[4.0] Consistent public error types in the Rust API #745

cdisselkoen · 2024-03-25T16:42:53Z

Describe the feature you'd like to request

Currently Cedar is inconsistent about what error data is public (accessible to Cedar consumers and thus part of our stable interface) and what error data is private (implementation details we're free to change at will, e.g., to improve error messages). In fact, at least six different patterns are currently in use in different places in the Cedar codebase, as described below. This issue requests that Cedar standardize all of its error types into Pattern C. This will be a breaking change to some APIs.

Pattern A

Methods that can return errors do not publicize what type they are returning, just impl miette::Diagnostic. Callers (Cedar consumers) can use the methods on miette::Diagnostic to get the error's message, help text, source location, etc but no other info.

Policy::to_json() currently takes this approach:

pub fn to_json(&self) -> Result<serde_json::Value, impl miette::Diagnostic>

This option is maximally flexible for Cedar developers, as Cedar can make any changes to error structures that it wants, even in patch releases, as long as whatever is returned still implements miette::Diagnostic (which will always be the case). But, this option is undesirable because it does not provide enough information to callers (Cedar consumers). Callers cannot distinguish between different error cases in a structured way, and cannot get other information about the error in a structured way -- for instance, seeing which EntityUid was responsible for an EntityDoesNotExist error, except by inspecting the error message.

Pattern B

Methods that can return errors return a named error type, but that type is a struct with private field(s). Of course, the error type implements miette::Diagnostic so provides all of those methods, to get the error's message, help text, source location, etc. The error type can also provide additional public methods if it wants.

eval_expression() (the public interface for the Cedar evaluator) currently takes this approach:

pub fn eval_expression(
    request: &Request,
    entities: &Entities,
    expr: &Expression,
) -> Result<EvalResult, EvaluationError>

where callers (Cedar consumers) just see EvaluationError as a struct with private fields in the docs.

This option is still extremely flexible for Cedar developers, as Cedar can make any changes to error structures that it wants, even in patch releases, as long as whatever is returned still implements miette::Diagnostic and any additional public methods we've already committed to providing for that error type. By choosing what public methods to provide, we have full control over what's public for each error type. However, the public methods we provide do have to "make sense" for all enum variants, assuming there is an enum under the hood. We can't have different public methods per enum variant, like we can in Pattern C. For this reason, and to improve consumers' ability to pattern-match on Cedar error types, we prefer Pattern C.

Pattern C (Proposed to standardize on)

Methods that can return errors return a public enum, but each enum variant is a single struct with private field(s). The enum itself, and also all of the member structs, implement miette::Diagnostic. The member structs can also provide additional public methods if they want, as could the enum.

TypeErrorKind currently takes something close to this approach. As you can see in its docs, it is a public enum with 13 variants, most of which are structs with private fields:

pub enum TypeErrorKind {
    UnexpectedType(UnexpectedType),
    IncompatibleTypes(IncompatibleTypes),
    ...
}

where callers (Cedar consumers) just see UnexpectedType and IncompatibleTypes as structs with private fields in the docs.

This option is somewhat less flexible for Cedar developers than Patterns A or B. Cedar cannot remove or rename enum variants in patch releases, and can't add new enum variants either unless the overall enum is marked as non_exhaustive (like TypeErrorKind is). However, Cedar can still add or remove fields from the member structs (like UnexpectedType) at will, as long as all public methods are maintained.

Callers (Cedar consumers) get significantly more structured information than they do with Patterns A or B. They get not only the methods on miette::Diagnostic and any other public methods provided by the enum, but also can match on the error kind, so they can for instance distinguish between UnexpectedType and IncompatibleTypes in a structured way, that is, without matching on the error message. And, they can use whatever variant-specific public methods Cedar chooses to provide on each member struct.

Pattern D

Methods that can return errors return a public enum, where each enum variant contains its data inline, or, contains a struct with public fields. The enum itself, and also all of the member structs (if any), implement miette::Diagnostic. The member structs (if any), as well as the public enum, can also provide additional public methods if they want, but this would be only a convenience and not expose any new information, because all of the information is already public.

SchemaError currently takes something close to this approach. As you can see in its docs, it is a public enum with 21 variants, which contain a mix of inline data related to the error and/or other error types with public fields (like EntityAttrEvaluationError).

pub enum SchemaError {
    Serde(serde::Error),
    ...
    UndeclaredEntityTypes(HashSet<String>),
    ...
    CycleInActionHierarchy(EntityUid),
    ...
    ActionAttrEval(EntityAttrEvaluationError),
    ExprEscapeUsed,
}

Callers (Cedar consumers) see detailed data, like the HashSet of entity types that were found to be undeclared in UndeclaredEntityTypes, or the public fields of EntityAttrEvaluationError. This is in contrast to Pattern C, where UndeclaredEntityTypes would contain a struct called UndeclaredEntityTypes instead of containing the HashSet directly; and the member struct UndeclaredEntityTypes would perhaps have a public method for iterating over the individual types, but would perhaps not promise that it's a HashSet (so Cedar would be free to change that implementation detail in patch releases), and would also allow Cedar to add more information to UndeclaredEntityType (like a source location) without breaking changes.

This option is much less flexible for Cedar developers than the above patterns. Like Pattern C, Cedar cannot remove or rename enum variants in patch releases, and cannot add enum variants unless the enum is marked non_exhaustive (which SchemaError is not). Unlike Pattern C, with Pattern D Cedar cannot add or remove data from the enum variants or their member structs, or change the type of any contained data. For instance, Cedar could not change the HashSet above to a BTreeSet or Vec, and could not add source locations to these error types. With Pattern C, Cedar could make those changes as long as we still provide the same public methods on each member struct. (We would write the public methods to, e.g., not expose the collection type contained in UndeclaredEntityTypes, but only allow iterating over it.)

Another complication of this option is the need to use only public-visible types in the enum variants and member structs. For instance, the EntityUid in the CycleInActionHierarchy variant is a public EntityUid, not a Core EntityUID. Of course, the validator internally needs to return Core EntityUID as it has no way to name the public-interface EntityUid (depending on cedar-policy would be circular). So, Cedar currently duplicates the entire SchemaError enum -- one declaration in the validator crate, using internal types, and another declaration in the cedar-policy crate, using public-interface types, with conversion methods between them. Patterns B and C also have to deal with converting internal types to public-interface types, but this is both easier and potentially more performant to do within the type's public methods rather than in the type itself. Pattern C would probably require us to re-declare the public enum, with wrapped member structs and wrapped public methods.

Pattern E (definitely bad)

Some Cedar errors are in weird in-between states, for instance EntitiesError (docs), which is a public enum with member structs that have public fields but are not themselves interface types, only Core types. This is unclear to callers (Cedar consumers) what they're "allowed" to match on. Depending on the syntactic situation, they might be able to access the fields of the member structs, or might have to take a dependency on cedar-policy-core to do so (because they're unable to syntactically name the type without importing it from Core). This is both confusing and bad.

Pattern F (definitely bad)

Another bad state is also exemplified by EntitiesError (sorry, EntitiesError): one of its enum variants is Duplicate which contains a Core EntityUID as inline data. To the extent that fields are public (like this one), they should only contain public-interface types. I'm not sure why Rust allows the Core type EntityUID to leak out of the public interface in this way. Regardless of whether Rust should or should not allow it, Cedar should not do this in its error types.

Describe alternatives you've considered

Patterns A, B, D are plausible alternatives (with B the most plausible of those), but this issue proposes standardizing on Pattern C.

Additional context

No response

Is this something that you'd be interested in working on?

👋 I may be able to implement this feature request
⚠️ This feature might incur a breaking change

The text was updated successfully, but these errors were encountered:

cdisselkoen · 2024-03-25T16:48:12Z

cdisselkoen · 2024-03-25T17:57:10Z

Small amendment, since Pattern C doesn't say this explicitly: It's also OK for an enum variant to be itself another enum that complies with the Pattern C guidelines. It doesn't always have to be a struct with private fields.

Unresolved question: Is it OK for an enum variant to be serde_json::Error, or do we want to wrap that? (E.g., in PolicyToJsonError.)

Another unresolved question: The current EvaluationError is a struct-with-private-fields, with a public method error_kind(&self) -> &EvaluationErrorKind where EvaluationErrorKind is an enum (that we would plan to conform to Pattern C). Is this OK, or do we want to convert to Pattern C's suggestion, requiring that EvaluationError is an enum directly (and thus each member struct contains the private advice and source_loc fields itself)? Previously we said (e.g., as referenced in this comment) that we prefer the *Kind pattern in order to avoid duplicating these fields on every enum variant. Do we want to reverse that decision for consistency in Pattern C, or allow this pattern (in addition to Pattern C)? What's most consistent for users and what leads to the best/cleanest code internally?

khieta · 2024-03-25T18:59:14Z

I'm fine with either, but I slightly prefer a wrapped error instead of serde_json::Error.
I believe the advice field could be replaced with miette's help. Is there anything in miette that we can use in place of the source_loc field? If so, I'd prefer the approach in Pattern C.

cdisselkoen · 2024-03-26T13:27:03Z

Currently, miette's help is sourced from the advice field of this error. I will see if it's possible to refactor to remove the field and just compute the advice directly in help()

cdisselkoen · 2024-03-27T13:44:06Z

that last suggestion became #748

john-h-kastner-aws · 2024-04-30T15:33:59Z

UnsupportedFeature -- this is currently publically exported in api.rs but I don't think it's actually used, so we can just remove it from public API?

This should be used inside of SchemaError::UnsupportedFeature, but that currently holds a String instead. We should unexport it as part of updating SchemaError.

Signed-off-by: Shaobo He <shaobohe@amazon.com>

cdisselkoen added pending-triage Hasn't been triaged yet feature-request Request for a new feature backlog We hope to work on this in the future 4.0 and removed pending-triage Hasn't been triaged yet labels Mar 25, 2024

This was referenced Mar 25, 2024

Change Result<T, impl miette::Diagnostic> to Result<T, E> #723

Open

Re-export core crate errors in cedar_policy crate #695

Open

Error codes for each error variant in Core/Validator #184

Open

This was referenced Mar 26, 2024

cedar#745 for PolicyToJsonError #747

Merged

reorg how advice is stored on EvaluationError #748

Merged

mstalzer assigned cdisselkoen Apr 5, 2024

khieta mentioned this issue May 2, 2024

Refactoring in api.rs #830

Merged

3 tasks

john-h-kastner-aws mentioned this issue May 2, 2024

Remove ParseErrors::errors_as_strings #543

Closed

2 tasks

shaobo-he-aws added a commit that referenced this issue May 15, 2024

Implement #745 for api::SchemaError

e50aa54

Signed-off-by: Shaobo He <shaobohe@amazon.com>

shaobo-he-aws mentioned this issue May 15, 2024

Implement #745 for api::SchemaError #876

Open

3 tasks

cdisselkoen mentioned this issue May 17, 2024

Box error types #878

Open

2 tasks

andrewmwells-amazon mentioned this issue May 20, 2024

Standardize PolicySetErrors #889

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[4.0] Consistent public error types in the Rust API #745

[4.0] Consistent public error types in the Rust API #745

cdisselkoen commented Mar 25, 2024 •

edited

cdisselkoen commented Mar 25, 2024 •

edited by andrewmwells-amazon

cdisselkoen commented Mar 25, 2024

khieta commented Mar 25, 2024

cdisselkoen commented Mar 26, 2024

cdisselkoen commented Mar 27, 2024

john-h-kastner-aws commented Apr 30, 2024 •

edited

[4.0] Consistent public error types in the Rust API #745

[4.0] Consistent public error types in the Rust API #745

Comments

cdisselkoen commented Mar 25, 2024 • edited

Category

Describe the feature you'd like to request

Pattern A

Pattern B

Pattern C (Proposed to standardize on)

Pattern D

Pattern E (definitely bad)

Pattern F (definitely bad)

Describe alternatives you've considered

Additional context

Is this something that you'd be interested in working on?

cdisselkoen commented Mar 25, 2024 • edited by andrewmwells-amazon

cdisselkoen commented Mar 25, 2024

khieta commented Mar 25, 2024

cdisselkoen commented Mar 26, 2024

cdisselkoen commented Mar 27, 2024

john-h-kastner-aws commented Apr 30, 2024 • edited

cdisselkoen commented Mar 25, 2024 •

edited

cdisselkoen commented Mar 25, 2024 •

edited by andrewmwells-amazon

john-h-kastner-aws commented Apr 30, 2024 •

edited