Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

errors: replace anyhow with thiserror #212

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

mkulke
Copy link

@mkulke mkulke commented Apr 24, 2024

hey there @anakrish I took a stab at this one, PTAL. note: this would be breaking change.

fixes #161

This change replaces stringly-typed anyhow errors with thiserror-constructed explicit error enums.

The following strategy has been used in the conversion:

  • Every module maintains its own error enum, exception is src/builtins/*.rs in which all errors are consolidated in a single enum to avoid duplication
  • Source errors are being wrapped and propagated in the error string
  • Crate-specific business logic errors have been converted into their own enum
  • A simple bail! macro has been added to reduce changes
  • Extension fn's require a Box<dyn std::error::Error> error type. it would be nice if the extension fn's could have a result with trait bound std::error::Error but that requires some unergonomic type gymnastics on the trait + impls
  • A local type alias for Result has been added to modules to reduce changes in the signatures where reasonable.

@mkulke mkulke force-pushed the mkulke/use-thiserror branch 2 times, most recently from d6bf7cf to 76b50d3 Compare April 24, 2024 13:02
fixes microsoft#161

This change replaces stringly-typed anyhow errors with
thiserror-constructed explicit error enums.

The following strategy has been used in the conversion:

- Every module maintains its own error enum
- Exception is builtins/*.rs in which all errors are consolidated in one
  enum to avoid duplication
- Source errors are being wrapped and propagated in the error string
- Crate-specific businesslogic errors have been convernted into their
  own enum
- A simple `bail!` macro has been added to reduce changes
- Extension fn's require a `Box<dyn std::error::Error>` error type. it
  would be nice if the Extension fn's could have a result with trait
  bound `std::error::Error` but that requires some unergonomic type
  gymnastics on the trait + impls
- A local type alias for `Result` has been added to modules to reduce
  changes in the signatures where reasonable.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
@anakrish
Copy link
Collaborator

Thanks for taking this on @mkulke.

I have a few questions:

  • When I do cargo run -r --example regorus -- eval "1+(a+b)" I get
Error: scheduler error: lexer error: 
--> <query.rego>:1:4
 |
1 | 1+(a+b)
 |    ^
error: use of undefined variable `a` is unsafe

Caused by:
   0: lexer error: 
      --> <query.rego>:1:4
        |
      1 | 1+(a+b)
        |    ^
      error: use of undefined variable `a` is unsafe
   1: 
      --> <query.rego>:1:4
        |
      1 | 1+(a+b)
        |    ^
      error: use of undefined variable `a` is unsafe
  • Is there some way to avoid printing the Caused by: information?
  • Is there some way to hide the scheduler error: lexer error from the user of an application like the regorus example? Such a user may not really care whether it is a lexer error or a scheduler error.

@anakrish
Copy link
Collaborator

this would be breaking change

What would be the impact on a user of the library? The regorus example program seems to compile fine without any changes; however the error printout is different.

What would be the recommended way to notify the user of such breaking changes? Does cargo provide any way?

src/engine.rs Outdated
#[error("value error: {0}")]
ValueError(#[from] ValueError),
#[error("data must be object")]
DataMustBeObject,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say in a future version we refactor the code and eliminate the chance of such an error, would that be a breaking change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enum can be marked #[non_exhaustive], but I'm not sure whether that's a good pattern. Ideally the types cover more or less what could go wrong in a given public fn call w/ a granularity that is useful to the consumer.

It's annoying if the error type is shared and there will be a new fn added to the API. Adding a new error case would be a breaking change. But maybe the types could be per-fn or logically grouped (like engine::EvalError)?

@anakrish
Copy link
Collaborator

Every module maintains its own error enum, exception is src/builtins/*.rs in which all errors are consolidated in a single enum to avoid duplication

Say we later find out a better way to group and name errors, is there a way to make that a non-breaking change?

@mkulke
Copy link
Author

mkulke commented Apr 25, 2024

this would be breaking change

What would be the impact on a user of the library? The regorus example program seems to compile fine without any changes; however the error printout is different.

What would be the recommended way to notify the user of such breaking changes? Does cargo provide any way?

At the moment the API exposes an anyhow::Result in the public fn signatures, so I'd expect this to be breaking change. Curiously cargo-semver-checks doesn't detect a breaking change, but I wonder why. If someone e.g. relied on some anyhow::Result specifics like ::context() it would fail, I'd assume 🤔

@mkulke
Copy link
Author

mkulke commented Apr 25, 2024

  • Is there some way to avoid printing the Caused by: information?
  • Is there some way to hide the scheduler error: lexer error from the user of an application like the regorus example? Such a user may not really care whether it is a lexer error or a scheduler error.

yeah, if the error is marked as transparent then it will be just passed through w/o adding an additional layer of causes:

#[derive(Error, Debug)]
pub enum SchedulerError {
    ...
    #[error(transparent)]
    LexerError(#[from] LexerError),
    ...
}

...and similarly for SchedulerError in engine.rs, this will yield:

$ cargo run -r --example regorus -- eval "1+(a+b)"
   Compiling regorus v0.1.5 (/home/magnuskulke/dev/regorus)
    Finished release [optimized + debuginfo] target(s) in 13.08s
     Running `target/release/examples/regorus eval '1+(a+b)'`
Error:
--> <query.rego>:1:4
  |
1 | 1+(a+b)
  |    ^
error: use of undefined variable `a` is unsafe

@mkulke
Copy link
Author

mkulke commented Apr 25, 2024

Every module maintains its own error enum, exception is src/builtins/*.rs in which all errors are consolidated in a single enum to avoid duplication

Say we later find out a better way to group and name errors, is there a way to make that a non-breaking change?

I think we probably want to think a bit which errors should be exposed as top-level API, which are useful for the consumer, so .e.g InternalError | ParserError | DataMustBeAnObject | QueryDidNotProduceAnyValue | QueryProducedMoreThanOneValue or something...

@mkulke
Copy link
Author

mkulke commented Apr 25, 2024

What would be the recommended way to notify the user of such breaking changes? Does cargo provide any way?

Cargo relies on semver, so if it's bumped to 0.2.0 that should cover a breaking change (for > 1.0 crates).

- Internal errors are merely propagated along the call stack now
- engine::* errors have been split and grouped logically

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
@anakrish
Copy link
Collaborator

@mkulke
I've been thinking about this very important change.

  • It would be nice if we could store the Span in each error and the error/warning message. Span.error or Span.message takes the message and displays it in a rustc-like format pointing to the location within the erroring line. This can be done on-demand only when the error is being converted to string. Having source location in the error objects would also make it easier for downstream tools that might want to process the error.
  • We could keep the public interface as anyhow while we work towards figuring out the best representation/grouping for error.
    Alternatively we could use the strategy descibed in https://docs.rs/thiserror/latest/thiserror/ for "Another use case is hiding implementation details of an error representation behind an opaque error type, so that the representation is able to evolve without breaking the crate’s public API.". That would also allow us multiple iterations till we feel confident about the error representation.

@mkulke mkulke marked this pull request as draft May 15, 2024 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid use of anyhow
2 participants