-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: more granular conversion from object_store::Error to Error #2316
base: main
Are you sure you want to change the base?
Conversation
ACTION NEEDED Lance follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. For details on the error please inspect the "PR Title Check" action. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2316 +/- ##
==========================================
- Coverage 80.68% 80.67% -0.02%
==========================================
Files 191 191
Lines 56695 56721 +26
Branches 56695 56721 +26
==========================================
+ Hits 45746 45761 +15
- Misses 8386 8402 +16
+ Partials 2563 2558 -5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this.
I think we should probably go for a different approach. Some of these error mappings take something very generic, and map it to a very specific error. For example, object_store::Error::NotFound
could be any file isn't found, but we are mapping it here to specifically mean that the dataset is not found. Or object_store::Error::InvalidPath
being mapped to Error::InvalidInput
, even though the path could be one that internal and not user provided, and thus a more appropriate error would be Error::Internal
.
Here's what I think might be a better solution:
- Change
Error::IO
variant to holdBoxError
, so it could be downcast if needed. - To solve the "opening non-existent dataset should return Error::NotFound instead of Error::IO" issue, we should add the error the specific code path where we generate the error. We should also have a unit test that validates we produce an error.
If you'd like, you can just start with a PR to do (1), as I think that on it's own would be a meaningful improvement.
Thanks for the feedback, I will start with (1) and see what I can do with (2). |
@wjones127 if we change Error::IO from #[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
pub enum Error {
...
#[snafu(display("LanceError(IO): {message}, {location}"))]
IO { message: String, location: Location },
...
} to #[derive(Debug, Snafu)]
#[snafu(visibility(pub))]
pub enum Error {
...
#[snafu(display("LanceError(IO): {source}, {location}"))]
IO { source: BoxedError, location: Location },
...
} and maybe add a helper function like this: impl Error {
pub fn io(message: impl Into<String>, location: Location) -> Self {
let message: String = message.into();
Self::IO {
source: message.into(),
location,
}
}
} do we want Error::IO to be a wrapper of other Error enums? like this: return Err(Error::IO {
source: Box::new(Error::Internal {
message: "HashJoiner: No data".to_string(),
location: location!(),
}),
location: location!()
}); if so, where do we expect to downcast the Error::IO to the specific Error enum? or we just make everything Error::IO, like this: return Err(Error::io(
"HashJoiner: No data".to_string(),
location!(),
)); I am very new to Lance and have only read a small portion of the codebase yet so sorry if these are naive questions |
No, it's not meant to wrap our own errors. The source field will wrap other error types, like object store error. Also, keep in mind some of the error handling in our code base is very bad. This is a good example where someone used an IO error when the should have used something else, since this error doesn't have to do with reading or writing data from some IO source/sink. So if you see something like this in our codebase, do not assume it is "the right way". What I meant is that object store errors would be put in a box and wrapped in
The downcasts would not be in our codebase. This is something users of our library would do if they wanted to get the underlying object store error. |
This PR tries to fix issue #2067 by adding a more granular conversion from object_store::Error types to Error types.