New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-43121: Set up server to client error propagation #973
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #973 +/- ##
==========================================
+ Coverage 88.89% 88.90% +0.01%
==========================================
Files 329 329
Lines 42280 42330 +50
Branches 8699 8703 +4
==========================================
+ Hits 37583 37634 +51
Misses 3444 3444
+ Partials 1253 1252 -1 ☔ View full report in Codecov by Sentry. |
15b404c
to
78d6fbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
), "Subclasses of ButlerUserError must have unique 'error_type' property" | ||
|
||
|
||
def create_butler_user_error(error_type: str, message: str) -> ButlerUserError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a docstring for this since it is used elsewhere and would help people reading the code and getting previews from VSCode.
try: | ||
model = self._get_file_info(datasetRefOrType, dataId, collections, kwargs) | ||
except DatasetNotFoundError as e: | ||
raise FileNotFoundError(str(e)) from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need to talk about this oddity for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good call out... I'm pretty sure this is just incorrect even though it passes tests.
I thought this was needed because of the datastore vs registry thing. Datastore pretty consistently throws FileNotFoundError when stuff can't be found, and we're mostly acting in place of datastore here.
But I took a closer look at DirectButler and it does actually throw DatasetNotFoundError (formerly LookupError) for most common cases of not being able to lookup a ref in get(). Apparently that just isn't covered in test_butler.py
.
The FileNotFoundErrors being expected in the tests are actually for a couple of weird edge cases:
- changing the UUID of a ref to something invalid (line 500 in test_butler.py)
- holding on to a ref after calling
pruneDatasets(purge=True)
, then trying to call get() on it (line 417 in test_butler.py)
Arguably in both these cases it would make just as much sense to throw the DatasetNotFoundError, but because these checks are inside Datastore they happen to throw FileNotFoundError because that's what Datastore does.
8d09b13
to
8a89a98
Compare
Add an exception class that can be used to tag an exception as being a user-facing error message. Set up a FastAPI exception handler that can forward these error messages to the client side. Have RemoteButler raise the corresponding exception when errors are received from the server.
Hoist MissingDatasetTypeError up to the top-level exceptions file, and mark it as as ButlerUserError so that it can be sent from the server to the client. This allows removing some code specific to this exception type from RemoteButler.
In unit tests, we now convert unhandled server errors to a special exception class 'UnhandledServerError' that is not part of Butler's usual exception hierarchy. The FastAPI TestClient by default passes unhandled exceptions up from the server to the client. This is useful behavior for unit testing because it gives you traceability from the test to the problem in the server code. However, because RemoteButler is in some ways just a proxy for the server-side Butler, we raise similar exceptions on the client and server side. Thus the default TestClient behavior can mask missing error-handling logic. The registry test suite in test_remote_butler.py does not yet have all error-handling logic in place, so it opts out of the new behavior.
Remove one-off code handling FileNotFoundError and 404s in RemoteButler, replacing it with the new error propagation framework.
LookupError is currently thrown from a variety of places for mostly-unrelated reasons, most of which don't need to be propagated to the server. Give it a more specific name instead.
8a89a98
to
98b0e44
Compare
It turns out that DirectButler typically throws two different exceptions from get(): 1. DatasetNotFoundError if the dataset can't be found in registry 2. FileNotFoundError if the file itself cannot be found or under some other error conditions (some of which involve not being able to find the dataset in registry.) Added test coverage for case #1, and fixed RemoteButler to comply with it. Loosened the tests for some unusual error conditions to allow either of the two exception types, because RemoteButler doesn't have enough information available to replicate the DirectButler behavior.
2a535c5
to
9bb59b2
Compare
9bb59b2
to
f6eb3c8
Compare
Set up a framework for transferring exceptions from the server to the client and raising them in RemoteButler. Exceptions intended as user-facing error messages should now be raised as exceptions that subclass a new class
ButlerUserError
. This allows us to distinguish between "intentional"/"expected" error messages which should be sent to the client and should not raise alarms, vs server-internal breakage that our team will have to investigate and fix.Any
ButlerUserError
raised on the server will be caught and transferred to the client as JSON with a 422 status code, then raised as the original exception type on the client.The top-level
_exceptions.py
will eventually become the single file where Butler-specific exceptions for the public API are defined. Added a new exception typeDatasetNotFoundError
to replace someLookupError
s raised when datasets were not found. HoistedMissingDatasetTypeError
fromregistry._exceptions
to the top-level_exceptions
, and put its base classes in a separate file_exceptions_legacy.py
to indicate they should not be used from new code.Updated enough exceptions to pass the tests in
test_butler.py
using this framework. The registry tests intest_remote_butler.py
will require more work in a later PR.Checklist
doc/changes