-
Notifications
You must be signed in to change notification settings - Fork 61
Handle ValueError from _decode_response #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@chekunkov Please take a look and let me know if we need a better message for such cases, or more tests probably. I consider such issues as temporary ones, so the only advice is to retry the request later, maybe we also should create an additional subclass for |
scrapinghub/client/exceptions.py
Outdated
| # exception on attempt to decode wrong json in _decode_response | ||
| raise ScrapinghubAPIError( | ||
| "Error decoding server response ({}). Please try again later.". | ||
| format(str(exc))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to catch all ValueErrors here and convert them to ScrapinghubAPIError. This is specifically a problem of legacy python-scrapinghub, we can fix it there, where it happens - e.g. by checking response code before trying to get json and raising and APIError. I Even if this will change behaviour - not a big deal, this is rather bug than a feature of legacy code. And I think that the new client desires major version bump, so we can allow ourselves to have some small incompatibilities in v2.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, makes sense, let's handle it there 👍
| elif 400 <= status_code < 500: | ||
| raise ScrapinghubAPIError(http_error=exc) | ||
| elif 500 <= status_code < 600: | ||
| raise ServerError(http_error=exc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chekunkov Reworked a bit: the simplest way would be to call raise_for_status in the legacy client, and handle it here, does it look better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vshlapakov depending on whether it's possible to receive a non 200 response with some json body that should proceed to self._decode_response (not sure what's the return code for e.g. this error https://github.com/scrapinghub/python-scrapinghub/pull/55/files#diff-cd5ba6b088815a563525dde1900bbfecR150)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point, thanks. Return code is 403 for the authentication errors, so just calling raise_for_status doesn't fit. There's a few options how to handle it more gracefully in my mind:
- always raise HTTPError for all 500s errors(500<=status<600), w/o trying to parse body
- wrap
json.loadswith try-except and raise HTTPError from except clause (assuming that successful requests' responses always have proper body type, i.e. json)
My guess that 2nd approach is a way to go, and we also should do the same for jsonlines case. And as ValueError can happen in _decode_response on json.loads only, I'd wrap the whole method by catching ValueErrors and raising HTTPErrors based on response. Do you agree or I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with the wrapper is that for jl format ValueError will never be catched as method returns generator in this case. I think it'd be reasonable to raise appropriate APIError for 404 (APIError.ERR_NOT_FOUND is never used) and 5xx (new err type) in _decode_response
P.S. let's change APIError error type values to strings vs current meaningless ints, e.g. ERR_DEFAULT = 0 -> ERR_DEFAULT = 'err_default'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, please review (but let me squash it before merge if it looks good)
|
LGTM, just squash it |
19eae10 to
7cf1744
Compare
7cf1744 to
a273b76
Compare
Please, review.