Skip to content

Conversation

@vshlapakov
Copy link
Contributor

@vshlapakov vshlapakov commented Mar 21, 2017

Please, review.

@vshlapakov vshlapakov self-assigned this Mar 21, 2017
@vshlapakov vshlapakov requested a review from chekunkov March 22, 2017 09:33
@vshlapakov
Copy link
Contributor Author

@chekunkov Please take a look and let me know if we need a better message for such cases, or more tests probably. I consider such issues as temporary ones, so the only advice is to retry the request later, maybe we also should create an additional subclass for InternalServerError representing HTTP 500 responses.

# exception on attempt to decode wrong json in _decode_response
raise ScrapinghubAPIError(
"Error decoding server response ({}). Please try again later.".
format(str(exc)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to catch all ValueErrors here and convert them to ScrapinghubAPIError. This is specifically a problem of legacy python-scrapinghub, we can fix it there, where it happens - e.g. by checking response code before trying to get json and raising and APIError. I Even if this will change behaviour - not a big deal, this is rather bug than a feature of legacy code. And I think that the new client desires major version bump, so we can allow ourselves to have some small incompatibilities in v2.0.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense, let's handle it there 👍

elif 400 <= status_code < 500:
raise ScrapinghubAPIError(http_error=exc)
elif 500 <= status_code < 600:
raise ServerError(http_error=exc)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chekunkov Reworked a bit: the simplest way would be to call raise_for_status in the legacy client, and handle it here, does it look better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vshlapakov depending on whether it's possible to receive a non 200 response with some json body that should proceed to self._decode_response (not sure what's the return code for e.g. this error https://github.com/scrapinghub/python-scrapinghub/pull/55/files#diff-cd5ba6b088815a563525dde1900bbfecR150)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point, thanks. Return code is 403 for the authentication errors, so just calling raise_for_status doesn't fit. There's a few options how to handle it more gracefully in my mind:

  • always raise HTTPError for all 500s errors(500<=status<600), w/o trying to parse body
  • wrap json.loads with try-except and raise HTTPError from except clause (assuming that successful requests' responses always have proper body type, i.e. json)

My guess that 2nd approach is a way to go, and we also should do the same for jsonlines case. And as ValueError can happen in _decode_response on json.loads only, I'd wrap the whole method by catching ValueErrors and raising HTTPErrors based on response. Do you agree or I'm missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with the wrapper is that for jl format ValueError will never be catched as method returns generator in this case. I think it'd be reasonable to raise appropriate APIError for 404 (APIError.ERR_NOT_FOUND is never used) and 5xx (new err type) in _decode_response

P.S. let's change APIError error type values to strings vs current meaningless ints, e.g. ERR_DEFAULT = 0 -> ERR_DEFAULT = 'err_default'

Copy link
Contributor Author

@vshlapakov vshlapakov Mar 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, please review (but let me squash it before merge if it looks good)

@chekunkov
Copy link
Contributor

LGTM, just squash it

@vshlapakov vshlapakov force-pushed the sc1467-handle-json-decode-error branch 2 times, most recently from 19eae10 to 7cf1744 Compare March 23, 2017 13:39
@vshlapakov vshlapakov force-pushed the sc1467-handle-json-decode-error branch from 7cf1744 to a273b76 Compare March 23, 2017 13:40
@vshlapakov vshlapakov merged commit 471580a into sc1467-1 Mar 23, 2017
@vshlapakov vshlapakov deleted the sc1467-handle-json-decode-error branch March 23, 2017 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants