Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow GithubStream to continue on specific errors #16

Open
ericboucher opened this issue Sep 17, 2021 · 2 comments
Open

Allow GithubStream to continue on specific errors #16

ericboucher opened this issue Sep 17, 2021 · 2 comments

Comments

@ericboucher
Copy link
Contributor

ericboucher commented Sep 17, 2021

As I was implementing a GithubStream for community_profile, I stumbled upon the following issue:

Some repos actually return a 404 when we request their community_profile. It is quite lucky that this got caught early thanks to the main test repo being one of these edge-cases. Though I still do not understand why.

As a workaround, we could allow tolerated_http_errors and handle them differently. This is what I have implemented here. But I am open to other ideas 😄

Open question:

  • Is it worth implementing at the SDK level? This seems a bit specific to GitHub but 🤷
@ericboucher
Copy link
Contributor Author

ericboucher commented Sep 17, 2021

Linked issue on the SDK - Error Handling - https://gitlab.com/meltano/sdk/-/issues/134

@aaronsteers
Copy link
Contributor

aaronsteers commented Sep 20, 2021

@ericboucher - For the SDK generic handling, I'm thinking a multi-facet approach along these lines:

  1. Add a public and overridable method in RESTStream called raise_if_error() (or similar), which would abstract the case logic in _request_with_backoff(). By default, the raise_if_error() would check status codes just as we do today, and then raise one of a specific set of exception types to be handled by _request_with_backoff() and the SDK internals.
  2. Predefined exception types would be documented in the dev guide, to be optionally sent by custom implementations, and then caught and handled by the SDK.
  3. We probably would want to introduce both "Fatal" and "Retryable" exception classes, and also add exception classes to signify rate limits.
    • Rate limit exceptions could similarly be categorized as retryable or fatal, depending on period (per-minute limits being retryable and per-day limits not, for instance).
  4. The set of status codes that roll up into each exception type should also be overridable by the developer.
    • The example here uses tolerated_http_errors. We could support a RESTStream.tolerated_http_errors property or perhaps just allow users to override fatal_http_errors to exclude the error code in question.
    • If the user overrides the numeric code list (RESTStream.tolerated_http_errors or RESTStream.fatal_http_errors, for instance), then they may or may not actually need to also override raise_if_error(). Presumably, they'd only need to do both when the status code is not sufficient information or when the response needs to be parsed in order to properly determine the nature of the exception message.
    • Potential flaw in this plan is that if we want an entire series of codes to represent a failure, it may require adding all the codes in the sequence, something like range(start=400, stop=500). I'm not sure how much of an issue this would be, but wanted to call it out.

@ericboucher - What do you think of this approach? It should give you less you would need to override in cases such as these.

cc @edgarrmondragon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants