Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create a version of requests that retries on certain HTTP statuses by default #147

Closed
rudolfix opened this issue Feb 22, 2023 · 0 comments
Closed
Assignees
Labels
good first issue Good for newcomers

Comments

@rudolfix
Copy link
Collaborator

rudolfix commented Feb 22, 2023

Motivation

  1. Most of the pipelines do hundreds of requests, if there's no retry built in the whole pipeline will fail.
  2. We want a blueprint for "source helper" - something that is so often used that is a part of dlt core

Tasks

  1. implement retry session using HTTPAdapter: https://majornetwork.net/2022/04/handling-retries-in-python-requests/
  2. allow to configure timeout as well.
  3. provide a default session with sane timeout and retry parameters
  4. provide configurable session like: https://github.com/bustawin/retry-requests
  5. look at thread safety ie. here someone explains how to make a shared connection pool which is thread safe https://stackoverflow.com/questions/18188044/is-the-session-object-from-pythons-requests-library-thread-safe

Place code in dlt.sources.helpers.requests module and expose the default session as requests and configurable session as requests_with_retry(...)

the point of the above is to provide drop-in replacement for people making requests in pipelines repository, They'd just import from some other module but the remaining code could stay the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants