Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Consider switching to httpx instead of requests #1368

Closed
Wauplin opened this issue Mar 1, 2023 · 3 comments
Closed

[RFC] Consider switching to httpx instead of requests #1368

Wauplin opened this issue Mar 1, 2023 · 3 comments

Comments

@Wauplin
Copy link
Contributor

Wauplin commented Mar 1, 2023

1. requests.Session would be really cool

huggingface_hub is a library essentially to make calls to the Hub. For that we are using requests which is great but has its limitations. requests has the concept of Session to keep a connection alive when doing multiple requests in a row. I made some experiments this morning and the gain is very substantial. The following snippet uses hfh to create a repo, upload 2 (almost empty) files and delete it. It's not perfect but quite representative to what users could do.

repo_id = create_repo("tmp_repo_id_lucain").repo_id
upload_file(repo_id=repo_id, path_in_repo="text.txt", path_or_fileobj=b"content")
upload_file(repo_id=repo_id, path_in_repo="lfs.bin", path_or_fileobj=b"content")
delete_repo(repo_id=repo_id)

I tweaked huggingface_hub and ran this snippet to run with and without requests.Session. Average execution time goes down from 9.3s to 5.5s (-40%). This gain comes from the fact that we don't start a new HTTPS connection at each call.

I also tried a real example when loading weights with diffusers when they are already cached. Discussion has been started by @patrickvonplaten yesterday (internal link).

from diffusers import StableDiffusionPipeline
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16)

=> it turns out I save around 1s (from 4.6s to 3.7s) which is again quite substantial. In this particular case, diffusers could also be improved but it still highlights that huggingface_hub can do better.

Note: besides improving a lot the performances, being able to pass a Session object is also useful for users that needs to configure their proxy (and headers/user-agents/cookies/...). At the proxies can be configured in hf_hub_download and snapshow_download but not in HfApi.

2. But requests.Session is not thread-safe

I was about to push a draft PR to add an (optional) session parameter everywhere when I realized that doing so is not entirely thread-safe. The issue has been raised in 2015 and referenced quite a lot but the issue is not solved. It seems that most problems have been fixed by the underlying urllib3 package but it's not enough. See for example this issue reported in Dec 2022. I haven't investigated the why behind it but I think it's quite a no-go for us. We use threads to upload/download files (though the Session is less needed in that case) and any user can also make concurrent calls using HfApi (for example loop over repos and do something). Since it would be nice to still guarantee that HfApi is thread-safe, let's not use requests.Session.

3. Use httpx instead

httpx self-describes itself as HTTPX is a fully featured HTTP client for Python 3, which provides sync and async APIs, and support for both HTTP/1.1 and HTTP/2. What's interesting for us is:

  • it's very similar to requests. They have a compatibility guide and nothing things too difficult to handle.
  • it's guaranteed to be thread-safe
  • it's hopefully compliant as much as possible with http standards
  • (not a priority) switching to httpx would also be a good start if we want to support some async stuff some day.

4. Caveats and workarounds

Switching to httpx still has some drawbacks:

  • Community might be less used to it. Also it adds a new dependency.
  • httpx.HTTPError and requests.HTTPError are not the same objects (same for Response and Request). It's not so much an issue since huggingface_hub never return a Request/Response directly. However I think what we should do is to make HfHubHttpError inherit from both httpx.HTTPError and requests.HTTPError. This way we don't break any try/except implemented by downstream libraries
  • httpx do not follow redirects by default but we can enforce to do it. To keep compatibility, I guess we should do so (at least to still handle hf.co => huggingface.co)
  • stream=True is used once in hfh and behavior has changed for it => should be ok

5. What's next?

WDYT about it? Any hard feeling about not doing it? My take is that it's a good idea to do it now, deal with the few issues that will happen and then we'll be fine to build on top of it.

So basically the plan is:

  1. Make a PR to switch from requests to httpx without adding features. Keep requests as a dependency.
  2. Once we have httpx, use a session everywhere (called httpx.Client). Also deprecate proxies parameters since configuration can be handled by the user directly
  3. (later) start to use async capabilities

tagging @sgugger @LysandreJik @patrickvonplaten @lhoestq @julien-c @osanseviero for feedback

@patrickvonplaten
Copy link
Contributor

I don't know enough about httpx to have a good opinion here, but from your points above it seems like it's very much worth changing to httpx if we're able to bring down latencies down significantly when doing multiple calls to the Hub in rapid succession.

@julien-c
Copy link
Member

julien-c commented Mar 7, 2023

maybe worth asking the community about this, too

@Wauplin
Copy link
Contributor Author

Wauplin commented Mar 27, 2023

As part of #1394 I used requests.Session object to keep the connection alive between HTTP calls. To tackle the thread-safety issue, we now have a get_session() helper to get the current opened session. 1 session per thread is created to avoid mixing them. Sessions can be globally configured using configure_http_backend.

So no need for httpx for now. It would have been cleaner to have a single httpx.Client but adding a new dependency was too heavy compared to the implemented workaround. I'm closing this issue as "wont-fix". Let's revisit httpx usage if we introduce async at some point.

@Wauplin Wauplin closed this as not planned Won't fix, can't repro, duplicate, stale Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants