More advanced pipelining for hf_transfer to increase download speed even further #32

aikitoria · 2024-03-28T21:48:45Z

Is your feature request related to a problem? Please describe.
hf_transfer is very fast for individual files, but for models with many split files, it's not quite as fast as it could be.

Describe the solution you'd like
Currently, it seems like hf_transfer is invoked serially for each file, starting over from scratch with a new set of connections. That results in a timeline similar to this:

To fully maximize the download speed, what it should do instead is use a shared pool of connections that is reused between small files and chunks of multiple larger files in parallel:

aikitoria · 2024-03-28T22:00:32Z

A simpler alternative solution might be to launch the next hf_transfer instance when the previous file is about 75% done rather than waiting for 100%. But reusing connections from a pool, ideally with at least http/2 so they can launch the second download onwards at full speed immediately, would be better.

Wauplin · 2024-03-29T11:29:59Z

Hi @aikitoria, thanks for the suggestion! Do you have a repo example where such an improvement could have a significant impact? At the moment hf_transfer is optimized for single-file downloads which is usually enough for model repos where at most a few big files have to be downloaded (and therefore the time to open a connection is neglectable in comparison to the download time). The changes you're suggesting is not trivial so we would need good reasons to implement them :)

I'm also cc-ing @Narsil who's implemented hf_transfer.

aikitoria · 2024-03-29T20:51:02Z

Sure, here is an example where it would make a difference:

https://huggingface.co/databricks/dbrx-instruct/tree/main

When downloading this on a server with 10gbit/s download speed, restarting the connections and growing the sliding window again takes up a significant portion of the time. We end up with something like this (exaggerated):

aikitoria · 2024-04-16T19:16:35Z

@Wauplin here is another model where this would be very beneficial:

https://huggingface.co/CohereForAI/c4ai-command-r-plus

Narsil · 2024-06-04T09:50:12Z

@aikitoria

The reconnect only occurs because of this: https://github.com/huggingface/hf_transfer/blob/main/src/lib.rs#L162
hyperium/hyper#2136 (comment)

The reconnects probably only occur because large chunks take more that 20s to download.
Or maybe you're targeting a server than doesn't support http2?
Also your first picture is incorrect I believe, the reconnects don't happen all at the same time, but continuously too (and if the download doesn't exceed the keep alive, then you shouldn't see any reconnects. misread your naming.

Edit: Reflecting on this, since we're cloning the client across task, it's very well possible that connection reuse doesn't happen, http2 or not.

aikitoria · 2024-06-04T12:25:53Z

Or maybe you're targeting a server than doesn't support http2

I dunno? Huggingface are hosting the server themselves. Wasn't even aware you can use hf_transfer for something other than downloading models from Huggingface.

Narsil · 2024-06-04T12:57:11Z

Wasn't even aware you can use hf_transfer for something other than downloading models from Huggingface.

It's just a tool that multiplexes the download over multiple byte range (bypassing some server side rate liimits and using all your cores).

Everything I said is a bit wrong, your whole argument is about multiple files, and that still holds, I was all thinking about thread multiplexing.

Wauplin transferred this issue from huggingface/huggingface_hub Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More advanced pipelining for hf_transfer to increase download speed even further #32

More advanced pipelining for hf_transfer to increase download speed even further #32

aikitoria commented Mar 28, 2024

aikitoria commented Mar 28, 2024 •

edited

Loading

Wauplin commented Mar 29, 2024

aikitoria commented Mar 29, 2024 •

edited

Loading

aikitoria commented Apr 16, 2024

Narsil commented Jun 4, 2024 •

edited

Loading

aikitoria commented Jun 4, 2024

Narsil commented Jun 4, 2024 •

edited

Loading

More advanced pipelining for hf_transfer to increase download speed even further #32

More advanced pipelining for hf_transfer to increase download speed even further #32

Comments

aikitoria commented Mar 28, 2024

aikitoria commented Mar 28, 2024 • edited Loading

Wauplin commented Mar 29, 2024

aikitoria commented Mar 29, 2024 • edited Loading

aikitoria commented Apr 16, 2024

Narsil commented Jun 4, 2024 • edited Loading

aikitoria commented Jun 4, 2024

Narsil commented Jun 4, 2024 • edited Loading

aikitoria commented Mar 28, 2024 •

edited

Loading

aikitoria commented Mar 29, 2024 •

edited

Loading

Narsil commented Jun 4, 2024 •

edited

Loading

Narsil commented Jun 4, 2024 •

edited

Loading