Skip to content

Commit

Permalink
Document hf_transfer more prominently (#1714)
Browse files Browse the repository at this point in the history
* draft doc

* alias

* advice to use  for faster downloads

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
  • Loading branch information
Wauplin and stevhliu committed Oct 6, 2023
1 parent 35117fb commit 6ccc1f6
Show file tree
Hide file tree
Showing 6 changed files with 41 additions and 10 deletions.
5 changes: 4 additions & 1 deletion docs/source/en/_redirects.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,7 @@ how-to-model-cards: guides/model-cards
how-to-upstream: guides/upload
search-the-hub: guides/search
guides/manage_spaces: guides/manage-spaces
package_reference/inference_api: package_reference/inference_client
package_reference/inference_api: package_reference/inference_client

# Alias for hf-transfer description
hf_transfer: package_reference/environment_variables#hfhubenablehftransfer
12 changes: 11 additions & 1 deletion docs/source/en/guides/download.md
Original file line number Diff line number Diff line change
Expand Up @@ -231,4 +231,14 @@ For a full list of the arguments, you can run:

```bash
huggingface-cli download --help
```
```

## Faster downloads

If you are running on a machine with high bandwidth, you can increase your download speed with [`hf_transfer`](https://github.com/huggingface/hf_transfer), a Rust-based library developed to speed up file transfers with the Hub. To enable it, install the package (`pip install hf_transfer`) and set `HF_HUB_ENABLE_HF_TRANSFER=1` as an environment variable.

<Tip warning={true}>

`hf_transfer` is a power user tool! It is tested and production-ready, but it lacks user-friendly features like progress bars or advanced error handling. For more details, please take a look at this [section](https://huggingface.co/docs/huggingface_hub/hf_transfer).

</Tip>
2 changes: 1 addition & 1 deletion docs/source/en/guides/upload.md
Original file line number Diff line number Diff line change
Expand Up @@ -493,7 +493,7 @@ be re-uploaded twice but checking it client-side can still save some time.
uploads on machines with very high bandwidth. To use it, you must install it (`pip install hf_transfer`) and enable it
by setting `HF_HUB_ENABLE_HF_TRANSFER=1` as an environment variable. You can then use `huggingface_hub` normally.
Disclaimer: this is a power user tool. It is tested and production-ready but lacks user-friendly features like progress
bars or advanced error handling.
bars or advanced error handling. For more details, please refer to this [section](https://huggingface.co/docs/huggingface_hub/hf_transfer).
## (legacy) Upload files with Git LFS
Expand Down
12 changes: 5 additions & 7 deletions docs/source/en/package_reference/environment_variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,17 +136,15 @@ You can set `HF_HUB_DISABLE_TELEMETRY=1` as environment variable to globally dis

### HF_HUB_ENABLE_HF_TRANSFER

Set to `True` to download files from the Hub using `hf_transfer`. It's a Rust-based package
that enables faster download (up to x2 speed-up). Be aware that this is still experimental
so it might cause issues in your workflow. In particular, it does not support features such
as progress bars, resume download, proxies or error handling.
Set to `True` for faster uploads and downloads from the Hub using `hf_transfer`.

**Note:** `hf_transfer` has to be installed separately [from Pypi](https://pypi.org/project/hf-transfer/).
By default, `huggingface_hub` uses the Python-based `requests.get` and `requests.post` functions. Although these are reliable and versatile, they may not be the most efficient choice for machines with high bandwidth. [`hf_transfer`](https://github.com/huggingface/hf_transfer) is a Rust-based package developed to maximize the bandwidth used by dividing large files into smaller parts and transferring them simultaneously using multiple threads. This approach can potentially double the transfer speed. To use `hf_transfer`, you need to install it separately [from PyPI](https://pypi.org/project/hf-transfer/) and set `HF_HUB_ENABLE_HF_TRANSFER=1` as an environment variable.

Please note that using `hf_transfer` comes with certain limitations. Since it is not purely Python-based, debugging errors may be challenging. Additionally, `hf_transfer` lacks several user-friendly features such as progress bars, resumable downloads and proxies. These omissions are intentional to maintain the simplicity and speed of the Rust logic. Consequently, `hf_transfer` is not enabled by default in `huggingface_hub`.

## From external tools

Some environment variables are not specific to `huggingface_hub` but still taken into account
when they are set.
Some environment variables are not specific to `huggingface_hub` but are still taken into account when they are set.

### NO_COLOR

Expand Down
10 changes: 10 additions & 0 deletions src/huggingface_hub/commands/download.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,14 @@
from huggingface_hub import logging
from huggingface_hub._snapshot_download import snapshot_download
from huggingface_hub.commands import BaseHuggingfaceCLICommand
from huggingface_hub.constants import HF_HUB_ENABLE_HF_TRANSFER
from huggingface_hub.file_download import hf_hub_download
from huggingface_hub.utils import disable_progress_bars, enable_progress_bars


logger = logging.get_logger(__name__)


class DownloadCommand(BaseHuggingfaceCLICommand):
@staticmethod
def register_subcommand(parser: _SubParsersAction):
Expand Down Expand Up @@ -164,6 +168,12 @@ def _download(self) -> str:
if self.exclude is not None and len(self.exclude) > 0:
warnings.warn("Ignoring `--exclude` since filenames have being explicitly set.")

if not HF_HUB_ENABLE_HF_TRANSFER:
logger.info(
"Consider using `hf_transfer` for faster downloads. This solution comes with some limitations. See"
" https://huggingface.co/docs/huggingface_hub/hf_transfer for more details."
)

# Single file to download: use `hf_hub_download`
if len(self.filenames) == 1:
return hf_hub_download(
Expand Down
10 changes: 10 additions & 0 deletions src/huggingface_hub/commands/upload.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,14 @@
from huggingface_hub import logging
from huggingface_hub._commit_scheduler import CommitScheduler
from huggingface_hub.commands import BaseHuggingfaceCLICommand
from huggingface_hub.constants import HF_HUB_ENABLE_HF_TRANSFER
from huggingface_hub.hf_api import HfApi
from huggingface_hub.utils import disable_progress_bars, enable_progress_bars


logger = logging.get_logger(__name__)


class UploadCommand(BaseHuggingfaceCLICommand):
@staticmethod
def register_subcommand(parser: _SubParsersAction):
Expand Down Expand Up @@ -192,6 +196,12 @@ def _upload(self) -> str:
if self.delete is not None and len(self.delete) > 0:
warnings.warn("Ignoring `--delete` since a single file is uploaded.")

if not HF_HUB_ENABLE_HF_TRANSFER:
logger.info(
"Consider using `hf_transfer` for faster uploads. This solution comes with some limitations. See"
" https://huggingface.co/docs/huggingface_hub/hf_transfer for more details."
)

# Schedule commits if `every` is set
if self.every is not None:
if os.path.isfile(self.local_path):
Expand Down

0 comments on commit 6ccc1f6

Please sign in to comment.