Skip to content

v0.20.0: Authentication, speed, safetensors metadata, access requests and more.

Compare
Choose a tag to compare
@Wauplin Wauplin released this 20 Dec 10:16
· 174 commits to main since this release

(Discuss about the release in our Community Tab. Feedback welcome!! 🤗)

🔐 Authentication

Authentication has been greatly improved in Google Colab. The best way to authenticate in a Colab notebook is to define a HF_TOKEN secret in your personal secrets. When a notebook tries to reach the Hub, a pop-up will ask you if you want to share the HF_TOKEN secret with this notebook -as an opt-in mechanism. This way, no need to call huggingface_hub.login and copy-paste your token anymore! 🔥🔥🔥

In addition to the Google Colab integration, the login guide has been revisited to focus on security. It is recommended to authenticate either using huggingface_hub.login or the HF_TOKEN environment variable, rather than passing a hardcoded token in your scripts. Check out the new guide here.

🏎️ Faster HfFileSystem

HfFileSystem is a pythonic fsspec-compatible file interface to the Hugging Face Hub. Implementation has been greatly improved to optimize fs.find performances.

Here is a quick benchmark with the bigcode/the-stack-dedup dataset:

v0.19.4 v0.20.0
hffs.find("datasets/bigcode/the-stack-dedup", detail=False) 46.2s 1.63s
hffs.find("datasets/bigcode/the-stack-dedup", detail=True) 47.3s 24.2s

🚪 Access requests API (gated repos)

Models and datasets can be gated to monitor who's accessing the data you are sharing. You can also filter access with a manual approval of the requests. Access requests can now be managed programmatically using HfApi. This can be useful for example if you have advanced user request screening requirements (for advanced compliance requirements, etc) or if you want to condition access to a model based on completing a payment flow.

Check out this guide to learn more about gated repos.

>>> from huggingface_hub import list_pending_access_requests, accept_access_request

# List pending requests
>>> requests = list_pending_access_requests("meta-llama/Llama-2-7b")
>>> requests[0]
[
    AccessRequest(
        username='clem',
        fullname='Clem 🤗',
        email='***',
        timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc),
        status='pending',
        fields=None,
    ),
    ...
]

# Accept Clem's request
>>> accept_access_request("meta-llama/Llama-2-7b", "clem")

🔍 Parse Safetensors metadata

Safetensors is a simple, fast and secured format to save tensors in a file. Its advantages makes it the preferred format to host weights on the Hub. Thanks to its specification, it is possible to parse the file metadata on-the-fly. HfApi now provides get_safetensors_metadata, an helper to get safetensors metadata from a repo.

# Parse repo with single weights file
>>> metadata = get_safetensors_metadata("bigscience/bloomz-560m")
>>> metadata
SafetensorsRepoMetadata(
    metadata=None,
    sharded=False,
    weight_map={'h.0.input_layernorm.bias': 'model.safetensors', ...},
    files_metadata={'model.safetensors': SafetensorsFileMetadata(...)}
)
>>> metadata.files_metadata["model.safetensors"].metadata
{'format': 'pt'}

Other improvements

List and filter collections

You can now list collections on the Hub. You can filter them to return only collection containing a given item, or created by a given author.

>>> collections = list_collections(item="models/TheBloke/OpenHermes-2.5-Mistral-7B-GGUF", sort="trending", limit=5):
>>> for collection in collections:
...   print(collection.slug)
teknium/quantized-models-6544690bb978e0b0f7328748
AmeerH/function-calling-65560a2565d7a6ef568527af
PostArchitekt/7bz-65479bb8c194936469697d8c
gnomealone/need-to-test-652007226c6ce4cdacf9c233
Crataco/favorite-7b-models-651944072b4fffcb41f8b568

Respect .gitignore

upload_folder now respect gitignore files!

Previously it was possible to filter which files should be uploaded from a folder using the allow_patterns and ignore_patterns parameters. This can now automatically be done by simply creating a .gitignore file in your repo.

Robust uploads

Uploading LFS files has also gotten more robust with a retry mechanism if a transient error happen while uploading to S3.

Target language in InferenceClient.translation

InferenceClient.translation now supports src_lang/tgt_lang for applicable models.

>>> from huggingface_hub import InferenceClient
>>> client = InferenceClient()
>>> client.translation("My name is Sarah Jessica Parker but you can call me Jessica", model="facebook/mbart-large-50-many-to-many-mmt", src_lang="en_XX", tgt_lang="fr_XX")
"Mon nom est Sarah Jessica Parker mais vous pouvez m'appeler Jessica"
>>> client.translation("My name is Sarah Jessica Parker but you can call me Jessica", model="facebook/mbart-large-50-many-to-many-mmt", src_lang="en_XX", tgt_lang="es_XX")
'Mi nombre es Sarah Jessica Parker pero puedes llamarme Jessica'

Support source in reported EvalResult

EvalResult now support source_name and source_link to provide a custom source for a reported result.

  • Support source in EvalResult for model cards by @Wauplin in #1874

🛠️ Misc

Fetch all pull requests refs with list_repo_refs.

  • Add include_pull_requests to list_repo_refs by @Wauplin in #1822

Filter discussion when listing them with get_repo_discussions.

# List opened PR from "sanchit-gandhi" on model repo "openai/whisper-large-v3"
>>> from huggingface_hub import get_repo_discussions
>>> discussions = get_repo_discussions(
...     repo_id="openai/whisper-large-v3",
...     author="sanchit-gandhi",
...     discussion_type="pull_request",
...     discussion_status="open",
... )

New field createdAt for ModelInfo, DatasetInfo and SpaceInfo.

It's now possible to create an inference endpoint running on a custom docker image (typically: a TGI container).

# Start an Inference Endpoint running Zephyr-7b-beta on TGI
>>> from huggingface_hub import create_inference_endpoint
>>> endpoint = create_inference_endpoint(
...     "aws-zephyr-7b-beta-0486",
...     repository="HuggingFaceH4/zephyr-7b-beta",
...     framework="pytorch",
...     task="text-generation",
...     accelerator="gpu",
...     vendor="aws",
...     region="us-east-1",
...     type="protected",
...     instance_size="medium",
...     instance_type="g5.2xlarge",
...     custom_image={
...         "health_route": "/health",
...         "env": {
...             "MAX_BATCH_PREFILL_TOKENS": "2048",
...             "MAX_INPUT_LENGTH": "1024",
...             "MAX_TOTAL_TOKENS": "1512",
...             "MODEL_ID": "/repository"
...         },
...         "url": "ghcr.io/huggingface/text-generation-inference:1.1.0",
...     },
... )
  • Allow create inference endpoint from docker image by @Wauplin in #1861

Upload CLI: create branch when revision does not exist

  • Create branch if missing in hugginface-cli upload by @Wauplin in #1857

🖥️ Environment variables

huggingface_hub.constants.HF_HOME has been made a public constant (see reference).

Offline mode has gotten more consistent. If HF_HUB_OFFLINE is set, any http call to the Hub will fail. The fallback mechanism is snapshot_download has been refactored to be aligned with the hf_hub_download workflow. If offline mode is activated (or a connection error happens) and the files are already in the cache, snapshot_download returns the corresponding snapshot directory.

DO_NOT_TRACK environment variable is now respected to deactivate telemetry calls. This is similar to HF_HUB_DISABLE_TELEMETRY but not specific to Hugging Face.

📚 Documentation

Doc fixes

💔 Breaking change

timeout parameter has been removed from list_repo_files, as part of a planned deprecation cycle.

Otherwise, breaking changes should not be expected in this release. We can mention the fact that upload_file and upload_folder are now returning a CommitInfo dataclass instead of a str. Those two methods were previously returning the url of the uploaded file or folder on the Hub as a string. However, some information is lost compared to CommitInfo: commit id, commit title, description, author, etc. In order to make it backward compatible, the return type CommitInfo inherit from both dataclass and str. The plan is to switch to dataclass-only in release v1.0 (not planned yet).

Finally, HfFolder is now deprecated in favor of get_token, login and logout. The goal is to force users and integrations to use login/logout (instead of HfFolder.save_token/HfFolder.delete_token) which contain more checks and warning messages. The plan is to get rid of HfFolder in release v1.0 (not planned yet).

Small fixes and maintenance

⚙️ fixes

⚙️ internal

  • Prepare for v0.20.0 by @Wauplin in #1807
  • (nit) fix fsspec default mode by @Wauplin (direct commit on main)
  • Use ruff formatter in check_static_imports.py by @Wauplin in #1824
  • ruff formatte by @Wauplin (direct commit on main)
  • Check pydantic correct installation by @Wauplin in #1829
  • FIX ?? send ref in LFS endpoint by @Wauplin in #1838
  • Install doc-builder from source by @Wauplin in #1849
  • robustness by @Wauplin (direct commit on main)
  • style by @Wauplin (direct commit on main)
  • fix list_space_author test by @Wauplin (direct commit on main)
  • finally fix robustness? by @Wauplin (direct commit on main)
  • 4 parallel tests in repo CI instead of 8 to improve stability by @Wauplin (direct commit on main)
  • Remove delete_doc_comment.yaml and delete_doc_comment_trigger.yaml from CI by @Wauplin in #1887
  • skip flaky test by @Wauplin (direct commit on main)
  • Rerun flaky tests in CI by @Wauplin in #1914
  • Sentence Transformers test (soon) no longer expected to fail by @tomaarsen in #1918
  • flakyness by @Wauplin (direct commit on main)

Significant community contributions

The following contributors have made significant changes to the library over the last release: