Skip to content

snapshot_download ignores 5xx errors during resume, returns unsafe cached data #3007

Open
@NearBirdEZ

Description

@NearBirdEZ

Describe the bug

Description
When resuming downloads, snapshot_download handles errors
[requests.exceptions.ConnectionError, requests.exceptions.Timeout, huggingface_hub.errors.OfflineModeIsEnabled, requests.HTTPError]
(e.g., Hub downtime) by falling back to locally cached files without raising an error. This occurs because the code prioritizes cache availability over server communication status

Reproduction

import os
import random
from typing import Never, Type
from unittest.mock import patch

import huggingface_hub
import requests


class BadHfApi(huggingface_hub.HfApi):
    def repo_info(self, *args, **kwargs) -> Never:
        expected_errors: list[Type[Exception]] = [
            requests.exceptions.ConnectionError,
            requests.exceptions.Timeout,
            huggingface_hub.errors.OfflineModeIsEnabled,
            requests.HTTPError
        ]
        error_class: Type[Exception] = random.choice(expected_errors)
        print(f"Will be raise {error_class=}")
        raise error_class("test timeout from hf")


def main() -> None:
    print("Continuing the download")
    repo_id: str = "deepcogito/cogito-v1-preview-qwen-32B"
    revision: str = "118e6e43c46a51667f544dae9cbe4027f59eb697"
    local_path: str = f"models/deepcogito/cogito_v1_preview_qwen_32b/{revision}"
    print(f"Before download: {os.listdir(local_path)=}")
    with patch("huggingface_hub._snapshot_download.HfApi", new=BadHfApi):
        result: str = huggingface_hub.snapshot_download(
            repo_id=repo_id,
            local_dir=local_path,
            revision=revision
        )
    print(f"{result=}")
    print(f"After download attempt: {os.listdir(local_path)=}")



if __name__ == "__main__":
    main()

Logs

Continuing the download
Before download: os.listdir(local_path)=['added_tokens.json', 'model-00001-of-00014.safetensors', 'images', 'config.json', 'README.md', 'merges.txt', '.gitattributes', '.cache']
Will be raise error_class=<class 'requests.exceptions.Timeout'>
result='/Users/smith/work/hf_bug/models/deepcogito/cogito_v1_preview_qwen_32b/118e6e43c46a51667f544dae9cbe4027f59eb697'
After download attempt: os.listdir(local_path)=['added_tokens.json', 'model-00001-of-00014.safetensors', 'images', 'config.json', 'README.md', 'merges.txt', '.gitattributes', '.cache']
Returning existing local_dir `models/deepcogito/cogito_v1_preview_qwen_32b/118e6e43c46a51667f544dae9cbe4027f59eb697` as remote repo cannot be accessed in `snapshot_download` (test timeout from hf).

Process finished with exit code 0

System info

- huggingface_hub version: 0.30.2
- Platform: macOS-13.6.3-arm64-arm-64bit
- Python version: 3.12.0
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Running in Google Colab Enterprise ?: No
- Token path ?: /Users/smith/.cache/huggingface/token
- Has saved token ?: False
- Configured git credential helpers: osxkeychain
- FastAI: N/A
- Tensorflow: N/A
- Torch: N/A
- Jinja2: N/A
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: N/A
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: N/A
- pydantic: N/A
- aiohttp: N/A
- hf_xet: N/A
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /Users/smith/.cache/huggingface/hub
- HF_ASSETS_CACHE: /Users/smith/.cache/huggingface/assets
- HF_TOKEN_PATH: /Users/smith/.cache/huggingface/token
- HF_STORED_TOKENS_PATH: /Users/smith/.cache/huggingface/stored_tokens
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions