-
Notifications
You must be signed in to change notification settings - Fork 31.3k
Description
System Info
transformersversion: 4.57.1- Platform: Linux-6.6.84.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Python version: 3.13.0
- Huggingface_hub version: 0.36.0
- Safetensors version: 0.6.2
- Accelerate version: not installed
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.9.1+cpu (NA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: No
Who can help?
@ydshieh I have created a reproducible example of the issue I mentioned in #41311 (comment).
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Reproducible example: https://github.com/fr1ll/HF_HUB_OFFLINE
Warming the cache in a subprocess, then disabling sockets, then loading the same model should work.
However, it fails with an attempt to access a socket and then "Can't load" errors.
The script named subprocess-warm_then_offline-load.py reproduces this error.
Interestingly, warming the cache in process, then disabling sockets, then loading the same model works.
This is reproduced in inprocess-warm_then_offline-load.py in the repo above.
Expected behavior
When a model has already been loaded into the cache ("warm cache"), if HF_OFFLINE_MODE = "1", a Transformers pipeline should be able to load the model without accessing any network sockets.