-
Notifications
You must be signed in to change notification settings - Fork 187
Description
Describe the bug
When HF_HUB_OFFLINE=1 is set, download_model() does not properly use the local HuggingFace cache. Instead, the model_info() API call fails immediately with an EnvironmentError (offline mode), and fastembed falls back to downloading ~83MB from storage.googleapis.com. In air-gapped environments where Google Cloud Storage is also unreachable, fastembed cannot load models that are already present in the local cache.
Expected behavior
With HF_HUB_OFFLINE=1, fastembed should resolve models from the local HF cache (models--org--name/snapshots/...) without any network calls.
Actual behavior
download_model()callsdownload_files_from_huggingface()withlocal_files_only=False- Inside,
model_info(hf_source_repo)is called — this is a network API call - With
HF_HUB_OFFLINE=1,huggingface_hubraisesEnvironmentError: offline mode is enabled - fastembed catches this and logs
"Could not download model from HuggingFace... Falling back to other sources." - Falls back to
retrieve_model_gcs()→ downloads ~83MB fromstorage.googleapis.com - In air-gapped environments, GCS is also unreachable → complete failure
Note: on current main (v0.7.x), there is a local_files_only=True first pass before the retry loop. However, if that pass fails for any reason (e.g. missing metadata file), the retry loop still hits the network path described above.
Steps to reproduce
from fastembed import TextEmbedding
import os
# Step 1: Download the model (populates HF cache)
TextEmbedding("sentence-transformers/all-MiniLM-L6-v2")
# Step 2: Enable offline mode
os.environ["HF_HUB_OFFLINE"] = "1"
# Step 3: Try to load the same model — triggers GCS download instead of using local cache
TextEmbedding("sentence-transformers/all-MiniLM-L6-v2")Environment
- fastembed 0.6.0 (pinned in container image) and 0.7.4 (current main)
- Deployed in NeMo Guardrails on Red Hat OpenShift AI
- Corporate air-gapped environment (both HuggingFace and Google Cloud Storage unreachable)
Fix
PR #614 — sets local_files_only=True in download_model() when HF_HUB_OFFLINE=1 is detected, ensuring zero network calls.