fix: persist reranker ONNX cache to ~/.cache/fastembed#19
Closed
joyson-fernandes wants to merge 1 commit intolyonzin:masterfrom
Closed
fix: persist reranker ONNX cache to ~/.cache/fastembed#19joyson-fernandes wants to merge 1 commit intolyonzin:masterfrom
joyson-fernandes wants to merge 1 commit intolyonzin:masterfrom
Conversation
Before this change TextCrossEncoder was instantiated without a cache_dir, so FastEmbed defaulted to a temp directory. macOS periodically purges it, which meant that any run with HF_HUB_OFFLINE=1 couldn't find the ONNX weights and failed with "Could not load model Xenova/ms-marco-MiniLM-L-6-v2 from any source." TextEmbedding already had this fix applied on line 145; the reranker just never got the same treatment when it was added. Match the embedder's cache_dir so both live at ~/.cache/fastembed/ and survive macOS tmp cleaning. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Owner
|
Hey @joyson-fernandes, thanks for taking the time to report this! You're right that the macOS That said, we'll handle the fix internally since we want to apply it a bit differently. Closing this one out, but appreciate the heads-up. |
lyonzin
pushed a commit
that referenced
this pull request
Apr 20, 2026
CrossEncoderReranker was not passing cache_dir to TextCrossEncoder, causing model re-download on macOS where $TMPDIR is periodically purged. Aligns with TextEmbedding which already uses config.models_cache_dir. Closes #19
lyonzin
added a commit
that referenced
this pull request
Apr 20, 2026
CrossEncoderReranker was not passing cache_dir to TextCrossEncoder, causing model re-download on macOS where $TMPDIR is periodically purged. Aligns with TextEmbedding which already uses config.models_cache_dir. Closes #19
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cache_dir=~/.cache/fastembedtoTextCrossEncoderso the reranker ONNX model is cached in a durable location instead of the default system temp directory.TextEmbedding, line 145).Why
On macOS the default FastEmbed cache dir resolves to
$TMPDIR, which the OS periodically purges. Combined withHF_HUB_OFFLINE=1(recommended for startup stability), any purge between runs causes:When the reranker was first added it missed the
cache_dirkwarg thatTextEmbeddingalready uses. This PR makes both consistent so both models survive tmp cleaning.Test plan
rm -rf ~/.cache/fastembed/models--Xenova--ms-marco-MiniLM-L-6-v2HF_HUB_OFFLINE=1→ reranker now loads (previously failed)search_knowledgereturns reranked results as before🤖 Generated with Claude Code