Add support for HuggingFace GGUF models in Ollama#3
Conversation
Updated OllamaInfo.is_model_available() to recognize HuggingFace GGUF model paths
|
I'm curious to your setup. I imagine you have uploaded your SSH key to Hugging Face. And when you run the model, the inference happens on Hugging Face? Can you talk on PlanExe Discord? |
|
Hey! To download models from Hugging Face, you simply need a valid HF_TOKEN set as an environment variable locally. There's no need to use huggingface_cli or any other pip library—model inference is handled locally via Ollama rather than on Hugging Face’s servers. I'll try to join your discord server when I can, thank you for your attention! |
|
I can't make sense of the I have tried recreating your scenario, using I was unaware that ollama could fetch GGUFs directly from HF, thanks. I have updated the docs roughly describing how to fetch GGUF models. Let me know if the docs can be improved further. |
|
The I think there's a misunderstanding in your documentation. When using HuggingFace GGUF models with the For example:
The confusion might be because your To clarify the core functionality: Ollama downloads the model file from HuggingFace and then runs inference locally on the user's machine, not on HuggingFace's servers. |
|
I would like to talk with you on Discord about your fix. In particular why I have updated the docs with your recommendations. Thank you. |
Phase 1 (Critical Security): - Fix SECRET_KEY validation to detect both 'your-secret-key' AND 'dev-secret-key' defaults - Fail hard in production (when FLASK_ENV=production or PLANEXE_PUBLIC_BASE_URL set) - Add session cookie security flags (SECURE, HTTPONLY, SAMESITE=Lax) - Update .env examples with SECRET_KEY generation command Phase 2 (Error Handling & UX): - Wrap OAuth callback in try/except for better error handling - Add profile field validation with clear error messages - Log warning when OAuth profile missing email - Update login.html to display error messages Addresses Issues #1, #3, #5, PlanExeOrg#6, PlanExeOrg#7 from OAUTH_ANALYSIS.md
Phase 1 (Critical Security): - Fix SECRET_KEY validation to detect both 'your-secret-key' AND 'dev-secret-key' defaults - Fail hard in production (when FLASK_ENV=production or PLANEXE_PUBLIC_BASE_URL set) - Add session cookie security flags (SECURE, HTTPONLY, SAMESITE=Lax) - Update .env examples with SECRET_KEY generation command Phase 2 (Error Handling & UX): - Wrap OAuth callback in try/except for better error handling - Add profile field validation with clear error messages - Log warning when OAuth profile missing email - Update login.html to display error messages Addresses Issues #1, #3, #5, PlanExeOrg#6, PlanExeOrg#7 from OAUTH_ANALYSIS.md
…hip-set Updates two docs to reflect the post-PlanExeOrg#753 state of the napkin-math pipeline. methology.md: describe the current pipeline behaviour — two-batch compress with paraphrase-tolerant quote match and cross-bucket promoter; extract's source-arithmetic preservation, threshold-pairing, and dropped_signals field; 19-check validator (added aggregate_not_bounded, requirement_has_margin, dropped_signals_schema); bounds' asymmetric source label on commitment defaults, calculation-output strip, reserved correlations block, reserved lognormal/pert disciplines with loud NotImplementedError; advisory audit_source_preservation.py step. 20260520_plan.md → 20260522_plan.md: bump status date; mark PR PlanExeOrg#750 merged; add PR PlanExeOrg#751/PlanExeOrg#752/PlanExeOrg#753 entries (proposal 141 implementation); update Phase status table (added 4.5 audit row, reclassified Phase 8 as partially done, Phase 10 marked done for current ship-set); add v58 14-plan empirical snapshot (1 viable / 5 fragile / 8 doom); reorder Next likely move now that proposal 141 has shipped — Phase 5 citation verifier promoted to PlanExeOrg#1, Phase 8 samplers added as PlanExeOrg#2 with v58 cases that bite now, Phase 9 composite-band cap as PlanExeOrg#3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add support for HuggingFace GGUF models in Ollama
Description
This PR adds support for running GGUF models directly from HuggingFace through Ollama using the
hf.co/prefix in model configurations. This enables users to leverage a wider range of models without requiring manual model installation.Changes
OllamaInfo.is_model_available()to recognize HuggingFace GGUF model pathsExample Configuration
{ "lmstudio-qwen2.5-7b-instruct-1m-gguf": { "comment": "This runs on your own computer via Ollama using GGUF models from HuggingFace.", "class": "Ollama", "arguments": { "model": "hf.co/lmstudio-community/Qwen2.5-7B-Instruct-1M-GGUF:Q6_K", "temperature": 0.5, "request_timeout": 120.0, "is_function_calling_model": false } } }