HF2Ollama is a powerful all-in-one toolkit that bridges the gap between Hugging Face and Ollama. It enables you to download models, convert them to GGUF format with advanced quantization (including Q4_K_M, MXFP4_MOE), and register them directly into Ollama using both a CLI and a Modern Web UI.
- 📥 One-Click Download: Easily download models from Hugging Face Hub.
- 🔄 Advanced Quantization:
- Supports standard conversions (f16, f32).
- Auto-setup: Automatically downloads
llama-quantizebinaries for your OS. - 2-Step Quantization: Supports high-performance formats like
q4_k_mandmxfp4_moe.
- 🔐 User Authentication:
- Built-in login & signup system with bcrypt password hashing.
- Admin approval system for new users.
- User management dashboard with approval/rejection controls.
- 🖥️ Modern Web UI:
- Dark-themed, developer-friendly interface.
- Real-time terminal-style logging.
- File Manager: Manage/delete large model files directly from the browser.
- 🚀 Seamless Integration: Register converted GGUF models to Ollama with a single click.
-
Clone the repository:
git clone https://github.com/your-username/hf2ollama.git cd hf2ollama -
Set up a virtual environment (Recommended):
# Windows python -m venv .venv .\.venv\Scripts\activate # Mac/Linux python3 -m venv .venv source .venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
(Optional) Configure Environment:
- Create a
.envfile to set your Hugging Face Token (for private models), change the Web UI port, or set a custom secret key.HF_TOKEN=your_token_here FLASK_PORT=5000 SECRET_KEY=your-secret-key-here-change-in-production
- Create a
HF2Ollama includes a built-in user authentication system to secure your instance.
- Start the application and navigate to
http://localhost:5000. - You'll be redirected to the Sign In page.
- Click "Don't have an account? Sign up" to create your first account.
- The first user automatically becomes an admin and can access all features immediately.
- Admin Dashboard: Click the "Admin" button in the navbar (only visible to admins) to manage users.
- User Approval: New users must be approved by an admin before they can sign in.
- Admin Features:
- View all registered users
- Approve pending users
- Reject or delete users
- Passwords are hashed using bcrypt for maximum security.
- Each session is managed securely using Flask's built-in session management.
- The database is stored locally in
instance/app.db(included in.gitignore).
The easiest way to use HF2Ollama is via the Web UI.
Windows:
Double-click start_web.bat.
Mac/Linux:
Run ./start_web.sh.
- Open your browser at
http://localhost:5000. - Sign up as your first user (you'll automatically become an admin).
- Click "Install Tools" in the navbar to automatically set up quantization binaries.
- Enjoy downloading, converting, and creating models!
You can also use the command-line interface for automation.
Download required binaries (llama-quantize) for advanced quantization.
python main.py install-toolspython main.py download "HuggingFaceTB/SmolLM2-135M"Convert to GGUF. Supports 2-step quantization automatically if install-tools was run.
# Basic f16 conversion
python main.py convert "./models/HuggingFaceTB--SmolLM2-135M"
# Advanced 4-bit quantization (requires install-tools)
python main.py convert "./models/HuggingFaceTB--SmolLM2-135M" --out-type q4_k_m
# Force overwrite existing files
python main.py convert "./models/HuggingFaceTB--SmolLM2-135M" --forcepython main.py create "smollm2" --gguf-path "./gguf/HuggingFaceTB--SmolLM2-135M-q4_k_m.gguf"ollama run smollm2