A fullscreen TUI and headless CLI that turns any Hugging Face GGUF repository into a running, hardware-tuned Ollama model — then lets you back it up, restore it, or hand it off to Claude Code, Codex, Copilot CLI, OpenCode, Droid, and seven more integrations without leaving your shell.
- Why
- Install
- Quick start
- Features
- CLI reference
- Keyboard reference
- Configuration
- Architecture
- Development
- Privacy, security, cost
- Roadmap
- Author
- License
Running open-weight LLMs locally via Ollama is powerful but punishingly repetitive:
- Browse Hugging Face. Pick a repo. Eyeball the quant table.
- Guess which quant fits your VRAM. Read the card for temperature / top_p hints.
- Download the
.gguf. Hand-write aModelfile.ollama createit. - Tune env vars (
OLLAMA_FLASH_ATTENTION,OLLAMA_KV_CACHE_TYPE, …) differently on every OS. - Back it up. Restore it. Manage a dozen tags. Hand it off to Claude Code or Codex.
hfo collapses that into a single session. It reads your rig, scores every quant in the repo
against your VRAM and RAM, parses the model card for recommended sampling parameters, and — on the
happy path — installs a ready-to-run model with one Enter keystroke. Every TUI capability has a
non-interactive flag so you can script the same pipeline from CI, cron, or an editor plugin.
It is written in strict TypeScript, runs on Windows / macOS / Linux, talks only to the public Hugging Face API and your local Ollama daemon, and ships as both an npm package and standalone binaries for each OS.
Two entry points share the same engine — the fullscreen TUI for interactive exploration and headless flags for scripts, cron, and editor plugins. Both shells read the same settings file, write the same backup format, and invoke the same Ollama commands underneath.
flowchart LR
A[hfo CLI] --> B{argv?}
B -->|--view, --list, --backup,<br/>--restore, --tune, --delete,<br/>--launch| C[Headless dispatch]
B -->|no flag, or URL / --tab| D[Fullscreen TUI]
C --> E[Plain stdout + exit code]
D --> F[Alt-screen buffer]
F --> G[Dashboard · Models · Install ·<br/>Tune · Help · Settings]
G --> H[Optional handoff to<br/>ollama launch<br/>on exit]
C -.reads.-> S[(settings.json)]
D -.reads + writes.-> S
C -.calls.-> O[(Ollama daemon)]
D -.calls.-> O
C -.GET.-> HF[(huggingface.co public API)]
D -.GET.-> HF
# macOS / Linux — detects OS + arch, installs to /usr/local/bin or ~/.local/bin.
curl -fsSL https://hfo.carrillo.app/install.sh | sh# Windows PowerShell — installs to %LOCALAPPDATA%\Programs\hfo and adds it
# to your user PATH. No admin prompt required.
irm https://hfo.carrillo.app/install.ps1 | iexBoth scripts always pull the latest GitHub Release binary for your
platform. Override with HFO_VERSION=v0.2.0 (or $env:HFO_VERSION) to
pin a specific tag, or HFO_INSTALL_DIR=/some/dir to relocate.
Binaries are built and published only when a
v*.*.*tag is pushed, never on regular commits. Every-commit CI runs only typecheck, lint, test and build.
Pick the one you already have. All four ship the same hfo-cli package
from the npm registry.
npm i -g hfo-cli
pnpm add -g hfo-cli
yarn global add hfo-cli
bun add -g hfo-cliRequires Node.js ≥ 20. Ollama is optional — hfo will install it for
you via winget, brew, or the official shell script if it isn't
already on your PATH.
Each tagged release on GitHub publishes self-contained executables you can drop straight onto your PATH:
| Asset | OS · Arch |
|---|---|
hfo-linux-x64 |
Linux x64 |
hfo-linux-arm64 |
Linux arm64 |
hfo-macos-x64 |
macOS x64 (Intel) |
hfo-macos-arm64 |
macOS arm64 (Apple Silicon) |
hfo-win-x64.exe |
Windows x64 |
Download directly from the Releases page, make it executable, and put it on your PATH — no Node runtime required. The one-liner scripts above wrap exactly this flow for you.
# Explore — fullscreen dashboard: hardware, capacity, picks, live metrics
hfo
# Install a specific repo (opens the Install tab pre-filled)
hfo bartowski/Llama-3.2-3B-Instruct-GGUF
# One-keystroke install, no TUI
hfo --view
hfo bartowski/Qwen2.5-Coder-7B-Instruct-GGUF --code
# (the new quick-confirm screen auto-accepts with Enter)
# Launch a coding agent wired to a local model
hfo --launch claude --model llama3.1:8b
# Back up a model folder
hfo --backup qwen2-5-coder-7b:q4-k-m
# Restore it later
hfo --restore ~/.config/hfo/backups/2026-04-23_11-48-25/qwen2-5-coder-7b_q4-k-m.zipOn the happy path a new install is three keystrokes: pick the quant, press Enter, wait for the download. Conflicts and customisations branch off into their own screens — no mandatory detours when there's nothing to decide.
stateDiagram-v2
[*] --> Probing
Probing --> OllamaMissing: ollama absent
OllamaMissing --> OfferTune: installed + reachable
Probing --> OfferTune: ollama ok
OfferTune --> LoadingRepo
LoadingRepo --> Scored: repo + card + quant scores
Scored --> QuantsPicked: user selects quant(s)
state if_conflict <<choice>>
QuantsPicked --> if_conflict
if_conflict --> QuickConfirm: single quant, no conflicts
if_conflict --> Reviewing: multi-quant OR tag/file conflicts
QuickConfirm --> Processing: Enter (recommended defaults)
QuickConfirm --> ChangeDir: O (file browser)
QuickConfirm --> ParamsEditing: C (customize)
QuickConfirm --> Scored: Esc
ChangeDir --> QuickConfirm: new dir chosen
Reviewing --> Processing: confirm plans
ParamsEditing --> Processing: save
Processing --> Installed
Installed --> Done: launch or skip
Done --> [*]
- Fullscreen TUI with six tabs. Dashboard (live GPU / VRAM / RAM meters + loaded models + capacity tier), Models (installed + orphan-available-to-reinstall + loaded status), Install (quant picker + quick-confirm + file browser + plan reviewer + parameter editor), Tune (default vs suggested vs current env table), Help, and Settings.
- Hardware scoring. Every GGUF is graded 0–100 against your usable VRAM (GPU reserve subtracted)
and RAM headroom, with per-quant labels like
Full GPU,Partial 87%,CPU-heavy. - HF model-card parameters. The README is parsed for recommended
temperature,top_p,top_k,repeat_penalty,min_p, and context size — overlaid onto your hardware-tuned defaults, with a star marker on rows the card contributed. - Quick-confirm install. New in this release: if the picked quant has no tag or directory conflicts, the install collapses to a single "Enter to install / O to change dir / C to customize" screen. Three-keystroke flow end to end.
- Orphan reinstall. Tags removed from Ollama but whose GGUFs still live on disk appear in a
separate "Available to reinstall" section.
Iregenerates the Modelfile from the GGUF alone using your hardware + HF card and re-registers with Ollama. - Zip backup + restore. Level-9 compression, streaming (multi-GB safe), metadata sidecar.
Bzips a model;hfo --restore <zip>extracts and re-registers, regenerating the Modelfile if needed. - Deep delete.
dremoves the Ollama tag only;Alt+dalso wipes the tracked directory. - Launch integrations. Press
Lon any model to runollama launch <integration>with that model as the backend. Supports Claude Code, Cline, Codex, Copilot CLI, Droid, Hermes, Kimi, OpenCode, OpenClaw, Pi, and VS Code, with runtime probing againstollama launch --helpto mark unsupported targets. - ~90% capacity tuner. Side-by-side default / suggested / current for
OLLAMA_FLASH_ATTENTION,OLLAMA_KV_CACHE_TYPE,OLLAMA_KEEP_ALIVE,OLLAMA_NUM_PARALLEL,OLLAMA_MAX_LOADED_MODELS,OLLAMA_MAX_QUEUE. Persists viasetxon Windows,launchctl setenv+~/.zprofileon macOS, and~/.profile+systemdoverride on Linux. - Seven themes, 20 languages. Dark, Light, Dracula, Solarized Dark/Light, Nord, Monokai. Translations in English, Spanish, Chinese, Hindi, Arabic, Portuguese, Bengali, Russian, Japanese, German, French, Korean, Italian, Turkish, Vietnamese, Indonesian, Polish, Dutch, Thai, and Ukrainian. Live language switching from the Settings tab.
- Reusable searchable dropdown. Themes, languages, and sort-order pickers all use the same keyboard-driven component — type to filter, arrow to nav, Enter to select.
Every TUI capability ships a headless flag equivalent that prints plain text and exits with a non-zero status on failure.
| Command | Description |
|---|---|
hfo |
Open the fullscreen TUI (Dashboard tab) |
hfo <hf-url-or-repo> |
Open the TUI pre-filled on the Install tab |
hfo --help / -h |
Print the full flag reference |
hfo --version |
Print hfo vX.Y.Z · MIT · <author> |
hfo --view |
Print hardware profile, capacity score, tier, picks, pre-filtered HF search URLs |
hfo --list |
List installed models + orphaned GGUFs on disk |
hfo --tune |
Persist the ~90% capacity Ollama env profile, then restart the daemon |
hfo --backup <tag> |
Create a .zip backup of the model's folder with level-9 compression |
hfo --restore <zip> |
Extract a backup, regenerate the Modelfile if needed, re-register with Ollama |
hfo --delete <tag> |
Remove the tag from Ollama (shallow) |
hfo --delete <tag> --deep |
Remove the tag AND delete its on-disk folder |
hfo --launch <integration> |
Run ollama launch <integration> with the given model as backend |
hfo --tab <name> |
Open the TUI on dashboard · models · install · tune · help · settings |
Additional flags:
| Flag | Description |
|---|---|
--dir, -d |
Base directory for the install file browser (default: settings.modelDir) |
--token, -t |
Hugging Face token for gated/private repos (or set $HF_TOKEN) |
--code, -c |
Mark the install as code-specialized (adjusts SYSTEM prompt) |
--ctx |
Force context size (default: auto — 8192 if fits in VRAM, else 4096) |
--model |
Model tag to pass to --launch |
--deep |
Deep-delete alongside --delete |
--no-fullscreen |
Disable the alt-screen buffer (helpful on legacy terminals) |
| Scope | Keys | Action |
|---|---|---|
| Global | 1–6 |
Jump to a tab |
| Global | Tab / Shift+Tab |
Cycle tabs |
| Global | , |
Open Settings |
| Global | ? |
Open Help |
| Global | Ctrl+H |
Open the author's homepage |
| Global | q · Ctrl+C |
Quit (restores scrollback) |
| Dashboard | Alt+↑ · Alt+↓ |
Focus / cycle panels |
| Dashboard | Esc · b |
Return to the 4-panel grid |
| Dashboard | Enter |
Open the selected link in the default browser |
| Dashboard | r |
Force refresh |
| Models | ↑ ↓ |
Navigate rows |
| Models | L / G |
Launch integration with selected model / Launch menu without preset |
| Models | I |
Reinstall an orphan (regenerates Modelfile if missing) |
| Models | B |
Create a .zip backup |
| Models | d / Alt+d |
Remove tag / Remove tag AND delete files |
| Models | r |
Refresh |
| Install (quick-confirm) | Enter |
Install now with recommended defaults |
| Install (quick-confirm) | O |
Change target directory |
| Install (quick-confirm) | C |
Customize Modelfile parameters |
| Install (quick-confirm) | Esc |
Back to the quant picker |
| Tune | ↑↓ · ←→ |
Navigate rows · cycle enum values |
| Tune | E · D · S · R · X · A |
Edit freely · Default · Suggested · Reset→suggested · Reset→defaults · Apply |
| Settings | Enter |
Open dropdown (themes / 20 languages, searchable) |
hfo persists a JSON settings file under the OS-appropriate config directory:
- Windows —
%APPDATA%\hfo\settings.json - macOS —
~/Library/Application Support/hfo/settings.json - Linux —
$XDG_CONFIG_HOME/hfo/settings.json(defaults to~/.config/hfo/)
Schema:
interface Settings {
theme: 'dark' | 'light' | 'dracula' | 'solarized-dark' | 'solarized-light' | 'nord' | 'monokai';
language: 'en' | 'es' | 'zh' | 'hi' | 'ar' | 'pt' | 'bn' | 'ru' | 'ja' | 'de'
| 'fr' | 'ko' | 'it' | 'tr' | 'vi' | 'id' | 'pl' | 'nl' | 'th' | 'uk';
refreshIntervalMs: number; // Dashboard polling cadence
defaultSort: 'trending' | 'downloads' | 'likes7d' | 'modified';
defaultCodeMode: boolean;
modelDir: string | null; // Base dir for the file browser; null = platform default
useAltScreen: boolean;
installations: Installation[]; // { tag, dir, repoId, quant, installedAt }
}Backups are written under <configDir>/hfo/backups/<timestamp>/<slug>.zip with a sidecar
.metadata.json containing tag, repo, quant, byte counts, and compression ratio.
Module layout, grouped by concern. Tests live under test/, UI components
under src/components/, tab containers under src/tabs/, everything else is
pure TypeScript suitable for unit testing and reuse.
graph TB
subgraph Entry["Entry points"]
CLI[cli.tsx]
Shell[Shell.tsx]
Headless[headless.ts]
end
subgraph Flow["Install flow state machine"]
App[App.tsx]
QuickConfirm[components/QuickConfirm]
Planner[plan.ts]
Reviewer[components/PlanReviewer]
ParamsEditor[components/ParamsEditor]
Processor[components/Processor]
end
subgraph Domain["Pure-logic modules"]
HF[hf.ts]
HW[hardware.ts]
Scoring[scoring.ts]
Capacity[capacity.ts]
Readme[readme.ts]
Modelfile[modelfile.ts]
Describe[describe.ts]
Backup[backup.ts]
Restore[restore.ts]
Reinstall[reinstall.ts]
Launch[launch.ts]
end
subgraph Infra["Cross-OS infra"]
Ollama[ollama.ts]
Platform[platform.ts]
Settings[settings.ts]
Live[live.ts]
Theme[theme.ts]
I18n[i18n.ts]
end
CLI --> Shell
CLI --> Headless
Shell --> App
Shell --> Settings
App --> QuickConfirm
App --> Planner
App --> Reviewer
App --> ParamsEditor
App --> Processor
Processor --> HF
Processor --> Modelfile
Processor --> Ollama
Processor --> Settings
Planner --> HF
Planner --> Ollama
Scoring --> HF
Capacity --> HW
App --> Scoring
App --> Capacity
App --> Readme
Headless --> Backup
Headless --> Restore
Headless --> Reinstall
Headless --> Launch
Headless --> HW
Headless --> Ollama
Backup --> Settings
Restore --> Reinstall
Reinstall --> Scoring
Reinstall --> Modelfile
Shell --> Live
Shell --> Theme
Shell --> I18n
hfo/
├─ .github/ # Community health + CI/CD
│ ├─ ISSUE_TEMPLATE/ # bug_report.yml · feature_request.yml · config.yml
│ ├─ workflows/ # ci.yml (every push) · pages.yml (docs) · release.yml (tags only)
│ ├─ CODEOWNERS · FUNDING.yml · dependabot.yml · SUPPORT.md · PULL_REQUEST_TEMPLATE.md
├─ bin/hfo.js # npm bin entrypoint → dist/cli.js
├─ docs/ # Static site deployed at hfo.carrillo.app
│ ├─ .well-known/security.txt # RFC 9116 disclosure contact
│ ├─ assets/ # logo-mark.svg · logo-wordmark.svg · og-image.svg
│ ├─ 404.html # Not-found page sharing the site design
│ ├─ CNAME · humans.txt · robots.txt · sitemap.xml · manifest.webmanifest
│ ├─ install.sh · install.ps1 # One-liner installers (hfo.carrillo.app/install.*)
│ └─ index.html # Landing: Hero · Features · Install · CLI · Kb · FAQ
├─ scripts/
│ ├─ generate-favicons.mjs # SVG → PNG/ICO + og.png, invoked by the Pages workflow
│ └─ install-skills.mjs # `pnpm run skills:install`
├─ src/ # Strict-TypeScript source — grouped by layer
│ ├─ cli.tsx # Entry points (stay at root — framework hooks use)
│ ├─ Shell.tsx # argv + headless dispatch · 6-tab router · install FSM
│ ├─ App.tsx # non-interactive command dispatch
│ ├─ headless.ts
│ ├─ core/ # Pure domain logic — zero UI, zero OS side effects
│ │ ├─ hf.ts # public HF API client + resumable downloader
│ │ ├─ hardware.ts # nvidia-smi + systeminformation probes
│ │ ├─ live.ts # real-time samplers for the Dashboard
│ │ ├─ scoring.ts # per-quant compatibility 0-100
│ │ ├─ capacity.ts # rig tier + picks + HF search URLs
│ │ ├─ readme.ts # HF model-card parser
│ │ ├─ plan.ts # install plans + conflict detection
│ │ ├─ modelfile.ts # Modelfile generator + tag slugifier
│ │ ├─ reinstall.ts # orphan detection + Modelfile regeneration
│ │ ├─ backup.ts # archiver-based zip writer + metadata sidecar
│ │ ├─ restore.ts # adm-zip reader + reinstall handoff
│ │ ├─ launch.ts # ollama launch targets + runtime probing
│ │ └─ describe.ts # quant flavour + modality detection
│ ├─ infra/ # Cross-OS platform integration
│ │ ├─ ollama.ts # Ollama CLI wrappers + env persistence (setx/launchctl/systemd)
│ │ ├─ platform.ts # openUrl + config paths (Windows / macOS / Linux)
│ │ ├─ settings.ts # persisted preferences + installations index
│ │ └─ about.ts # package.json metadata loader
│ ├─ ui/ # UI-layer primitives consumed by tabs/ + components/
│ │ ├─ theme.ts # 7-theme registry with contrast pairs
│ │ ├─ i18n.ts # 20-language translation catalogs
│ │ ├─ icons.ts # figures-backed icon set
│ │ ├─ hooks.ts # useTerminalSize · useInterval · useNow
│ │ └─ format.ts # byte / ETA / progress-bar formatters
│ ├─ tabs/ # Dashboard · Models · Install · Tune · Help · Settings
│ └─ components/ # Dropdown · QuickConfirm · BootScreen · PlanReviewer · ParamsEditor · FileBrowser · LaunchMenu · …
├─ test/ # 12 vitest suites · 71 tests · v8 coverage
├─ CHANGELOG.md · CODE_OF_CONDUCT.md · CONTRIBUTING.md · LICENSE · README.md · SECURITY.md
├─ eslint.config.js · tsconfig.json · vitest.config.ts
└─ package.json # packageManager pinned; pkg config for binaries
git clone https://github.com/carrilloapps/hfo.git
cd hfo
pnpm install
pnpm dev # run from source via tsx
pnpm build # compile TS to dist/
pnpm typecheck # tsc --noEmit (strict mode, all flags on)
pnpm lint # eslint flat config with @typescript-eslint
pnpm test # vitest (12 suites, 71 tests)
pnpm test:watch # TDD mode
pnpm test:coverage # generate coverage/index.html
pnpm run ci # typecheck + lint + test + build in one goThe CI workflow runs the same chain on Ubuntu / macOS / Windows × Node 20 / 22 for every push
and PR. The release workflow publishes to npm and uploads per-OS binaries to GitHub Releases
when a v*.*.* tag is pushed.
- Strict TypeScript. Every flag in
tsconfig.jsonis on, includingstrictNullChecks,noImplicitReturns,noFallthroughCasesInSwitch, andisolatedModules. - No raw emojis in source. Use
src/icons.ts(backed by thefigurespackage, with ASCII fallback on legacy consoles). ESLint guards this. - Cross-OS is non-negotiable. Anything OS-specific lives behind
src/platform.tsorsrc/ollama.ts. - No telemetry.
hfoonly talks to the public Hugging Face API and the local Ollama daemon. - English source. Spanish and 18 other languages live in
src/i18n.ts.
See CONTRIBUTING.md for the PR workflow and
CODE_OF_CONDUCT.md for community guidelines.
Security issues: see SECURITY.md.
hfo makes exactly two kinds of external requests, both to Hugging Face's public,
unauthenticated API:
GET https://huggingface.co/api/models/<repo>/tree/main
GET https://huggingface.co/<repo>/resolve/main/<file>
Everything else — nvidia-smi, systeminformation, ollama ps, ollama create,
ollama launch — runs locally. No telemetry. No accounts. Gated or private repos use a
standard HF_TOKEN environment variable or the --token flag.
The Ollama install helpers (winget install, brew install, the official shell script) run only
when the user explicitly opts in from the TUI's Ollama installer overlay.
All costs are zero: no subscriptions, no API keys, no rate-limited tiers.
- Real-time progress in backup/restore for multi-gigabyte archives (pause / resume).
- Multi-install batch mode in
--install <file-with-repos>. - Snapshot diffing between two Modelfiles (for A/B testing parameter changes).
- Plugin API for custom launch integrations.
- Windows-native toast notifications when a long backup finishes.
Open an issue with the label enhancement to propose items.
José Carrillo — Senior Full-stack Developer (Tech Lead)
- Website — carrillo.app
- Email — m@carrillo.app
- GitHub — @carrilloapps
- LinkedIn — in/carrilloapps
- X — @carrilloapps
- Dev.to — @carrilloapps
- Medium — @carrilloapps
MIT — © 2026 José Carrillo.