Core-Fidelity presents Pullama.

Resumable model puller for Ollama.

Downloads models directly from the Ollama registry with crash-safe resume. Works over unreliable connections — kill it, reboot, run it again, and it picks up exactly where it left off.

No Ollama server required. Single static binary. Zero runtime dependencies.

Install

Download a release binary, or build from source:

go build -trimpath -ldflags="-buildid=" -o pullama .

Put it on your PATH (e.g. ~/.local/bin):

mkdir -p ~/.local/bin
cp pullama ~/.local/bin/

Cross-compile:

GOOS=darwin  GOARCH=arm64 go build -trimpath -ldflags="-buildid=" -o pullama-darwin-arm64 .
GOOS=linux   GOARCH=amd64 go build -trimpath -ldflags="-buildid=" -o pullama-linux-amd64 .
GOOS=windows GOARCH=amd64 go build -trimpath -ldflags="-buildid=" -o pullama-windows-amd64.exe .

Usage

# Pull a model
pullama llama3.2
pullama mistral:7b
pullama user/my-model

# Override storage location
pullama llama3.2 --models-dir /data/ollama

# Use plain HTTP (for local registries)
pullama my-model --insecure

# Output modes
pullama llama3.2 --output json      # structured JSON events
pullama llama3.2 --output compact   # minimal one-line updates
pullama llama3.2 --output debug     # verbose Go struct output

# Quiet mode (one summary line)
pullama llama3.2 --quiet

# Verbose mode (checkpoint saves, HTTP details, chunk boundaries)
pullama llama3.2 --verbose

# List locally installed models
pullama list

# Show model details (family, parameters, quantization, layers)
pullama show llama3.2

# Remove a model (shared-blob aware — won't delete blobs used by other models)
pullama rm llama3.2

# Clean up disposable artifacts (partial downloads, locks, checkpoints)
pullama clean

Queue

Queue multiple models for sequential download. Failed models are marked and skipped — one bad model doesn't block the rest.

# Add models to the queue
pullama queue add llama3.2 mistral:7b phi3:mini
# added 3 model(s) to queue

# List the queue
pullama queue list
#   · 1  llama3.2:latest  queued
#   · 2  mistral:7b        queued
#   · 3  phi3:mini          queued

# Remove an entry (by position number)
pullama queue rm 2

# Start processing the queue
pullama queue start
# ▸ queue [1/2] pulling llama3.2:latest
#   ... (normal pull output) ...
# ▸ queue [1/2] ✓ completed llama3.2:latest
# ▸ queue [2/2] pulling phi3:mini
#   ...

Duplicates are skipped — adding a model that's already queued or active does nothing. Only one pullama queue start can run at a time (queue-level lock). Ctrl+C stops after the current model finishes and the queue remains paused — run pullama queue start again to continue.

Active entries can't be removed from the queue (cancel the running process instead). Pending entries can be removed freely.

Flags

Flag	Default	Description
`--insecure`	off	Use `http://` instead of `https://`
`--models-dir`	`$OLLAMA_MODELS` or `~/.ollama/models`	Storage root
`--quiet`	off	Suppress progress bar; emit final summary only
`--verbose`	off	Log checkpoint saves, HTTP details, chunk boundaries
`--no-color`	off	Strip ANSI colors and Unicode box-drawing
`--output`	`table`	Output mode: `table`, `compact`, `json`, `debug`
`--max-retries`	6	Max transient retries per chunk
`--chunk-size`	64 MiB	Chunk size when server doesn't provide chunksums
`--timeout`	30m	Per-chunk HTTP timeout

Exit Codes

Code	Meaning
0	Success
1	General error
2	Disk full — try `pullama clean`
3	Authentication failed — check `~/.ollama/id_ed25519`
4	Model not found

How It Works

pullama talks directly to the Ollama registry (the same one ollama pull uses). It authenticates with the same Ed25519 key at ~/.ollama/id_ed25519 and writes blobs to the same ~/.ollama/models/blobs/ directory. Models pulled with pullama appear in ollama list without any extra steps.

Crash-safe resume

Every download writes persistent checkpoints to disk. If the process is killed (SIGINT, SIGKILL, kernel panic), a subsequent run:

Reopens the .partial file
Validates the checkpoint against the manifest
Re-verifies the last chunk that was written (catches torn writes)
Truncates any unverified data
Resumes from the exact byte boundary

No re-downloading of already-verified data. Full-blob SHA256 verification runs before every .partial-to-final rename.

File layout

Files written to ~/.ollama/models/:

blobs/sha256-<hex>             # final verified blobs (shared with Ollama)
blobs/sha256-<hex>.partial     # in-progress download data
blobs/sha256-<hex>.lock        # OS advisory lock (auto-released on crash)
.pullm/sha256-<hex>.json       # download checkpoint
.pullm/queue.json              # download queue (pullama queue)
manifests/<host>/<ns>/<model>/<tag>  # model manifest (written last)

Partial files, locks, checkpoints, and queue state are disposable — deleting them is always safe (worst case: download restarts from offset 0, queue is lost). Final blobs and manifests are never modified after write.

Atomicity

All state transitions follow the same pattern:

write path.tmp → fsync(path.tmp) → rename(path.tmp, path) → fsync(parent_dir)

A crash at any point leaves the previous valid state intact.

Concurrency

One blob at a time, one chunk at a time. OS advisory locks (flock on Unix, LockFileEx on Windows) prevent two pullama processes from writing the same .partial. Locks are released by the kernel on process exit — no stale-lock issues.

Signal Handling

SIGINT / SIGTERM — graceful shutdown:

The current chunk finishes its write and hash verification
If verified, the checkpoint is saved; if not, unverified data is truncated
The lock is released
Prints a resume hint: interrupted — resume with: pullama <model>
Exits 0

A second signal exits immediately with code 1.

Retry Behavior

Condition	Action	Limit
Connection reset / timeout / DNS / 5xx / 429	Exponential backoff retry (0.5s–120s ± jitter)	6 retries per chunk
401 from registry	Regenerate auth token	3 refreshes per blob
403 from CDN	Re-resolve blob URL	5 refreshes per blob
Chunk hash mismatch	Truncate to verified boundary, retry	6 retries per chunk
Full-blob hash mismatch	Delete .partial + checkpoint, re-download	2 full re-downloads

`pullama rm` — safe deletion

Removal is shared-blob aware:

Acquires a directory lock on manifests/
Reads the target manifest
Scans all other manifests to build the active-digest set
If any other manifest fails to parse, the entire deletion aborts — no files are removed
Deletes only blobs referenced exclusively by the target model
Prunes empty parent directories

`pullama clean` — disposable cleanup

Removes .pullm/*.json checkpoints, blobs/*.partial, and blobs/*.lock. Never touches final blobs or manifests. Safe to run at any time. Idempotent.

Output Modes

Table (default — pretty TTY output)

  ▸ core-fidelity - pullama
╭─ pulling llama3.2:latest ─────────────────────╮
│  ✓ manifest · 5 blobs · 4.4 GB
│  ◆ cached    34bb5ab01051 125.0 kB [1/5]
│  › downloading def456abc789 4.2 GB [2/5]
│  [███████████████████▌░░░░░░░░░] 67% 2.9 GB/4.4 GB · 8.2 MB/s · eta 2m30s
│  ✓ verified  def456abc789 (12m34s)
│  ✓ finalized def456abc789
│  ✓ manifest written
╰─────────────────────────────────────────────╯
╔═════════════════════════════════════════════╗
║  ✓  pulled llama3.2:latest                  ║
╠═════════════════════════════════════════════╣
║    size      4.4 GB                         ║
║    blobs     5                              ║
║    elapsed   12m34s                         ║
║    avg rate  5.9 MB/s                       ║
╚═════════════════════════════════════════════╝

Non-TTY output (piped, CI, TERM=dumb) automatically strips colors, Unicode, and spinners.

Compact

pulling llama3.2:latest
67% 3.1/4.7 GB
completed 4.7 GB 12m34s

JSON

Each event is a JSON line — useful for piping to jq or programmatic consumption:

{"Model":"llama3.2","Tag":"latest"}
{"BlobCount":5,"TotalSize":4831838208}
{"pct":67,"OverallDone":3145728000,"OverallTotal":4831838208}

Debug

Raw Go struct formatting of every event — for development and debugging.

Authentication

pullama uses the same Ed25519 key as Ollama (~/.ollama/id_ed25519). If you've used ollama pull or ollama run on this machine, the key already exists and pullama will use it. If not, you'll need to generate one or copy it from a machine that has one.

Compatibility

Writes files that ollama list reads natively
Coexists with a running Ollama server — blobs are content-addressed, so concurrent writes are safe
Works on macOS (arm64/amd64), Linux (amd64/arm64), and Windows (amd64)

Support

If Pullama saved you time (or bandwidth), consider supporting development:

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ancillary_test.go		ancillary_test.go
ansi.go		ansi.go
atomic.go		atomic.go
atomic_test.go		atomic_test.go
auth.go		auth.go
auth_test.go		auth_test.go
bar.go		bar.go
box.go		box.go
build.sh		build.sh
checkpoint.go		checkpoint.go
checkpoint_test.go		checkpoint_test.go
clean.go		clean.go
config.go		config.go
download.go		download.go
download_test.go		download_test.go
errors.go		errors.go
errors_test.go		errors_test.go
failure_test.go		failure_test.go
format.go		format.go
format_test.go		format_test.go
frame.go		frame.go
go.mod		go.mod
go.sum		go.sum
list.go		list.go
lockfile.go		lockfile.go
lockfile_test.go		lockfile_test.go
lockfile_unix.go		lockfile_unix.go
lockfile_windows.go		lockfile_windows.go
main.go		main.go
name.go		name.go
name_test.go		name_test.go
pullama-darwin-arm64		pullama-darwin-arm64
pullama-linux-amd64		pullama-linux-amd64
queue.go		queue.go
queue_runner.go		queue_runner.go
queue_test.go		queue_test.go
registry.go		registry.go
registry_test.go		registry_test.go
render_pretty.go		render_pretty.go
rm.go		rm.go
show.go		show.go
spinner.go		spinner.go
tty.go		tty.go
ui.go		ui.go
ui_pretty_test.go		ui_pretty_test.go
verify.go		verify.go
verify_test.go		verify_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Core-Fidelity presents Pullama.

Install

Usage

Queue

Flags

Exit Codes

How It Works

Crash-safe resume

File layout

Atomicity

Concurrency

Signal Handling

Retry Behavior

`pullama rm` — safe deletion

`pullama clean` — disposable cleanup

Output Modes

Table (default — pretty TTY output)

Compact

JSON

Debug

Authentication

Compatibility

Support

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Core-Fidelity presents Pullama.

Install

Usage

Queue

Flags

Exit Codes

How It Works

Crash-safe resume

File layout

Atomicity

Concurrency

Signal Handling

Retry Behavior

pullama rm — safe deletion

pullama clean — disposable cleanup

Output Modes

Table (default — pretty TTY output)

Compact

JSON

Debug

Authentication

Compatibility

Support

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

`pullama rm` — safe deletion

`pullama clean` — disposable cleanup