Release mtop 1.0.0 · eladser/mtop

First release.

mtop is one terminal window for your local AI: loaded models and the VRAM they hold, GPU state, every request with its tok/s, and a throughput sparkline.

What's in 1.0:

Models, GPU, requests and throughput panes. Zero config — run mtop, it finds Ollama on localhost.
Per-request tok/s via a pass-through proxy on 127.0.0.1:4321. Ollama has no metrics endpoint; the response stream is the only place those numbers exist, so mtop sits in the middle and reads them as they pass. Point OLLAMA_HOST at it.
Model unload: arrows to select, u to evict. Models that blow past their expiry get marked overdue.
-idle-unload 15m evicts anything that hasn't served a request in 15 minutes, for the times ollama forgets to.
Binaries for Windows, Linux and macOS below. GPU stats are NVIDIA-only for now; AMD and Apple Silicon are next on the roadmap, along with llama.cpp and LM Studio.

The gif in the README is a real run against a live model, not a mockup.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mtop 1.0.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!