Release mtop 1.3.0 · eladser/mtop

Watch more than one box, see more per request, and a couple of new numbers.

Multi-host: give -ollama a comma list and mtop stacks the models and GPUs from each machine, tagged by host. Handy if you run models on a couple of boxes.
GPU util and memory now draw as sparklines over time, next to the live numbers.
Request inspector: run with -inspect, press i, and you get the last request's prompt, completion, and a load/prompt/decode timing split. Off by default; the text it captures is stripped of control bytes so a model can't smuggle escape sequences into your terminal.
Session energy on the TOK/S line: watt-hours used and tokens per watt-hour. It's whole-GPU power, so read it as a rough efficiency number.
compare -openai <url> runs the comparison against llama.cpp, LM Studio or vLLM, not just ollama.
-mem-alert and -temp-alert to set the alert thresholds instead of the built-in 93% and 87C.

brew and scoop pick this up as usual; winget follows once Microsoft merges the bump.

Provide feedback