SiliconScope v1.3.0
Local-AI monitoring — a dedicated cockpit for running LLMs on Apple Silicon.
- AI runtime detection — recognizes Ollama, llama.cpp, LM Studio, MLX, Jan, GPT4All, vLLM by process (bundle-first, sudoless) with RAM/CPU.
- Model memory budget — "largest model that fits" + "if you unload " per quant, with a rate-based swap-risk signal that warns before tokens/sec collapse.
- Runtime API (opt-in, off by default) — reads the loaded model, the authoritative GPU/CPU offload split (Ollama
size_vram/size), and tokens/sec (llama.cpp/metrics) from127.0.0.1. Nothing leaves your Mac. - AI Workload classifier calibrated against real M1 Max LLM runs (MoE / dense / memory-pressured) and stabilized with a rolling average.
- Honest attribution — GPU activity is not assumed to be AI; "LLM" is asserted only with evidence (a runtime serving a model), ANE ⇒ CoreML, Media engine ⇒ video, otherwise "type unknown."
Designed via a multi-stage review, then validated and tuned on-device. Thanks to @durul for the earlier AI-workload groundwork.
Install: download the DMG, open it, drag SiliconScope to Applications. Developer-ID signed + notarized. macOS 14+, Apple Silicon.