Release SiliconScope v1.3.0 · kennss/SiliconScope

Local-AI monitoring — a dedicated cockpit for running LLMs on Apple Silicon.

AI runtime detection — recognizes Ollama, llama.cpp, LM Studio, MLX, Jan, GPT4All, vLLM by process (bundle-first, sudoless) with RAM/CPU.
Model memory budget — "largest model that fits" + "if you unload " per quant, with a rate-based swap-risk signal that warns before tokens/sec collapse.
Runtime API (opt-in, off by default) — reads the loaded model, the authoritative GPU/CPU offload split (Ollama size_vram/size), and tokens/sec (llama.cpp /metrics) from 127.0.0.1. Nothing leaves your Mac.
AI Workload classifier calibrated against real M1 Max LLM runs (MoE / dense / memory-pressured) and stabilized with a rolling average.
Honest attribution — GPU activity is not assumed to be AI; "LLM" is asserted only with evidence (a runtime serving a model), ANE ⇒ CoreML, Media engine ⇒ video, otherwise "type unknown."

Designed via a multi-stage review, then validated and tuned on-device. Thanks to @durul for the earlier AI-workload groundwork.

Install: download the DMG, open it, drag SiliconScope to Applications. Developer-ID signed + notarized. macOS 14+, Apple Silicon.

Provide feedback