Skip to content

SiliconScope v1.3.0

Choose a tag to compare

@kennss kennss released this 14 Jun 17:17
· 9 commits to main since this release

Local-AI monitoring — a dedicated cockpit for running LLMs on Apple Silicon.

  • AI runtime detection — recognizes Ollama, llama.cpp, LM Studio, MLX, Jan, GPT4All, vLLM by process (bundle-first, sudoless) with RAM/CPU.
  • Model memory budget — "largest model that fits" + "if you unload " per quant, with a rate-based swap-risk signal that warns before tokens/sec collapse.
  • Runtime API (opt-in, off by default) — reads the loaded model, the authoritative GPU/CPU offload split (Ollama size_vram/size), and tokens/sec (llama.cpp /metrics) from 127.0.0.1. Nothing leaves your Mac.
  • AI Workload classifier calibrated against real M1 Max LLM runs (MoE / dense / memory-pressured) and stabilized with a rolling average.
  • Honest attribution — GPU activity is not assumed to be AI; "LLM" is asserted only with evidence (a runtime serving a model), ANE ⇒ CoreML, Media engine ⇒ video, otherwise "type unknown."

Designed via a multi-stage review, then validated and tuned on-device. Thanks to @durul for the earlier AI-workload groundwork.

Install: download the DMG, open it, drag SiliconScope to Applications. Developer-ID signed + notarized. macOS 14+, Apple Silicon.