Releases · signerless/llm-checker

Release list

v3.7.4 Latest

Latest

signerless released this 20 Jun 12:43

v3.7.4

6264c76

Published to npm as llm-checker@3.7.4. Folds in everything since 3.7.0 (3.7.1–3.7.4). Full suite 48/48.

Highlights since 3.7.0:

Correct MoE memory sizing on ALL paths: weights are sized by the TOTAL parameter count and a real observed artifact size always wins, so a large MoE (e.g. a 236B / 397B-A17B model) can no longer falsely "fit" small hardware. Active params drive speed only.
A size-unknown Ollama variant (e.g. :latest) no longer inherits model_sizes[0]: qwen3:latest is sized ~9B (not 30B) and stops poisoning the real qwen3:30b size map (a 19GB model that was falsely fitting 16GB).
Multi-GPU VRAM is no longer double-counted (a 2x24=48GB box stays 48GB).
Recommendation diversity (3.7.1): the registry surfaces Hugging Face / GPT4All models, not just Ollama — quant/shard variants of the same model collapse to one distinct pick, and a source that scores close to the top is guaranteed a slot. Use --runtime vllm|mlx|llama.cpp|transformers or --source to target explicitly.
Registry CLI validation (3.7.3): registry-search/registry-recommend reject invalid --source/--format/--runtime/--optimize with a clear error, and never silently fall back to the built-in catalog when no artifacts match.
Registry ingestor data quality (3.7.4): LoRA adapters and optimizer/training files are no longer ingested as models; F16/FP16/BF16 are precisions (not quantizations); GPT4All sizes/canonical ids fixed; dead index dropped. Regenerated seed: 3 sources, 3,259 repos, 32,779 artifacts.
filterByCategory and other guards hardened against malformed input; MCP cli_exec now exposes the registry commands.

Full notes: docs/reference/changelog.md

Assets 2

v3.7.0

signerless released this 20 Jun 09:20

v3.7.0

4977b92

Published to npm as llm-checker@3.7.0. Adds a packaged multi-source model registry and wires it into the recommendation flow. Full suite green at 44/44.

Highlights:

Multi-source registry: a packaged snapshot of ~3,259 repos / ~33,736 exact installable/downloadable artifacts from Hugging Face, Ollama, and GPT4All, with per-source install commands (hf download ..., ollama pull ...). New registry-sync, registry-search, and registry-recommend commands.
recommend (and the check recommendation card) now source candidates from the registry through the canonical deterministic scoring core, with --runtime auto plus Ollama / vLLM / MLX / llama.cpp / Transformers targeting; falls back to the Ollama catalog when the registry is empty or unavailable.
Mixture-of-Experts memory sizing fixed: MoE models (e.g. Mixtral-8x7B, Qwen3-397B-A17B) are sized by their TOTAL parameter count (all experts are resident under Ollama / Metal / vLLM), re-derived from the model name so a stale/under-reported DB value can never make a huge model falsely "fit" small hardware. The packaged seed DB was regenerated so stored MoE totals are correct (Mixtral-8x7B 7B→56B; Qwen3.5-397B-A17B 17B→397B total / 17B active).
Packaged src/data/seed/models.db is ~45 MB unpacked (tarball ~6.5 MB).

Carries everything from 3.6.1 (issue #88 scoring unification, #95 hardware VRAM, #97 MCP hardening, #86/#98 Windows UI). Full notes: docs/reference/changelog.md

Assets 2

v3.6.1

signerless released this 19 Jun 19:57

v3.6.1

9526ca3

Published to npm as llm-checker@3.6.1. First npm release since 3.5.15 — also carries the previously-unpublished 3.6.0 batch. Every fix below ships with an integration test; the full suite is green at 39/39.

Highlights:

Fixes #88 (root cause): check, recommend, and smart-recommend now rank through one canonical scoring core (src/models/scoring-core.js), so they agree on the best model and the high-capacity right-sizing floor applies everywhere — tiny 2B–8B models no longer out-rank large models on high-end hardware. (#96)
Corrects GPU-VRAM detection for high-end / multi-GPU machines: workstation/datacenter cards (RTX PRO 6000, A100, H100, L40, …) no longer collapse to a generic 8 GB, fixes the GB normalization "dead zone", and guards a willModelFit divide-by-zero. A dual RTX PRO 6000 box now reports ~192 GB instead of ~16 GB. (#95)
Hardens the Claude MCP server: reads hardware facts from hw-detect --json instead of regex-scraping CLI text, fixes bogus tokens/sec, runs compare_models sequentially, syncs the advertised version from package.json, and makes the module importable for testing. (#97)
Fixes the Windows interactive-panel flicker (#86): resolves full-panel height overflow on 46–49 row terminals, adds debounced terminal resize handling, and stops the banner pulse from clearing the whole screen 8×/second. (#98)

Full notes: docs/reference/changelog.md

Assets 2

v3.5.13

signerless released this 03 May 13:20

v3.5.13

984fa32

Published to npm as llm-checker@3.5.13.

Highlights:

Ships a prebuilt Ollama SQLite catalog with 229 models and 7176 variants.
Adds weekly model DB update workflow and seed DB refresh tooling.
Cleans the interactive panel with animated status verbs instead of verbose progress bars.
AI Run now shows tokens/sec beside model responses.
Recommendation scoring avoids stale aliases, all-zero pulls, and cloud-only local picks.

Assets 2

v3.5.11

signerless released this 27 Mar 20:10

v3.5.11

b650796

3.5.11 — Windows Ollama Host Normalization Follow-up

Fixed the remaining Windows Ollama client path where OLLAMA_HOST could be inherited as a wildcard bind address such as 0.0.0.0 or [::]
Wildcard bind hosts now normalize back to localhost for client requests
Missing Ollama ports now default to 11434
Kept the native-fetch retry fallback in the release path for retryable network failures such as fetch failed
Updated CLI/docs guidance so custom client endpoints use OLLAMA_BASE_URL
Added regression coverage for wildcard-host normalization

npm package: llm-checker@3.5.11

Assets 2

v3.5.9

signerless released this 26 Mar 09:27

v3.5.9

bf5eda1

v3.5.9

Fixed the remaining Ollama localhost bypasses in selector flows.
Deterministic speed probes now use the shared Ollama client instead of a hardcoded http://localhost:11434 endpoint.
AI evaluator chat requests now use the same resolved Ollama base URL path as the rest of the CLI.
Added selector-specific regression coverage for Windows-style localhost failure with successful 127.0.0.1 fallback.
The separate Windows backend wording question (Best backend: cpu with Runtime assist: Vulkan) remains tracked in #71.

npm:

llm-checker@3.5.9

Assets 2

v3.5.8

signerless released this 25 Mar 22:02

v3.5.8

435b4f0

Windows Ollama localhost fallback + Vulkan assist visibility

retry Ollama availability across localhost loopback candidates and persist the working base URL
filter fake Windows remote display adapters from fallback GPU inventory
surface Vulkan runtime assist metadata for integrated Windows GPU paths
improve hw-detect output for integrated/shared-memory acceleration paths
add regression tests for loopback fallback and Windows GPU reporting

Published npm package: llm-checker@3.5.8

Assets 2

v3.5.7

signerless released this 25 Mar 20:37

v3.5.7

2dc28da

Highlights

fixed Windows CPU detection noise on modern Windows builds where wmic is retired
fixed oversized local Ollama recommendation edge cases on CPU-backed systems
bumped package and MCP server metadata to 3.5.7

Included fixes

#66 WMIC retired
#67 Recommended model with memory requirement more than I have

Validation

npm test passed (26/26)
npm pack --dry-run passed

Assets 2

v3.5.6

signerless released this 13 Mar 15:21

v3.5.6

4c137d4

Integrated GPU Inventory & Hybrid Visibility

Added first-class integrated GPU inventory handling in unified hardware summaries.
Hybrid systems now keep both dedicated and integrated GPU models visible.
Integrated-only systems keep GPU visibility even when the runtime backend remains CPU.
Recommendation, tiering, and token-speed estimation now use canonical integrated-GPU signals more consistently.
CLI output now shows dedicated vs integrated GPU inventory explicitly.
Added regression coverage for hybrid and integrated-only detection paths.

Assets 2

v3.5.4

signerless released this 05 Mar 09:25

v3.5.4

ec1df33

v3.5.4

Fixed

Linux hybrid GPU detection fallback now includes lspci parsing and improved dedicated-GPU enrichment (#58).
AMD ROCm VRAM unit parsing fixed to prevent massively overreported memory (#59).

Added

Fine-tuning suitability labels in check, recommend, and ai-check outputs (Full FT / LoRA / QLoRA support bands) (#60).

Tests

Added regression tests for ROCm VRAM parsing, hybrid GPU fallback detection, and fine-tuning classification.

Commit: ec1df33

Assets 2

Releases: signerless/llm-checker

Release list

v3.7.4

Uh oh!

v3.7.0

Uh oh!

v3.6.1

Uh oh!

v3.5.13

Uh oh!

v3.5.11

3.5.11 — Windows Ollama Host Normalization Follow-up

Uh oh!

v3.5.9

v3.5.9

Uh oh!

v3.5.8

Uh oh!

v3.5.7

Highlights

Included fixes

Validation

Uh oh!

v3.5.6

Integrated GPU Inventory & Hybrid Visibility

Uh oh!

v3.5.4

v3.5.4

Fixed

Added

Tests

Uh oh!