Published to npm as llm-checker@3.7.4. Folds in everything since 3.7.0 (3.7.1–3.7.4). Full suite 48/48.
Highlights since 3.7.0:
- Correct MoE memory sizing on ALL paths: weights are sized by the TOTAL parameter count and a real observed artifact size always wins, so a large MoE (e.g. a 236B / 397B-A17B model) can no longer falsely "fit" small hardware. Active params drive speed only.
- A size-unknown Ollama variant (e.g.
:latest) no longer inheritsmodel_sizes[0]:qwen3:latestis sized ~9B (not 30B) and stops poisoning the realqwen3:30bsize map (a 19GB model that was falsely fitting 16GB). - Multi-GPU VRAM is no longer double-counted (a 2x24=48GB box stays 48GB).
- Recommendation diversity (3.7.1): the registry surfaces Hugging Face / GPT4All models, not just Ollama — quant/shard variants of the same model collapse to one distinct pick, and a source that scores close to the top is guaranteed a slot. Use
--runtime vllm|mlx|llama.cpp|transformersor--sourceto target explicitly. - Registry CLI validation (3.7.3):
registry-search/registry-recommendreject invalid--source/--format/--runtime/--optimizewith a clear error, and never silently fall back to the built-in catalog when no artifacts match. - Registry ingestor data quality (3.7.4): LoRA adapters and optimizer/training files are no longer ingested as models; F16/FP16/BF16 are precisions (not quantizations); GPT4All sizes/canonical ids fixed; dead index dropped. Regenerated seed: 3 sources, 3,259 repos, 32,779 artifacts.
filterByCategoryand other guards hardened against malformed input; MCPcli_execnow exposes the registry commands.
Full notes: docs/reference/changelog.md