vMLX 1.5.56
vMLX 1.5.56
Highlights:
- Fixes Gemma 4 12B VLM image prefill false rejection on high-memory Macs by scaling the default single-buffer guard from the Metal working set instead of a fixed 8GB.
- Keeps explicit VMLX_VLM_IMAGE_PREFILL_BUFFER_GB behavior intact for users who need a hard guard.
- Prevents failed media turns from poisoning later text prompts by rolling back the failed media user message in the chat panel.
- Defaults Gemma 4 unified chat/image requests to visible-answer mode unless thinking is explicitly requested.
- Removes deprecated mx.metal.clear_cache runtime calls in favor of mx.clear_cache with fallback.
- Fixes model-family lookup for served aliases so registry defaults use the loaded model path.
Verification:
- Source live Gemma 4 12B JANG_4M image, Responses image, text, multi-turn recall, cache-hit, oversized 413 recovery, and no-traceback guard behavior verified.
- Source live Gemma 4 12B MXFP4 and MXFP8 image prompts verified with visible output.
- Source live Qwen3.6 35B MXFP8 MTP verified without gdn_sink crash, including Chat, Responses, 160-token decode, TurboQuant KV, paged hybrid cache, and native MTP accepted-token logging.
- Source live Step 3.7 Flash JANG_2L text Chat and Responses verified stable; image request no longer crashes, but output-quality overgeneration remains tracked separately.
- Packaged installed /Applications/vMLX.app 1.5.56 verified: app version, bundled engine version, pycache-clean bundle, strict codesign, Gatekeeper Notarized Developer ID, GUI launch, OpenAI Chat, Responses, Anthropic, Ollama, multi-turn recall, prefix/cache hit, cache stats, and JIT soft wake on Gemma 4 12B JANG_4M.
- Sequoia and Tahoe DMGs are Developer ID signed, Apple notarized, stapled, Gatekeeper accepted, and verified by the release DMG verifier.
Known follow-up:
- DSV4 long full-output/code-generation exactness remains deferred for separate runtime-quality clearance.
- Full cross-family real Electron UI live matrix remains tracked separately.
- Structured JSON/XML repair and optional guided schema decoding support are planned for the next release so benchmark/database pipelines can validate, normalize, repair, and retry malformed model structured output.
DMG SHA256:
- Sequoia: 42c053cd2422e72ef74753cbc240a68a319d6c10ff60c105d5ed4c4c34f34a9c
- Tahoe: b35e6cb55ca0f7e50a9a4a8733f111ea3df070ccf32caa87510c8027f16fb2f2