Patch release.
If you fine-tuned a 4-bit base (e.g. Llama-3.2-1B-Instruct-4bit) and saved with model.save_pretrained_merged(...), the reloaded merged model could behave like the base model — the fine-tune was lost — even though inference right after training looked correct. Fixed. If you hit this, just re-merge after upgrading; your adapters were fine all along.
Upgrade
uv pip install -U mlx-tuneNo API changes. The default save_method="merged_16bit" now does the right thing.
Affected before fix
Merging a LoRA adapter into a quantized base with save_pretrained_merged — most likely to bite with light fine-tunes. A strong fine-tune (higher LR / more steps) often survived the 4-bit round-trip, which is why it was intermittent.
Unaffected
Adapter-only saves (save_pretrained + load_adapter), merging into a non-quantized (bf16/fp16) base, and the VLM / STT / Embedding merge paths (separate code).
Full Changelog: https://github.com/arahim3/mlx-tune/commits/v0.5.1