Local-first MLX-native toolkit for discovering, preparing, and fine-tuning small language models on Apple Silicon.
mlx-lab provides:
- a Python library (
mlx_lab) - a CLI (
mlx-lab) - deterministic data-cleaning and LoRA run workflows
- reproducibility artifacts for replay and run comparison
+----------------------+
| Hugging Face Hub API |
+----------+-----------+
|
| model/dataset metadata
v
+------------------+ +-----------+------------+ +-------------------+
| Local raw data +------->+ mlx-lab CLI + library +------->+ Cleaned JSONL |
| JSON / JSONL | | (model/dataset/data/ | | prompt/completion |
+------------------+ | train/run commands) | +---------+---------+
+-----------+------------+ |
| | train lora
v v
+-----------+------------+ +---------+---------+
| Runtime + preflight |------->+ Run directory |
| checks (macOS arm64, | | metrics, checkpoints|
| dataset validity, deps)| | manifests, state |
+------------------------+ +---------------------+
1) Discover candidates
mlx-lab model search / dataset search
|
v
2) Inspect one model + dataset
mlx-lab model inspect / dataset inspect
|
v
3) Clean raw data to canonical JSONL
mlx-lab data clean
|
v
4) Run LoRA training (mlx or simulated)
mlx-lab train lora
|
v
5) Replay or compare runs
mlx-lab run replay / run compare
- Docs index:
docs/README.md - Quickstart:
docs/quickstart.md - Architecture and internals:
docs/architecture.md - CLI command reference:
docs/cli-reference.md - Release process:
docs/release-checklist.md
- macOS only
- Apple Silicon only
- Python 3.10+
uv sync --frozen
uv run mlx-lab --helpmlx-lab
|-- model
| |-- search
| `-- inspect
|-- dataset
| |-- search
| `-- inspect
|-- data
| `-- clean
|-- train
| `-- lora
`-- run
|-- replay
`-- compare
mlx-lab train lora writes reproducibility artifacts into each run directory:
run_manifest.json: model/dataset fingerprint, effective config hash, backend, and environment snapshot.metrics.jsonl: per-step structured logs (step,loss,throughput_tokens_per_s,learning_rate, timestamp).checkpoints/: periodic adapter checkpoints for resume/replay.train_config.resolved.json: resolved config plus config hash.preflight.json: dataset/runtime validation report.run_state.json: latest step/checkpoint summary.
Deterministic defaults:
max_steps=50checkpoint_interval=10learning_rate=2e-4batch_size=4lora_rank=16seed=7
Determinism limits:
- Requires identical cleaned dataset bytes and effective config.
- Numeric behavior can differ across backend/library versions.
- Resume and replay depend on unchanged checkpoint artifacts.
- Source package:
src/mlx_lab/ - CLI entrypoint:
mlx-lab - Tests:
tests/
Run tests:
uv run python -m unittest discover -s tests -p "test_*.py"
uv run python -m unittest tests/test_release_packaging.py -v