mlx-eval

Utilities to evaluate MLX quantizations.

See detailed results for more information:

Usage

# clone the repo
git clone git@github.com:deepsweet/mlx-eval.git
cd mlx-eval/

# install dependencies
uv sync

# prepare an original reference MLX model 
# fof example, text-only using mlx-lm, or multimodal using mlx-vlm:
uv tool install mlx-vlm --with torchvision
mlx_vlm.convert \
  --hf-path Qwen/Qwen3.6-35B-A3B \
  --mlx-path /path/to/Qwen3.6-35B-A3B-MLX

# prepare a quantized target MLX model
# for example:
mlx_vlm.convert \
  --hf-path Qwen/Qwen3.6-35B-A3B \
  --mlx-path /path/to/Qwen3.6-35B-A3B-MLX-Q4 \
  --quantize \
  --q-bits 4

# compute and store the reference model data into outputs/
# mlx_eval.reference <reference_model_path> <window_count> <max_tokens>
uv run mlx_eval.reference /path/to/Qwen3.6-35B-A3B-MLX 16 8192

# and compare the target quantized model against it
# mlx_eval.compare <target_model_path> <window_count>
uv run mlx_eval.compare /path/to/Qwen3.6-35B-A3B-MLX-Q4 16

Generate chart

uv run results/<model_name>.py

Lint and test

uv sync --group dev
uv run ruff check .
uv run pytest .

License

MIT.

The evaluation prompt is derived from Aes Sedai's combined_all_micro.txt.

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
.github/workflows		.github/workflows
.vscode		.vscode
mlx_eval		mlx_eval
results		results
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mlx-eval

Usage

Generate chart

Lint and test

License

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mlx-eval

Usage

Generate chart

Lint and test

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages