Skip to content

v0.2.0 - Exl3 Converter on Apple Silicon

Choose a tag to compare

@beamivalice beamivalice released this 18 Jun 12:10
· 30 commits to master since this release

Inference

  • MiniCPM5-1B EXL3 support (model_type llama)
  • ~152 tok/s greedy decode on M5 Max; ~0.9 GB resident

Converter (ponyexl3-convert)

  • HF → EXL3 conversion on Metal: trellis search, Hessian/LDLQ, regularization, calibration, allocation
  • Full-model MiniCPM5-1B in ~7 min (direct path)
  • KLD vs bf16 matches turboderp/MiniCPM5-1B-exl3 4.00bpw (KLD 0.0422 vs 0.0428)

Install

pip install "ponyexl3 @ git+https://github.com/beamivalice/PonyExl3.git@v0.2.0"

Full changelog: CHANGELOG.md