Releases · dgrauet/claude-skill-mlx-porting

20 Apr 12:00

dgrauet

v2.1.0

2bad819

v2.1.0 — Pitfall #8: Tekken / Pixtral tokenizer skips the BOS Latest

Latest

Added

Pitfall #8 — Tekken / Pixtral tokenizer skips the BOS. When wiring `mlx-lm` Mistral-family text encoders (Ministral3, Mistral Small 3, Pixtral) to a diffusion DiT, `add_special_tokens=True` does NOT auto-prepend ``. Token 0 then enters the attention stack at an out-of-distribution magnitude and compounds layer-by-layer (diverges from HF transformers by 100× starting at layer 2 in the ERNIE-Image burn). The content tokens remain fine but the DiT receives conditioning it was not trained on.

The pitfall documents the symptom, the layer-by-layer measurement from the ERNIE-Image port that isolated it, and a one-line fix for the pipeline's `_tokenize` helper.

Why this matters

Pitfall #7 (checkerboard trap, shipped in v2.0.0) gave us the diagnostic procedure. Pitfall #8 is a follow-up trap that the same port surfaced. Both are now codified so future MLX diffusion ports using Mistral-family text encoders won't have to rediscover them.

Skill asset

The `mlx-porting.skill` artifact below is built by the release workflow (introduced in v2.0.0) and can be dropped straight into Claude Code via `/skill install` or unpacked into `~/.claude/skills/`.

Commits

`2bad819` — feat(pitfalls): add #8 — Tekken/Pixtral tokenizer skips BOS

Assets 3

20 Apr 00:23

dgrauet

v2.0.0

e43d7df

v2.0.0 — Rename to mlx-porting + checkerboard pitfall + CI

Breaking

Skill directory renamed porting-pytorch-to-mlx/ → mlx-porting/. Same move inside the packaged .skill tarball.
Frontmatter name updated accordingly: mlx-porting.
Existing v1.0.0 installs (the ~/.claude/skills/porting-pytorch-to-mlx/ layout) will keep working but will NOT receive these updates — reinstall from source or download the new mlx-porting.skill artifact below.

Added

Pitfall #7 — The checkerboard trap in references/common-pitfalls.md. Covers the four recurring causes (mx.tile vs mx.repeat, pixel-shuffle axis order, text-encoder hidden_states[-2] off-by-one, scheduler dtype leaking fp32 into a bf16 DiT) and a three-test diagnostic procedure to run before shipping every port.
SKILL.md upgrades: new reading-time checklist bullet flagging the checkerboard trap, plus a caveat in Step 5 that small-scale random-weight parity is necessary but insufficient.
Helpers in scripts/parity_helpers.py: `detect_checkerboard(image)` (autocorrelation-based) and `noise_decode_check(decode_fn, shape)` to wire the diagnostic as a permanent smoke test.
GitHub Actions:
- `ci.yml` validates frontmatter, `evals.json` schema, cross-references, and python syntax on every push and PR, plus a `.skill` packaging smoke test.
- `release.yml` auto-builds `mlx-porting.skill` on tag push and attaches it to the release (this release is the first to use it).
CI badge on the README.

Why this matters

Every MLX port I've shipped has hit a checkerboard-looking output at some point because layer-level parity passes with random weights at small scale but the bug only manifests at production scale. This release codifies the fix: the 3-test diagnostic catches 95% of the class in under 90 seconds, and the rule "never tweak sampling parameters to mask a spatial-operator bug" is now part of the checklist.

Asset

`mlx-porting.skill` below is produced by the new release workflow. Drop it into Claude Code via `/skill install mlx-porting.skill` or unpack under `~/.claude/skills/`.

Assets 3

19 Apr 16:51

dgrauet

v1.0.0

997dc50

v1.0.0 — Initial release

First public release of the porting-pytorch-to-mlx Claude Code skill.

Install

Download porting-pytorch-to-mlx.skill and drop it into Claude Code, or clone the repo and copy the source:

```bash
git clone https://github.com/dgrauet/claude-skill-mlx-porting.git
cp -r claude-skill-mlx-porting/porting-pytorch-to-mlx ~/.claude/skills/
```

What's included

SKILL.md — 7-step porting workflow + six reading-time traps
6 reference files — MLX docs, common pitfalls, attention patterns, weight conversion, parity testing, repo layout
scripts/parity_helpers.py — reusable PyTorch↔MLX helpers
evals/evals.json — 5 representative test cases

Measured performance

Triggering accuracy: 100% (precision + recall = 1.0 across 20 queries)
Pass-rate lift vs baseline Opus 4.7: +10 to +25 percentage points on workflow-intensive tasks

See the README for full details.

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Added

Why this matters

Skill asset

Commits

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Breaking

Added

Why this matters

Asset

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Install

What's included

Measured performance

Uh oh!

Releases: dgrauet/claude-skill-mlx-porting

v2.1.0 — Pitfall #8: Tekken / Pixtral tokenizer skips the BOS

Added

Why this matters

Skill asset

Commits

Uh oh!

v2.0.0 — Rename to mlx-porting + checkerboard pitfall + CI

Breaking

Added

Why this matters

Asset

Uh oh!

v1.0.0 — Initial release

Install

What's included

Measured performance

Uh oh!