Problem
mlxcel_core::slice_update is a functional op: it wraps mlx::core::slice_update, which returns a NEW array (MLX arrays are immutable) and does NOT mutate its src argument. Several visual-injection / section-assembly loops call it as a bare statement and discard the return value, so result is never updated and the write is silently lost.
This exact bug made Granite 4 Vision completely blind (the model ran text-only) and was fixed in PR #649 (see inject_at_positions in src/models/granitemoehybrid.rs, which now captures the return and has a co-located regression test). The same broken pattern exists at these other sites, copied from the same template:
src/models/qwen3_vl.rs:798 (DeepStack injection loop)
src/models/qwen3_vl_moe.rs:991 (DeepStack injection loop)
src/models/qwen2_vl.rs:205 (section-assembly loop)
src/models/paddleocr_vl.rs:191 (section-assembly loop)
Impact
- qwen3_vl / qwen3_vl_moe: the MAIN image features enter via
inputs_embeds, and DeepStack is only an auxiliary deeper-layer refinement, so the dropped write degrades output quality rather than fully blinding the model. Still a real correctness bug.
- qwen2_vl / paddleocr_vl: these are section-assembly loops (mrope / feature packing); the discarded return may or may not drop data depending on control flow. Verify each before fixing.
Fix
Capture the return: result = mlxcel_core::slice_update(&result, &val, ...). For a contiguous run, collapse the per-row loop to a single slice_update (see the granite4 fix). Add a regression test per model where practical.
Validation
Each fix must be validated on a real checkpoint with a real image, comparing per-op absmeans (q/k/v post-rope, SDPA output) against the mlx-vlm reference. Diagnostic signature of a dropped injection: hidden-state at image positions stays ~0, so the all-positions q/k/v absmean collapses to roughly reference x (n_text / seq_len).
Problem
mlxcel_core::slice_updateis a functional op: it wrapsmlx::core::slice_update, which returns a NEW array (MLX arrays are immutable) and does NOT mutate itssrcargument. Several visual-injection / section-assembly loops call it as a bare statement and discard the return value, soresultis never updated and the write is silently lost.This exact bug made Granite 4 Vision completely blind (the model ran text-only) and was fixed in PR #649 (see
inject_at_positionsinsrc/models/granitemoehybrid.rs, which now captures the return and has a co-located regression test). The same broken pattern exists at these other sites, copied from the same template:src/models/qwen3_vl.rs:798(DeepStack injection loop)src/models/qwen3_vl_moe.rs:991(DeepStack injection loop)src/models/qwen2_vl.rs:205(section-assembly loop)src/models/paddleocr_vl.rs:191(section-assembly loop)Impact
inputs_embeds, and DeepStack is only an auxiliary deeper-layer refinement, so the dropped write degrades output quality rather than fully blinding the model. Still a real correctness bug.Fix
Capture the return:
result = mlxcel_core::slice_update(&result, &val, ...). For a contiguous run, collapse the per-row loop to a singleslice_update(see the granite4 fix). Add a regression test per model where practical.Validation
Each fix must be validated on a real checkpoint with a real image, comparing per-op absmeans (q/k/v post-rope, SDPA output) against the mlx-vlm reference. Diagnostic signature of a dropped injection: hidden-state at image positions stays ~0, so the all-positions q/k/v absmean collapses to roughly
reference x (n_text / seq_len).