Skip to content

Fix KDA equivalence tests and add accelerate dependency#488

Merged
jlamypoirier merged 1 commit intomainfrom
jlp_fix_external_model_tests
Apr 23, 2026
Merged

Fix KDA equivalence tests and add accelerate dependency#488
jlamypoirier merged 1 commit intomainfrom
jlp_fix_external_model_tests

Conversation

@jlamypoirier
Copy link
Copy Markdown
Collaborator

Summary

  • KDA gate activation: the kda_mixer_config test fixture was missing "activation": "sigmoid" in its normalization config. Apriel2 defaults to silu but FLA's KimiDeltaAttention hardcodes sigmoid, causing ~0.5 max diff and 324 test failures on GPU. Fixed by making the test explicit about the activation it's testing against.
  • Mode alignment: removed forced mode = "chunk" on both models since both FLA and Apriel2 share the same fused_recurrent auto-heuristic for seq_len <= 64 in eval — the override was always silently ignored on Apriel2's side.
  • accelerate dependency: transformers 4.57 introduced auto-detection of the torch CUDA context and sets device_map implicitly, which then requires accelerate. Added accelerate>=1.4.0 to the HUGGINGFACE extras in setup.cfg.

Test plan

  • All 2117 remote GPU tests pass, 42 skipped, 0 failures
  • All 371 local CPU tests pass

🤖 Generated with Claude Code

…endency

- Add "activation": "sigmoid" to kda_mixer_config normalization in
  test_mixer_equivalence.py to match FLA's hardcoded sigmoid gate;
  Apriel2's default is silu but the test must be explicit about which
  activation it's comparing against.
- Align fla_kda.mode with Apriel2's auto-heuristic (fused_recurrent for
  seq_len<=64 in eval) instead of forcing chunk mode, which both
  implementations override anyway.
- Add accelerate>=1.4.0 to HUGGINGFACE deps; transformers 4.57 now
  auto-detects a CUDA device context and requires accelerate when
  device_map is set implicitly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jlamypoirier jlamypoirier merged commit b901d8e into main Apr 23, 2026
2 checks passed
@jlamypoirier jlamypoirier deleted the jlp_fix_external_model_tests branch April 23, 2026 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant