Fix KDA equivalence tests and add accelerate dependency by jlamypoirier · Pull Request #488 · ServiceNow/Fast-LLM

jlamypoirier · 2026-04-23T00:28:40Z

Summary

KDA gate activation: the kda_mixer_config test fixture was missing "activation": "sigmoid" in its normalization config. Apriel2 defaults to silu but FLA's KimiDeltaAttention hardcodes sigmoid, causing ~0.5 max diff and 324 test failures on GPU. Fixed by making the test explicit about the activation it's testing against.
Mode alignment: removed forced mode = "chunk" on both models since both FLA and Apriel2 share the same fused_recurrent auto-heuristic for seq_len <= 64 in eval — the override was always silently ignored on Apriel2's side.
accelerate dependency: transformers 4.57 introduced auto-detection of the torch CUDA context and sets device_map implicitly, which then requires accelerate. Added accelerate>=1.4.0 to the HUGGINGFACE extras in setup.cfg.

Test plan

All 2117 remote GPU tests pass, 42 skipped, 0 failures
All 371 local CPU tests pass

🤖 Generated with Claude Code

…endency - Add "activation": "sigmoid" to kda_mixer_config normalization in test_mixer_equivalence.py to match FLA's hardcoded sigmoid gate; Apriel2's default is silu but the test must be explicit about which activation it's comparing against. - Align fla_kda.mode with Apriel2's auto-heuristic (fused_recurrent for seq_len<=64 in eval) instead of forcing chunk mode, which both implementations override anyway. - Add accelerate>=1.4.0 to HUGGINGFACE deps; transformers 4.57 now auto-detects a CUDA device context and requires accelerate when device_map is set implicitly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jlamypoirier merged commit b901d8e into main Apr 23, 2026
2 checks passed

jlamypoirier deleted the jlp_fix_external_model_tests branch April 23, 2026 00:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix KDA equivalence tests and add accelerate dependency#488

Fix KDA equivalence tests and add accelerate dependency#488
jlamypoirier merged 1 commit intomainfrom
jlp_fix_external_model_tests

jlamypoirier commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jlamypoirier commented Apr 23, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant