Skip to content

Refactor testing framework, add ring attention benchmark#38

Merged
MasterJH5574 merged 5 commits into
mlc-ai:mainfrom
haok1402:0512-refactor-test-bench
May 13, 2026
Merged

Refactor testing framework, add ring attention benchmark#38
MasterJH5574 merged 5 commits into
mlc-ai:mainfrom
haok1402:0512-refactor-test-bench

Conversation

@haok1402
Copy link
Copy Markdown
Collaborator

@haok1402 haok1402 commented May 13, 2026

  • Removed legacy testing scripts for the MLA operators, for which FA4 handles it now.
  • Refactored the testing framework to launch torch.distributed via mp.spawn with pytest integration, so future operator tests can follow this pattern. Yet to be extended if unit tests require a multi-node setup.
  • Revised the ring attention benchmark and test:
    • Prepares for the next PR, which will address workload balance for standard ring attention.
    • Benchmark captures an nsys profile of one iteration per scenario (bracketed by cudaProfilerStart/Stop).
    • Added a real-scenario benchmark case: Qwen3-30B-A3B GQA dims at 32K context with CP=4.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a performance benchmarking suite and a structured testing framework for ring attention. Key additions include a Python benchmark script and a shell launcher for latency measurement and profiling, along with a new unit test suite that validates ring attention against a dense baseline. The PR also refactors distributed testing utilities and updates model import paths. Reviewer feedback focuses on improving robustness and accuracy, specifically by suggesting assertions for sequence length divisibility, optimizing CUDA event management in benchmarks, enhancing CLI argument validation, and applying Bash quoting best practices.

Comment thread tests/operators/test_ring_attention.py
Comment thread benchmarks/operators/bench_ring_attention.py
Comment thread benchmarks/operators/bench_ring_attention.py
Comment thread benchmarks/operators/bench_ring_attention.py
Comment thread benchmarks/operators/bench_ring_attention.sh
@MasterJH5574 MasterJH5574 merged commit c76a591 into mlc-ai:main May 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants