Updates benchmark test configurations for better coverage #45

LoserCheems · 2025-07-01T03:17:26Z

Adjusts query and key lengths in test configurations to provide more balanced testing scenarios.

Changes small sequence length tests from 4 to 64 tokens to better represent realistic use cases, and modifies the largest configuration to use matching sequence lengths with non-causal attention for improved test diversity.

Adjusts query and key lengths in test configurations to provide more balanced testing scenarios. Changes small sequence length tests from 4 to 64 tokens to better represent realistic use cases, and modifies the largest configuration to use matching sequence lengths with non-causal attention for improved test diversity.

Copilot

Pull Request Overview

This PR updates benchmark test configurations to improve coverage by adjusting sequence lengths and attention settings.

Increased small-sequence tests from 4 to 64 tokens for more realistic scenarios
Matched query and key lengths and switched the largest test to non-causal attention

Comments suppressed due to low confidence (2)

benchmarks/benchmark_forward_equivalence.py:360

By removing the minimal-length test (query_len=4), we lose an edge-case check for very short sequences; consider retaining at least one small-length test to validate behavior at minimal token counts.

        (1, 1, 1, 64, 64, 32, True),

benchmarks/benchmark_forward_equivalence.py:379

Replacing the largest causal test with a non-causal one removes coverage of causal attention at the maximum sequence length; consider including both causal and non-causal variants for the 512×512 configuration.

        (1, 2, 1, 256, 256, 128, True),

LoserCheems requested review from Evanwu1125, SNHuan, Copilot and wubingheng111 July 1, 2025 03:17

LoserCheems assigned SNHuan, Evanwu1125, wubingheng111 and LoserCheems Jul 1, 2025

LoserCheems added the bug Something isn't working label Jul 1, 2025

Copilot AI reviewed Jul 1, 2025

View reviewed changes

LoserCheems merged commit f006b7a into main Jul 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Updates benchmark test configurations for better coverage #45

Updates benchmark test configurations for better coverage #45

Uh oh!

LoserCheems commented Jul 1, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Updates benchmark test configurations for better coverage #45

Updates benchmark test configurations for better coverage #45

Uh oh!

Conversation

LoserCheems commented Jul 1, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants