Renames Flash Attention to SDPA in benchmark suite #67

LoserCheems · 2025-07-10T12:56:39Z

Updates terminology throughout the performance benchmark to use "SDPA" (Scaled Dot Product Attention) instead of "Flash Attention" for consistency with PyTorch's official naming convention.

Changes function names, variable references, command-line options, and display text to reflect that the baseline implementation uses PyTorch's SDPA rather than specifically Flash Attention, improving clarity about the actual underlying implementation being benchmarked.

Updates terminology throughout the performance benchmark to use "SDPA" (Scaled Dot Product Attention) instead of "Flash Attention" for consistency with PyTorch's official naming convention. Changes function names, variable references, command-line options, and display text to reflect that the baseline implementation uses PyTorch's SDPA rather than specifically Flash Attention, improving clarity about the actual underlying implementation being benchmarked.

Copilot

Pull Request Overview

Updates terminology from “Flash Attention” to “SDPA” across the benchmark suite to align with PyTorch’s official naming and improve clarity.

Renamed function, variables, flags, and display text from Flash Attention to SDPA.
Adjusted command-line options (--test-type) and titles/headers in output.
Updated comments and docstrings to reference SDPA baseline.

Comments suppressed due to low confidence (3)

benchmarks/benchmark_forward_performance.py:535

[nitpick] Consider renaming the variable run_flash to run_sdpa (and updating its usage) to better reflect that it's selecting SDPA runs rather than Flash Attention.

    run_flash = test_type in ['all', 'sdpa', 'sdpa-vs-cuda', 'sdpa-vs-triton', 'sdpa-vs-flex']

benchmarks/benchmark_forward_performance.py:813

[nitpick] The tuple for 'sdpa' still references 'flash_attention'; consider renaming it (e.g., 'sdpa_attention') or matching the new function name scaled_dot_product_attention_cuda for consistency.

            'sdpa': ('flash_attention', results['flash_attention_status'], results['flash_attention_times']),

benchmarks/benchmark_forward_performance.py:172

Update the docstring to reflect that this function returns a tuple (attn_outputs, time_ms) or ("OOM", 0) rather than just the attention output, so the documentation matches the actual return values.

    Returns:

LoserCheems requested review from Copilot and removed request for Copilot July 10, 2025 12:56

Copilot AI reviewed Jul 10, 2025

View reviewed changes

LoserCheems merged commit bf4dddc into main Jul 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Renames Flash Attention to SDPA in benchmark suite #67

Renames Flash Attention to SDPA in benchmark suite #67

Uh oh!

LoserCheems commented Jul 10, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Renames Flash Attention to SDPA in benchmark suite #67

Renames Flash Attention to SDPA in benchmark suite #67

Uh oh!

Conversation

LoserCheems commented Jul 10, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants