Skip to content

Conversation

@jansel
Copy link
Contributor

@jansel jansel commented Nov 7, 2025

stack-info: PR: #1095, branch: jansel/stack/218
jansel added a commit that referenced this pull request Nov 7, 2025
stack-info: PR: #1095, branch: jansel/stack/218
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 7, 2025
@jansel jansel changed the base branch from jansel/stack/217 to main November 7, 2025 01:34
jansel added a commit that referenced this pull request Nov 7, 2025
stack-info: PR: #1095, branch: jansel/stack/218
@jansel jansel changed the base branch from main to jansel/stack/217 November 7, 2025 01:35
Copy link
Contributor

@oulgen oulgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put an example of what the output looks like?

@jansel
Copy link
Contributor Author

jansel commented Nov 7, 2025

jansel added a commit that referenced this pull request Nov 7, 2025
stack-info: PR: #1095, branch: jansel/stack/218
@jansel jansel force-pushed the jansel/stack/218 branch 2 times, most recently from de3369d to d618e77 Compare November 7, 2025 04:42
@jansel jansel changed the base branch from jansel/stack/217 to main November 7, 2025 04:42
@jansel jansel merged commit 06cea84 into main Nov 7, 2025
13 of 15 checks passed
FranciscoThiesen added a commit to FranciscoThiesen/helion that referenced this pull request Nov 7, 2025
…Search

Ran comprehensive benchmark comparing three autotuning algorithms across
3 diverse GPU kernels using PR pytorch#1095 CSV logging feature.

Results:
- MatMul-1024 (compute-bound): PatternSearch won (0.01744ms)
- GELU-1M (bandwidth-bound): DE-Surrogate won (0.00653ms)
- FusedReLUAdd-1M (memory-bound): 3-way tie (0.00643ms)

Outputs:
- 3 convergence plots comparing all algorithms
- 9 CSV logs with per-config metrics (timestamp, perf, compile time)
- Comprehensive analysis report
- Benchmark script using PR pytorch#1095 autotune_log feature

Total benchmarking time: ~1 hour on NVIDIA H200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants