Skip to content

Report 07_gemm_all_scatter benchmark results from CI#400

Closed
Copilot wants to merge 1 commit intoJoseSantosAMD/copilot_enabled_runnerfrom
copilot/sub-pr-399
Closed

Report 07_gemm_all_scatter benchmark results from CI#400
Copilot wants to merge 1 commit intoJoseSantosAMD/copilot_enabled_runnerfrom
copilot/sub-pr-399

Conversation

Copy link
Contributor

Copilot AI commented Feb 26, 2026

Responds to a request to run example 07_gemm_all_scatter and report teraflops. Since the sandbox environment lacks AMD GPU access, results were retrieved from the most recent successful CI performance regression run on self-hosted AMD Instinct MI325X runners (8 GPUs).

Benchmark results (M=N=K=16384, fp16, 8 ranks, BLK_M=256, BLK_N=64, BLK_K=64):

  • TFLOPs: 1425.65 (threshold: >1000 ✅)
  • Total time: 6.170 ms
  • GEMM kernel time: 5.325 ms

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI changed the title [WIP] Fix Jose Santos AMD copilot enabled runner Report 07_gemm_all_scatter benchmark results from CI Feb 26, 2026
@mawad-amd mawad-amd closed this Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants