update b200 peak memory bandwidth by danielvegamyhre · Pull Request #4002 · pytorch/ao

danielvegamyhre · 2026-03-05T18:12:29Z

This twitter post claimed B200 memory bandwidth is actually 7680gbps, not 8192gbps, as the original bitwidth of the memory bus reported by CUDA drivers has been updated/corrected from 8192 bits to 7680 bits.

I confirmed this claim using Claude to write a simple CUDA C++ file to query the device driver:

=== Device 0: NVIDIA B200 ===
Memory Bus Width: 7680 bits
Total Global Memory: 178.35 GB
L2 Cache Size: 126.50 MB
Compute Capability: 10.0
Number of SMs: 148
Warp Size: 32
Max Threads per Block: 1024
Max Threads per SM: 2048
Shared Memory per Block: 48.00 KB
Shared Memory per SM: 228.00 KB
Registers per Block: 65536
Registers per SM: 65536
ECC Enabled: Yes

And nvidia-smi to get the memory clock frequency:

nvidia-smi --query-gpu=clocks.mem,clocks.max.mem --format=csv

clocks.current.memory [MHz], clocks.max.memory [MHz]
3996 MHz, 3996 MHz

Bandwidth = (7680bits/8bits per byte)*(3996 MHz memory clock) * 2 DDR / 1e12 = 7.672 TB/s

This PR updates our benchmark/roofline utils accordingly.

pytorch-bot · 2026-03-05T18:12:33Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4002

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 9 Pending

As of commit 1bfdd0c with merge base d6d423e ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo · 2026-03-05T18:30:39Z

torchao/testing/training/roofline_utils.py

-        # https://resources.nvidia.com/en-us-blackwell-architecture, page 20
-        # 8.0 TB per second
-        "peak_mem_bw_bytes_sec": 8.0e12,
+        # (7680 memory bus bitwdith / 8 bits per byte) * (3996 MHz memory clock) * 2 DDR


I think this should definitely keep the source from nvidia, and also cite the additional source you are mentioning in the PR summary

vkuzo · 2026-03-05T18:31:05Z

lg once source is updated!

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 5, 2026

danielvegamyhre requested review from drisspg and vkuzo March 5, 2026 18:12

danielvegamyhre added topic: bug fix Use this tag for PRs that fix bugs topic: for developers Use this tag if this PR is mainly developer facing labels Mar 5, 2026

danielvegamyhre force-pushed the specmarch5 branch from 9df1385 to 00728a5 Compare March 5, 2026 18:22

vkuzo reviewed Mar 5, 2026

View reviewed changes

vkuzo requested changes Mar 5, 2026

View reviewed changes

vkuzo approved these changes Mar 5, 2026

View reviewed changes

danielvegamyhre added the module: core changes affecting multiple modules, e.g. base config/tensor, observers, quant ops label Mar 5, 2026

update b200 peak memory bandwidth

1bfdd0c

danielvegamyhre force-pushed the specmarch5 branch from 00728a5 to 1bfdd0c Compare March 5, 2026 18:59

danielvegamyhre merged commit df68b82 into main Mar 5, 2026
22 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update b200 peak memory bandwidth#4002

update b200 peak memory bandwidth#4002
danielvegamyhre merged 1 commit intomainfrom
specmarch5

danielvegamyhre commented Mar 5, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 5, 2026 •

edited

Loading

Uh oh!

vkuzo Mar 5, 2026 •

edited

Loading

Uh oh!

danielvegamyhre Mar 5, 2026

Uh oh!

vkuzo commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

danielvegamyhre commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4002

⏳ No Failures, 9 Pending

Uh oh!

vkuzo Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danielvegamyhre Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

vkuzo commented Mar 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

danielvegamyhre commented Mar 5, 2026 •

edited

Loading

pytorch-bot bot commented Mar 5, 2026 •

edited

Loading

vkuzo Mar 5, 2026 •

edited

Loading