Adding a wallclock consistency detection preset by gilbertlee-amd · Pull Request #258 · ROCm/TransferBench

gilbertlee-amd · 2026-04-18T04:53:29Z

Motivation

TransferBench uses GFX wall clock timestamps to measure individual Transfers within a kernel, which may require comparing timestamps across multiple threadblocks working together on the same Transfers. On AMD hardware, each XCC has its own wall clock counter, which may be slightly uncoordinated with one another.

This new wallclock preset executes a simple kernel to try to capture timestamps from various XCCs at the same moment of time, then compares the differences between them. This preset is multi-node capable, allow for convenient checking across a cluster of nodes.

Technical Details

This kernel launches 1 threadblock per XCC as well as an extra threadblock that issues a "go" command to the other threadblocks as to when they should capture a timestamp. This assumes that threadblocks are assigned in round-robin XCC order one at a time. The timestamps are collected then processed on the host and results are printed.

Test Result

Example output (MI355X)

[WallClock Related]
NUM_GPU_DEVICES      =            8 : Limit to using 8 GPUs (per rank)
NUM_ITERATIONS       =           10 : Number of iterations
NUM_WARMUPS          =            3 : Number of warmup iterations
SHOW_ITERATIONS      =            0 : Showing per iteration details. Set to 2 to see raw wallclock values

Running 10 iterations.  Detected wall clock rate of 100000Khz = 0.01 usec per cycle

 Rank GPU Iter Delta(cycles) Delta(usec)
-----------------------------------------
  0    0  AVG      23.00        0.23
  0    1  AVG      24.90        0.25
  0    2  AVG      23.30        0.23
  0    3  AVG      21.30        0.21
  0    4  AVG      22.90        0.23
  0    5  AVG      21.20        0.21
  0    6  AVG      25.10        0.25
  0    7  AVG      22.00        0.22

Minimum Delta detected: 21.20 cycles (0.21 usec)
Maximum Delta detected: 25.10 cycles (0.25 usec)

Additional information, as well as raw timestamp values can be shown by setting SHOW_ITERATIONS

Example showing raw timestamps (SHOW_ITERATIONS=2):

[WallClock Related]
NUM_GPU_DEVICES      =            8 : Limit to using 8 GPUs (per rank)
NUM_ITERATIONS       =           10 : Number of iterations
NUM_WARMUPS          =            3 : Number of warmup iterations
SHOW_ITERATIONS      =            2 : Showing per iteration details. Set to 2 to see raw wallclock values

Running 10 iterations.  Detected wall clock rate of 100000Khz = 0.01 usec per cycle

 Rank GPU Iter Delta(cycles) Delta(usec)      XCC 0           XCC 1           XCC 2           XCC 3           XCC 4           XCC 5           XCC 6           XCC 7
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  0    0   0        20          0.20     355750333035786 355750333035796 355750333035800 355750333035802 355750333035782 355750333035789 355750333035782 355750333035787
  0    0   1        24          0.24     355750333048306 355750333048296 355750333048300 355750333048302 355750333048282 355750333048289 355750333048282 355750333048287
  0    0   2        22          0.22     355750333060766 355750333060776 355750333060780 355750333060778 355750333060758 355750333060769 355750333060762 355750333060767
  0    0   3        22          0.22     355750333073606 355750333073616 355750333073620 355750333073618 355750333073606 355750333073609 355750333073598 355750333073603
  0    0   4        22          0.22     355750333086046 355750333086060 355750333086064 355750333086062 355750333086042 355750333086053 355750333086046 355750333086047
  0    0   5        20          0.20     355750333098614 355750333098624 355750333098628 355750333098630 355750333098610 355750333098617 355750333098610 355750333098615
  0    0   6        20          0.20     355750333111094 355750333111108 355750333111108 355750333111110 355750333111094 355750333111101 355750333111090 355750333111095
  0    0   7        24          0.24     355750333123550 355750333123560 355750333123564 355750333123566 355750333123546 355750333123553 355750333123542 355750333123551
  0    0   8        22          0.22     355750333136026 355750333136036 355750333136044 355750333136038 355750333136022 355750333136029 355750333136022 355750333136027
  0    0   9        24          0.24     355750333148446 355750333148456 355750333148460 355750333148462 355750333148438 355750333148449 355750333148442 355750333148447
  0    0  AVG      22.00        0.22
  0    1   0        26          0.26     355750333850367 355750333850373 355750333850369 355750333850376 355750333850386 355750333850393 355750333850390 355750333850384
  0    1   1        26          0.26     355750333862759 355750333862765 355750333862761 355750333862768 355750333862782 355750333862785 355750333862778 355750333862776
  0    1   2        23          0.23     355750333875099 355750333875105 355750333875101 355750333875108 355750333875118 355750333875121 355750333875122 355750333875116
  0    1   3        27          0.27     355750333887447 355750333887453 355750333887449 355750333887456 355750333887474 355750333887465 355750333887470 355750333887464
  0    1   4        25          0.25     355750333899815 355750333899821 355750333899813 355750333899824 355750333899838 355750333899837 355750333899830 355750333899832
  0    1   5        26          0.26     355750333912143 355750333912149 355750333912145 355750333912152 355750333912166 355750333912169 355750333912158 355750333912160
  0    1   6        25          0.25     355750333924471 355750333924477 355750333924469 355750333924480 355750333924490 355750333924493 355750333924494 355750333924484
  0    1   7        26          0.26     355750333936799 355750333936805 355750333936801 355750333936808 355750333936822 355750333936825 355750333936822 355750333936816
  0    1   8        27          0.27     355750333949163 355750333949169 355750333949165 355750333949172 355750333949190 355750333949185 355750333949178 355750333949180
  0    1   9        23          0.23     355750333961491 355750333961497 355750333961493 355750333961500 355750333961514 355750333961513 355750333961506 355750333961508
  0    1  AVG      25.40        0.25
  0    2   0        22          0.22     355750334646365 355750334646379 355750334646387 355750334646383 355750334646367 355750334646372 355750334646366 355750334646375
  0    2   1        24          0.24     355750334658781 355750334658795 355750334658803 355750334658799 355750334658779 355750334658788 355750334658786 355750334658783
  0    2   2        23          0.23     355750334671173 355750334671163 355750334671167 355750334671171 355750334671151 355750334671160 355750334671150 355750334671155
  0    2   3        25          0.25     355750334683557 355750334683571 355750334683579 355750334683575 355750334683559 355750334683564 355750334683554 355750334683563
  0    2   4        21          0.21     355750334695921 355750334695931 355750334695939 355750334695935 355750334695923 355750334695928 355750334695918 355750334695919
  0    2   5        21          0.21     355750334708293 355750334708303 355750334708311 355750334708307 355750334708291 355750334708300 355750334708290 355750334708295
  0    2   6        28          0.28     355750334720685 355750334720679 355750334720683 355750334720691 355750334720663 355750334720676 355750334720670 355750334720667
  0    2   7        21          0.21     355750334733037 355750334733051 355750334733055 355750334733055 355750334733035 355750334733044 355750334733034 355750334733039
  0    2   8        22          0.22     355750334745385 355750334745399 355750334745403 355750334745407 355750334745387 355750334745392 355750334745386 355750334745391
  0    2   9        25          0.25     355750334757741 355750334757755 355750334757763 355750334757759 355750334757743 355750334757748 355750334757738 355750334757747
  0    2  AVG      23.20        0.23
  0    3   0        20          0.20     355750335420016 355750335420031 355750335420036 355750335420030 355750335420024 355750335420016 355750335420020 355750335420018
  0    3   1        19          0.19     355750335432524 355750335432539 355750335432536 355750335432538 355750335432520 355750335432528 355750335432524 355750335432526
  0    3   2        20          0.20     355750335444924 355750335444919 355750335444924 355750335444918 355750335444904 355750335444904 355750335444912 355750335444906
  0    3   3        20          0.20     355750335457276 355750335457291 355750335457292 355750335457286 355750335457272 355750335457276 355750335457280 355750335457282
  0    3   4        23          0.23     355750335469652 355750335469671 355750335469664 355750335469666 355750335469648 355750335469652 355750335469660 355750335469654
  0    3   5        21          0.21     355750335482008 355750335482027 355750335482024 355750335482018 355750335482008 355750335482012 355750335482016 355750335482006
  0    3   6        20          0.20     355750335494380 355750335494395 355750335494396 355750335494390 355750335494384 355750335494376 355750335494380 355750335494382
  0    3   7        19          0.19     355750335506760 355750335506779 355750335506776 355750335506774 355750335506760 355750335506764 355750335506768 355750335506762
  0    3   8        20          0.20     355750335519116 355750335519131 355750335519136 355750335519130 355750335519124 355750335519120 355750335519116 355750335519118
  0    3   9        20          0.20     355750335531660 355750335531675 355750335531676 355750335531670 355750335531664 355750335531656 355750335531660 355750335531666
  0    3  AVG      20.20        0.20
  0    4   0        21          0.21     355750336235450 355750336235453 355750336235455 355750336235449 355750336235462 355750336235463 355750336235469 355750336235470
  0    4   1        24          0.24     355750336248026 355750336248033 355750336248031 355750336248029 355750336248050 355750336248039 355750336248045 355750336248046
  0    4   2        24          0.24     355750336260582 355750336260589 355750336260591 355750336260585 355750336260606 355750336260599 355750336260601 355750336260602
  0    4   3        23          0.23     355750336273146 355750336273153 355750336273151 355750336273149 355750336273162 355750336273167 355750336273169 355750336273162
  0    4   4        24          0.24     355750336285746 355750336285741 355750336285747 355750336285741 355750336285754 355750336285755 355750336285765 355750336285758
  0    4   5        20          0.20     355750336298338 355750336298345 355750336298343 355750336298341 355750336298358 355750336298351 355750336298357 355750336298358
  0    4   6        24          0.24     355750336311506 355750336311513 355750336311515 355750336311509 355750336311526 355750336311519 355750336311525 355750336311530
  0    4   7        23          0.23     355750336324058 355750336324065 355750336324063 355750336324061 355750336324078 355750336324071 355750336324081 355750336324078
  0    4   8        24          0.24     355750336336606 355750336336613 355750336336615 355750336336609 355750336336630 355750336336619 355750336336625 355750336336626
  0    4   9        23          0.23     355750336349158 355750336349153 355750336349147 355750336349149 355750336349170 355750336349163 355750336349169 355750336349166
  0    4  AVG      23.00        0.23
  0    5   0        27          0.27     355750337052912 355750337052923 355750337052917 355750337052919 355750337052937 355750337052929 355750337052934 355750337052939
  0    5   1        26          0.26     355750337065492 355750337065499 355750337065493 355750337065499 355750337065509 355750337065513 355750337065518 355750337065511
  0    5   2        25          0.25     355750337078104 355750337078107 355750337078101 355750337078107 355750337078117 355750337078117 355750337078126 355750337078119
  0    5   3        25          0.25     355750337090732 355750337090739 355750337090733 355750337090739 355750337090757 355750337090749 355750337090750 355750337090751
  0    5   4        25          0.25     355750337103372 355750337103379 355750337103373 355750337103375 355750337103397 355750337103389 355750337103390 355750337103391
  0    5   5        23          0.23     355750337116276 355750337116283 355750337116277 355750337116283 355750337116293 355750337116293 355750337116298 355750337116299
  0    5   6        23          0.23     355750337128884 355750337128895 355750337128885 355750337128891 355750337128901 355750337128905 355750337128906 355750337128907
  0    5   7        26          0.26     355750337141536 355750337141539 355750337141529 355750337141539 355750337141549 355750337141549 355750337141554 355750337141555
  0    5   8        26          0.26     355750337154172 355750337154179 355750337154169 355750337154175 355750337154193 355750337154185 355750337154190 355750337154195
  0    5   9        26          0.26     355750337166756 355750337166763 355750337166757 355750337166763 355750337166777 355750337166773 355750337166782 355750337166775
  0    5  AVG      25.20        0.25
  0    6   0        25          0.25     355750337911680 355750337911686 355750337911675 355750337911678 355750337911697 355750337911693 355750337911700 355750337911698
  0    6   1        26          0.26     355750337924324 355750337924330 355750337924327 355750337924334 355750337924341 355750337924345 355750337924348 355750337924350
  0    6   2        25          0.25     355750337936952 355750337936958 355750337936947 355750337936954 355750337936965 355750337936965 355750337936972 355750337936970
  0    6   3        24          0.24     355750337949600 355750337949610 355750337949603 355750337949606 355750337949617 355750337949617 355750337949624 355750337949622
  0    6   4        25          0.25     355750337962224 355750337962234 355750337962227 355750337962230 355750337962241 355750337962249 355750337962244 355750337962246
  0    6   5        25          0.25     355750337974840 355750337974850 355750337974843 355750337974846 355750337974865 355750337974857 355750337974864 355750337974858
  0    6   6        22          0.22     355750337987460 355750337987466 355750337987463 355750337987470 355750337987477 355750337987481 355750337987480 355750337987482
  0    6   7        27          0.27     355750338000092 355750338000094 355750338000087 355750338000098 355750338000105 355750338000109 355750338000112 355750338000114
  0    6   8        22          0.22     355750338012724 355750338012730 355750338012727 355750338012734 355750338012745 355750338012741 355750338012740 355750338012746
  0    6   9        25          0.25     355750338025352 355750338025358 355750338025347 355750338025350 355750338025365 355750338025369 355750338025372 355750338025370
  0    6  AVG      24.60        0.25
  0    7   0        19          0.19     355750338751894 355750338751910 355750338751906 355750338751905 355750338751891 355750338751897 355750338751891 355750338751899
  0    7   1        23          0.23     355750338764574 355750338764594 355750338764586 355750338764589 355750338764571 355750338764577 355750338764575 355750338764579
  0    7   2        19          0.19     355750338777250 355750338777266 355750338777262 355750338777265 355750338777247 355750338777253 355750338777251 355750338777255
  0    7   3        22          0.22     355750338789886 355750338789898 355750338789902 355750338789905 355750338789883 355750338789889 355750338789887 355750338789891
  0    7   4        19          0.19     355750338802462 355750338802478 355750338802474 355750338802473 355750338802459 355750338802465 355750338802459 355750338802467
  0    7   5        23          0.23     355750338815010 355750338815030 355750338815022 355750338815025 355750338815011 355750338815017 355750338815007 355750338815019
  0    7   6        19          0.19     355750338827594 355750338827610 355750338827606 355750338827609 355750338827591 355750338827593 355750338827595 355750338827599
  0    7   7        23          0.23     355750338840162 355750338840182 355750338840174 355750338840177 355750338840159 355750338840169 355750338840163 355750338840167
  0    7   8        23          0.23     355750338852746 355750338852766 355750338852758 355750338852761 355750338852743 355750338852749 355750338852747 355750338852751
  0    7   9        27          0.27     355750338865582 355750338865570 355750338865578 355750338865573 355750338865555 355750338865561 355750338865559 355750338865563
  0    7  AVG      21.70        0.22

Minimum Delta detected: 20.20 cycles (0.20 usec)
Maximum Delta detected: 25.40 cycles (0.25 usec)

Copilot

Pull request overview

Adds a new preset entry intended to measure AMD GPU wallclock consistency across XCCs, and records the feature in the changelog.

Changes:

Registers a new "wallclock" preset in the preset dispatcher.
Adjusts macro cleanup behavior for GetXccId in TransferBench.hpp.
Documents the new preset in CHANGELOG.md.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
src/header/TransferBench.hpp	Stops undefining `GetXccId` at the end of the header (macro now leaks past the header boundary).
src/client/Presets/Presets.hpp	Adds include and preset map entry for the new `wallclock` preset.
CHANGELOG.md	Adds a bullet noting the new wallclock preset.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

alex-breslow-amd

LGTM. Is there a missing Wallclock.hpp header? I didn't see the implementation. I wasn't sure if it was already committed.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Initial pod communication support (#235) - cuda + MNNVL update & pod presets (#241) - Increase CQ size for high qps (#244) - fix hang when NVML is present but fabricmanager isnt (#246) - Adding nica2a preset (#248) - Adding HBM read bandwidth preset (#250) - Pod Ring preset (#251) - gfxsweep preset (#254) (#256) - Adding Batched DMA support (hipMemcpyBatchAsync), and bmasweep preset (#255) - Adding a wallclock consistency detection preset (#258) - Adding smoketest preset for simple correctness tests (#266) - Help / envvars / presets presets (#267) - Modernize CMake build (#268) - Replace version-based pod/amd-smi detection with compile-time API probes (#269) - Fix collective mismatch hangs in multi-rank error paths (#270) - Fix SHOW_ITERATIONS table truncation with multiple transfers per executor (#271) - Reformat a2asweep output to match gfxsweep style (#272) - Gfx sweep update (#274) - Increasing flush frequency in smoketest (#275) - Adding new experimental copy-only GFX kernel, gfxsweep update (#277) - Fixes for cuMem compilation and invalid device ordinal (#278) - Simplifying socket connect, allow for using host address (#279) - Updating podring to run on single node without need to force single pod (#280) - Adding SHOW_PERCENTILES to show extra per-iteration statistics (#281) --------- Co-authored-by: AtlantaPepsi <timhu102@gmail.com> Co-authored-by: Pak Nin Lui <pak.lui@amd.com> Co-authored-by: pierreantoineH <PierreAntoine.Harraud@amd.com> Co-authored-by: Nilesh M Negi <Nilesh.Negi@amd.com> Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

- Initial pod communication support (#235) - cuda + MNNVL update & pod presets (#241) - Increase CQ size for high qps (#244) - fix hang when NVML is present but fabricmanager isnt (#246) - Adding nica2a preset (#248) - Adding HBM read bandwidth preset (#250) - Pod Ring preset (#251) - gfxsweep preset (#254) (#256) - Adding Batched DMA support (hipMemcpyBatchAsync), and bmasweep preset (#255) - Adding a wallclock consistency detection preset (#258) - Adding smoketest preset for simple correctness tests (#266) - Help / envvars / presets presets (#267) - Modernize CMake build (#268) - Replace version-based pod/amd-smi detection with compile-time API probes (#269) - Fix collective mismatch hangs in multi-rank error paths (#270) - Fix SHOW_ITERATIONS table truncation with multiple transfers per executor (#271) - Reformat a2asweep output to match gfxsweep style (#272) - Gfx sweep update (#274) - Increasing flush frequency in smoketest (#275) - Adding new experimental copy-only GFX kernel, gfxsweep update (#277) - Fixes for cuMem compilation and invalid device ordinal (#278) - Simplifying socket connect, allow for using host address (#279) - Updating podring to run on single node without need to force single pod (#280) - Adding SHOW_PERCENTILES to show extra per-iteration statistics (#281) --------- Co-authored-by: Tim <43156029+AtlantaPepsi@users.noreply.github.com> Co-authored-by: Pak Nin Lui <pak.lui@amd.com> Co-authored-by: pierreantoineH <PierreAntoine.Harraud@amd.com> Co-authored-by: Nilesh M Negi <Nilesh.Negi@amd.com> Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Adding a wallclock consistency detection preset

2cb701f

gilbertlee-amd requested review from a team as code owners April 18, 2026 04:53

nileshnegi requested a review from Copilot April 18, 2026 04:54

Copilot started reviewing on behalf of nileshnegi April 18, 2026 04:56 View session

Copilot AI reviewed Apr 18, 2026

View reviewed changes

Comment thread src/client/Presets/Presets.hpp

Comment thread src/header/TransferBench.hpp

Comment thread src/client/Presets/Presets.hpp

alex-breslow-amd approved these changes Apr 18, 2026

View reviewed changes

Forgot to add new Preset file

7aaa2c3

nileshnegi requested a review from Copilot April 18, 2026 07:47

Copilot started reviewing on behalf of nileshnegi April 18, 2026 07:48 View session

Copilot AI reviewed Apr 18, 2026

View reviewed changes

Comment thread src/client/Presets/WallClock.hpp

Comment thread src/client/Presets/WallClock.hpp Outdated

Comment thread src/client/Presets/WallClock.hpp

Switching to Utils memory allocation/deallocation

4ba3bce

gilbertlee-amd merged commit b57f2e2 into ROCm:candidate Apr 19, 2026
1 check passed

mustafabar reviewed Apr 21, 2026

View reviewed changes

Comment thread src/client/Presets/WallClock.hpp

nileshnegi mentioned this pull request Apr 27, 2026

TransferBench v1.67.0 #273

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a wallclock consistency detection preset#258

Adding a wallclock consistency detection preset#258
gilbertlee-amd merged 3 commits intoROCm:candidatefrom
gilbertlee-amd:wallClockPreset

gilbertlee-amd commented Apr 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alex-breslow-amd left a comment •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gilbertlee-amd commented Apr 18, 2026

Motivation

Technical Details

Test Result

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alex-breslow-amd left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alex-breslow-amd left a comment •

edited

Loading