Skip to content

[STF] Rework dot tool to have really nested sections#5723

Merged
caugonnet merged 26 commits intoNVIDIA:mainfrom
caugonnet:stf_hierarchical_dot
Sep 4, 2025
Merged

[STF] Rework dot tool to have really nested sections#5723
caugonnet merged 26 commits intoNVIDIA:mainfrom
caugonnet:stf_hierarchical_dot

Conversation

@caugonnet
Copy link
Contributor

@caugonnet caugonnet commented Aug 29, 2025

Description

This is a significant rewrite of the dot generation tool, to take into account nested contexts, collapse parts of the graphs etc...

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@caugonnet caugonnet requested a review from a team as a code owner August 29, 2025 20:20
@caugonnet caugonnet requested a review from griwes August 29, 2025 20:20
@github-project-automation github-project-automation bot moved this to Todo in CCCL Aug 29, 2025
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Aug 29, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@caugonnet caugonnet added the stf Sequential Task Flow programming model label Aug 29, 2025
@caugonnet caugonnet self-assigned this Aug 29, 2025
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Aug 29, 2025
@caugonnet
Copy link
Contributor Author

/ok to test 31c03d1

@caugonnet
Copy link
Contributor Author

/ok to test 1eeff57

@caugonnet
Copy link
Contributor Author

/ok to test 35f1ba2

@andralex
Copy link
Contributor

/ok to test fd49e94

@github-actions
Copy link
Contributor

🟩 CI finished in 41m 28s: Pass: 100%/32 | Total: 7h 40m | Avg: 14m 23s | Max: 29m 27s | Hits: 67%/15498
  • 🟩 cudax: Pass: 100%/28 | Total: 7h 13m | Avg: 15m 29s | Max: 29m 27s | Hits: 67%/15498

    🟩 cpu
      🟩 amd64              Pass: 100%/24  | Total:  6h 16m | Avg: 15m 40s | Max: 29m 27s | Hits:  69%/13110 
      🟩 arm64              Pass: 100%/4   | Total: 57m 38s | Avg: 14m 24s | Max: 15m 33s | Hits:  61%/2388  
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 37m 34s | Avg: 12m 31s | Max: 14m 27s | Hits:  67%/1485  
      🟩 12.9               Pass: 100%/25  | Total:  6h 36m | Avg: 15m 51s | Max: 29m 27s | Hits:  67%/14013 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 37m 34s | Avg: 12m 31s | Max: 14m 27s | Hits:  67%/1485  
      🟩 nvcc12.9           Pass: 100%/25  | Total:  6h 36m | Avg: 15m 51s | Max: 29m 27s | Hits:  67%/14013 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/28  | Total:  7h 13m | Avg: 15m 29s | Max: 29m 27s | Hits:  67%/15498 
    🟩 cxx
      🟩 Clang14            Pass: 100%/2   | Total: 26m 39s | Avg: 13m 19s | Max: 14m 12s | Hits:  61%/1196  
      🟩 Clang15            Pass: 100%/1   | Total: 15m 32s | Avg: 15m 32s | Max: 15m 32s | Hits:  61%/597   
      🟩 Clang16            Pass: 100%/1   | Total: 15m 25s | Avg: 15m 25s | Max: 15m 25s | Hits:  61%/597   
      🟩 Clang17            Pass: 100%/1   | Total: 16m 26s | Avg: 16m 26s | Max: 16m 26s | Hits:  61%/597   
      🟩 Clang18            Pass: 100%/1   | Total: 15m 23s | Avg: 15m 23s | Max: 15m 23s | Hits:  61%/597   
      🟩 Clang19            Pass: 100%/4   | Total: 50m 54s | Avg: 12m 43s | Max: 15m 39s | Hits:  70%/2388  
      🟩 GCC10              Pass: 100%/2   | Total: 30m 34s | Avg: 15m 17s | Max: 16m 07s | Hits:  60%/1196  
      🟩 GCC11              Pass: 100%/1   | Total: 16m 11s | Avg: 16m 11s | Max: 16m 11s | Hits:  60%/597   
      🟩 GCC12              Pass: 100%/1   | Total: 21m 00s | Avg: 21m 00s | Max: 21m 00s | Hits:  60%/597   
      🟩 GCC13              Pass: 100%/8   | Total:  2h 02m | Avg: 15m 17s | Max: 20m 41s | Hits:  70%/4776  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 10m 40s | Avg: 10m 40s | Max: 10m 40s | Hits:  95%/291   
      🟩 MSVC14.43          Pass: 100%/3   | Total: 34m 00s | Avg: 11m 20s | Max: 12m 19s | Hits:  95%/879   
      🟩 NVHPC25.7          Pass: 100%/2   | Total: 58m 50s | Avg: 29m 25s | Max: 29m 27s | Hits:  58%/1190  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/10  | Total:  2h 20m | Avg: 14m 01s | Max: 16m 26s | Hits:  65%/5972  
      🟩 GCC                Pass: 100%/12  | Total:  3h 10m | Avg: 15m 50s | Max: 21m 00s | Hits:  67%/7166  
      🟩 MSVC               Pass: 100%/4   | Total: 44m 40s | Avg: 11m 10s | Max: 12m 19s | Hits:  95%/1170  
      🟩 NVHPC              Pass: 100%/2   | Total: 58m 50s | Avg: 29m 25s | Max: 29m 27s | Hits:  58%/1190  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total: 21m 28s | Avg: 10m 44s | Max: 13m 07s | Hits:  80%/1194  
      🟩 rtx2080            Pass: 100%/26  | Total:  6h 52m | Avg: 15m 51s | Max: 29m 27s | Hits:  66%/14304 
    🟩 jobs
      🟩 Build              Pass: 100%/25  | Total:  6h 37m | Avg: 15m 53s | Max: 29m 27s | Hits:  63%/13707 
      🟩 Test               Pass: 100%/3   | Total: 36m 47s | Avg: 12m 15s | Max: 20m 41s | Hits:  99%/1791  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total: 21m 28s | Avg: 10m 44s | Max: 13m 07s | Hits:  80%/1194  
      🟩 90;90a             Pass: 100%/2   | Total: 25m 47s | Avg: 12m 53s | Max: 14m 38s | Hits:  72%/890   
      🟩 100;120            Pass: 100%/2   | Total: 26m 10s | Avg: 13m 05s | Max: 15m 38s | Hits:  72%/890   
    🟩 std
      🟩 17                 Pass: 100%/3   | Total: 57m 37s | Avg: 19m 12s | Max: 29m 23s | Hits:  60%/1789  
      🟩 20                 Pass: 100%/25  | Total:  6h 16m | Avg: 15m 03s | Max: 29m 27s | Hits:  68%/13709 
    
  • 🟩 packaging: Pass: 100%/4 | Total: 26m 33s | Avg: 6m 38s | Max: 8m 02s

    🟩 cpu
      🟩 amd64              Pass: 100%/4   | Total: 26m 33s | Avg:  6m 38s | Max:  8m 02s
    🟩 ctk
      🟩 12.0               Pass: 100%/2   | Total: 11m 00s | Avg:  5m 30s | Max:  7m 59s
      🟩 12.9               Pass: 100%/2   | Total: 15m 33s | Avg:  7m 46s | Max:  8m 02s
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/2   | Total: 11m 00s | Avg:  5m 30s | Max:  7m 59s
      🟩 nvcc12.9           Pass: 100%/2   | Total: 15m 33s | Avg:  7m 46s | Max:  8m 02s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 26m 33s | Avg:  6m 38s | Max:  8m 02s
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 01s | Avg:  3m 01s | Max:  3m 01s
      🟩 Clang19            Pass: 100%/1   | Total:  7m 31s | Avg:  7m 31s | Max:  7m 31s
      🟩 GCC12              Pass: 100%/1   | Total:  7m 59s | Avg:  7m 59s | Max:  7m 59s
      🟩 GCC13              Pass: 100%/1   | Total:  8m 02s | Avg:  8m 02s | Max:  8m 02s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/2   | Total: 10m 32s | Avg:  5m 16s | Max:  7m 31s
      🟩 GCC                Pass: 100%/2   | Total: 16m 01s | Avg:  8m 00s | Max:  8m 02s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 26m 33s | Avg:  6m 38s | Max:  8m 02s
    🟩 jobs
      🟩 Test               Pass: 100%/4   | Total: 26m 33s | Avg:  6m 38s | Max:  8m 02s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
CCCL Packaging
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper
NVBench Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- CCCL Packaging
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper
NVBench Helper

🏃‍ Runner counts (total jobs: 32)

# Runner
17 linux-amd64-cpu16
6 linux-amd64-gpu-rtx2080-latest-1
4 linux-arm64-cpu16
4 windows-amd64-cpu16
1 linux-amd64-gpu-h100-latest-1

@andralex
Copy link
Contributor

/ok to test ef970f8

@github-actions
Copy link
Contributor

🟩 CI finished in 2h 03m: Pass: 100%/32 | Total: 2h 50m | Avg: 5m 19s | Max: 13m 48s | Hits: 99%/15498
  • 🟩 cudax: Pass: 100%/28 | Total: 2h 33m | Avg: 5m 29s | Max: 13m 48s | Hits: 99%/15498

    🟩 cpu
      🟩 amd64              Pass: 100%/24  | Total:  2h 21m | Avg:  5m 54s | Max: 13m 48s | Hits:  99%/13110 
      🟩 arm64              Pass: 100%/4   | Total: 11m 53s | Avg:  2m 58s | Max:  3m 17s | Hits:  99%/2388  
    🟩 ctk
      🟩 12.0               Pass: 100%/3   | Total: 18m 07s | Avg:  6m 02s | Max: 11m 44s | Hits:  98%/1485  
      🟩 12.9               Pass: 100%/25  | Total:  2h 15m | Avg:  5m 25s | Max: 13m 48s | Hits:  99%/14013 
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/3   | Total: 18m 07s | Avg:  6m 02s | Max: 11m 44s | Hits:  98%/1485  
      🟩 nvcc12.9           Pass: 100%/25  | Total:  2h 15m | Avg:  5m 25s | Max: 13m 48s | Hits:  99%/14013 
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/28  | Total:  2h 33m | Avg:  5m 29s | Max: 13m 48s | Hits:  99%/15498 
    🟩 cxx
      🟩 Clang14            Pass: 100%/2   | Total:  6m 09s | Avg:  3m 04s | Max:  3m 16s | Hits: 100%/1196  
      🟩 Clang15            Pass: 100%/1   | Total:  3m 22s | Avg:  3m 22s | Max:  3m 22s | Hits: 100%/597   
      🟩 Clang16            Pass: 100%/1   | Total:  3m 16s | Avg:  3m 16s | Max:  3m 16s | Hits: 100%/597   
      🟩 Clang17            Pass: 100%/1   | Total:  3m 17s | Avg:  3m 17s | Max:  3m 17s | Hits: 100%/597   
      🟩 Clang18            Pass: 100%/1   | Total:  3m 13s | Avg:  3m 13s | Max:  3m 13s | Hits: 100%/597   
      🟩 Clang19            Pass: 100%/4   | Total: 17m 22s | Avg:  4m 20s | Max:  8m 45s | Hits: 100%/2388  
      🟩 GCC10              Pass: 100%/2   | Total:  7m 11s | Avg:  3m 35s | Max:  3m 41s | Hits:  99%/1196  
      🟩 GCC11              Pass: 100%/1   | Total:  3m 47s | Avg:  3m 47s | Max:  3m 47s | Hits:  99%/597   
      🟩 GCC12              Pass: 100%/1   | Total:  3m 43s | Avg:  3m 43s | Max:  3m 43s | Hits:  99%/597   
      🟩 GCC13              Pass: 100%/8   | Total: 41m 25s | Avg:  5m 10s | Max: 13m 48s | Hits:  99%/4776  
      🟩 MSVC14.39          Pass: 100%/1   | Total: 11m 44s | Avg: 11m 44s | Max: 11m 44s | Hits:  95%/291   
      🟩 MSVC14.43          Pass: 100%/3   | Total: 35m 03s | Avg: 11m 41s | Max: 12m 54s | Hits:  95%/879   
      🟩 NVHPC25.7          Pass: 100%/2   | Total: 14m 09s | Avg:  7m 04s | Max:  7m 08s | Hits:  97%/1190  
    🟩 cxx_family
      🟩 Clang              Pass: 100%/10  | Total: 36m 39s | Avg:  3m 39s | Max:  8m 45s | Hits: 100%/5972  
      🟩 GCC                Pass: 100%/12  | Total: 56m 06s | Avg:  4m 40s | Max: 13m 48s | Hits:  99%/7166  
      🟩 MSVC               Pass: 100%/4   | Total: 46m 47s | Avg: 11m 41s | Max: 12m 54s | Hits:  95%/1170  
      🟩 NVHPC              Pass: 100%/2   | Total: 14m 09s | Avg:  7m 04s | Max:  7m 08s | Hits:  97%/1190  
    🟩 gpu
      🟩 h100               Pass: 100%/2   | Total:  9m 43s | Avg:  4m 51s | Max:  6m 38s | Hits:  99%/1194  
      🟩 rtx2080            Pass: 100%/26  | Total:  2h 23m | Avg:  5m 32s | Max: 13m 48s | Hits:  99%/14304 
    🟩 jobs
      🟩 Build              Pass: 100%/25  | Total:  2h 04m | Avg:  4m 58s | Max: 12m 54s | Hits:  99%/13707 
      🟩 Test               Pass: 100%/3   | Total: 29m 11s | Avg:  9m 43s | Max: 13m 48s | Hits:  99%/1791  
    🟩 sm
      🟩 90                 Pass: 100%/2   | Total:  9m 43s | Avg:  4m 51s | Max:  6m 38s | Hits:  99%/1194  
      🟩 90;90a             Pass: 100%/2   | Total: 14m 45s | Avg:  7m 22s | Max: 11m 17s | Hits:  98%/890   
      🟩 100;120            Pass: 100%/2   | Total: 14m 36s | Avg:  7m 18s | Max: 10m 52s | Hits:  98%/890   
    🟩 std
      🟩 17                 Pass: 100%/3   | Total: 12m 57s | Avg:  4m 19s | Max:  7m 01s | Hits:  99%/1789  
      🟩 20                 Pass: 100%/25  | Total:  2h 20m | Avg:  5m 37s | Max: 13m 48s | Hits:  99%/13709 
    
  • 🟩 packaging: Pass: 100%/4 | Total: 16m 41s | Avg: 4m 10s | Max: 6m 42s

    🟩 cpu
      🟩 amd64              Pass: 100%/4   | Total: 16m 41s | Avg:  4m 10s | Max:  6m 42s
    🟩 ctk
      🟩 12.0               Pass: 100%/2   | Total:  6m 36s | Avg:  3m 18s | Max:  3m 20s
      🟩 12.9               Pass: 100%/2   | Total: 10m 05s | Avg:  5m 02s | Max:  6m 42s
    🟩 cudacxx
      🟩 nvcc12.0           Pass: 100%/2   | Total:  6m 36s | Avg:  3m 18s | Max:  3m 20s
      🟩 nvcc12.9           Pass: 100%/2   | Total: 10m 05s | Avg:  5m 02s | Max:  6m 42s
    🟩 cudacxx_family
      🟩 nvcc               Pass: 100%/4   | Total: 16m 41s | Avg:  4m 10s | Max:  6m 42s
    🟩 cxx
      🟩 Clang14            Pass: 100%/1   | Total:  3m 20s | Avg:  3m 20s | Max:  3m 20s
      🟩 Clang19            Pass: 100%/1   | Total:  3m 23s | Avg:  3m 23s | Max:  3m 23s
      🟩 GCC12              Pass: 100%/1   | Total:  3m 16s | Avg:  3m 16s | Max:  3m 16s
      🟩 GCC13              Pass: 100%/1   | Total:  6m 42s | Avg:  6m 42s | Max:  6m 42s
    🟩 cxx_family
      🟩 Clang              Pass: 100%/2   | Total:  6m 43s | Avg:  3m 21s | Max:  3m 23s
      🟩 GCC                Pass: 100%/2   | Total:  9m 58s | Avg:  4m 59s | Max:  6m 42s
    🟩 gpu
      🟩 rtx2080            Pass: 100%/4   | Total: 16m 41s | Avg:  4m 10s | Max:  6m 42s
    🟩 jobs
      🟩 Test               Pass: 100%/4   | Total: 16m 41s | Avg:  4m 10s | Max:  6m 42s
    

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
CCCL Packaging
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper
NVBench Helper

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- CCCL Packaging
libcu++
CUB
Thrust
+/- CUDA Experimental
stdpar
python
CCCL C Parallel Library
Catch2Helper
NVBench Helper

🏃‍ Runner counts (total jobs: 32)

# Runner
17 linux-amd64-cpu16
6 linux-amd64-gpu-rtx2080-latest-1
4 linux-arm64-cpu16
4 windows-amd64-cpu16
1 linux-amd64-gpu-h100-latest-1

@caugonnet caugonnet enabled auto-merge (squash) September 2, 2025 11:56
@caugonnet
Copy link
Contributor Author

/ok to test 9f6e3d8

@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2025

📖 Doc Preview CI

🚀 Preview URL: https://NVIDIA.github.io/cccl/pr-preview/pr-5723/

Preview will be available once GitHub Pages deployment completes.

@github-actions

This comment has been minimized.

@caugonnet
Copy link
Contributor Author

/ok to test f5b7e9a

@github-actions

This comment has been minimized.

@caugonnet
Copy link
Contributor Author

/ok to test 9b7b8c7

@github-actions

This comment has been minimized.

@caugonnet
Copy link
Contributor Author

/ok to test 7839aa8

@github-actions
Copy link
Contributor

github-actions bot commented Sep 4, 2025

🥳 CI Workflow Results

🟩 Finished in 40m 18s: Pass: 100%/46 | Total: 13h 12m | Max: 36m 47s | Hits: 65%/20719

See results here.

@caugonnet caugonnet merged commit cc4ec5a into NVIDIA:main Sep 4, 2025
57 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Sep 4, 2025
davebayer pushed a commit to davebayer/cccl that referenced this pull request Sep 23, 2025
* Rework the implementation of the dot tool to have nested sections

* Add new test

* get_next_prereq_unique_id()

* Revert "Add new test"

This reverts commit 9ad0a1a.

* fix visibility issues

* remove stages from dot

* clang-format

* Get code to build with gcc13 and a few minor improvements

* use the appropriate comment format

* update comment format

---------

Co-authored-by: Andrei Alexandrescu <andrei@erdani.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stf Sequential Task Flow programming model

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

3 participants