Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PTX: mark cp_async_bulk*_multicast functions sm_90a #1734

Merged
merged 1 commit into from
May 13, 2024

Conversation

ahendriksen
Copy link
Contributor

Description

closes #1733

Make cp_async_bulk*_multicast functions sm_90a. They were previously marked sm_90.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@ahendriksen ahendriksen requested review from a team as code owners May 13, 2024 08:26
@ahendriksen
Copy link
Contributor Author

pre-commit.ci autofix

Copy link

copy-pr-bot bot commented May 13, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@ahendriksen ahendriksen force-pushed the fix-ptx-cp-async-bulk-multicast branch from 2ecea8f to 5047b34 Compare May 13, 2024 08:33
Copy link
Contributor

🟩 CI Results [ Failed: 0 | Passed: 302 | Total: 302 ]
  • 🟩 Project cub [ Failed: 0 | Passed: 99 | Total: 99 ]

    🟩 cpu
      🟩 amd64 (0% Fail)              Failed:  0  -- Passed: 91  -- Total: 91 
      🟩 arm64 (0% Fail)              Failed:  0  -- Passed:  8  -- Total:  8 
    🟩 ctk
      🟩 11.1 (0% Fail)               Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 11.8 (0% Fail)               Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 12.4 (0% Fail)               Failed:  0  -- Passed: 81  -- Total: 81 
    🟩 cudacxx_full
      🟩 clang-cuda16 (0% Fail)       Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc11.1 (0% Fail)           Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 nvcc11.8 (0% Fail)           Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 nvcc12.4 (0% Fail)           Failed:  0  -- Passed: 79  -- Total: 79 
    🟩 cudacxx_name
      🟩 clang-cuda (0% Fail)         Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc (0% Fail)               Failed:  0  -- Passed: 97  -- Total: 97 
    🟩 cxx_full
      🟩 clang9 (0% Fail)             Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 clang10 (0% Fail)            Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 clang11 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang12 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang13 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang14 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang15 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang16 (0% Fail)            Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 gcc6 (0% Fail)               Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 gcc7 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc8 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc9 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc10 (0% Fail)              Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 gcc11 (0% Fail)              Failed:  0  -- Passed:  7  -- Total:  7 
      🟩 gcc12 (0% Fail)              Failed:  0  -- Passed: 16  -- Total: 16 
      🟩 Intel2023.2.0 (0% Fail)      Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC14.16 (0% Fail)          Failed:  0  -- Passed:  1  -- Total:  1 
      🟩 MSVC14.29 (0% Fail)          Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 MSVC14.39 (0% Fail)          Failed:  0  -- Passed:  3  -- Total:  3 
    🟩 cxx_name
      🟩 clang (0% Fail)              Failed:  0  -- Passed: 43  -- Total: 43 
      🟩 gcc (0% Fail)                Failed:  0  -- Passed: 47  -- Total: 47 
      🟩 Intel (0% Fail)              Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
    🟩 gpu
      🟩 v100 (0% Fail)               Failed:  0  -- Passed: 99  -- Total: 99 
    🟩 jobs
      🟩 build (0% Fail)              Failed:  0  -- Passed: 91  -- Total: 91 
      🟩 test (0% Fail)               Failed:  0  -- Passed:  8  -- Total:  8 
    🟩 os
      🟩 ubuntu18.04 (0% Fail)        Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 ubuntu20.04 (0% Fail)        Failed:  0  -- Passed: 35  -- Total: 35 
      🟩 ubuntu22.04 (0% Fail)        Failed:  0  -- Passed: 44  -- Total: 44 
      🟩 windows2022 (0% Fail)        Failed:  0  -- Passed:  6  -- Total:  6 
    🟩 sm
      🟩 60;70;80;90 (0% Fail)        Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 90a (0% Fail)                Failed:  0  -- Passed:  4  -- Total:  4 
    🟩 std
      🟩 11 (0% Fail)                 Failed:  0  -- Passed: 26  -- Total: 26 
      🟩 14 (0% Fail)                 Failed:  0  -- Passed: 29  -- Total: 29 
      🟩 17 (0% Fail)                 Failed:  0  -- Passed: 28  -- Total: 28 
      🟩 20 (0% Fail)                 Failed:  0  -- Passed: 16  -- Total: 16 
    
  • 🟩 Project thrust [ Failed: 0 | Passed: 99 | Total: 99 ]

    🟩 cpu
      🟩 amd64 (0% Fail)              Failed:  0  -- Passed: 91  -- Total: 91 
      🟩 arm64 (0% Fail)              Failed:  0  -- Passed:  8  -- Total:  8 
    🟩 ctk
      🟩 11.1 (0% Fail)               Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 11.8 (0% Fail)               Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 12.4 (0% Fail)               Failed:  0  -- Passed: 81  -- Total: 81 
    🟩 cudacxx_full
      🟩 clang-cuda16 (0% Fail)       Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc11.1 (0% Fail)           Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 nvcc11.8 (0% Fail)           Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 nvcc12.4 (0% Fail)           Failed:  0  -- Passed: 79  -- Total: 79 
    🟩 cudacxx_name
      🟩 clang-cuda (0% Fail)         Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc (0% Fail)               Failed:  0  -- Passed: 97  -- Total: 97 
    🟩 cxx_full
      🟩 clang9 (0% Fail)             Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 clang10 (0% Fail)            Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 clang11 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang12 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang13 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang14 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang15 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang16 (0% Fail)            Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 gcc6 (0% Fail)               Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 gcc7 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc8 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc9 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc10 (0% Fail)              Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 gcc11 (0% Fail)              Failed:  0  -- Passed:  7  -- Total:  7 
      🟩 gcc12 (0% Fail)              Failed:  0  -- Passed: 16  -- Total: 16 
      🟩 Intel2023.2.0 (0% Fail)      Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC14.16 (0% Fail)          Failed:  0  -- Passed:  1  -- Total:  1 
      🟩 MSVC14.29 (0% Fail)          Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 MSVC14.39 (0% Fail)          Failed:  0  -- Passed:  3  -- Total:  3 
    🟩 cxx_name
      🟩 clang (0% Fail)              Failed:  0  -- Passed: 43  -- Total: 43 
      🟩 gcc (0% Fail)                Failed:  0  -- Passed: 47  -- Total: 47 
      🟩 Intel (0% Fail)              Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
    🟩 gpu
      🟩 v100 (0% Fail)               Failed:  0  -- Passed: 99  -- Total: 99 
    🟩 jobs
      🟩 build (0% Fail)              Failed:  0  -- Passed: 91  -- Total: 91 
      🟩 test (0% Fail)               Failed:  0  -- Passed:  8  -- Total:  8 
    🟩 os
      🟩 ubuntu18.04 (0% Fail)        Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 ubuntu20.04 (0% Fail)        Failed:  0  -- Passed: 35  -- Total: 35 
      🟩 ubuntu22.04 (0% Fail)        Failed:  0  -- Passed: 44  -- Total: 44 
      🟩 windows2022 (0% Fail)        Failed:  0  -- Passed:  6  -- Total:  6 
    🟩 sm
      🟩 60;70;80;90 (0% Fail)        Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 90a (0% Fail)                Failed:  0  -- Passed:  4  -- Total:  4 
    🟩 std
      🟩 11 (0% Fail)                 Failed:  0  -- Passed: 26  -- Total: 26 
      🟩 14 (0% Fail)                 Failed:  0  -- Passed: 29  -- Total: 29 
      🟩 17 (0% Fail)                 Failed:  0  -- Passed: 28  -- Total: 28 
      🟩 20 (0% Fail)                 Failed:  0  -- Passed: 16  -- Total: 16 
    
  • 🟩 Project libcudacxx [ Failed: 0 | Passed: 104 | Total: 104 ]

    🟩 cpu
      🟩 amd64 (0% Fail)              Failed:  0  -- Passed: 96  -- Total: 96 
      🟩 arm64 (0% Fail)              Failed:  0  -- Passed:  8  -- Total:  8 
    🟩 ctk
      🟩 11.1 (0% Fail)               Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 11.8 (0% Fail)               Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 12.4 (0% Fail)               Failed:  0  -- Passed: 86  -- Total: 86 
    🟩 cudacxx_full
      🟩 clang-cuda16 (0% Fail)       Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc11.1 (0% Fail)           Failed:  0  -- Passed: 15  -- Total: 15 
      🟩 nvcc11.8 (0% Fail)           Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 nvcc12.4 (0% Fail)           Failed:  0  -- Passed: 84  -- Total: 84 
    🟩 cudacxx_name
      🟩 clang-cuda (0% Fail)         Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 nvcc (0% Fail)               Failed:  0  -- Passed: 102 -- Total: 102
    🟩 cxx_full
      🟩 clang9 (0% Fail)             Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 clang10 (0% Fail)            Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 clang11 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang12 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang13 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang14 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang15 (0% Fail)            Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 clang16 (0% Fail)            Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 gcc6 (0% Fail)               Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 gcc7 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc8 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc9 (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
      🟩 gcc10 (0% Fail)              Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 gcc11 (0% Fail)              Failed:  0  -- Passed:  7  -- Total:  7 
      🟩 gcc12 (0% Fail)              Failed:  0  -- Passed: 21  -- Total: 21 
      🟩 Intel2023.2.0 (0% Fail)      Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC14.16 (0% Fail)          Failed:  0  -- Passed:  1  -- Total:  1 
      🟩 MSVC14.29 (0% Fail)          Failed:  0  -- Passed:  2  -- Total:  2 
      🟩 MSVC14.39 (0% Fail)          Failed:  0  -- Passed:  3  -- Total:  3 
    🟩 cxx_name
      🟩 clang (0% Fail)              Failed:  0  -- Passed: 43  -- Total: 43 
      🟩 gcc (0% Fail)                Failed:  0  -- Passed: 52  -- Total: 52 
      🟩 Intel (0% Fail)              Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 MSVC (0% Fail)               Failed:  0  -- Passed:  6  -- Total:  6 
    🟩 gpu
      🟩 v100 (0% Fail)               Failed:  0  -- Passed: 104 -- Total: 104
    🟩 jobs
      🟩 build (0% Fail)              Failed:  0  -- Passed: 91  -- Total: 91 
      🟩 nvrtc (0% Fail)              Failed:  0  -- Passed:  4  -- Total:  4 
      🟩 test (0% Fail)               Failed:  0  -- Passed:  8  -- Total:  8 
      🟩 verify_codegen (0% Fail)     Failed:  0  -- Passed:  1  -- Total:  1 
    🟩 os
      🟩 ubuntu18.04 (0% Fail)        Failed:  0  -- Passed: 14  -- Total: 14 
      🟩 ubuntu20.04 (0% Fail)        Failed:  0  -- Passed: 35  -- Total: 35 
      🟩 ubuntu22.04 (0% Fail)        Failed:  0  -- Passed: 49  -- Total: 49 
      🟩 windows2022 (0% Fail)        Failed:  0  -- Passed:  6  -- Total:  6 
    🟩 sm
      🟩 60;70;80;90 (0% Fail)        Failed:  0  -- Passed:  3  -- Total:  3 
      🟩 90a (0% Fail)                Failed:  0  -- Passed:  4  -- Total:  4 
    🟩 std
      🟩 11 (0% Fail)                 Failed:  0  -- Passed: 27  -- Total: 27 
      🟩 14 (0% Fail)                 Failed:  0  -- Passed: 30  -- Total: 30 
      🟩 17 (0% Fail)                 Failed:  0  -- Passed: 29  -- Total: 29 
      🟩 20 (0% Fail)                 Failed:  0  -- Passed: 17  -- Total: 17 
    

🏃‍ Runner counts (total jobs: 302)

# Runner
232 linux-amd64-cpu16
28 linux-amd64-gpu-v100-latest-1
24 linux-arm64-cpu16
18 windows-amd64-cpu16

👃 Inspect Changes

Modifications in project?

Project
CCCL Infrastructure
+/- libcu++
CUB
Thrust
CUDA Experimental

Modifications in project or dependencies?

Project
CCCL Infrastructure
+/- libcu++
+/- CUB
+/- Thrust
+/- CUDA Experimental

@miscco miscco merged commit f8a26b2 into NVIDIA:main May 13, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[BUG]: PTXAS emits advisory regarding cp.async.bulk.*.multicast use on sm_90
2 participants