Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cuda::ptx:tensormap_{replace,cp_fenceproxy} #1441

Merged
merged 2 commits into from
Feb 28, 2024

Conversation

ahendriksen
Copy link
Contributor

Description

closes #1439, #1440

Adds:

  • cuda::ptx::tensormap_replace
  • cuda::ptx::tensormap_cp_fenceproxy

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@ahendriksen
Copy link
Contributor Author

CI is failing because NVRTC is not yet supported by NV_HAS_FEATURE_SM_90a. See #1446

@ahendriksen
Copy link
Contributor Author

Tests are passing. I think this can be merged?

@miscco miscco merged commit dff1e18 into NVIDIA:main Feb 28, 2024
563 checks passed
miscco added a commit to miscco/cccl that referenced this pull request Feb 29, 2024
* Add `cuda::ptx:tensormap_{replace,cp_fenceproxy}`

* Add a hacky workaround until nvrtc knows about SM90a

---------

Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[FEA]: Add cuda::ptx::tensormap_replace
2 participants