-
Couldn't load subscription status.
- Fork 217
Add cuTENSOR support & bug fixes discovered while working on conda testing #1194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… testing is INCOMPLETE.
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
|
…test_load_nvidia_dynamic_lib.py
…or conda anomaly, to restore proper functioning for standard CTK installations
|
/ok to test |
Manually tested with: pip install nvidia-cufftmp-cu13==12.1.3.1 That wheel was yanked, therefore not adding to pyproject.toml
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
{"summary": "
Sequence Diagram
sequenceDiagram
participant User
participant find_nvidia_header_directory
participant find_nvidia_headers.py
participant supported_nvidia_headers.py
participant load_nvidia_dynamic_lib
participant supported_nvidia_libs.py
participant Test Suite
User->>find_nvidia_header_directory: "Request header for libname (e.g., 'cutensor')"
find_nvidia_header_directory->>supported_nvidia_headers.py: "Check if libname in SUPPORTED_HEADERS_CTK"
alt CTK Library
supported_nvidia_headers.py-->>find_nvidia_header_directory: "Found in CTK"
find_nvidia_header_directory->>find_nvidia_headers.py: "Call _find_ctk_header_directory()"
else Non-CTK Library (e.g., cutensor)
supported_nvidia_headers.py-->>find_nvidia_header_directory: "Check SUPPORTED_HEADERS_NON_CTK"
find_nvidia_header_directory->>supported_nvidia_headers.py: "Get canonical header basename"
supported_nvidia_headers.py-->>find_nvidia_header_directory: "Return 'cutensor.h'"
find_nvidia_header_directory->>find_nvidia_headers.py: "Search site-packages"
find_nvidia_headers.py->>find_nvidia_headers.py: "Check SUPPORTED_SITE_PACKAGE_HEADER_DIRS_NON_CTK"
find_nvidia_headers.py->>find_nvidia_headers.py: "_find_based_on_conda_layout() for non-CTK"
alt Windows Conda Anomaly (cccl)
find_nvidia_headers.py->>find_nvidia_headers.py: "Handle targets/x64 path anomaly"
end
find_nvidia_headers.py->>find_nvidia_headers.py: "Check SUPPORTED_INSTALL_DIRS_NON_CTK"
end
find_nvidia_header_directory-->>User: "Return header directory path or None"
User->>load_nvidia_dynamic_lib: "Load dynamic library (e.g., 'cutensor')"
load_nvidia_dynamic_lib->>supported_nvidia_libs.py: "Query SUPPORTED_LINUX_SONAMES or SUPPORTED_WINDOWS_DLLS"
alt CTK Library
supported_nvidia_libs.py-->>load_nvidia_dynamic_lib: "Return from SUPPORTED_*_SONAMES_CTK"
else Non-CTK Library (cutensor)
supported_nvidia_libs.py-->>load_nvidia_dynamic_lib: "Return from SUPPORTED_*_SONAMES_OTHER"
Note over supported_nvidia_libs.py: "Added: cutensor -> libcutensor.so.2 (Linux)<br/>cutensor -> cutensor.dll (Windows)"
end
load_nvidia_dynamic_lib->>supported_nvidia_libs.py: "Check SITE_PACKAGES_LIBDIRS_*"
supported_nvidia_libs.py-->>load_nvidia_dynamic_lib: "Return site-packages paths"
load_nvidia_dynamic_lib-->>User: "Return LoadedDL object or DynamicLibNotFoundError"
User->>Test Suite: "Run test_find_nvidia_headers"
Test Suite->>find_nvidia_header_directory: "Test all SUPPORTED_HEADERS_NON_CTK.keys()"
Test Suite->>Test Suite: "Loop over cutensor, nvshmem, etc."
Test Suite-->>User: "Report test results"
User->>Test Suite: "Run test_load_nvidia_dynamic_lib"
Test Suite->>load_nvidia_dynamic_lib: "Test all SUPPORTED_LINUX_SONAMES or SUPPORTED_WINDOWS_DLLS"
Test Suite->>Test Suite: "Run in spawned child process for isolation"
Test Suite-->>User: "Report test results with abs_path or 'Not found'"
14 files reviewed, 12 comments
cuda_pathfinder/cuda/pathfinder/_headers/find_nvidia_headers.py
Outdated
Show resolved
Hide resolved
cuda_pathfinder/cuda/pathfinder/_headers/find_nvidia_headers.py
Outdated
Show resolved
Hide resolved
cuda_pathfinder/cuda/pathfinder/_headers/find_nvidia_headers.py
Outdated
Show resolved
Hide resolved
|
@ZzEeKkAa I believe this is ready for review. It'd be great if you could take a look. The only thing left to do is go through the manual testing systematically. |
|
CI ran successfully at commit b079c84: https://github.com/NVIDIA/cuda-python/actions/runs/18886735456/job/53903916554?pr=1194 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This incremental review covers only the changes made since the last review, not the entire PR. The developer has addressed previous feedback by applying the _abs_norm() wrapper to all remaining return paths in find_nvidia_header_directory() (lines 150, 153, 159) and fixed the PowerShell syntax error in the conda setup script. The _abs_norm() helper (lines 15-18) normalizes path separators and converts relative paths to absolute paths, ensuring consistent path format across all return points—critical for the new cuTENSOR/non-CTK library support that can be installed in diverse locations (site-packages, conda, standard directories). The PowerShell script now includes 'cutensor' in the package list for testing and corrects the missing comma after "libnvshmem-dev". These changes ensure that header discovery returns predictable, normalized paths regardless of installation method or platform, addressing Windows conda path anomalies mentioned in the PR description.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| cuda_pathfinder/cuda/pathfinder/_headers/find_nvidia_headers.py | 5/5 | Applied _abs_norm() wrapper to all non-CTK header return paths for consistent path normalization |
| toolshed/conda_create_for_pathfinder_testing.ps1 | 5/5 | Added 'cutensor' package and fixed trailing comma syntax error in package list |
Confidence score: 5/5
- This PR is safe to merge with minimal risk as the changes are targeted fixes addressing specific previous review feedback
- Score reflects that all previously identified issues have been resolved: path normalization is now consistent across all return paths, and the PowerShell syntax error has been corrected
- No files require special attention; both changes are straightforward defensive improvements that enhance cross-platform reliability
2 files reviewed, no comments
| SUPPORTED_HEADERS_CTK | ||
| SUPPORTED_HEADERS_NON_CTK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll have to document the OS-dependent flavors. The other day I was looking at this doc
https://nvidia.github.io/cuda-python/cuda-pathfinder/latest/generated/cuda.pathfinder.SUPPORTED_NVIDIA_LIBNAMES.html#cuda.pathfinder.SUPPORTED_NVIDIA_LIBNAMES
and noticed that nccl is not on the list, but we clearly support nccl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created issue #1197 to track follow-on work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (SUPPORTED_HEADERS_NON_CTK) unfortunately needs to be decided now, no matter how hard we want to push out a release. Once this is documented, it becomes a public API, and there is no turning back without breaking the major version. Let's make sure we reach a conclusion before cutting a new (patch) release, if not in this PR.
| # conda has this anomaly | ||
| cdir_ctk12 = os.path.join(idir, "targets", "x64") | ||
| cdir_ctk13 = os.path.join(cdir_ctk12, "cccl") | ||
| if _joined_isfile(cdir_ctk13, h_basename): | ||
| return cdir_ctk13 | ||
| if _joined_isfile(cdir_ctk12, h_basename): | ||
| return cdir_ctk12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going through systematically:
cu13 = 13.0.2
cu12 = 12.9.1
local-ctk cu13: IINFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v13.0\\include\\cccl'
local-ctk cu12: INFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.9\\include'
conda-ctk cu13: INFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Users\\rgrossekunst\\AppData\\Local\\miniforge3\\envs\\pathfinder_testing_cu13.0.2\\Library\\include\\targets\\x64\\cccl'
conda-ctk cu12: INFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Users\\rgrossekunst\\AppData\\Local\\miniforge3\\envs\\pathfinder_testing_cu12.9.1\\Library\\include\\targets\\x64'
I moved the comment to the end of the line, to make it more clear that "anomaly" applies to the targets\x64 part (commit 4d5a41a).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So by "anomaly" you meant x64 shows up only in CCCL's path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the whole thing in mind: targets\x64 (appears only specifically if conda && windows && cccl)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting it. I now recall we documented this
https://github.com/conda-forge/cccl-feedstock/blob/1fc3562e90c7d81203e62826779c8d45fe66191b/recipe/recipe.yaml#L1-L4
However, I don't think it is Windows only. This is the path on Linux (as documented):
- https://github.com/conda-forge/cuda-cccl-feedstock/blob/19281e6b2bbe5b2a2332795e7bac85e02e1421ca/recipe/build.sh#L20
- https://github.com/conda-forge/cuda-cccl-feedstock/blob/19281e6b2bbe5b2a2332795e7bac85e02e1421ca/recipe/meta.yaml#L91
Could you check again?
ab4c8f3 to
9671b24
Compare
| } | ||
| SUPPORTED_HEADERS_NON_CTK_LINUX = SUPPORTED_HEADERS_NON_CTK_COMMON | SUPPORTED_HEADERS_NON_CTK_LINUX_ONLY | ||
| SUPPORTED_HEADERS_NON_CTK_WINDOWS = SUPPORTED_HEADERS_NON_CTK_COMMON | ||
| SUPPORTED_HEADERS_NON_CTK_ALL = SUPPORTED_HEADERS_NON_CTK_COMMON | SUPPORTED_HEADERS_NON_CTK_LINUX_ONLY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: What's the difference between SUPPORTED_HEADERS_NON_CTK_ALL and SUPPORTED_HEADERS_NON_CTK_LINUX? They look the same to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In supported_nvidia_dynamic_libs.py there is also SUPPORTED_..._WINDOWS_ONLY. It doesn't exist here, but I wanted to follow the more general form.
We can maybe look at this some more under issue #1197, although I think it's a useful pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I follow. Shouldn't it be
| SUPPORTED_HEADERS_NON_CTK_ALL = SUPPORTED_HEADERS_NON_CTK_COMMON | SUPPORTED_HEADERS_NON_CTK_LINUX_ONLY | |
| SUPPORTED_HEADERS_NON_CTK_ALL = SUPPORTED_HEADERS_NON_CTK_COMMON | SUPPORTED_HEADERS_NON_CTK_LINUX_ONLY | SUPPORTED_HEADERS_NON_CTK_WINDOWS_ONLY |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe keep an empty dict here so that we express the intent while leaving room for future extension?
|
/ok to test |
Closes #1144, #1116
Bump
cuda-pathfinderversion to1.3.2TODOs:
Main changes:
find_nvidia_headers.pygeneralization for non-CTK headers, introducing a new family ofSUPPORTED_*NON_CTK*variables insupported_nvidia_headers.py.test_load_nvidia_dynamic_lib.pynow loops consistently over allSUPPORTED_LINUX_SONAMESorSUPPORTED_WINDOWS_DLLS, depending on the platform.Piggy-backed changes:
cublasmpdependencies are not reflected insupported_nvidia_libs.py#1116SITE_PACKAGES_LIBDIRS_WINDOWS_OTHERcudsspaths insupported_nvidia_libs.py.ccclIS_WINDOWSconda anomaly (see example below) in find_nvidia_headers.py, e.g. (this was overlooked before due to lack of automatic testing).nvidia-cufftmp-cu13data insupported_nvidia_libs.pyExample for
ccclIS_WINDOWSconda anomaly (notetargets\x64afterinclude\):