Skip to content

Conversation

@rwgk
Copy link
Collaborator

@rwgk rwgk commented Oct 28, 2025

Closes #1144, #1116

Bump cuda-pathfinder version to 1.3.2

TODOs:

  • Paste all outputs: (site-packages, conda, standard-CTK) x (cu12, cu13) x (linux-64, linux-aarch64, win-64)

Main changes:

  • find_nvidia_headers.py generalization for non-CTK headers, introducing a new family of SUPPORTED_*NON_CTK* variables in supported_nvidia_headers.py.
  • test_load_nvidia_dynamic_lib.py now loops consistently over all SUPPORTED_LINUX_SONAMES or SUPPORTED_WINDOWS_DLLS, depending on the platform.

Piggy-backed changes:

Example for cccl IS_WINDOWS conda anomaly (note targets\x64 after include\):

INFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Users\\rgrossekunst\\AppData\\Local\\miniforge3\\envs\\pathfinder_testing_cu12.9.1\\Library\\include\\targets\\x64'

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Oct 28, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 28, 2025

/ok to test

@github-actions
Copy link

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 28, 2025

/ok to test

@rwgk rwgk marked this pull request as ready for review October 28, 2025 19:23
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Oct 28, 2025

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 28, 2025

/ok to test

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

{"summary": "

Sequence Diagram

sequenceDiagram
    participant User
    participant find_nvidia_header_directory
    participant find_nvidia_headers.py
    participant supported_nvidia_headers.py
    participant load_nvidia_dynamic_lib
    participant supported_nvidia_libs.py
    participant Test Suite

    User->>find_nvidia_header_directory: "Request header for libname (e.g., 'cutensor')"
    find_nvidia_header_directory->>supported_nvidia_headers.py: "Check if libname in SUPPORTED_HEADERS_CTK"
    alt CTK Library
        supported_nvidia_headers.py-->>find_nvidia_header_directory: "Found in CTK"
        find_nvidia_header_directory->>find_nvidia_headers.py: "Call _find_ctk_header_directory()"
    else Non-CTK Library (e.g., cutensor)
        supported_nvidia_headers.py-->>find_nvidia_header_directory: "Check SUPPORTED_HEADERS_NON_CTK"
        find_nvidia_header_directory->>supported_nvidia_headers.py: "Get canonical header basename"
        supported_nvidia_headers.py-->>find_nvidia_header_directory: "Return 'cutensor.h'"
        find_nvidia_header_directory->>find_nvidia_headers.py: "Search site-packages"
        find_nvidia_headers.py->>find_nvidia_headers.py: "Check SUPPORTED_SITE_PACKAGE_HEADER_DIRS_NON_CTK"
        find_nvidia_headers.py->>find_nvidia_headers.py: "_find_based_on_conda_layout() for non-CTK"
        alt Windows Conda Anomaly (cccl)
            find_nvidia_headers.py->>find_nvidia_headers.py: "Handle targets/x64 path anomaly"
        end
        find_nvidia_headers.py->>find_nvidia_headers.py: "Check SUPPORTED_INSTALL_DIRS_NON_CTK"
    end
    find_nvidia_header_directory-->>User: "Return header directory path or None"

    User->>load_nvidia_dynamic_lib: "Load dynamic library (e.g., 'cutensor')"
    load_nvidia_dynamic_lib->>supported_nvidia_libs.py: "Query SUPPORTED_LINUX_SONAMES or SUPPORTED_WINDOWS_DLLS"
    alt CTK Library
        supported_nvidia_libs.py-->>load_nvidia_dynamic_lib: "Return from SUPPORTED_*_SONAMES_CTK"
    else Non-CTK Library (cutensor)
        supported_nvidia_libs.py-->>load_nvidia_dynamic_lib: "Return from SUPPORTED_*_SONAMES_OTHER"
        Note over supported_nvidia_libs.py: "Added: cutensor -> libcutensor.so.2 (Linux)<br/>cutensor -> cutensor.dll (Windows)"
    end
    load_nvidia_dynamic_lib->>supported_nvidia_libs.py: "Check SITE_PACKAGES_LIBDIRS_*"
    supported_nvidia_libs.py-->>load_nvidia_dynamic_lib: "Return site-packages paths"
    load_nvidia_dynamic_lib-->>User: "Return LoadedDL object or DynamicLibNotFoundError"

    User->>Test Suite: "Run test_find_nvidia_headers"
    Test Suite->>find_nvidia_header_directory: "Test all SUPPORTED_HEADERS_NON_CTK.keys()"
    Test Suite->>Test Suite: "Loop over cutensor, nvshmem, etc."
    Test Suite-->>User: "Report test results"

    User->>Test Suite: "Run test_load_nvidia_dynamic_lib"
    Test Suite->>load_nvidia_dynamic_lib: "Test all SUPPORTED_LINUX_SONAMES or SUPPORTED_WINDOWS_DLLS"
    Test Suite->>Test Suite: "Run in spawned child process for isolation"
    Test Suite-->>User: "Report test results with abs_path or 'Not found'"
Loading

14 files reviewed, 12 comments

Edit Code Review Agent Settings | Greptile

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 28, 2025

@ZzEeKkAa I believe this is ready for review. It'd be great if you could take a look.

The only thing left to do is go through the manual testing systematically.

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 28, 2025

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This incremental review covers only the changes made since the last review, not the entire PR. The developer has addressed previous feedback by applying the _abs_norm() wrapper to all remaining return paths in find_nvidia_header_directory() (lines 150, 153, 159) and fixed the PowerShell syntax error in the conda setup script. The _abs_norm() helper (lines 15-18) normalizes path separators and converts relative paths to absolute paths, ensuring consistent path format across all return points—critical for the new cuTENSOR/non-CTK library support that can be installed in diverse locations (site-packages, conda, standard directories). The PowerShell script now includes 'cutensor' in the package list for testing and corrects the missing comma after "libnvshmem-dev". These changes ensure that header discovery returns predictable, normalized paths regardless of installation method or platform, addressing Windows conda path anomalies mentioned in the PR description.

Important Files Changed

Filename Score Overview
cuda_pathfinder/cuda/pathfinder/_headers/find_nvidia_headers.py 5/5 Applied _abs_norm() wrapper to all non-CTK header return paths for consistent path normalization
toolshed/conda_create_for_pathfinder_testing.ps1 5/5 Added 'cutensor' package and fixed trailing comma syntax error in package list

Confidence score: 5/5

  • This PR is safe to merge with minimal risk as the changes are targeted fixes addressing specific previous review feedback
  • Score reflects that all previously identified issues have been resolved: path normalization is now consistent across all return paths, and the PowerShell syntax error has been corrected
  • No files require special attention; both changes are straightforward defensive improvements that enhance cross-platform reliability

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@leofang leofang assigned leofang and rwgk and unassigned leofang Oct 28, 2025
@leofang leofang added enhancement Any code-related improvements P0 High priority - Must do! cuda.pathfinder Everything related to the cuda.pathfinder module labels Oct 28, 2025
Comment on lines 20 to +21
SUPPORTED_HEADERS_CTK
SUPPORTED_HEADERS_NON_CTK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll have to document the OS-dependent flavors. The other day I was looking at this doc
https://nvidia.github.io/cuda-python/cuda-pathfinder/latest/generated/cuda.pathfinder.SUPPORTED_NVIDIA_LIBNAMES.html#cuda.pathfinder.SUPPORTED_NVIDIA_LIBNAMES
and noticed that nccl is not on the list, but we clearly support nccl.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created issue #1197 to track follow-on work.

Copy link
Member

@leofang leofang Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (SUPPORTED_HEADERS_NON_CTK) unfortunately needs to be decided now, no matter how hard we want to push out a release. Once this is documented, it becomes a public API, and there is no turning back without breaking the major version. Let's make sure we reach a conclusion before cutting a new (patch) release, if not in this PR.

Comment on lines 42 to 48
# conda has this anomaly
cdir_ctk12 = os.path.join(idir, "targets", "x64")
cdir_ctk13 = os.path.join(cdir_ctk12, "cccl")
if _joined_isfile(cdir_ctk13, h_basename):
return cdir_ctk13
if _joined_isfile(cdir_ctk12, h_basename):
return cdir_ctk12
Copy link
Member

@leofang leofang Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment. The difference between CUDA 12 & 13 is universal (differ by a cccl subdir) and not limited to conda?

  • CUDA 12 Windows path is %CONDA_PREFIX%\Library\include\targets\x64\ (link)
  • CUDA 13 Windows path is %CONDA_PREFIX%\Library\include\targets\x64\cccl (link)

Copy link
Collaborator Author

@rwgk rwgk Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through systematically:

cu13 = 13.0.2
cu12 = 12.9.1

local-ctk cu13: IINFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v13.0\\include\\cccl'

local-ctk cu12: INFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.9\\include'

conda-ctk cu13: INFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Users\\rgrossekunst\\AppData\\Local\\miniforge3\\envs\\pathfinder_testing_cu13.0.2\\Library\\include\\targets\\x64\\cccl'

conda-ctk cu12: INFO test_find_ctk_headers[cccl]: hdr_dir='C:\\Users\\rgrossekunst\\AppData\\Local\\miniforge3\\envs\\pathfinder_testing_cu12.9.1\\Library\\include\\targets\\x64'

I moved the comment to the end of the line, to make it more clear that "anomaly" applies to the targets\x64 part (commit 4d5a41a).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So by "anomaly" you meant x64 shows up only in CCCL's path?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the whole thing in mind: targets\x64 (appears only specifically if conda && windows && cccl)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwgk rwgk force-pushed the cutensor_support branch from ab4c8f3 to 9671b24 Compare October 28, 2025 22:13
}
SUPPORTED_HEADERS_NON_CTK_LINUX = SUPPORTED_HEADERS_NON_CTK_COMMON | SUPPORTED_HEADERS_NON_CTK_LINUX_ONLY
SUPPORTED_HEADERS_NON_CTK_WINDOWS = SUPPORTED_HEADERS_NON_CTK_COMMON
SUPPORTED_HEADERS_NON_CTK_ALL = SUPPORTED_HEADERS_NON_CTK_COMMON | SUPPORTED_HEADERS_NON_CTK_LINUX_ONLY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: What's the difference between SUPPORTED_HEADERS_NON_CTK_ALL and SUPPORTED_HEADERS_NON_CTK_LINUX? They look the same to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In supported_nvidia_dynamic_libs.py there is also SUPPORTED_..._WINDOWS_ONLY. It doesn't exist here, but I wanted to follow the more general form.

We can maybe look at this some more under issue #1197, although I think it's a useful pattern.

Copy link
Member

@leofang leofang Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I follow. Shouldn't it be

Suggested change
SUPPORTED_HEADERS_NON_CTK_ALL = SUPPORTED_HEADERS_NON_CTK_COMMON | SUPPORTED_HEADERS_NON_CTK_LINUX_ONLY
SUPPORTED_HEADERS_NON_CTK_ALL = SUPPORTED_HEADERS_NON_CTK_COMMON | SUPPORTED_HEADERS_NON_CTK_LINUX_ONLY | SUPPORTED_HEADERS_NON_CTK_WINDOWS_ONLY

?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep an empty dict here so that we express the intent while leaving room for future extension?

@rwgk
Copy link
Collaborator Author

rwgk commented Oct 29, 2025

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.pathfinder Everything related to the cuda.pathfinder module enhancement Any code-related improvements P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support cuTENSOR in pathfinder [BUG]: cublasmp dependencies are not reflected in supported_nvidia_libs.py

2 participants