Skip to content

EXPERIMENT — NOT FOR MERGING — cuda-pathfinder in build-system.requires#1803

Draft
rwgk wants to merge 23 commits intoNVIDIA:mainfrom
rwgk:CUDA_PATH_CUDA_HOME_cleanup_pathfinder_in_build-system_requires
Draft

EXPERIMENT — NOT FOR MERGING — cuda-pathfinder in build-system.requires#1803
rwgk wants to merge 23 commits intoNVIDIA:mainfrom
rwgk:CUDA_PATH_CUDA_HOME_cleanup_pathfinder_in_build-system_requires

Conversation

@rwgk
Copy link
Collaborator

@rwgk rwgk commented Mar 21, 2026

Based on #1801, with commit f44a308 from 1801 reverted here.

rwgk and others added 17 commits March 20, 2026 11:45
…d onto main.

Adds cuda.pathfinder._utils.env_var_constants with canonical search order,
enhances get_cuda_home_or_path() with robust path comparison and caching,
and updates documentation across all packages to reflect the new priority.

Co-authored-by: Rob Parolin <rparolin@nvidia.com>
Made-with: Cursor
Drop os.pathsep splitting of CUDA_PATH/CUDA_HOME in both build_hooks.py files.
Both functions now delegate to get_cuda_home_or_path() from cuda.pathfinder,
returning a single path.

See NVIDIA#1801 (comment)

Made-with: Cursor
Safe: currently an internal-only API (not yet public).
Made-with: Cursor
Export get_cuda_path_or_home from cuda.pathfinder.__init__. External
consumers now import from cuda.pathfinder directly. Rename constant
to _CUDA_PATH_ENV_VARS_ORDERED and remove all public references to it.

Made-with: Cursor
Pathfinder 1.5.0 release notes no longer claim cross-package consistency
(that depends on future bindings/core releases). cuda_bindings env var
docs now defer to pathfinder release notes for migration guidance.

Made-with: Cursor
…docs/nv-versions.json before

Discovered via independent review from GPT-5.4 Extra High
Aligns the provenance label with the new CUDA_PATH-first priority.
The label signals the highest-priority env var name, not necessarily
which variable supplied the value.

Discovered via independent review from GPT-5.4 Extra High

Made-with: Cursor
The build backends run in an isolated venv created by pyproject-build.
Although cuda-pathfinder is listed in build-system.requires and gets
installed, the cuda namespace package from backend-path=["."] shadows
the installed cuda-pathfinder, making `from cuda.pathfinder import ...`
fail with ModuleNotFoundError. This broke all CI wheel builds.

Revert _get_cuda_path() to use os.environ.get() directly with
CUDA_PATH > CUDA_HOME priority, and remove cuda-pathfinder from
build-system.requires (it was not there on main; our PR added it).

Made-with: Cursor
Temporary diagnostic to investigate why `from cuda.pathfinder import
get_cuda_path_or_home` fails inside the pyproject-build isolated venv
despite cuda-pathfinder being in build-system.requires.

Made-with: Cursor
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 21, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

/ok to test

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

Tracking Cursor Opus 4.6 1M Thinking analysis that was the motivation for this experimental PR:

My prompt:

Could you please help me develop my understanding: assume cuda-pathfinder 1.5.0 (this PR) is released on PyPI. Could we then revert the last commit? Or will
this still be an issue:

> the cuda namespace package from backend-path=["."] shadows the installed cuda-pathfinder

Cursor response:

 The namespace issue would persist even with 1.5.0 on PyPI. The problem isn't about which version is available — it's structural.
  Here's the mechanism: pyproject-build creates an isolated venv, installs build-system.requires (which would include cuda-pathfinder), then runs the build backend
  in a subprocess. In that subprocess, backend-path=["."] prepends cuda_bindings/ to sys.path. Now sys.path has two locations providing the cuda namespace:
  • cuda_bindings/cuda/ (from backend-path, contains bindings/)
  • <venv>/site-packages/cuda/ (from installed cuda-pathfinder, contains pathfinder/)

  Both are implicit namespace packages (no __init__.py). In theory, Python should merge them into a single namespace with __path__ covering both locations. In
  practice, this failed in CI — and cuda-pathfinder 1.4.3 IS on PyPI and IS a pure-Python wheel, so it should have installed fine. The namespace structure in 1.5.0
  would be identical.
  I'll caveat: I'm not 100% certain the root cause is namespace shadowing vs. a silent pip install failure inside the cibuildwheel container. You could verify by
  adding a diagnostic like CIBW_BEFORE_BUILD: "pip list | grep cuda" to confirm whether the package is actually installed. But either way, the os.environ.get()
  approach is the right call for build backends — it has zero external dependencies and the full get_cuda_path_or_home() features (symlink comparison, warnings,
  caching) aren't useful during a build.

@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

Tracking Cursor Opus 4.6 1M Thinking analysis of https://github.com/NVIDIA/cuda-python/actions/runs/23381904071 logs (at commit 47ea03e here):

  Mystery solved definitively:
  • Line 5128: `Successfully installed cuda-pathfinder-1.4.3` — pip installed it successfully into the isolated venv at /tmp/build-env-iic2uvbl/
  • Line 5147: The _in_process.py subprocess runs from /opt/python/cp312-cp312/lib/python3.12/site-packages/ — that's the host Python's site-packages, not the
    isolated venv
  • Line 5162: ModuleNotFoundError: No module named 'cuda.pathfinder'

  This confirms the namespace package resolution issue. cuda-pathfinder 1.4.3 is installed and present, but from cuda.pathfinder import ... fails because
  backend-path=["."] puts /project/cuda_bindings/ on sys.path — its cuda/ directory (containing bindings/) prevents the namespace from merging with the venv's
  cuda/pathfinder/.
  Releasing 1.5.0 on PyPI would not help. The issue is structural: Python's namespace package resolution doesn't merge cuda/ across the backend-path and the
  isolated venv's site-packages in this environment. The os.environ.get() approach on CUDA_PATH_CUDA_HOME_cleanup is the right fix.

rwgk added 2 commits March 21, 2026 08:12
Temporary instrumentation to understand why `import cuda.pathfinder`
fails inside pyproject-build's isolated venv despite cuda-pathfinder
being successfully installed. Prints sys.path, cuda.__path__,
cuda.__spec__, and whether cuda.pathfinder is importable.

Made-with: Cursor
If `import cuda.pathfinder` fails, try `pkgutil.extend_path` to merge
the cuda namespace across sys.path entries and retry. This tests whether
extend_path is a viable fix for the backend-path namespace conflict.

Made-with: Cursor
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

/ok to test

@github-actions
Copy link

Temporary reduction to match the build matrix cut, one entry per
platform (linux-64, linux-aarch64, win-64).

Made-with: Cursor
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

The below is based on analyzing the 3 successful build logs and corresponding 9 successful tests logs from this CI run:

https://github.com/NVIDIA/cuda-python/actions/runs/23382608485?pr=1803


Build hooks patch: use cuda.pathfinder at build time

Apply after cuda-pathfinder >= 1.5 is released on PyPI and the
build-system.requires in both cuda_bindings/pyproject.toml and
cuda_core/pyproject.toml pin cuda-pathfinder >= 1.5.

Root cause (confirmed by CI diagnostic on 3 platforms)

When pyproject-build (or pip) builds a wheel it:

  1. Creates an isolated venv and installs build-system.requires
    (including cuda-pathfinder).
  2. Prepends the project directory to sys.path (via backend-path=["."])
    and imports the build backend (build_hooks).
  3. Removes the project directory from sys.path again (via the
    _BackendHook context manager inside pyproject_hooks).

Between steps 2 and 3 the cuda namespace package has already been
resolved. Python caches it in sys.modules with a _NamespacePath that
points only to the project's cuda/ directory
(e.g. /project/cuda_bindings/cuda). The site-packages cuda/ directory
(which contains pathfinder/) is never merged in.

Key evidence from all 6 diagnostic blocks (2 packages × 3 platforms):

cuda.__path__: _NamespacePath(['/project/cuda_bindings/cuda'])
cuda.__file__: None
import cuda.pathfinder failed (before extend_path): No module named 'cuda.pathfinder'
cuda.__path__ (after extend_path): _NamespacePath(['/project/cuda_bindings/cuda'])
import cuda.pathfinder failed (after extend_path): No module named 'cuda.pathfinder'

pkgutil.extend_path did not add the site-packages path. This is
because _NamespacePath._recalculate() is driven by the path finders
(not os.path.isdir), and the import machinery has already locked the
cuda namespace to a single portion.

Fix strategy

Replace cuda.__path__ (a _NamespacePath) with a plain list that
includes both the project's cuda/ and the site-packages cuda/.
A plain list is not subject to _NamespacePath._recalculate(), so the
additional path sticks.

Patch

Apply to both cuda_bindings/build_hooks.py and cuda_core/build_hooks.py.

Step 1 — add _import_get_cuda_path_or_home helper

def _import_get_cuda_path_or_home():
    """Import get_cuda_path_or_home, working around PEP 517 namespace shadowing.

    In isolated build environments, backend-path=["."] causes the ``cuda``
    namespace package to resolve to only the project's ``cuda/`` directory,
    hiding ``cuda.pathfinder`` installed in the build-env's site-packages.
    Fix by replacing ``cuda.__path__`` with a plain list that includes the
    site-packages ``cuda/`` directory.
    """
    try:
        import cuda.pathfinder
    except ModuleNotFoundError:
        pass
    else:
        return cuda.pathfinder.get_cuda_path_or_home

    # In PEP 517 isolated builds, the cuda namespace resolves to only the
    # project's cuda/ dir (via backend-path=["."]).  The _NamespacePath
    # cached in sys.modules never merges the site-packages cuda/ where
    # pathfinder lives.  Replace it with a plain list so the fix sticks.
    import cuda

    for p in sys.path:
        sp_cuda = os.path.join(p, "cuda")
        if os.path.isdir(os.path.join(sp_cuda, "pathfinder")):
            cuda.__path__ = list(cuda.__path__) + [sp_cuda]
            break

    import cuda.pathfinder

    return cuda.pathfinder.get_cuda_path_or_home

Step 2 — replace _get_cuda_path

@functools.cache
def _get_cuda_path() -> str:
    get_cuda_path_or_home = _import_get_cuda_path_or_home()
    cuda_path = get_cuda_path_or_home()
    if not cuda_path:
        raise RuntimeError("Environment variable CUDA_PATH or CUDA_HOME is not set")
    print("CUDA path:", cuda_path)
    return cuda_path

Step 3 — remove _diagnose_namespace_packages

Delete the _diagnose_namespace_packages() function entirely.

Step 4 — restore cuda-pathfinder in build-system.requires

In both cuda_bindings/pyproject.toml and cuda_core/pyproject.toml,
add "cuda-pathfinder>=1.5" back to the [build-system] requires list.

rwgk added 2 commits March 21, 2026 09:48
Replace the temporary namespace diagnostic with _import_get_cuda_path_or_home
which works around PEP 517 namespace shadowing: in isolated build envs,
backend-path=["."] causes the cuda namespace to resolve to only the project's
cuda/ dir, hiding the installed cuda-pathfinder. The fix replaces the
_NamespacePath with a plain list that includes the site-packages cuda/ dir.

Made-with: Cursor
cuda-pathfinder 1.4.x (currently on PyPI) does not expose
get_cuda_path_or_home yet. Fall back to os.environ.get via getattr
so the namespace fix can be tested on CI now.

Made-with: Cursor
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

/ok to test

`from cuda import pathfinder` raises ImportError (not ModuleNotFoundError)
when the cuda namespace exists but pathfinder is not a submodule.
Switch to `import cuda.pathfinder` which raises ModuleNotFoundError
when the submodule is missing.

Made-with: Cursor
@rwgk
Copy link
Collaborator Author

rwgk commented Mar 21, 2026

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant