Skip to content

Support CUDA 12.8#309

Open
zaz wants to merge 21 commits intogeometric-intelligence:mainfrom
zaz:support-cu128
Open

Support CUDA 12.8#309
zaz wants to merge 21 commits intogeometric-intelligence:mainfrom
zaz:support-cu128

Conversation

@zaz
Copy link
Copy Markdown

@zaz zaz commented Apr 11, 2026

Checklist

  • My pull request has a clear and explanatory title.
  • My pull request passes the Linting test.
  • I added appropriate unit tests and I made sure the code passes all unit tests. (refer to comment below)
  • My PR follows PEP8 guidelines. (refer to comment below)
  • My code is properly documented, using numpy docs conventions, and I made sure the documentation renders properly.
  • I linked to issues and PRs that are relevant to this PR.

PR series: #303 #307 #308 309

This PR is only for commits da969a2..9741e93. It is number 3 in a series of staged PRs, building upon #308 by adding commits da969a2..9741e93; if #308 is rejected, the commits in this PR need to be manually reviewed. In particular, 2953cfb is required so that the OGB dataset loader does not crash.

Description

Commits:

  1. Loosens the torch pin from ==2.3.0 to >=2.3.0.
  2. Set 3 torch sparse packages to not build from source because I was having issues with the source build taking priority.
  3. Add CUDA 12.8 support. Fixes Support cu128 for Blackwell GPUs #306.
  4. Make it so backbone-specific imports are optional.
  5. Add tests for 4.

The last two shouldn't affect users who do the automated install, but for users who are playing around with later CUDA versions, later torch versions, etc, having those dependencies as optional makes things easier as you have less dependencies to manage if you're not using those backbones.

If you don't want to support torch >2.3.0, we could add a CLI option to use a specific torch version and update the installation instructions to use that, noting that non-2.3.0 versions are experimental. However, it would be good to support >2.3.0 torch versions because GPUs that require them are only becoming more common.

Issue

Fixes #306.

zaz and others added 21 commits April 10, 2026 15:56
Run `pre-commit run --all-files`.
Fixes linter warning: numpydoc-validation flagged mismatched
underline lengths in docstring section headers.
Fixes linter warning: numpydoc-validation flagged GL08 (missing
docstring) on 4 modules.
Run `pre-commit run --all-files`.
After adding these, running `pre-commit run --all-files` indicates
no existing issues.
Run `codespell --write-changes`, then manually correct.
Fix grammar using Claude Haiku 4.5, then manually correct.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Use `return 1` instead of `exit 1` so that sourcing the script
with an invalid platform prints the error without killing the shell.

Fixes geometric-intelligence#305
Verify that `source uv_env_setup.sh INVALID` does not kill the
user's shell (uses return instead of exit).
Replace if/elif chain with case statement and string concatenation.
No functional changes.
torch >= 2.6 defaults to weights_only=True, which breaks OGB's
torch.load calls that serialize PyG data classes. Register the
needed classes as safe globals so datasets load without errors.
Guarded with hasattr for compatibility with torch < 2.6.
Prevents PyG's fs.torch_load from falling back to weights_only=False
when loading processed datasets that contain numpy scalars and dtypes.
Includes numpy.dtypes.*DType subclasses for numpy >= 1.25.
torch.tensor(existing_tensor) is deprecated; use .detach().clone().
nx.from_numpy_matrix is removed in NetworkX 3.0; use from_numpy_array.
Replace hardcoded TORCH_VER="2.3.0" with auto-detection so the setup
script works with any torch version resolved by uv. Loosen torch pin
from ==2.3.0 to >=2.3.0 to allow newer versions.
Add no-build-package to pyproject.toml to prevent uv from building
these packages from PyPI sdists. This forces resolution from the PyG
find-links wheels, which are pre-built for the correct PyTorch + CUDA
version. Applies to all uv commands, not just the setup script.
Remove extra-build-dependencies section (no longer needed since we
never build from source).
Add pytorch-cu128 index to pyproject.toml and cu128 option to the
setup script. This is required for newer GPUs (e.g. Blackwell
architecture) that need CUDA 12.8+.
This only affects users doing a manual install; the setup script
installs them via --all-extras. Making them optional avoids install
failures for users who don't need NSD, ED-GNN, or point cloud lifting
backbones, as these packages require pre-built wheels matching the
exact PyTorch + CUDA version.

Move top-level imports of torch_sparse, torch_scatter, and
torch_cluster to lazy imports inside the functions that use them,
so that importing topobench doesn't crash without the [sparse] extra.
Add [sparse] to the [all] extra group.
Verify that importing topobench and triggering backbone auto-discovery
works without the [sparse] extra installed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support cu128 for Blackwell GPUs

1 participant