Fix wheel builds and publish GPU wheels to PyPI by jameslehoux · Pull Request #259 · BASE-Laboratory/OpenImpala

jameslehoux · 2026-04-21T11:34:10Z

Now that openimpala-cuda is published to PyPI (previous commit switched the
GPU wheel workflow), the install collapses from

pip install openimpala-cuda --find-links \
  https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 \
  nvidia-cuda-runtime-cu12 nvidia-cublas-cu12 nvidia-cusparse-cu12 \
  nvidia-curand-cu12

down to plain

pip install openimpala-cuda

The nvidia-*-cu12 packages were only needed because the --find-links index
didn't carry them; PyPI's resolver will pull whatever the wheel actually
declares. Updates every call site that showed the old incantation:

README.md, docs/getting-started.md, docs/user-guide/gpu.md — advanced
install sections
paper.md — corrects "via GitHub Releases" wording for the JOSS draft
notebooks/visualization_yt.ipynb — §0 install cell
tutorials/02_digital_twin.ipynb — install cell
tutorials/04_multiphase_and_fields.ipynb — install cell
tutorials/07_hpc_scaling.ipynb — §6 install cell

Also fixes a malformed .sif wget URL in docs/getting-started.md (a stray
concatenation of expanded_assets/v4.0.6 with the filename) by switching to
a vX.Y.Z placeholder to match the pattern already used in tutorial 7.

CPU wheel error ("gmake: *** No rule to make target '_core'") traced to the SKBUILD_CMAKE_ARGS env var interfering with scikit-build-core's cmake.args merge. The OPENIMPALA_ENABLE_TINY_PROFILE option was redundant anyway — when AMReX is built with AMReX_TINY_PROFILE=ON, it sets AMREX_TINY_PROFILE in its installed AMReX_Config.H header, which every file including AMReX.H picks up automatically. Removed the option and the env var; kept the AMReX-side build flag. GPU wheel error ("CUDA::nvToolsExt target not found") is AMReX 25.03 vs. CUDA 12 — libnvToolsExt was removed in CUDA 12 in favour of NVTX3 (header-only). Patch AMReX 25.03's CMake to use CUDA::nvtx3 instead, applied via sed before configure. CMake 3.25+ (we have 3.28) exposes CUDA::nvtx3 from CUDAToolkit, so this is drop-in. Cache keys bumped (CPU v5, GPU nvtx3-v4) to force a fresh dep rebuild. https://claude.ai/code/session_011dJ5Bwq4Tnr8wxH597XJFf

Now that openimpala-cuda has been granted the 320 MiB per-file PyPI limit, the GPU wheels fit and can be installed via `pip install openimpala-cuda` like any other package. Mirror the CPU workflow's publish job: use the pypi trusted-publisher flow (environment: pypi, id-token: write) via pypa/gh-action-pypi-publish. Gate on github.event_name == 'release' so workflow_dispatch runs still produce artifacts for manual inspection without touching the index. https://claude.ai/code/session_011dJ5Bwq4Tnr8wxH597XJFf

Now that openimpala-cuda is published to PyPI (previous commit switched the GPU wheel workflow), the install collapses from pip install openimpala-cuda --find-links \ https://github.com/BASE-Laboratory/OpenImpala/releases/expanded_assets/v4.0.6 \ nvidia-cuda-runtime-cu12 nvidia-cublas-cu12 nvidia-cusparse-cu12 \ nvidia-curand-cu12 down to plain pip install openimpala-cuda The nvidia-*-cu12 packages were only needed because the --find-links index didn't carry them; PyPI's resolver will pull whatever the wheel actually declares. Updates every call site that showed the old incantation: - README.md, docs/getting-started.md, docs/user-guide/gpu.md — advanced install sections - paper.md — corrects "via GitHub Releases" wording for the JOSS draft - notebooks/visualization_yt.ipynb — §0 install cell - tutorials/02_digital_twin.ipynb — install cell - tutorials/04_multiphase_and_fields.ipynb — install cell - tutorials/07_hpc_scaling.ipynb — §6 install cell Also fixes a malformed .sif wget URL in docs/getting-started.md (a stray concatenation of expanded_assets/v4.0.6 with the filename) by switching to a vX.Y.Z placeholder to match the pattern already used in tutorial 7. https://claude.ai/code/session_011dJ5Bwq4Tnr8wxH597XJFf

auditwheel repair --exclude drops libcudart / libcublas / libcusparse / libcurand / libnvJitLink from the openimpala-cuda wheel payload, which means the wheel only works on machines that already have the CUDA 12 toolkit installed — driver-only Colab/Kaggle runtimes have the libraries, but a bare Python venv on an NVIDIA workstation does not. Declare the nvidia-*-cu12 PyPI packages as runtime deps so pip pulls them automatically. Keep them commented out in pyproject.toml with clear markers so the CPU wheel (which uses the same file) doesn't grow a 1-2 GB dep tree. The GPU workflow's existing sed step already rewrites `name = "openimpala"` to `"openimpala-cuda"`; a second sed uncomments the `#"nvidia-..."` lines in the same pass. Verified with python3 -m tomllib that both variants produce valid TOML and the expected dependency lists: CPU: ['numpy', 'scipy>=1.7'] GPU: ['numpy', 'scipy>=1.7', 'nvidia-cuda-runtime-cu12', 'nvidia-cublas-cu12', 'nvidia-cusparse-cu12', 'nvidia-curand-cu12', 'nvidia-nvjitlink-cu12'] https://claude.ai/code/session_011dJ5Bwq4Tnr8wxH597XJFf

github-actions · 2026-04-21T11:36:59Z

Performance Benchmark Results

Size	Solver	Wall Time (s)	Tortuosity	Expected	Rel. Error	Iters	Status
64³	pcg	0.7091	0.984375	0.984375	0.00e+00	1	PASS
64³	flexgmres	0.4182	0.984375	0.984375	0.00e+00	N/A	PASS
64³	bicgstab	0.4105	0.984375	0.984375	0.00e+00	N/A	PASS
64³	gmres	0.4124	0.984375	0.984375	0.00e+00	N/A	PASS
128³	pcg	8.4434	0.992188	0.992188	0.00e+00	1	PASS
128³	flexgmres	5.6633	0.992188	0.992188	0.00e+00	N/A	PASS
128³	bicgstab	5.4838	0.992188	0.992188	0.00e+00	N/A	PASS
128³	gmres	5.5678	0.992188	0.992188	0.00e+00	N/A	PASS

Fastest solver: bicgstab at 64³ (0.4105s)

Benchmark: uniform block (analytical τ = (N-1)/N)

github-actions · 2026-04-21T11:45:45Z

Code Coverage Report

------------------------------------------------------------------------------
                           GCC Code Coverage Report
Directory: .
------------------------------------------------------------------------------
File                                       Lines     Exec  Cover   Missing
------------------------------------------------------------------------------
src/io/CathodeWrite.cpp                       95       83    87%   40-41,97-100,115-116,182-185
src/io/CathodeWrite.H                          1        1   100%
src/io/DatReader.cpp                         135      105    77%   26-27,30,35,92-93,99-100,107-109,135-137,141,144-148,152-155,162,164,208-209,242,245
src/io/DatReader.H                             1        1   100%
src/io/HDF5Reader.cpp                        344       84    24%   40-41,43-44,46-49,52,54-56,58-59,62,64-66,68-74,92-93,126-128,144-145,154-157,174-180,182-187,204,213-215,217,219-228,230-233,236-238,240-251,253-258,266,266,266,266,266,266,266,270,270,270,270,270,270,270,274,276,278,280,282,288,290,297,297,297,297,297,297,297,301,301,301,301,301,301,301,305,305,305,305,305,305,305-306,306,306,306,306,306,306,309,309,309,309,309,309,309-310,310,310,310,310,310,310-311,311,311,311,311,311,311,313,313,313,313,313,313,313-314,314,314,314,314,314,314-315,315,315,315,315,315,315,319,319,319,319,319,319,319,324,324,324,324,324,324,324-325,325,325,325,325,325,325-326,326,326,326,326,326,326-327,327,327,327,327,327,327,332,332,332,332,332,332,332,337,337,337,337,337,337,337-338,338,338,338,338,338,338,343,343,343,343,343,343,343,350,350,350,350,350,350,350,357-358,432-435,437-440
src/io/HDF5Reader.H                            3        3   100%
src/io/ImageLoader.cpp                        61       42    68%   25,38,48,60-62,64-70,72,77,89-90,92,94
src/io/RawReader.cpp                         266      135    50%   49-50,89-90,111-112,115-117,120-121,140-142,155-157,166-168,174-177,185-186,192-196,200-204,209-212,219-224,231-237,271,273-274,276,283-284,301,312,314,318,325,327,331-334,338,346-347,353-355,361-363,365-366,369,372,374,377-380,382-384,386,388-389,391,393-394,396,398-399,401,403-404,406,410-411,413,417-418,420,425,465,471-472,521-524,538,540-542,544,546-548,558,562-564,566,588
src/io/RawReader.H                             1        1   100%
src/io/TiffReader.cpp                        384      130    33%   59-65,67-69,71-73,75-77,79-80,82-84,86-88,90-92,94-96,98-99,101-103,106-108,111-112,114-117,119,122,124-127,143-144,148-150,152-158,160,186,210,217,226,228-231,240,242-245,248,255,288-293,306,309-317,319-320,323-327,331-335,338-342,344-348,351-357,359-363,367,369,375-377,379-393,396,398-402,404-409,413-418,420-425,428-429,432-434,555-575,577-578,581-588,590,593-609,612-614,670,673-674,677-683,685,689-700,702-703
src/io/TiffReader.H                            5        5   100%
src/props/BoundaryCondition.H                131       74    56%   63,68,70,216,224-229,233-236,238-244,247-249,252-253,255,258-261,264-265,271-272,274-279,285-287,290-296,299,303,365-366,371,373
src/props/ConnectedComponents.cpp             69       67    97%   94-95
src/props/ConnectedComponents.H                4        4   100%
src/props/DeffTensor.cpp                      62       59    95%   122,128-129
src/props/Diffusion.cpp                      510      378    74%   93-94,97-98,103-104,106-116,118,123-132,134-141,144-150,153-157,159-163,165,168-173,175-177,179,182-184,186-187,190-191,193,195-198,200,202-203,288-289,297-298,300,349,359-360,368-371,373-375,404-413,415,453,461,465-467,526-527,533,535,539,547,581,610,638,646,735-736,739-740,757-760,771-772,774,824
src/props/EffDiffFillMtx.H                   120      106    88%   58,216-217,221-225,229,231-235
src/props/EffectiveDiffusivityHypre.cpp      389      347    89%   189-191,193-197,305,367-370,479,612-615,617-619,621-624,633-636,643,672,684-687,689-691,693,705,716,718
src/props/EffectiveDiffusivityHypre.H          7        7   100%
src/props/FloodFill.cpp                       84       81    96%   94-95,203
src/props/HypreStructSolver.cpp              343      210    61%   87-88,121,133-134,145,299,309,311,314,346,356,358,361,367-370,372-376,378-379,381-385,388-389,391-392,394,397-398,401-402,404-407,409-413,415-416,418-422,425-426,428-429,431,434-435,438-439,441-443,445-451,453-457,460-461,463-464,466,469-470,473,475-477,479-485,487-491,494-495,497-498,500,503-504,507,509-511,513-516,518-522,525-526,528-529,531,534-535,538,541-542,555
src/props/HypreStructSolver.H                  6        6   100%
src/props/MacroGeometry.H                     17       17   100%
src/props/ParticleSizeDistribution.cpp        11       11   100%
src/props/ParticleSizeDistribution.H           6        6   100%
src/props/PercolationCheck.cpp                53       46    86%   32-33,49-51,68,73
src/props/PercolationCheck.H                   4        4   100%
src/props/PhysicsConfig.H                     90       89    98%   150
src/props/ResultsJSON.H                      225      222    98%   242,395,416
src/props/REVStudy.cpp                       151      128    84%   72,83-91,159,170-173,175,183-186,188-190
src/props/SolverConfig.H                      32       20    62%   30,32,37-44,75-76
src/props/SpecificSurfaceArea.cpp             56       55    98%   59
src/props/SpecificSurfaceArea.H                6        6   100%
src/props/ThroughThicknessProfile.cpp         38       38   100%
src/props/ThroughThicknessProfile.H            5        5   100%
src/props/Tortuosity.H                         2        2   100%
src/props/TortuosityDirect.cpp               219      191    87%   81-83,86,100-106,113-114,125,134,140,202-209,226,394,424,433
src/props/TortuosityDirect.H                   5        5   100%
src/props/TortuosityHypre.cpp                784      563    71%   149-150,155-156,240-243,246-248,311,335-337,340-341,343,353-355,358-360,390-393,573,597,601,622,639-640,642-644,646-655,657,660-664,668-670,673-680,682-686,690-692,694-696,698-707,709-713,715-726,728-731,733,743,749-752,754-756,765-768,770-772,788,791-792,815-820,831-834,836,873,878-881,884-886,890-893,895,897-900,902,907-909,911,960,969,974,977-982,998-1001,1015-1019,1024-1029,1039-1043,1048-1053,1058-1062,1065-1068,1075-1078,1089,1098,1100,1104,1106,1128,1159-1160,1246-1248,1374-1377
src/props/TortuosityHypre.H                   15       15   100%
src/props/TortuosityHypreFill.H              127       98    77%   85,203,205-212,237-239,241-245,247-248,250,252,255-256,258-262
src/props/TortuosityKernels.H                 97       53    54%   52,56-60,62-65,69-74,76-80,84-85,90,129,143,157,243,245-248,250-253,257-260,262-265
src/props/TortuosityMLMG.cpp                  99       91    91%   160,181-183,185-186,193,206
src/props/TortuosityMLMG.H                     1        1   100%
src/props/TortuositySolverBase.cpp           301      237    78%   70-72,74-75,94-101,104,106,142-145,200,203,205,255,280,298,327,391,394-396,398,406-409,411-417,422,427-429,435-436,438-440,454,460,464-465,467,478,492,496-498,500,502,506
src/props/TortuositySolverBase.H              13       13   100%
src/props/VolumeFraction.cpp                  25       25   100%
src/props/VolumeFraction.H                     4        4   100%
------------------------------------------------------------------------------
TOTAL                                       5407     3874    71%
------------------------------------------------------------------------------

Generated by CI — coverage data from gcovr

codecov · 2026-04-21T11:46:11Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

James Le Houx added 4 commits April 21, 2026 11:22

jameslehoux merged commit 8d155b7 into master Apr 21, 2026
6 checks passed

github-actions Bot added devops documentation Improvements or additions to documentation gpu labels Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix wheel builds and publish GPU wheels to PyPI#259

Fix wheel builds and publish GPU wheels to PyPI#259
jameslehoux merged 4 commits intomasterfrom
claude/upbeat-mccarthy-f1mNN

jameslehoux commented Apr 21, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

codecov Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jameslehoux commented Apr 21, 2026

Uh oh!

Uh oh!

github-actions Bot commented Apr 21, 2026

Performance Benchmark Results

Uh oh!

github-actions Bot commented Apr 21, 2026

Code Coverage Report

Uh oh!

codecov Bot commented Apr 21, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant