Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CUDA 12.0 build matrix #967

Merged
merged 7 commits into from
Jun 27, 2023

Conversation

pentschev
Copy link
Member

@pentschev pentschev commented Jun 23, 2023

Closes #927

@pentschev pentschev requested a review from a team as a code owner June 23, 2023 18:37
@pentschev
Copy link
Member Author

We're seeing the following error for CUDA 12:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/boa/cli/mambabuild.py", line 141, in mamba_get_install_actions
    solution = solver.solve_for_action(_specs, prefix)
  File "/opt/conda/lib/python3.9/site-packages/boa/core/solver.py", line 244, in solve_for_action
    t = self.solve(specs)
  File "/opt/conda/lib/python3.9/site-packages/boa/core/solver.py", line 234, in solve
    raise RuntimeError("Solver could not find solution." + error_string)
RuntimeError: Solver could not find solution.Mamba failed to solve:
 - cudatoolkit 12.0.*
 - libgcc-ng >=12
 - numpy >=1.21
 - pynvml >=11.4.1
 - python >=3.9,<3.10.0a0
 - python_abi 3.9.* *_cp39
 - libstdcxx-ng >=12
 - ucx >=1.14.1,<1.15.0a0

with channels:

The reported errors are:
- Encountered problems while solving:
-   - nothing provides requested cudatoolkit 12.0.*

I'm not actually sure what's missing or done wrong here, I've compared to cuDF's dependencies.yaml, and the cudatoolkit section looks correct. @jakirkham @bdice @vyasr would any of you be able to take a quick look and see if there's something clearly wrong with my changes here?

@bdice
Copy link
Contributor

bdice commented Jun 23, 2023

If this is an error in a conda build, check on meta.yaml. This line probably needs to be conditional on CUDA 11.

- cudatoolkit {{ cuda_version }}.*

@pentschev
Copy link
Member Author

If this is an error in a conda build, check on meta.yaml. This line probably needs to be conditional on CUDA 11.

- cudatoolkit {{ cuda_version }}.*

@bdice I looked at that, but cuDF seems to do it the same way, no conditionals, it does have some conditionals but those are for the requirements subsections, which don't really apply to UCX-Py. Do we need to introduce some conditionals in UCX-Py, even though it doesn't really depend on them directly?

@jakirkham
Copy link
Member

jakirkham commented Jun 23, 2023

Could you please try dropping cudatoolkit?

It's already pulled in by UCX when it is needed

Edit: Filed an issue about cuDF ( rapidsai/cudf#13613 )

@bdice
Copy link
Contributor

bdice commented Jun 23, 2023

@pentschev That’s a bug in cudf’s recipe but we disable recipe tests so it isn’t seen. https://github.com/rapidsai/cudf/blob/0b4e3543fb35a6f8bbae3ad3e07b544c534901ac/ci/build_python.sh#L19

This package isn’t skipping the tests so you do see a failure here.

@bdice
Copy link
Contributor

bdice commented Jun 23, 2023

Could you please try dropping cudatoolkit?

It's already pulled in by UCX when it is needed

Be sure that all the CUDA 11 test configurations pull in their target version. That’s why this specific test pinning exists — otherwise all the CI jobs for 11.2, 11.5, 11.8 would install any compatible version like >=11,<12 and not test all the intended toolkit versions.

@jakirkham
Copy link
Member

We could use cuda-version instead

@pentschev
Copy link
Member Author

Could you please try dropping cudatoolkit?

It's already pulled in by UCX when it is needed

No, cudatoolkit isn't pulled by UCX. Remember in conda-forge/ucx-split-feedstock#111 we've switch to a hybrid build mode that can work for CPU-only and CUDA as well if it's available, therefore by default no CUDA packages will be pulled.

We could use cuda-version instead

Also not sure what this comment relates to, are we supposed to add it somewhere else besides https://github.com/rapidsai/ucx-py/pull/967/files#diff-5475a6d76de4c506ee92cf6f941bba30a6a07b5881cfea531d42a6ec035095a6R77 ?

@bdice
Copy link
Contributor

bdice commented Jun 26, 2023

Try replacing this line in meta.yaml:

- cudatoolkit {{ cuda_version }}.*

with

 - cuda-version ={{ cuda_version }}

@pentschev
Copy link
Member Author

Thanks @bdice and @jakirkham , with the latest suggestion tests are passing now. Do you think we're good here and should move ahead on merging this, or are there further changes required?

dependencies.yaml Outdated Show resolved Hide resolved
@jakirkham jakirkham requested a review from bdice June 26, 2023 20:47
conda/recipes/ucx-py/meta.yaml Outdated Show resolved Hide resolved
dependencies.yaml Outdated Show resolved Hide resolved
Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Peter! 🙏

Had a couple comments below

dependencies.yaml Show resolved Hide resolved
conda/recipes/ucx-py/meta.yaml Outdated Show resolved Hide resolved
CUDA is not a hard-dependency, thus we should remove it.
@jarmak-nv
Copy link

Hi @pentschev could you please link this PR to #927 either by editing in closing keywords to the description, or through the UI?

Thanks!

@jakirkham jakirkham requested a review from bdice June 27, 2023 19:12
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, one minor comment on the open conversation about dependencies.yaml but nothing blocking.

dependencies.yaml Show resolved Hide resolved
@pentschev
Copy link
Member Author

Woohoo! Thanks all for the reviews here! 😄

@pentschev
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit c371b5d into rapidsai:branch-0.33 Jun 27, 2023
31 checks passed
@jakirkham
Copy link
Member

Thanks Peter and thanks everyone for reviewing! 🙏

@pentschev pentschev deleted the cuda-12-support branch September 26, 2023 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UCX-Py: CUDA 12 Conda Packages
5 participants