-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for CUDA 12.0. #1742
Conversation
I can have a look @maleadt, CUDA v12.0 should fix some bugs but it also removed a lot of depreciated routines that we are using. The current error with ERROR: LoadError: UndefVarError: libcudart not defined and I have a similar error julia> using CUDA
julia> CUDA.versioninfo()
CUDA runtime 11.8, artifact installation
CUDA driver 11.8
NVIDIA driver 520.61.5
Libraries:
- CUBLAS: 11.11.3
- CURAND: 10.3.0
- CUFFT: 10.9.0
- CUSOLVER: 11.4.1
- CUSPARSE: 11.7.5
- CUPTI: 18.0.0
- NVML: 11.0.0+520.61.5
Toolchain:
- Julia: 1.8.1
- LLVM: 13.0.1
- PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5, 7.0, 7.1, 7.2
- Device capability support: sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75, sm_80, sm_86
8 devices:
0: Tesla P100-PCIE-12GB (sm_60, 11.910 GiB / 12.000 GiB available)
1: Tesla P100-PCIE-12GB (sm_60, 11.910 GiB / 12.000 GiB available)
2: Tesla P100-PCIE-12GB (sm_60, 11.910 GiB / 12.000 GiB available)
3: Tesla P100-PCIE-12GB (sm_60, 11.910 GiB / 12.000 GiB available)
4: Tesla P100-PCIE-12GB (sm_60, 11.910 GiB / 12.000 GiB available)
5: Tesla P100-PCIE-12GB (sm_60, 11.910 GiB / 12.000 GiB available)
6: Tesla P100-PCIE-12GB (sm_60, 11.910 GiB / 12.000 GiB available)
7: Tesla P100-PCIE-12GB (sm_60, 11.910 GiB / 12.000 GiB available)
julia> CUDA.set_runtime_version!(v"12.0")
[ Info: Set CUDA Runtime version preference to 12.0, please re-start Julia for this to take effect. montalex@pandora.gerad.lan:~/git/CUDA.jl (1016)>julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.8.1 (2022-09-06)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> using CUDA
[ Info: Precompiling CUDA [052768ef-5323-5732-b1bb-66c8b64840ba]
┌ Error: No CUDA Runtime library found. This can have several reasons:
│ * you are using an unsupported platform: CUDA.jl only supports Linux (x86_64, aarch64, ppc64le), and Windows (x86_64).
│ refer to the documentation for instructions on how to use a custom CUDA runtime.
│ * you precompiled CUDA.jl in an environment where the CUDA driver was not available.
│ in that case, you need to specify (during pre compilation) which version of CUDA to use.
│ refer to the documentation for instructions on how to use `CUDA.set_runtime_version!`.
│ * you requested use of a local CUDA toolkit, but not all components were discovered.
│ try running with JULIA_DEBUG=CUDA_Runtime_Discovery for more information.
└ @ CUDA ~/git/CUDA.jl/src/initialization.jl:77 |
I'll update the headers.
That's because your driver is too old, only supporting up to CUDA 11.8, not CUDA 12.0. EDIT: Oh, I forgot to bump the Project.toml change, maybe that's the reason. |
a10716f
to
0dbc162
Compare
@maleadt |
Now may be a good time to remove that, so yeah feel free to get rid of those. |
FWIW, now that we've bumped the minimum requirement to CUDA 11.0, these are the driver-level functions that are supported (since it isn't always documented exactly which functions are supported; often deprecated or undocumented ones are silently still available):
Posting this list because we can't do CI for this |
ae8d1a9
to
d5e36ce
Compare
@maleadt ps: I also added some others fixes for the release v"12.0" on my branch |
Thanks, I've merged them here. |
As expected, the upgrade to 12.0 Update 1 didn't help with the CUSPARSE issues. |
@amontoison Any update from NVIDIA? Worst case, we make the cusparse/broadcast use larger inputs and introduce a test marked as broken on CUDA 12 for the problematic conversions. |
They still try to understand where is the issue. |
Yep, that's OK for me :-) |
I will open a PR tonight to fix |
e700005
to
6669cab
Compare
The remaining problem with the cudadrv tests is do |
* Update CUSPARSE for CUDA v12.0 * Fix a test with CuMatrix * CuSparseMatrixCOO * Keep the previous signature of scatter! and gather!
Also add a conversion CuSparseVector -> CuVector
e03f2e3
to
c010c22
Compare
Ahh, so close. @amontoison Looks like some of the conversions regressed on CUDA 11.0. |
Fixed with #1791 🤞 |
Great! Let's do a final check after merging your other changes, and get this over the finish line. Just in time for CUDA 12.1... 😅 |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #1742 +/- ##
==========================================
- Coverage 60.46% 59.85% -0.61%
==========================================
Files 148 147 -1
Lines 11885 12086 +201
==========================================
+ Hits 7186 7234 +48
- Misses 4699 4852 +153
... and 28 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
Finally, CI is all green. Thanks for the help @amontoison! |
Co-authored-by: Alexis Montoison <alexis.montoison@polymtl.ca>
Release notes: https://developer.nvidia.com/blog/cuda-toolkit-12-0-released-for-general-availability/
TODO: