Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch for cusparseLt 0.6.1 #8074

Open
wants to merge 46 commits into
base: main
Choose a base branch
from

Conversation

gdaisukesuzuki
Copy link

@gdaisukesuzuki gdaisukesuzuki commented Jan 1, 2024

#6757

This PR is compatible only on CUDA 12 and cusparseLt 0.6.0/1 (not 0.5.0, 0.5.2).

@gdaisukesuzuki gdaisukesuzuki changed the title patch for cusparselt5 patch for cusparseLt 5.0.2 Jan 1, 2024
@gdaisukesuzuki gdaisukesuzuki changed the title patch for cusparseLt 5.0.2 patch for cusparseLt 0.5.2 Jan 2, 2024
@kmaehashi kmaehashi self-assigned this Jan 5, 2024
@kmaehashi kmaehashi added cat:enhancement Improvements to existing features prio:medium labels Jan 5, 2024
@ev-br
Copy link
Contributor

ev-br commented Feb 8, 2024

Stumbled on this while trying to figure out why the cupy main does not build on a machine it did in December. Tracked it down to cusparseLT 0.5.2 installed on that machine. Rebasing this PR on today's main (f643379) builds fine (great!) but fails down the line with (the specific test file is fairly random)

$ pytest tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py -vx
=========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.10.0, pytest-8.0.0, pluggy-1.4.0 -- /home/ev-br/.conda/envs/cupyx2/bin/python3.10
cachedir: .pytest_cache
rootdir: /home/ev-br/repos/cupy
configfile: setup.cfg
collected 80 items                                                                                                                                                                                        

tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::test_conditionally_positive_definite[cubic] SKIPPED (conditionally posdef: skip for now)                                         [  1%]
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::test_conditionally_positive_definite[gaussian] SKIPPED (conditionally posdef: skip for now)                                      [  2%]
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::test_conditionally_positive_definite[inverse_multiquadric] SKIPPED (conditionally posdef: skip for now)                          [  3%]
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::test_conditionally_positive_definite[inverse_quadratic] SKIPPED (conditionally posdef: skip for now)                             [  5%]
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::test_conditionally_positive_definite[linear] SKIPPED (conditionally posdef: skip for now)                                        [  6%]
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::test_conditionally_positive_definite[multiquadric] SKIPPED (conditionally posdef: skip for now)                                  [  7%]
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::test_conditionally_positive_definite[quintic] SKIPPED (conditionally posdef: skip for now)                                       [  8%]
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::test_conditionally_positive_definite[thin_plate_spline] SKIPPED (conditionally posdef: skip for now)                             [ 10%]
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::TestRBFInterpolatorNeighborsNone::test_scale_invariance_1d[cubic] FAILED                                                         [ 11%]

================================================================================================ FAILURES =================================================================================================
____________________________________________________________________ TestRBFInterpolatorNeighborsNone.test_scale_invariance_1d[cubic] _____________________________________________________________________

self = <cupyx_tests.scipy_tests.interpolate_tests.test_rbfinterp.TestRBFInterpolatorNeighborsNone object at 0x7fa5140afca0>, xp = <module 'cupy' from '/home/ev-br/repos/cupy/cupy/__init__.py'>
scp = <module 'cupyx.scipy' from '/home/ev-br/repos/cupy/cupyx/scipy/__init__.py'>, kernel = 'cubic'

    @testing.numpy_cupy_allclose(scipy_name='scp')
    @pytest.mark.parametrize('kernel', sorted(_SCALE_INVARIANT))
    def test_scale_invariance_1d(self, xp, scp, kernel):
        # Verify that the functions in _SCALE_INVARIANT are insensitive to the
        # shape parameter (when smoothing == 0) in 1d.
        seq = Halton(1, scramble=False, seed=_np.random.RandomState())
        x = xp.asarray(3*seq.random(50))
        y = _1d_test_function(x, xp)
        xitp = xp.asarray(3*seq.random(50))
>       yitp1 = self.build(scp, x, y, epsilon=1.0, kernel=kernel)(xitp)

tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py:126: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py:435: in build
    return scp.interpolate.RBFInterpolator(*args, **kwargs)
cupyx/scipy/interpolate/_rbfinterp.py:672: in __init__
    shift, scale, coeffs = _build_and_solve_system(
cupyx/scipy/interpolate/_rbfinterp.py:465: in _build_and_solve_system
    lhs, rhs, shift, scale = _build_system(
cupyx/scipy/interpolate/_rbfinterp.py:283: in _build_system
    mins = cp.min(y, axis=0)
cupy/_statistics/order.py:44: in amin
    return a.min(axis=axis, out=out, keepdims=keepdims)
cupy/_core/core.pyx:1019: in cupy._core.core._ndarray_base.min
    cpdef _ndarray_base min(self, axis=None, out=None, keepdims=False):
cupy/_core/core.pyx:1027: in cupy._core.core._ndarray_base.min
    return _statistics._ndarray_min(self, axis, out, None, keepdims)
cupy/_core/_routines_statistics.pyx:64: in cupy._core._routines_statistics._ndarray_min
    return _amin(self, axis=axis, out=out, dtype=dtype, keepdims=keepdims)
cupy/_core/_reduction.pyx:618: in cupy._core._reduction._SimpleReductionKernel.__call__
    return self._call(
cupy/_core/_reduction.pyx:370: in cupy._core._reduction._AbstractReductionKernel._call
    cub_success = _cub_reduction._try_to_call_cub_reduction(
cupy/_core/_cub_reduction.pyx:689: in cupy._core._cub_reduction._try_to_call_cub_reduction
    _launch_cub(
cupy/_core/_cub_reduction.pyx:540: in cupy._core._cub_reduction._launch_cub
    func = _SimpleCubReductionKernel_get_cached_function(
cupy/_util.pyx:64: in cupy._util.memoize.decorator.ret
    result = f(*args, **kwargs)
cupy/_core/_cub_reduction.pyx:240: in cupy._core._cub_reduction._SimpleCubReductionKernel_get_cached_function
    return _create_cub_reduction_function(
cupy/_core/_cub_reduction.pyx:223: in cupy._core._cub_reduction._create_cub_reduction_function
    module = compile_with_cache(
cupy/_core/core.pyx:2254: in cupy._core.core.compile_with_cache
    return cuda.compiler._compile_module_with_cache(
cupy/cuda/compiler.py:484: in _compile_module_with_cache
    return _compile_with_cache_cuda(
cupy/cuda/compiler.py:562: in _compile_with_cache_cuda
    ptx, mapping = compile_using_nvrtc(
cupy/cuda/compiler.py:319: in compile_using_nvrtc
    return _compile(source, options, cu_path,
cupy/cuda/compiler.py:290: in _compile
    options, headers, include_names = _jitify_prep(
cupy/cuda/compiler.py:233: in _jitify_prep
    jitify._init_module()
cupy/cuda/jitify.pyx:212: in cupy.cuda.jitify._init_module
    cpdef void _init_module() except*:
cupy/cuda/jitify.pyx:236: in cupy.cuda.jitify._init_module
    _init_cupy_headers()
cupy/cuda/jitify.pyx:209: in cupy.cuda.jitify._init_cupy_headers
    _init_cupy_headers_from_scratch()
cupy/cuda/jitify.pyx:192: in cupy.cuda.jitify._init_cupy_headers_from_scratch
    jitify(warmup_kernel, options)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   load_program(cuda_source, headers, nullptr, &include_paths,
E   AssertionError: Only cupy raises error
E   
E   Traceback (most recent call last):
E     File "/home/ev-br/repos/cupy/cupy/testing/_loops.py", line 45, in _call_func
E       result = impl(*args, **kw)
E     File "/home/ev-br/repos/cupy/tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py", line 126, in test_scale_invariance_1d
E       yitp1 = self.build(scp, x, y, epsilon=1.0, kernel=kernel)(xitp)
E     File "/home/ev-br/repos/cupy/tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py", line 435, in build
E       return scp.interpolate.RBFInterpolator(*args, **kwargs)
E     File "/home/ev-br/repos/cupy/cupyx/scipy/interpolate/_rbfinterp.py", line 672, in __init__
E       shift, scale, coeffs = _build_and_solve_system(
E     File "/home/ev-br/repos/cupy/cupyx/scipy/interpolate/_rbfinterp.py", line 465, in _build_and_solve_system
E       lhs, rhs, shift, scale = _build_system(
E     File "/home/ev-br/repos/cupy/cupyx/scipy/interpolate/_rbfinterp.py", line 283, in _build_system
E       mins = cp.min(y, axis=0)
E     File "/home/ev-br/repos/cupy/cupy/_statistics/order.py", line 44, in amin
E       return a.min(axis=axis, out=out, keepdims=keepdims)
E     File "cupy/_core/core.pyx", line 1019, in cupy._core.core._ndarray_base.min
E       cpdef _ndarray_base min(self, axis=None, out=None, keepdims=False):
E     File "cupy/_core/core.pyx", line 1027, in cupy._core.core._ndarray_base.min
E       return _statistics._ndarray_min(self, axis, out, None, keepdims)
E     File "cupy/_core/_routines_statistics.pyx", line 64, in cupy._core._routines_statistics._ndarray_min
E       return _amin(self, axis=axis, out=out, dtype=dtype, keepdims=keepdims)
E     File "cupy/_core/_reduction.pyx", line 618, in cupy._core._reduction._SimpleReductionKernel.__call__
E       return self._call(
E     File "cupy/_core/_reduction.pyx", line 370, in cupy._core._reduction._AbstractReductionKernel._call
E       cub_success = _cub_reduction._try_to_call_cub_reduction(
E     File "cupy/_core/_cub_reduction.pyx", line 689, in cupy._core._cub_reduction._try_to_call_cub_reduction
E       _launch_cub(
E     File "cupy/_core/_cub_reduction.pyx", line 540, in cupy._core._cub_reduction._launch_cub
E       func = _SimpleCubReductionKernel_get_cached_function(
E     File "cupy/_util.pyx", line 64, in cupy._util.memoize.decorator.ret
E       result = f(*args, **kwargs)
E     File "cupy/_core/_cub_reduction.pyx", line 240, in cupy._core._cub_reduction._SimpleCubReductionKernel_get_cached_function
E       return _create_cub_reduction_function(
E     File "cupy/_core/_cub_reduction.pyx", line 223, in cupy._core._cub_reduction._create_cub_reduction_function
E       module = compile_with_cache(
E     File "cupy/_core/core.pyx", line 2254, in cupy._core.core.compile_with_cache
E       return cuda.compiler._compile_module_with_cache(
E     File "/home/ev-br/repos/cupy/cupy/cuda/compiler.py", line 484, in _compile_module_with_cache
E       return _compile_with_cache_cuda(
E     File "/home/ev-br/repos/cupy/cupy/cuda/compiler.py", line 562, in _compile_with_cache_cuda
E       ptx, mapping = compile_using_nvrtc(
E     File "/home/ev-br/repos/cupy/cupy/cuda/compiler.py", line 319, in compile_using_nvrtc
E       return _compile(source, options, cu_path,
E     File "/home/ev-br/repos/cupy/cupy/cuda/compiler.py", line 290, in _compile
E       options, headers, include_names = _jitify_prep(
E     File "/home/ev-br/repos/cupy/cupy/cuda/compiler.py", line 233, in _jitify_prep
E       jitify._init_module()
E     File "cupy/cuda/jitify.pyx", line 212, in cupy.cuda.jitify._init_module
E       cpdef void _init_module() except*:
E     File "cupy/cuda/jitify.pyx", line 236, in cupy.cuda.jitify._init_module
E       _init_cupy_headers()
E     File "cupy/cuda/jitify.pyx", line 209, in cupy.cuda.jitify._init_cupy_headers
E       _init_cupy_headers_from_scratch()
E     File "cupy/cuda/jitify.pyx", line 192, in cupy.cuda.jitify._init_cupy_headers_from_scratch
E       jitify(warmup_kernel, options)
E     File "cupy/cuda/jitify.pyx", line 267, in cupy.cuda.jitify.jitify
E       load_program(cuda_source, headers, nullptr, &include_paths,
E   RuntimeError: Runtime compilation failed

cupy/cuda/jitify.pyx:267: AssertionError
------------------------------------------------------------------------------------------ Captured stdout call -------------------------------------------------------------------------------------------
---------------------------------------------------
--- JIT compile log for cupy_jitify_exercise ---
---------------------------------------------------
cub/util_cpp_dialect.cuh(143): warning #161-D: unrecognized #pragma
       CUB_COMPILER_DEPRECATION_SOFT(C++14, C++11);
       ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

cooperative_groups/details/helpers.h(448): error: identifier "cudaCGGetIntrinsicHandle" is undefined
              return (cudaCGGetIntrinsicHandle(cudaCGScopeMultiGrid));
                      ^

cooperative_groups/details/helpers.h(453): error: identifier "cudaCGSynchronize" is undefined
              cudaError_t err = cudaCGSynchronize(handle, 0);
                                ^

cooperative_groups/details/helpers.h(459): error: identifier "cudaCGGetSize" is undefined
              cudaCGGetSize(&numThreads, NULL, handle);
              ^

cooperative_groups/details/helpers.h(466): error: identifier "cudaCGGetRank" is undefined
              cudaCGGetRank(&threadRank, NULL, handle);
              ^

cooperative_groups/details/helpers.h(473): error: identifier "cudaCGGetRank" is undefined
              cudaCGGetRank(NULL, &gridRank, handle);
              ^

cooperative_groups/details/helpers.h(480): error: identifier "cudaCGGetSize" is undefined
              cudaCGGetSize(NULL, &numGrids, handle);
              ^

6 errors detected in the compilation of "cupy_jitify_exercise".

---------------------------------------------------
========================================================================================= short test summary info =========================================================================================
FAILED tests/cupyx_tests/scipy_tests/interpolate_tests/test_rbfinterp.py::TestRBFInterpolatorNeighborsNone::test_scale_invariance_1d[cubic] - AssertionError: Only cupy raises error
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
====================================================================================== 1 failed, 8 skipped in 37.38s ======================================================================================

On a different machine which does not have cursparseLt at all, this same pytest invocation works fine on main. So this patch is great and needs a further small tweak somewhere?

Hat tip @andfoy for effectively guiding me down this exercise.

@leofang
Copy link
Member

leofang commented Feb 8, 2024

@ev-br could you create an issue for the error that you see (and ping me)? It's not about cuSPARSELt but something else.

@ev-br
Copy link
Contributor

ev-br commented Feb 8, 2024

@leofang done: #8171

@ev-br
Copy link
Contributor

ev-br commented Feb 14, 2024

Either an update to this PR or cleaning up the cupy cache locally made the problem reported in #8074 (comment) disappear. So I guess I can confirm that this PR, when rebased on main, works for

$ python -c'import cupy; cupy.show_config()'
OS                           : Linux-5.15.0-76-generic-x86_64-with-glibc2.35
Python Version               : 3.10.0
CuPy Version                 : 13.0.0rc1
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.26.0
SciPy Version                : 1.11.2
Cython Build Version         : 0.29.37
Cython Runtime Version       : None
CUDA Root                    : /usr/local/cuda
nvcc PATH                    : /usr/local/cuda/bin/nvcc
CUDA Build Version           : 12010
CUDA Driver Version          : 12010
CUDA Runtime Version         : 12010 (linked to CuPy) / 12000 (locally installed)
cuBLAS Version               : (available)
cuFFT Version                : 11002
cuRAND Version               : 10302
cuSOLVER Version             : (11, 4, 4)
cuSPARSE Version             : (available)
NVRTC Version                : (12, 1)
Thrust Version               : 200200
CUB Build Version            : 200200
Jitify Build Version         : e4bfacd69
cuDNN Build Version          : None
cuDNN Version                : None
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : None
cuSPARSELt Build Version     : 502
Device 0 Name                : NVIDIA GeForce RTX 2060
Device 0 Compute Capability  : 75
Device 0 PCI Bus ID          : 0000:01:00.0
Device 1 Name                : NVIDIA GeForce RTX 2060
Device 1 Compute Capability  : 75
Device 1 PCI Bus ID          : 0000:21:00.0

@gdaisukesuzuki
Copy link
Author

@ev-br
CusparseLt is designed to support the Ampere architecture (or later), which means it might not operate on the RTX2060....

@gdaisukesuzuki
Copy link
Author

@Micket Amended. Thank you so much for your comments.

Copy link

@Micket Micket left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some more things i saw

cupy_backends/cuda/libs/cusparselt.pyx Outdated Show resolved Hide resolved
cupy_backends/cuda/libs/cusparselt.pyx Outdated Show resolved Hide resolved
cupy_backends/cuda/libs/cusparselt.pyx Outdated Show resolved Hide resolved
cupy_backends/cuda/libs/cusparselt.pyx Outdated Show resolved Hide resolved
cupy_backends/cuda/libs/cusparselt.pyx Outdated Show resolved Hide resolved
cupy_backends/cuda/libs/cusparselt.pyx Outdated Show resolved Hide resolved
cupy_backends/cuda/libs/cusparselt.pyx Show resolved Hide resolved
cupy_backends/cuda/libs/cusparselt.pyx Show resolved Hide resolved
@kmaehashi
Copy link
Member

Hi @gdaisukesuzuki, Thank you for working on this one, and I'm sorry for keeping you waiting! Could you check @Micket's comments regarding type annotation?

Copy link
Member

@kmaehashi kmaehashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order looks different between the docs and the actual header file...

cupy_backends/cuda/libs/cusparselt.pxd Outdated Show resolved Hide resolved
@gdaisukesuzuki
Copy link
Author

gdaisukesuzuki commented Mar 27, 2024

@kmaehashi
Thank you for your comment.
I've made corrections based on your comment. Regarding the comments from @Micket , some have been corrected, while others seemed unnecessary to amend (the reasons are noted as comments). If you believe my judgment is incorrect, please comment again to notify me.

@gdaisukesuzuki
Copy link
Author

@kmaehashi
I'm sorry, I left the patch un-applied that was recommended during the static check (#11290). I've applied it and pushed again. Please check it out.

Copy link
Member

@kmaehashi kmaehashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM 👍

@kmaehashi
Copy link
Member

/test mini

@gdaisukesuzuki
Copy link
Author

@kmaehashi
This May, cusparselt 0.6.1 has been released. I confirmed that it can work without modifying this PR on 0.6.1. Please note that 0.6.1 only improves the features of 0.6.0 and there seems no change on the API.

@gdaisukesuzuki gdaisukesuzuki changed the title patch for cusparseLt 0.6.0 patch for cusparseLt 0.6.1 May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:enhancement Improvements to existing features prio:high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants