Skip to content

Conversation

@ChrisRackauckas-Claude
Copy link
Contributor

Summary

This PR fixes issue #827 by making LUFactorization the default algorithm for both CuSparseMatrixCSC and CuSparseMatrixCSR. Previously, these sparse CUDA matrix types defaulted to algorithms that are not GPU-compatible and cause scalar indexing errors.

Problem

Currently, the default algorithm choice for sparse CUDA matrices has several issues:

  1. CuSparseMatrixCSC: Always defaults to GenericLU which is not GPU compatible (causes scalar indexing errors)
  2. CuSparseMatrixCSR without CUDSS: Falls back to KrylovJL_GMRES when CUDSS is not loaded, which is also not GPU compatible (causes scalar indexing errors)
  3. Symmetric sparse CUDA matrices: Default to Cholesky which is not GPU compatible

All of these issues result in poor user experience with confusing scalar indexing errors.

Solution

This PR makes the following changes:

1. LinearSolveCUDAExt.jl

  • Simplified defaultalg() for CuSparseMatrixCSR: Now always returns LUFactorization instead of falling back to Krylov methods
  • Added defaultalg() for CuSparseMatrixCSC: New method that returns LUFactorization
  • Added error_no_cudss_lu() for CuSparseMatrixCSC: Provides clear error message when CUDSS is required but not loaded

2. LinearSolveCUDSSExt.jl

  • Added cudss_loaded() for CuSparseMatrixCSC: Returns true when CUDSS is available for CSC matrices

Benefits

  • Consistent behavior: Both CSR and CSC sparse CUDA matrices now use the same default algorithm
  • Clear error messages: When CUDSS is not loaded, users get a clear error message explaining what package they need to load
  • GPU-compatible: The default algorithm now works properly on GPU without scalar indexing issues
  • Simpler code: Removed the Krylov fallback that didn't work anyway

Testing

The package loads successfully without errors. Full CUDA tests require CUDA hardware and the CUDSS package.

Fixes

Fixes #827

🤖 Generated with Claude Code

ChrisRackauckas and others added 2 commits November 20, 2025 08:49
Previously:
- CuSparseMatrixCSC defaulted to GenericLU (not GPU compatible)
- CuSparseMatrixCSR fell back to Krylov when CUDSS unavailable (not GPU compatible)
- Symmetric sparse matrices defaulted to Cholesky (not GPU compatible)

Changes:
- Both CuSparseMatrixCSC and CuSparseMatrixCSR now default to LUFactorization
- Removed Krylov fallback - if CUDSS is not loaded, clear error message is shown
- Added cudss_loaded() support for CuSparseMatrixCSC
- Added error_no_cudss_lu() for CuSparseMatrixCSC

Fixes SciML#827

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@ChrisRackauckas-Claude
Copy link
Contributor Author

Implementation Details

Code Changes

1. ext/LinearSolveCUDAExt.jl

Simplified defaultalg() for CuSparseMatrixCSR (lines 21-24):

function LinearSolve.defaultalg(A::CUDA.CUSPARSE.CuSparseMatrixCSR{Tv, Ti}, b,
        assump::OperatorAssumptions{Bool}) where {Tv, Ti}
    LinearSolve.DefaultLinearSolver(LinearSolve.DefaultAlgorithmChoice.LUFactorization)
end
  • Removed conditional logic that fell back to Krylov when CUDSS was unavailable
  • Now always returns LUFactorization

Added defaultalg() for CuSparseMatrixCSC (lines 26-29):

function LinearSolve.defaultalg(A::CUDA.CUSPARSE.CuSparseMatrixCSC{Tv, Ti}, b,
        assump::OperatorAssumptions{Bool}) where {Tv, Ti}
    LinearSolve.DefaultLinearSolver(LinearSolve.DefaultAlgorithmChoice.LUFactorization)
end
  • New method for CSC sparse matrices
  • Matches the behavior of CSR matrices

Added error_no_cudss_lu() for CuSparseMatrixCSC (lines 38-43):

function LinearSolve.error_no_cudss_lu(A::CUDA.CUSPARSE.CuSparseMatrixCSC)
    if !LinearSolve.cudss_loaded(A)
        error("CUDSS.jl is required for LU Factorizations on CuSparseMatrixCSC. Please load this library.")
    end
    nothing
end
  • Provides clear error when CUDSS is needed but not available
  • Called during cache initialization to fail fast with a helpful message

2. ext/LinearSolveCUDSSExt.jl

Added cudss_loaded() for CuSparseMatrixCSC (line 7):

LinearSolve.cudss_loaded(A::CUDSS.CUDA.CUSPARSE.CuSparseMatrixCSC) = true
  • Returns true when CUDSS package is loaded
  • Enables detection of CUDSS availability for CSC matrices

Design Rationale

The previous approach of falling back to Krylov methods when CUDSS was unavailable didn't work because Krylov methods also suffer from scalar indexing issues on GPU. The new approach is simpler and more user-friendly:

  1. Always default to LUFactorization for sparse CUDA matrices
  2. Fail fast with clear error message if CUDSS is not loaded
  3. Consistent behavior between CSR and CSC matrix types

This matches the philosophy that it's better to give a clear error message than to silently fall back to an algorithm that will fail anyway.

Files Modified

  • ext/LinearSolveCUDAExt.jl
  • ext/LinearSolveCUDSSExt.jl

@ChrisRackauckas-Claude
Copy link
Contributor Author

Development Process

Steps Taken

  1. Analyzed Issue Default algs for CUSPARSE failing #827: Reviewed the issue and example code to understand the problem with default algorithms for sparse CUDA matrices

  2. Explored Codebase: Used the Explore agent to find all relevant code for default algorithm selection:

    • src/default.jl: Core default algorithm selection logic
    • ext/LinearSolveCUDAExt.jl: CUDA-specific algorithm selection
    • ext/LinearSolveCUDSSExt.jl: CUDSS integration
    • ext/LinearSolveSparseArraysExt.jl: Sparse array handling
  3. Implemented Fix:

    • Added defaultalg() method for CuSparseMatrixCSC
    • Simplified defaultalg() for CuSparseMatrixCSR to always use LUFactorization
    • Added error_no_cudss_lu() for CuSparseMatrixCSC to provide clear error messages
    • Added cudss_loaded() support for CuSparseMatrixCSC in CUDSS extension
  4. Tested: Verified package loads without errors (full CUDA tests require GPU hardware)

  5. Formatted Code: Applied JuliaFormatter with SciMLStyle() to changed files

  6. Created PR: Opened this PR with comprehensive documentation

Next Steps

  • Wait for CI to complete and check results
  • Address any test failures if they occur
  • Respond to reviewer feedback

CI Status

CI tests are currently running. The PR includes changes to CUDA extension files, so CUDA-specific tests will be particularly important to monitor.

Updated defaultalg() to check if CUDSS is loaded and error immediately
if not available, rather than removing the check entirely or falling
back to Krylov (which also doesn't work on GPU).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@ChrisRackauckas-Claude
Copy link
Contributor Author

Update: Added cudss_loaded() Check

Updated the implementation based on feedback. Now defaultalg() checks if CUDSS is loaded:

function LinearSolve.defaultalg(A::CUDA.CUSPARSE.CuSparseMatrixCSR{Tv, Ti}, b,
        assump::OperatorAssumptions{Bool}) where {Tv, Ti}
    if LinearSolve.cudss_loaded(A)
        LinearSolve.DefaultLinearSolver(LinearSolve.DefaultAlgorithmChoice.LUFactorization)
    else
        error("CUDSS.jl is required for LU Factorizations on CuSparseMatrixCSR. Please load this library.")
    end
end

Same logic applies to CuSparseMatrixCSC.

Benefits:

  • ✅ Errors immediately when CUDSS is not available (fail fast)
  • ✅ Clear error message tells user exactly what to do
  • ✅ Avoids fallback to Krylov which also doesn't work on GPU
  • ✅ When CUDSS is loaded, uses LUFactorization as intended

@ChrisRackauckas ChrisRackauckas merged commit 5c87a24 into SciML:main Nov 20, 2025
132 of 142 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Default algs for CUSPARSE failing

2 participants