Fix default algorithm for sparse CUDA matrices to LUFactorization #828

ChrisRackauckas-Claude · 2025-11-20T13:53:11Z

Summary

This PR fixes issue #827 by making LUFactorization the default algorithm for both CuSparseMatrixCSC and CuSparseMatrixCSR. Previously, these sparse CUDA matrix types defaulted to algorithms that are not GPU-compatible and cause scalar indexing errors.

Problem

Currently, the default algorithm choice for sparse CUDA matrices has several issues:

CuSparseMatrixCSC: Always defaults to GenericLU which is not GPU compatible (causes scalar indexing errors)
CuSparseMatrixCSR without CUDSS: Falls back to KrylovJL_GMRES when CUDSS is not loaded, which is also not GPU compatible (causes scalar indexing errors)
Symmetric sparse CUDA matrices: Default to Cholesky which is not GPU compatible

All of these issues result in poor user experience with confusing scalar indexing errors.

Solution

This PR makes the following changes:

1. LinearSolveCUDAExt.jl

Simplified defaultalg() for CuSparseMatrixCSR: Now always returns LUFactorization instead of falling back to Krylov methods
Added defaultalg() for CuSparseMatrixCSC: New method that returns LUFactorization
Added error_no_cudss_lu() for CuSparseMatrixCSC: Provides clear error message when CUDSS is required but not loaded

2. LinearSolveCUDSSExt.jl

Added cudss_loaded() for CuSparseMatrixCSC: Returns true when CUDSS is available for CSC matrices

Benefits

Consistent behavior: Both CSR and CSC sparse CUDA matrices now use the same default algorithm
Clear error messages: When CUDSS is not loaded, users get a clear error message explaining what package they need to load
GPU-compatible: The default algorithm now works properly on GPU without scalar indexing issues
Simpler code: Removed the Krylov fallback that didn't work anyway

Testing

The package loads successfully without errors. Full CUDA tests require CUDA hardware and the CUDSS package.

Fixes

Fixes #827

🤖 Generated with Claude Code

Previously: - CuSparseMatrixCSC defaulted to GenericLU (not GPU compatible) - CuSparseMatrixCSR fell back to Krylov when CUDSS unavailable (not GPU compatible) - Symmetric sparse matrices defaulted to Cholesky (not GPU compatible) Changes: - Both CuSparseMatrixCSC and CuSparseMatrixCSR now default to LUFactorization - Removed Krylov fallback - if CUDSS is not loaded, clear error message is shown - Added cudss_loaded() support for CuSparseMatrixCSC - Added error_no_cudss_lu() for CuSparseMatrixCSC Fixes SciML#827 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

ChrisRackauckas-Claude · 2025-11-20T13:53:38Z

Implementation Details

Code Changes

1. ext/LinearSolveCUDAExt.jl

Simplified defaultalg() for CuSparseMatrixCSR (lines 21-24):

function LinearSolve.defaultalg(A::CUDA.CUSPARSE.CuSparseMatrixCSR{Tv, Ti}, b,
        assump::OperatorAssumptions{Bool}) where {Tv, Ti}
    LinearSolve.DefaultLinearSolver(LinearSolve.DefaultAlgorithmChoice.LUFactorization)
end

Removed conditional logic that fell back to Krylov when CUDSS was unavailable
Now always returns LUFactorization

Added defaultalg() for CuSparseMatrixCSC (lines 26-29):

function LinearSolve.defaultalg(A::CUDA.CUSPARSE.CuSparseMatrixCSC{Tv, Ti}, b,
        assump::OperatorAssumptions{Bool}) where {Tv, Ti}
    LinearSolve.DefaultLinearSolver(LinearSolve.DefaultAlgorithmChoice.LUFactorization)
end

New method for CSC sparse matrices
Matches the behavior of CSR matrices

Added error_no_cudss_lu() for CuSparseMatrixCSC (lines 38-43):

function LinearSolve.error_no_cudss_lu(A::CUDA.CUSPARSE.CuSparseMatrixCSC)
    if !LinearSolve.cudss_loaded(A)
        error("CUDSS.jl is required for LU Factorizations on CuSparseMatrixCSC. Please load this library.")
    end
    nothing
end

Provides clear error when CUDSS is needed but not available
Called during cache initialization to fail fast with a helpful message

2. ext/LinearSolveCUDSSExt.jl

Added cudss_loaded() for CuSparseMatrixCSC (line 7):

LinearSolve.cudss_loaded(A::CUDSS.CUDA.CUSPARSE.CuSparseMatrixCSC) = true

Returns true when CUDSS package is loaded
Enables detection of CUDSS availability for CSC matrices

Design Rationale

The previous approach of falling back to Krylov methods when CUDSS was unavailable didn't work because Krylov methods also suffer from scalar indexing issues on GPU. The new approach is simpler and more user-friendly:

Always default to LUFactorization for sparse CUDA matrices
Fail fast with clear error message if CUDSS is not loaded
Consistent behavior between CSR and CSC matrix types

This matches the philosophy that it's better to give a clear error message than to silently fall back to an algorithm that will fail anyway.

Files Modified

ext/LinearSolveCUDAExt.jl
ext/LinearSolveCUDSSExt.jl

ChrisRackauckas-Claude · 2025-11-20T13:54:33Z

Development Process

Steps Taken

Analyzed Issue Default algs for CUSPARSE failing #827: Reviewed the issue and example code to understand the problem with default algorithms for sparse CUDA matrices
Explored Codebase: Used the Explore agent to find all relevant code for default algorithm selection:
- src/default.jl: Core default algorithm selection logic
- ext/LinearSolveCUDAExt.jl: CUDA-specific algorithm selection
- ext/LinearSolveCUDSSExt.jl: CUDSS integration
- ext/LinearSolveSparseArraysExt.jl: Sparse array handling
Implemented Fix:
- Added defaultalg() method for CuSparseMatrixCSC
- Simplified defaultalg() for CuSparseMatrixCSR to always use LUFactorization
- Added error_no_cudss_lu() for CuSparseMatrixCSC to provide clear error messages
- Added cudss_loaded() support for CuSparseMatrixCSC in CUDSS extension
Tested: Verified package loads without errors (full CUDA tests require GPU hardware)
Formatted Code: Applied JuliaFormatter with SciMLStyle() to changed files
Created PR: Opened this PR with comprehensive documentation

Next Steps

Wait for CI to complete and check results
Address any test failures if they occur
Respond to reviewer feedback

CI Status

CI tests are currently running. The PR includes changes to CUDA extension files, so CUDA-specific tests will be particularly important to monitor.

Updated defaultalg() to check if CUDSS is loaded and error immediately if not available, rather than removing the check entirely or falling back to Krylov (which also doesn't work on GPU). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

ChrisRackauckas-Claude · 2025-11-20T13:59:33Z

Update: Added cudss_loaded() Check

Updated the implementation based on feedback. Now defaultalg() checks if CUDSS is loaded:

function LinearSolve.defaultalg(A::CUDA.CUSPARSE.CuSparseMatrixCSR{Tv, Ti}, b,
        assump::OperatorAssumptions{Bool}) where {Tv, Ti}
    if LinearSolve.cudss_loaded(A)
        LinearSolve.DefaultLinearSolver(LinearSolve.DefaultAlgorithmChoice.LUFactorization)
    else
        error("CUDSS.jl is required for LU Factorizations on CuSparseMatrixCSR. Please load this library.")
    end
end

Same logic applies to CuSparseMatrixCSC.

Benefits:

✅ Errors immediately when CUDSS is not available (fail fast)
✅ Clear error message tells user exactly what to do
✅ Avoids fallback to Krylov which also doesn't work on GPU
✅ When CUDSS is loaded, uses LUFactorization as intended

ChrisRackauckas and others added 2 commits November 20, 2025 08:49

Apply JuliaFormatter with SciMLStyle to changed files

a0668de

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

ChrisRackauckas merged commit 5c87a24 into SciML:main Nov 20, 2025
132 of 142 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix default algorithm for sparse CUDA matrices to LUFactorization #828

Fix default algorithm for sparse CUDA matrices to LUFactorization #828

ChrisRackauckas-Claude commented Nov 20, 2025

Uh oh!

ChrisRackauckas-Claude commented Nov 20, 2025

Uh oh!

ChrisRackauckas-Claude commented Nov 20, 2025

Uh oh!

ChrisRackauckas-Claude commented Nov 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix default algorithm for sparse CUDA matrices to LUFactorization #828

Fix default algorithm for sparse CUDA matrices to LUFactorization #828

Conversation

ChrisRackauckas-Claude commented Nov 20, 2025

Summary

Problem

Solution

1. LinearSolveCUDAExt.jl

2. LinearSolveCUDSSExt.jl

Benefits

Testing

Fixes

Uh oh!

ChrisRackauckas-Claude commented Nov 20, 2025

Implementation Details

Code Changes

1. ext/LinearSolveCUDAExt.jl

2. ext/LinearSolveCUDSSExt.jl

Design Rationale

Files Modified

Uh oh!

ChrisRackauckas-Claude commented Nov 20, 2025

Development Process

Steps Taken

Next Steps

CI Status

Uh oh!

ChrisRackauckas-Claude commented Nov 20, 2025

Update: Added cudss_loaded() Check

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants