Skip to content

Feature sparse linalg solvers#2841

Open
abagusetty wants to merge 43 commits intoIntelPython:masterfrom
abagusetty:feature-sparse-linalg-solvers
Open

Feature sparse linalg solvers#2841
abagusetty wants to merge 43 commits intoIntelPython:masterfrom
abagusetty:feature-sparse-linalg-solvers

Conversation

@abagusetty
Copy link
Copy Markdown
Contributor

Adds support for from dpnp.scipy.sparse.linalg import LinearOperator, cg, gmres, minres
Fixes: #2831

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • Have you added documentation for your changes, if necessary?
  • Have you added your changes to the changelog?

abagusetty and others added 30 commits April 2, 2026 12:52
…oneMKL hooks

- _interface.py: add full operator algebra (.H, .T, +, *, **, neg),
  _AdjointLinearOperator, _TransposedLinearOperator, _SumLinearOperator,
  _ProductLinearOperator, _ScaledLinearOperator, _PowerLinearOperator,
  IdentityOperator, MatrixLinearOperator, _AdjointMatrixOperator,
  _CustomLinearOperator factory dispatch; extend aslinearoperator
  to handle dpnp sparse and dense arrays

- _iterative.py: add _make_system (dtype validation, preconditioner
  wiring, working dtype selection); add _make_fast_matvec CSR/oneMKL
  SpMV hook; fix GMRES Arnoldi inner product to single oneMKL BLAS
  gemv (dpnp.dot) instead of slow Python vdot loop; offload
  Hessenberg lstsq to numpy.linalg.lstsq (CPU, matches CuPy);
  fix SciPy host-fallback tol->rtol deprecation via _scipy_tol_kwarg;
  add preconditioner support to CG; keep MINRES as SciPy-backed stub

Refs: CuPy v14.0.1 cupyx/scipy/sparse/linalg/_interface.py,
      cupyx/scipy/sparse/linalg/_iterative.py"
…gmres, minres

Modeled after CuPy's cupyx_tests/scipy_tests/sparse_tests/test_linalg.py.
Covers:
  - LinearOperator: shape, dtype inference, matvec/rmatvec/matmat,
    subclassing, __matmul__, __call__, edge cases
  - aslinearoperator: dense array, duck-type, identity passthrough,
    rmatvec from dense, invalid inputs
  - cg: SPD convergence, scipy reference match, x0 warm start, b_ndim=2,
    callback, atol, LinearOperator path, invalid inputs,
    non-convergence info check
  - gmres: diag-dominant convergence, scipy reference match, restart
    variants, x0, b_ndim=2, callbacks, complex systems, atol,
    non-convergence info check, Hilbert-matrix stress test
  - minres: SPD, symmetric-indefinite, scipy reference, shift parameter,
    non-square guard, LinearOperator path, callback
  - Integration: parametric (n, dtype) cross-solver tests via LinearOperator
  - Import smoke tests: __all__ completeness
- Use dpnp.tests.helper: assert_dtype_allclose, generate_random_numpy_array,
  get_all_dtypes, get_float_complex_dtypes, has_support_aspect64
- Use dpnp.tests.third_party.cupy testing harness (with_requires, etc.)
- Use numpy.testing assert_allclose / assert_array_equal / assert_raises
- Use dpnp.asnumpy() instead of numpy.asarray()
- Use pytest parametrize ids matching existing test conventions
- Use is_scipy_available() helper from tests/helper.py
- Strict class-per-solver organisation matching TestCholesky / TestDet etc.
…or dtype

Two bugs fixed:
1. _init_dtype() was calling dpnp.zeros(n) which defaults to float64,
   so a float32 matvec would upcast and return float64, making the
   inferred dtype wrong.  Fix: use dpnp.zeros(n, dtype=dpnp.int8) as
   SciPy/CuPy do — any numeric matvec will promote int8 to its own dtype.
2. _CustomLinearOperator.__init__ called _init_dtype() even when an
   explicit dtype was already supplied, overwriting the caller's value.
   Fix: _init_dtype() now short-circuits when self.dtype is already set.
…ption handling

Align gemv.cpp with the conventions established in blas/gemm.cpp:

Headers added:
- ext/common.hpp         (dpctl_td_ns, consistent with other extensions)
- utils/memory_overlap.hpp   (MemoryOverlap guard on x vs y)
- utils/output_validation.hpp (CheckWritable + AmpleMemory on y)
- utils/type_utils.hpp       (validate_type_for_device<T> in impl)
- <sstream>                  (needed for stringstream error_msg)

Exception handling added in sparse_gemv_impl():
- try/catch(oneapi::mkl::exception) around all oneMKL sparse calls
- try/catch(sycl::exception) around all oneMKL sparse calls
- release_matrix_handle cleanup in the exception error path
- throw std::runtime_error with descriptive message on catch

Input validation added in sparse_gemv():
- ndim checks: x and y must be 1-D
- queues_are_compatible() across all 5 USM arrays
- MemoryOverlap()(x, y) aliasing guard
- CheckWritable::throw_if_not_writable(y)
- AmpleMemory::throw_if_not_ample(y, num_rows)
- keep_args_alive() at function exit (was missing, returning empty event)
… table

Modeled after blas/gemm.cpp (2-D table: value type x index type) and
blas/gemv.cpp (dispatch vector pattern with ContigFactory + init_dispatch_table).

Changes:
- Add sparse/types_matrix.hpp with SparseGemvTypePairSupportFactory<Tv, Ti>
  encoding the 4 supported combinations: {float32,float64} x {int32,int64}
- Rewrite sparse_gemv_impl() to take typeless char* pointers (matching
  the blas gemv_impl signature style) — type info flows through template
  params only, no runtime branching inside the impl
- Replace the 60-line if/else val_typenum/idx_typenum chain in sparse_gemv()
  with a 2-D dispatch table lookup (gemv_dispatch_table[val_id][idx_id])
- Rename init_sparse_gemv_dispatch_vector -> init_sparse_gemv_dispatch_table
  and implement it via init_dispatch_table<> from ext/common.hpp
- All validation guards and exception handling from prior commit are preserved
…se_gemv_dispatch_table

Follows the rename made in gemv.cpp when the dispatch mechanism was
changed from a 1-D vector to a 2-D table (value type x index type).
All other declarations (sparse_gemv signature, parameters) are unchanged.
The oneMKL 2025-2 sparse BLAS API deprecated the old 8-argument
set_csr_data(queue, handle, nrows, ncols, index_base, row_ptr, col_ind,
values, deps) overload in favour of a new signature that takes the
sparse matrix handle as `spmat` and adds an explicit `nnz` argument:

  set_csr_data(queue, spmat, nrows, ncols, nnz, index_base,
               row_ptr, col_ind, values, deps)

Fixes:
- Replace old set_csr_data call with the new nnz-aware signature
- Silences the resulting -Wunused-parameter warning on `nnz` (now used)
- No functional change; all other logic is unchanged
…tring

Line 477: `hasattr(A, "rmatmat\")` had a Markdown-escaped backslash
leaked into the Python source, causing an unterminated string literal.
Fixed to `hasattr(A, "rmatmat")`.
dpnp.ndarray blocks implicit NumPy conversion via __array__ to prevent
silent dtype=object arrays. All test assertions must use .asnumpy()
to materialize device arrays onto the host explicitly.

Also replaces numpy.asarray(x_dp) in _rel_residual helper.
…dation order

- _iterative.py: raise NotImplementedError for M != None *before* the
  _HOST_N_THRESHOLD SciPy fast-path in cg() and gmres(), so the contract
  is enforced regardless of system size (fixes test_cg_preconditioner_unsupported_raises,
  test_gmres_preconditioner_unsupported_raises).
- _iterative.py: validate callback_type and raise NotImplementedError for
  'pr_norm' *before* the _HOST_N_THRESHOLD branch in gmres(), so small-n
  systems also see the error (fixes test_gmres_callback_type_pr_norm_raises).
- _iterative.py: pass callback_type='legacy' to scipy.sparse.linalg.gmres
  when delegating on the fast path to suppress SciPy DeprecationWarning.
- test_scipy_sparse_linalg.py: add dtype=numpy.float64 to expected arange()
  calls in test_identity_operator and test_gmres_happy_breakdown so strict
  NumPy 2.0 dtype-equality checks pass (float64 result vs int64 expected).
- Replace .asnumpy() method calls with dpnp.asnumpy() module fn
  (asnumpy is not an ndarray method in dpnp; it is a top-level fn)
- Fix dpnp.any(x) ambiguous truth value in x0 zero-check; replace
  with explicit `x0 is not None` guard for r0 initialisation
- Fix V_mat.T.conj() -> dpnp.conj(V_mat.T) in GMRES Arnoldi step
- Guard minres beta sqrt against tiny negative floats: sqrt(abs(...))
- Unify GMRES Hessenberg h_np assignment to avoid .real stripping
  producing wrong dtype for complex systems
- Fix float() cast on dpnp scalar norm inside GMRES inner h_j1 line
…failures)

The committed code used hypot(gbar, oldb) as delta_k which is the
gamma (norm) from the PREVIOUS rotation step, not the correct diagonal
entry from applying the previous Givens rotation to the current column.

The correct Paige-Saunders (1975) two-rotation recurrence is:

  oldeps = epsln
  delta  = cs * dbar + sn * alpha   # apply previous rotation
  gbar_k = sn * dbar - cs * alpha   # residual -> new rotation input
  epsln  = sn * beta
  dbar   = -cs * beta

  gamma = hypot(gbar_k, beta)       # NEW rotation eliminates beta
  cs    = gbar_k / gamma
  sn    = beta   / gamma

  w_new = (v - oldeps*w - delta*w2) / gamma  # three-term update

This matches scipy.sparse.linalg.minres and Choi (2006) eq. 6.11.

The buggy recurrence produced solutions ~1.08x away from the true
solution (rel_err ~1e0) instead of the expected ~1e-13.

Co-authored-by: fix-minres-recurrence
abagusetty and others added 10 commits April 6, 2026 22:14
…stagnation checks and use 10*eps floor

- Move residual convergence check (rnorm <= atol_eff) before stagnation
  check so convergence always wins when both conditions trigger on the
  same iteration (fixes info=2 on float32 SPD with tol=1e-7).
- Raise stagnation floor from eps to 10*eps, matching SciPy's minres.py,
  so float32 (eps~1.19e-7) does not prematurely stagnate when tol is
  near machine epsilon.
- Also raise the Lanczos beta-collapse floor from eps*beta1 to
  10*eps*beta1 for the same reason.

Fixes: TestMINRES.test_minres_spd_convergence[float32-5/10/20]"
Bug 1 — GMRES crash: `_dpnp.asnumpy(h_dp)` does not exist as a
module-level function in dpnp.  Changed to the correct array-method
form `h_dp.asnumpy()`.

Bug 2 — GMRES performance: `_dpnp.stack(V_cols, axis=1)` was called
on every inner Arnoldi iteration, reallocating a growing (n x j)
device matrix at each step (O(j^2*n) memory traffic per restart).
Replaced with a pre-allocated V matrix `(n, restart+1)` filled
column-by-column; back-substitution and the solution update now index
directly into V rather than stacking V_cols.

Bug 3 — MINRES silent ignore of atol: `_get_atol(\"minres\", bnrm,
atol=None, rtol=tol)` hard-coded `atol=None`, discarding the caller's
`atol` argument entirely.  Changed to `atol=atol` so the caller's
absolute tolerance is respected.
… guard, Fortran V, pr_norm callback, full MINRES stopping battery
… precond, MINRES SPD guard, GMRES pr_norm callback

- gemv.cpp: split into _sparse_gemv_init / _sparse_gemv_compute / _sparse_gemv_release
  so optimize_gemv fires exactly once per operator rather than once per iteration.
- types_matrix.hpp: register complex64/complex128 × int32/int64 pairs for oneMKL
  sparse::gemv (std::complex<float/double>).
- gemv.hpp: declare the three new entry points.
- sparse_py.cpp: bind _sparse_gemv_init, _sparse_gemv_compute, _sparse_gemv_release
  and remove the old monolithic _sparse_gemv binding.
- _iterative.py: redesign around _CachedSpMV (init once, compute per matvec,
  release in __del__); rename tol→rtol with backward-compat alias; enable M
  preconditioner for cg/gmres; fix MINRES beta SPD check (check sign before sqrt,
  not after abs); add Paige-Saunders multi-criterion stopping (Anorm/ynorm/Acond);
  implement GMRES callback_type='pr_norm'; fix GMRES maxiter default semantics;
  add order='F' to GMRES Krylov basis V.
@intel-python-devops
Copy link
Copy Markdown

Can one of the admins verify this patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request support for scipy.sparse.linalg LinearOperator, GMRES, and MINRES

2 participants