add more tests into azure pipeline by shssf · Pull Request #16 · IntelPython/dpnp

shssf · 2020-09-15T20:28:12Z

No description provided.

…ost Hessenberg lstsq Closes audit items IntelPython#9, IntelPython#15, IntelPython#16, IntelPython#17, IntelPython#19 from the prior solver review. All four touch the iterative-solver inner loop and were left out of the previous correctness-only commit because they require a new C++ binding. backend/extensions/blas/gemv.{cpp,hpp}, blas_py.cpp - Extend the typed gemv_impl signature with alpha and beta as doubles; the per-T impl casts them to the matrix value type at dispatch time. dpnp.dot and other legacy callers are unaffected -- the existing gemv() public function now forwards (1.0, 0.0) to the shared gemv_dispatch helper. - Add gemv_alpha_beta() entry point and a _gemv_alpha_beta pybind method computing y = alpha * op(A) * x + beta * y for caller- supplied scalars. Required by the GMRES Arnoldi fast path which fires gemv with (alpha=1, beta=0) writing into a Hessenberg column slice, then (alpha=-1, beta=1) fusing u -= V @ h into one kernel. For complex matrices the scalars are always one of {-1, 0, 1} and so survive the cast exactly; the impl docstring flags the silent imag-loss caveat for any other complex caller. - Refactor: hoist the existing validation / strides / dispatch plumbing into a static gemv_dispatch helper so both entry points share identical behaviour without duplication. scipy/sparse/linalg/_iterative.py - _make_compute_hu now takes both V and H. The closure writes the projection coefficients h = V[:, :j+1]^H @ u directly into the Hessenberg column slice H[:j+1, j] via a single gemv with the output pointing at that slice -- no intermediate h buffer, no slice-assign copy (audit item IntelPython#16). Pass 2 fuses the AXPY-style update u = -V @ h + 1 * u into one gemv with alpha=-1, beta=1 -- no tmp buffer, one kernel instead of gemv-plus-subtract (audit item IntelPython#19). For complex V the (j+1)-element h slice is conjugated in-place between the two passes (V^T -> V^H), negligible cost next to the n*(j+1) gemv. - Switch the Hessenberg least-squares H y = e from a device dpnp.linalg.lstsq (which dispatches an SVD kernel for a tiny 21x20 problem per restart) to numpy.linalg.lstsq on the host (audit item IntelPython#17). Matches CuPy's choice and removes a device- side SVD launch that on Intel GPU dominates the restart cost for the default restart=20. RHS e is now maintained as a numpy array; H is copied via dpnp.asnumpy once per restart and the resulting y is shipped back as a dpnp array for the V @ y update. - V[:, j+1] = v retained as a single contiguous USM slice store (audit item IntelPython#15 closes as no-change-required: the assignment is already one dpnp op on an F-order buffer and there is no fused 'normalise-then-store' path without further binding work). - cg per-iter syncs collapsed from 3-4 down to 1 (audit item IntelPython#9). The pAp and rz_new breakdown checks are no longer transferred to the host on every iteration; instead the loop relies on IEEE-754 inf / NaN propagation through alpha = rz / pAp. When pAp underflows the resulting alpha is inf or NaN, poisons the next residual via x += alpha * p and r -= alpha * Ap, and the single norm sync at the top of the next iteration detects the breakdown via numpy.isfinite(rnorm_host) and exits with info > 0. Mirrors the cuBLAS-style CG inner loop (nrm2 + scalar test, one host barrier per iter); the initial rz breakdown guard remains so a zero preconditioned residual still short- circuits correctly. tests/third_party/cupyx/scipy_tests/sparse_tests/test_linalg.py - test_gmres_complex_arnoldi_fast_path: complex-dtype regression guard for the conjugate-in-place branch of _make_compute_hu -- a silent miss of the conjugate would lose orthogonality and misconverge. - test_cg_inf_breakdown_returns_positive_info: regression guard for the per-iter-sync collapse. Runs cg on a deliberately singular SPD operator and asserts info > 0 (not zero, not -1) so a future re-introduction of explicit breakdown syncs would still pass but a regression to the old info contract would not.

add more tests into azure pipeline

7ecfa7c

shssf merged commit 9b63bae into master Sep 15, 2020

shssf deleted the azure_ci_tests branch September 15, 2020 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add more tests into azure pipeline#16

add more tests into azure pipeline#16
shssf merged 1 commit into
masterfrom
azure_ci_tests

shssf commented Sep 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shssf commented Sep 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant