argmin, argmax, refactoring of sorting funcs by Alexander-Makaryev · Pull Request #17 · IntelPython/dpnp

Alexander-Makaryev · 2020-09-16T12:10:41Z

No description provided.

shssf · 2020-09-16T12:51:40Z

+    for (size_t i = 0; i < size; ++i)
+    {
+        result[i] = i;
+    }
+


Please use std::iota
http://www.cplusplus.com/reference/numeric/iota/

shssf · 2020-09-16T12:52:36Z

+    for (size_t i = 0; i < size; ++i)
+    {
+        result[i] = array_1[i];
+    }


std::copy
http://www.cplusplus.com/reference/algorithm/copy/?kw=copy

shssf · 2020-09-16T12:53:13Z

+include "backend_searching.pyx"
 include "backend_sorting.pyx"


Suggested change

include "backend_searching.pyx"

include "backend_sorting.pyx"

include "backend_search.pyx"

include "backend_sort.pyx"

These names are used in purpose to be compliant with original project/documentation. Are you really want to change them?
Sorting, searching, and counting

shssf · 2020-09-16T12:54:51Z

+    policy.queue().wait();
+}
+
+template void custom_sort_c<double>(void* array1_in, void* result1, size_t size);


Not sure in it but it would be greate to add these kernels into FPTR

dpnp/dpnp/backend/backend_iface_fptr.cpp

Line 46 in 9b63bae

func_map_t func_map =

shssf · 2020-09-16T12:55:54Z

+
+    cdef size_t size = in_array1.size
+
+    if call_type == numpy.float64:


If the kernel exists in FPTR you can use

dpnp/dpnp/backend_mathematical.pyx

Lines 116 to 128 in 9b63bae

cpdef dparray dpnp_add(dparray array1, dparray array2):

cdef DPNPFuncType param1_type = dpnp_dtype_to_DPNPFuncType(array1.dtype)

cdef DPNPFuncType param2_type = dpnp_dtype_to_DPNPFuncType(array2.dtype)

cdef DPNPFuncData kernel_data = get_dpnp_function_ptr(DPNP_FN_ADD, param1_type, param2_type)

result_type = dpnp_DPNPFuncType_to_dtype( < size_t > kernel_data.return_type)

cdef dparray result = dparray(array1.shape, dtype=result_type)

cdef custom_math_2in_1out_func_ptr_t func = <custom_math_2in_1out_func_ptr_t > kernel_data.ptr

func(array1.get_data(), array2.get_data(), result.get_data(), array1.size)

return result

shssf · 2020-09-16T12:56:36Z

+
+
+def argmax(in_array1, axis=None, out=None):
+    """


shssf

Some proposed change are not required but likely
Also 2 tests filed

FAILED tests/test_linalg.py::test_eig_arange[8-float32] - AssertionError: 
FAILED tests/third_party/cupy/math_tests/test_sumprod.py::TestSumprod::test_sum_all2

Alexander-Makaryev · 2020-09-16T13:41:05Z

@shssf

Some proposed change are not required but likely
Also 2 tests filed

FAILED tests/test_linalg.py::test_eig_arange[8-float32] - AssertionError: 
FAILED tests/third_party/cupy/math_tests/test_sumprod.py::TestSumprod::test_sum_all2

Same tests are failed on master branch, They are not related to these changes

…ost Hessenberg lstsq Closes audit items IntelPython#9, IntelPython#15, IntelPython#16, IntelPython#17, IntelPython#19 from the prior solver review. All four touch the iterative-solver inner loop and were left out of the previous correctness-only commit because they require a new C++ binding. backend/extensions/blas/gemv.{cpp,hpp}, blas_py.cpp - Extend the typed gemv_impl signature with alpha and beta as doubles; the per-T impl casts them to the matrix value type at dispatch time. dpnp.dot and other legacy callers are unaffected -- the existing gemv() public function now forwards (1.0, 0.0) to the shared gemv_dispatch helper. - Add gemv_alpha_beta() entry point and a _gemv_alpha_beta pybind method computing y = alpha * op(A) * x + beta * y for caller- supplied scalars. Required by the GMRES Arnoldi fast path which fires gemv with (alpha=1, beta=0) writing into a Hessenberg column slice, then (alpha=-1, beta=1) fusing u -= V @ h into one kernel. For complex matrices the scalars are always one of {-1, 0, 1} and so survive the cast exactly; the impl docstring flags the silent imag-loss caveat for any other complex caller. - Refactor: hoist the existing validation / strides / dispatch plumbing into a static gemv_dispatch helper so both entry points share identical behaviour without duplication. scipy/sparse/linalg/_iterative.py - _make_compute_hu now takes both V and H. The closure writes the projection coefficients h = V[:, :j+1]^H @ u directly into the Hessenberg column slice H[:j+1, j] via a single gemv with the output pointing at that slice -- no intermediate h buffer, no slice-assign copy (audit item IntelPython#16). Pass 2 fuses the AXPY-style update u = -V @ h + 1 * u into one gemv with alpha=-1, beta=1 -- no tmp buffer, one kernel instead of gemv-plus-subtract (audit item IntelPython#19). For complex V the (j+1)-element h slice is conjugated in-place between the two passes (V^T -> V^H), negligible cost next to the n*(j+1) gemv. - Switch the Hessenberg least-squares H y = e from a device dpnp.linalg.lstsq (which dispatches an SVD kernel for a tiny 21x20 problem per restart) to numpy.linalg.lstsq on the host (audit item IntelPython#17). Matches CuPy's choice and removes a device- side SVD launch that on Intel GPU dominates the restart cost for the default restart=20. RHS e is now maintained as a numpy array; H is copied via dpnp.asnumpy once per restart and the resulting y is shipped back as a dpnp array for the V @ y update. - V[:, j+1] = v retained as a single contiguous USM slice store (audit item IntelPython#15 closes as no-change-required: the assignment is already one dpnp op on an F-order buffer and there is no fused 'normalise-then-store' path without further binding work). - cg per-iter syncs collapsed from 3-4 down to 1 (audit item IntelPython#9). The pAp and rz_new breakdown checks are no longer transferred to the host on every iteration; instead the loop relies on IEEE-754 inf / NaN propagation through alpha = rz / pAp. When pAp underflows the resulting alpha is inf or NaN, poisons the next residual via x += alpha * p and r -= alpha * Ap, and the single norm sync at the top of the next iteration detects the breakdown via numpy.isfinite(rnorm_host) and exits with info > 0. Mirrors the cuBLAS-style CG inner loop (nrm2 + scalar test, one host barrier per iter); the initial rz breakdown guard remains so a zero preconditioned residual still short- circuits correctly. tests/third_party/cupyx/scipy_tests/sparse_tests/test_linalg.py - test_gmres_complex_arnoldi_fast_path: complex-dtype regression guard for the conjugate-in-place branch of _make_compute_hu -- a silent miss of the conjugate would lose orthogonality and misconverge. - test_cg_inf_breakdown_returns_positive_info: regression guard for the per-iter-sync collapse. Runs cg on a deliberately singular SPD operator and asserts info > 0 (not zero, not -1) so a future re-introduction of explicit breakdown syncs would still pass but a regression to the old info contract would not.

argmin, argmax, refactoring of sorting funcs

37a4346

Alexander-Makaryev requested a review from shssf September 16, 2020 12:25

shssf reviewed Sep 16, 2020

View reviewed changes

shssf suggested changes Sep 16, 2020

View reviewed changes

Alexander-Makaryev added 2 commits September 17, 2020 06:23

iota, copy, docs

011e536

FPTR functions map

55f8017

Alexander-Makaryev requested a review from shssf September 17, 2020 17:07

shssf approved these changes Sep 17, 2020

View reviewed changes

shssf merged commit ab10a38 into master Sep 17, 2020

shssf deleted the argmax-argmin branch September 17, 2020 19:27

		include "backend_searching.pyx"
		include "backend_sorting.pyx"


		cdef size_t size = in_array1.size

		if call_type == numpy.float64:

	cpdef dparray dpnp_add(dparray array1, dparray array2):
	cdef DPNPFuncType param1_type = dpnp_dtype_to_DPNPFuncType(array1.dtype)
	cdef DPNPFuncType param2_type = dpnp_dtype_to_DPNPFuncType(array2.dtype)

	cdef DPNPFuncData kernel_data = get_dpnp_function_ptr(DPNP_FN_ADD, param1_type, param2_type)

	result_type = dpnp_DPNPFuncType_to_dtype( < size_t > kernel_data.return_type)
	cdef dparray result = dparray(array1.shape, dtype=result_type)

	cdef custom_math_2in_1out_func_ptr_t func = <custom_math_2in_1out_func_ptr_t > kernel_data.ptr
	func(array1.get_data(), array2.get_data(), result.get_data(), array1.size)

	return result

Conversation

Alexander-Makaryev commented Sep 16, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shssf left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Alexander-Makaryev commented Sep 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shssf left a comment •

edited

Loading