Skip to content

Migrate dpctl.tensor into dpnp.tensor#2856

Open
vlad-perevezentsev wants to merge 48 commits intomasterfrom
include-dpctl-tensor
Open

Migrate dpctl.tensor into dpnp.tensor#2856
vlad-perevezentsev wants to merge 48 commits intomasterfrom
include-dpctl-tensor

Conversation

@vlad-perevezentsev
Copy link
Copy Markdown
Contributor

This PR migrates the tensor implementation from dpctl.tensor into dpnp.tensor making dpnp the primary owner of the Array API-compliant tensor layer

Major changes:

  • Move compiled C++/SYCL extensions (_tensor_impl, _tensor_elementwise_impl, _tensor_reductions_impl, _tensor_sorting_impl, _tensor_accumulation_impl, tensor linalg) into dpnp.tensor
  • Move usm_ndarray, compute-follows-data utilities and tensor tests from dpctl
  • Replace all dpctl.tensor references with dpnp.tensor in docstrings, error messages and comments
  • Remove redundant dpctl.tensor C-API interface
  • Add tensor.rst documentation page describing the module, its relationship to dpnp.ndarray and dpctl and linking to the dpctl 0.21.1 API reference

This simplifies maintenance, reduces cross-project dependencies and enables independent development and release cycles

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • Have you added documentation for your changes, if necessary?
  • Have you added your changes to the changelog?

antonwolfy and others added 30 commits January 26, 2026 20:36
The PR adds a header file `dpnp4pybind11.hpp` which contains minimum
necessary content to write pybind11 extensions and includes a caster for
`usm_ndarray` and type enumerators.

This PR also includes movement for a part of dpctl.tensor header which
previously used in dpnp code. It is needed to get rid of include
conflicts, since now repiques including new `dpnp4pybind11.hpp` header
everywhere.
Merge master into include-dpctl-tensor
This PR proposes introducing `dpctl_ext` as a new internal extension
module (temporarily renamed from `dpctl` to avoid conflicts), adding
CMake/packaging support for building `_tensor_impl` via pybind11 and
switching dpnp to use `dpctl_ext.tensor. _tensor_impl`

The migrated `_tensor_impl` currently supports the following functions:

>  '_array_overlap',
>  '_as_c_contig',
>  '_as_f_contig',
>  '_contract_iter',
>  '_contract_iter2',
>  '_contract_iter3',
>  '_contract_iter4',
>  '_copy_usm_ndarray_into_usm_ndarray',
>  '_ravel_multi_index',
>  '_same_logical_tensors',
>  '_unravel_index',
>  'default_device_bool_type',
>  'default_device_complex_type',
>  'default_device_fp_type',
>  'default_device_index_type',
>  'default_device_int_type',
>  'default_device_uint_type'

Files in `dpnp` that explicitly `import dpctl.tensor._tensor_impl`
This PR extends `_tensor_impl` in `dpctl_ext.tensor` with the remaining
functions that are explicitly used in `dpnp` implementations (`_take`,
`_full_usm_ndarray`, `_zeros_usm_ndarray`, `_triu`) enabling a complete
switch to `dpctl_ext.tensor._tensor_impl` instead of
`dpctl.tensor._tensor_impl`

It also adds `take()`, `put()`, `full()`,`tril()` and `triu()` to
`dpctl_ext.tensor` and updates the corresponding dpnp functions to use
these implementations internally
This PR extends `_tensor_impl` in `dpctl_ext.tensor` with the copy
functions (`_copy_usm_ndarray_for_reshape` ,
`_copy_numpy_ndarray_into_usm_ndarray`. `_copy_usm_ndarray_for_roll_1d`,
`_copy_usm_ndarray_for_roll_nd`)

It also adds `asnumpy(), astype(), copy(), from_numpy(), to_numpy(),
roll(), and reshape()` to `dpctl_ext.tensor` and updates the
corresponding dpnp functions to use these implementations internally
This PR extends `_tensor_impl` in `dpctl_ext.tensor` with the advanced
indexing (`_extract, _place, _nonzero, mask_positions, `), repeat
(`_cumsum_1d`) and `_eye` functions

It also adds `eye(), extract(), nonzero(), place(), put_along_axis(),
take_along_axis()` to `dpctl_ext.tensor` and updates the corresponding
dpnp functions to use these implementations internally
This PR adds a small clean up to already porting dpctl.tensor code:
* remove unused includes
* add missing includes
* remove redundant namespace qualifications when calling function from
the same namespace
…2778)

This PR extends `_tensor_impl` in `dpctl_ext.tensor` with the `_where,
_clip` and repeat functions
(`_repeat_by_sequence, _repeat_by_scalar`)

It also adds `repeat(), where(), clip()` and `can_cast, finfo, iinfo,
isdtype, result_type` from `_type_utils.py` `to dpctl_ext.tensor and
updates the corresponding dpnp functions to use these implementations
internally
This PR is the final one in the series of extending `_tensor_impl`
extension

It extends `_tensor_impl` in `dpctl_ext.tensor` with linear sequence
functions
(`_linspace_step and _linspace_affine`)

Also this PR significantly expands Python API of `dpctl_ext.tensor` by
adding all missing functions from `dpctl_ext.tensor._ctors` and
`dpctl_ext.tensor._manipulation_functions`

`_tensor_impl`:  45 / 45 functions 
Python API dpctl_ext.tensor:  70 / 233 functions
This PR completely moves `_tensor_accumulation_impl` pybind11 extension
into `dpctl_ext.tensor` and extends `dpctl_ext.tensor` Python API with
the functions `cumulative_logsumexp, cumulative_prod and cumulative_sum`
reusing them in dpnp
This PR completely moves `_tensor_sorting_impl` pybind11 extension into
`dpctl_ext.tensor` and extends dpctl_ext.tensor Python API with the
functions `searchsorted isin, unique_all, unique_counts, unique_inverse,
unique_values, argsort, sort and top_k ` reusing them in dpnp
This PR completely moves `_tensor_reductions_impl` pybind11 extension
into `dpctl_ext.tensor` and extends dpctl_ext.tensor Python API with the
functions: `all, any, diff, argmax, argmin, count_nonzero, logsumexp, max. min, prod, reduce_hypot and sum`
 reusing them in dpnp
The PR adds missing includes to tensor source and header files.
…#2795)

This PR initializes `_tensor_elementwise_impl ` pybind11 extension in
`dpctl_ext.tensor` and extends `dpctl_ext.tensor ` Python API with the
part of unary functions : `abs, acos , acosh, angle. atan, atanh, bitwise_invert. ceil, conj`

This is the first part of the work on migrating
`_tensor_elementwise_impl` (unary)_
This PR extends `_tensor_elementwise_impl` with part of the unary
functions: `cos, cosh, exp, expm1, floor, imag, isfinite, isinf, isnan,
log, log1p, log2, log10, logical_not, negative, positive`
This PR extends `_tensor_elementwise_impl` with the remaining unary
functions:  `real, reciprocal, round, rsqrt, sign, signbit, sin, sinh, sqrt, square, tan, tanh, trunc`
This PR migrates the `_tensor_linalg_impl` extension to
`dpctl_ext.tensor` and extends `dpctl_ext.tensor` Python API with
`dpctl.tensor` functions `matmul`, `matrix_transpose`, `tensordot`, and
`vecdot`
…r dpnp (#2803)

This PR extends `_tensor_elementwise_impl` with part of binary functions
: `add, atan2, bitwise_and, bitwise_left_shift, bitwise_or,
bitwise_right_shift, bitwise_xor`
This PR extends `_tensor_elementwise_impl` with part of binary functions
: `divide, equal, floor_divide, greater, greater_equal, hypot, less,
less_equal, logaddexp`
This PR extends _tensor_elementwise_impl with the remaining binary
functions : ` copysign, logical_and, logical_or, logical_xor, maximum, minimum, multiply, nextafter, not_equal, pow,  remainder, subtract `

This is the last PR series of `_tensor_elementwise_impl` migration which
fully migrates all elementwise functions to `dpctl_ext.tensor`
This PR extends `dpctl_ext.tensor` API with the remaining statistical
and testing functions adding `std(), var(), mean(), allclose()`
This PR proposes to migrate the tensor interface (`usm_ndarray, dlpack,
flags`) into `dpctl_ext/tensor` making `dpnp` independent of `dpctl's`
tensor module.

Updates:
> -  Introduce `dpctl_ext_capi.h`
> - Implement a clean CMake interface library `DpctlExtCAPI` to properly
propagate generated headers to consumers
> - Update remaining imports from `dpctl.tensor` to `dpctl_ext.tensor`
> - Link all backend extensions against `DpctlExtCAPI` to ensure
consistent access to the C-API
This PR removes the unused external C-API from `dpctl_ext.tensor` and
replaces function pointer calls with direct struct member access.

Changes:
1. Remove all `cdef api` functions from `_usmarray.pyx`
2. Delete `dpctl_ext_capi.h` and `DpctlExtCAPI` CMake interface library
3. Update `dpnp4pybind11.hpp` to access `PyUSMArrayObject` members
directly
4. Update build configuration
This PR proposes a refactoring that migrates `dpctl_ext.tensor` module
into `dpnp` package as `dpnp.tensor`

Changes:

1. Moved `dpctl_ext/tensor/` directory to `dpnp/tensor/`
2. Updated all imports from `dpctl_ext.tensor` to `dpnp.tensor` across
the codebase
3. Consolidated build: removed dpctl_ext/CMakeLists.txt, added
build_dpnp_tensor_ext() to dpnp/CMakeLists.txt
4. Added `DPNP_BUILD_COMPONENTS` CMake option
(`ALL/TENSOR_ONLY/SKIP_TENSOR`) for staged builds
5. Split coverage workflow into two steps to avoid memory issues
6. Updated include paths in all backend extension CMake files
7. Removed `dpctl_ext/` directory and cleaned up `.gitignore`
This PR moves all tensor-related tests to `dpnp/tests/tensor` as part of
the ongoing migration of tensor functionality from `dpctl` to
`dpnp.tensor`

Key changes:

> - Relocated 89 tensor tests (elementwise functions, `usm_ndarray`, and
tensor utilities)
> - Updated imports to use `dpnp.tensor`
> - Included tests in packaging configuration
> - Integrated tensor tests into CI
> - Fixed several issues discovered during migration (dtype
expectations, boolean reductions, etc.)
> - Fixed a circular import in _usmarray.py
> - Added `SKIP_TENSOR_TESTS` env variable to manage the launch of the
test scope

In a follow-up PR:

> - Conditional logic will be added to run dpctl_ext/tests only when
changes affect the tensor code.
> - Array API tests for tensor will be introduced and executed as a
separate CI job.
This PR proposes to move the file `_compute_follows_data.pyx` from
`dpctl.utils` to `dpnp.tensor` as part of the migration of
`dpctl.tensor` to `dpnp.tensor`

### Changes
>- **Moved file**: `dpctl/utils/_compute_follows_data.pyx` →
`dpnp/tensor/_compute_follows_data.pyx`
>- **Exports** (now available from `dpnp.tensor`):
>>- `ExecutionPlacementError` - exception for execution placement errors
>>- `get_execution_queue()` - determine execution queue from input
arrays
>>- `get_coerced_usm_type()` - determine output USM type for
compute-follows-data
>>- `validate_usm_type()` - validate USM type specifications
ndgrigorian and others added 18 commits April 10, 2026 09:55
Add `__main__.py` for CLI options to get `libtensor` include dirs from module
There was a w/a implemented in scope of
[dpctl#2275](IntelPython/dpctl#2275).
Thus the PR enables the previously muted tests for `dpnp.cumlogsumexp`.
)

This PR proposes to replace `dpctl.tensor` with `dpnp.tensor` across the
error messages.
Add a Sphinx handler to redirect dpnp.tensor.* cross-references to dpctl
0.21.1 docs
and `tensor.rst` page linking to the dpctl API reference
…2851)

This PR proposes device-aware output dtype resolution for
`dpnp.tensor.round()` with `boolean` input to handle devices that do not
support `float16`

Boolean support for round() was originally added in #2817
[6f5a792](6f5a792)
to match NumPy behavior where numpy.round(bool) returns float16 rather
than an integral type like int8.
However on devices without fp16 support, returning float16 is not
viable.

The bool type mapping was removed from the round kernel and an
acceptance
function `_acceptance_fn_round` was added to ensure the fallback in
`_find_buf_dtype`
prefers floating-point output over integral types for boolean input

Result :
fp16 devices: round(bool) -> float16 
non-fp16 devices: round(bool) -> float32
This PR proposes to fix test warnings in `dpnp.tensor` tests by
replacing deprecated strides assignment with
`np.lib.stride_tricks.as_strided` in `test_usm_ndarray_dlpack.py`
and suppressing overflow warnings from np.allclose in
`test_exp.py:test_exp_complex_contig`
@github-actions
Copy link
Copy Markdown
Contributor

View rendered docs @ https://intelpython.github.io/dpnp/pull/2856/index.html

@github-actions
Copy link
Copy Markdown
Contributor

Array API standard conformance tests for dpnp=0.20.0dev6=py313h509198e_56 ran successfully.
Passed: 1357
Failed: 3
Skipped: 16

Comment thread dpnp/tensor/_array_api.py
Returns a dictionary of default data types for ``device``.

Args:
device (Optional[:class:`dpctl.SyclDevice`, :class:`dpctl.SyclQueue`, :class:`dpctl.tensor.Device`, str]):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have dpctl.tensor anymore

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you assume to update tensor docstrings separetely in the follow-up PR?

#endif

// Include dpctl C-API headers (both declarations and import functions)
#include "dpctl/_sycl_context.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At what step we are going to use dpctl4pybind11.hpp?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants