[enhancement] add dlpack queue extraction and data conversions #2569

icfaust · 2025-06-23T23:49:34Z

Description

Changes in this PR:

Expose C++ ==operator for SyclQueue and SyclDevice checking. This should only be used when dpctl is not installed and the queues need to be checked for being on the same device in _sycl_queue_manager's from_data.
onedal.datatypes._dlpack is added which contains specific methods for converting to numpy and extracting SyclQueues from pytorch data.
_sycl_queue_manager is modified to work with __dlpack__ data, where new logic is added for SYCL data, and for instances when non-SYCL, non-CPU data (e.g. CUDA GPU data) is encountered, which should error the estimator unless fallback_to_host is set in the config.
Changes were required for sklearnex._device_offload._get_backend in order to work for this circumstance. A special object __non_queue exists to flag this scenario in _sycl_queue_manager.
A new wrapper convert_sklearnex_queue is added to make sure that any creation of queues using the pybind11 interface are converted to dpctl SyclQueues before use. In general, only a single SyclQueue object type should be used throughout the codebase, and should be determined at import time. However, there are cases where the queue must be accessed via the pybind11 interface (due to limitations in dpctl), and therefore conversions must occur if dpctl exists (see pytorch). This may occur again with another framework in the future, and is ready for use in a general way in _third_party.
is_torch_tensor is included to allow for lazy importing of pytorch as required for queue extraction.

PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.

You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).

Checklist to comply with before moving PR from draft:

PR completeness and readability

I have reviewed my changes thoroughly before submitting this pull request.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with update and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have added a respective label(s) to PR if I have a permission for that.
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least summary table with measured data, if performance change is expected.
I have provided justification why performance has changed or why changes are not expected.
I have provided justification why quality metrics have changed or why changes are not expected.
I have extended benchmarking suite and provided corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

Copilot

Pull Request Overview

This PR enhances device offloading and queue conversion by adding DLPack queue extraction and data conversion support, along with improving SyclQueue comparison and fallback logic. Key changes include:

Exposing C++ operator overloads (== and !=) for SyclQueue and SyclDevice in the Python bindings.
Adding new utilities to convert pybind11-defined SyclQueue objects to dpctl SyclQueues and to extract queues from DLPack-encapsulated data.
Modifying device offload logic in dispatch and the global queue management to support non-SYCL and fallback scenarios.

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
sklearnex/tests/test_config.py	Updated test references and added a new test for fallback behavior.
sklearnex/_device_offload.py	Adjusted fallback logic and simplified the assignment expression usage.
onedal/utils/_third_party.py	Introduced a new convert_sklearnex_queue decorator with dpctl availability check.
onedal/utils/_sycl_queue_manager.py	Added support for dlpack queue extraction and updated data interface handling.
onedal/datatypes/_dlpack.py	Provided new functions for dlpack-to-numpy conversion and queue extraction.
onedal/datatypes/init.py	Updated module exports to include new dlpack-related functions.
onedal/common/tests/test_sycl.py	Extended tests to validate new SyclQueue operator overloads.
onedal/common/sycl.cpp	Exposed operator== and operator!= for SyclQueue and SyclDevice in pybind11.
onedal/_device_offload.py	Revised _transfer_to_host logic to use the new dlpack_to_numpy conversion.

Comments suppressed due to low confidence (7)

sklearnex/_device_offload.py:57

The new block triggers a fallback for non-SYCL device data by calling QM.fallback_to_host() and returning (None, None). Please confirm that this behavior is consistent with the overall API design and intended error handling.

    if get_config()["allow_fallback_to_host"]:

sklearnex/tests/test_config.py:198

Using the class method reference '_Estimator._onedal_test' instead of the instance method improves clarity in the dispatch. Please verify that this change aligns with the intended design.

                    {"onedal": _Estimator._onedal_test, "sklearn": None},

sklearnex/_device_offload.py:52

[nitpick] The replacement of the assignment expression with a direct config access enhances readability; please confirm that this change does not affect any intended side effects.

        if not patching_status.get_status() and get_config()["allow_fallback_to_host"]:

onedal/utils/_third_party.py:142

Ensure that the variable 'dpctl_available' is defined or imported in the module to prevent potential NameError issues.

    if dpctl_available:

onedal/utils/_sycl_queue_manager.py:41

Including '__non_queue' in the condition ensures non-queue objects are passed through unchanged. Confirm that '__non_queue' is used consistently across the codebase for non-device scenarios.

    if isinstance(target, SyclQueue) or target is None or target is __non_queue:

onedal/utils/_sycl_queue_manager.py:144

The previous exception handling for 'sycl_usm_array_interface' has been removed. Please confirm that a RuntimeError will no longer be raised in scenarios where this attribute access fails.

        if usm_iface := getattr(item, "__sycl_usm_array_interface__", None):

onedal/_device_offload.py:72

[nitpick] Replacing the detailed dlpack handling inline with a call to dlpack_to_numpy simplifies the code. Please ensure that this condition covers all intended dlpack-supported cases.

        elif not isinstance(item, np.ndarray) and (hasattr(item, "__dlpack_device__")):

Copilot · 2025-06-29T23:18:25Z

onedal/utils/_sycl_queue_manager.py

@@ -139,6 +162,9 @@ def from_data(*data):
        data_dev = data_queue.sycl_device
        global_dev = global_queue.sycl_device
        if (data_dev and global_dev) is not None and data_dev != global_dev:


[nitpick] Using '(data_dev and global_dev) is not None' can be ambiguous; consider explicitly checking both 'data_dev is not None' and 'global_dev is not None' to improve clarity.

Suggested change

if (data_dev and global_dev) is not None and data_dev != global_dev:

if data_dev is not None and global_dev is not None and data_dev != global_dev:

icfaust · 2025-06-29T23:18:58Z

/intelci: run

icfaust · 2025-06-30T23:00:42Z

/intelci: run

icfaust · 2025-07-03T17:23:21Z

/intelci: run

icfaust and others added 19 commits June 23, 2025 06:10

starting point

0e69eaf

Update _sycl_queue_manager.py

035ad6d

Update _sycl_queue_manager.py

d3de2f4

Update _third_party.py

9ba3733

Update _sycl_queue_manager.py

305645b

Create _dlpack.py

1cded9e

Update __init__.py

68915e2

Update _sycl_queue_manager.py

6622f7e

Update sycl.cpp

b68e5e1

Update _sycl_queue_manager.py

8d75d48

Update _sycl_queue_manager.py

535988d

Update _dlpack.py

9d76776

Update __init__.py

3b1ebc3

Update _device_offload.py

25f51bf

Update _dlpack.py

5634d98

Update sycl.cpp

ed2f62a

Update _device_offload.py

12cdbc8

Update _sycl_queue_manager.py

8093d4d

clean up

a143241

icfaust requested a review from Copilot June 24, 2025 04:44

This comment was marked as outdated.

Sign in to view

icfaust and others added 9 commits June 25, 2025 15:00

Merge branch 'main' into dev/lazy_pytorch

0242256

Update _sycl_queue_manager.py

6de1907

Update _device_offload.py

26392e2

Update _sycl_queue_manager.py

01eb572

Update _sycl_queue_manager.py

6c6f439

Update _sycl_queue_manager.py

9a1f5df

Update _sycl_queue_manager.py

1a13972

Update _sycl_queue_manager.py

a3f0bed

fix import around onedal._default_backend

d1d20b0

icfaust added 12 commits June 28, 2025 23:54

Update test_sycl.py

df42993

Update test_config.py

d8c3edf

Update test_config.py

a9b5410

Update test_config.py

d0660da

Update _dlpack.py

f1bf501

Update _device_offload.py

906f928

Update test_config.py

c971242

Update test_config.py

f39ff3d

fixes

7c092bf

fix bug

cc0ff56

Update test_config.py

b14ec30

Update test_config.py

9130dde

icfaust marked this pull request as ready for review June 29, 2025 23:17

icfaust requested review from Alexsandruss, yuejiaointel, david-cortes-intel, ahuber21, ethanglaser and Vika-F as code owners June 29, 2025 23:17

icfaust requested a review from Copilot June 29, 2025 23:17

Copilot AI reviewed Jun 29, 2025

View reviewed changes

icfaust mentioned this pull request Jun 30, 2025

[enhancement] enable array_api return values from from_table #2441

Open

9 tasks

icfaust added 2 commits July 1, 2025 00:54

Update test_config.py

bef2d57

Update test_config.py

2f3f401

icfaust added 3 commits July 3, 2025 01:16

Update _device_offload.py

e03254b

Update _device_offload.py

d4f939b

Merge branch 'uxlfoundation:main' into dev/lazy_pytorch

712369b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[enhancement] add dlpack queue extraction and data conversions #2569

[enhancement] add dlpack queue extraction and data conversions #2569

icfaust commented Jun 23, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jun 29, 2025

Uh oh!

icfaust commented Jun 29, 2025

Uh oh!

icfaust commented Jun 30, 2025

Uh oh!

icfaust commented Jul 3, 2025

Uh oh!

Uh oh!

	if (data_dev and global_dev) is not None and data_dev != global_dev:
	if data_dev is not None and global_dev is not None and data_dev != global_dev:

[enhancement] add dlpack queue extraction and data conversions #2569

Are you sure you want to change the base?

[enhancement] add dlpack queue extraction and data conversions #2569

Conversation

icfaust commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jun 29, 2025

Choose a reason for hiding this comment

Uh oh!

icfaust commented Jun 29, 2025

Uh oh!

icfaust commented Jun 30, 2025

Uh oh!

icfaust commented Jul 3, 2025

Uh oh!

Uh oh!

icfaust commented Jun 23, 2025 •

edited

Loading