[enhancement] accelerate array_api inputs for sklearnex's validate_data
and _check_sample_weight
#2296
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Continues dlpack work from #2275.
validate_data
and_check_sample_weight
do not follow standard sklearnex offloading practice, namely they compute always wherever the data is (as the data movement could ruin any speedup provided by oneDAL, the algorithm is extraordinarily simple), and they do not patch out sklearn functions. Therefore, they must be enabled separately for array_api support. Since they are to be included in every zero-copy array_api supported algorithm, it is a prerequisite for enabling every other estimator.Previously this aspect was controlled by the looking for the
flags
attribute, which is not in the array_api standard. The array api standard does not include python-facing attributes or methods which can show if C-contiguous or F-contiguous. However, the array_api standard requires dlpack support. The attributes of from a DLPack tensor can be checked for the memory layout instead. This PR introduces a special onedal backend function which extracts and checks the necessary memory layout (without taking ownership of the tensor). A python function is created which first checks and queries theflags
or__dlpack__
attributes. If neither are available, it will return False triggering the sklearn_assert_all_finite
. This is done asto_table
will attempt to convert to a contiguous memory layout, which again will ruin the performance gain.PR should start as a draft, then move to ready for review state after CI is passed and all applicable checkboxes are closed.
This approach ensures that reviewers don't spend extra time asking for regular requirements.
You can remove a checkbox as not applicable only if it doesn't relate to this PR in any way.
For example, PR with docs update doesn't require checkboxes for performance while PR with any change in actual code should have checkboxes and justify how this code change is expected to affect performance (or justification should be self-evident).
Checklist to comply with before moving PR from draft:
PR completeness and readability
Testing
Performance