Skip to content

[maintenance] lazy load dpnp.tensor/dpnp and prepare for array_api lazy importing #2509

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 71 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
71 commits
Select commit Hold shift + click to select a range
7d14b79
starting point
icfaust Jun 3, 2025
4a83297
Merge branch 'dev/lazy_load' of https://github.com/icfaust/scikit-lea…
icfaust Jun 3, 2025
523e84b
first cut
icfaust Jun 5, 2025
6f4775f
rename
icfaust Jun 5, 2025
219e26f
fix various testing imports
icfaust Jun 5, 2025
54af074
don't get ahead of my skis
icfaust Jun 5, 2025
f3c5d5b
attempt to further move things apart
icfaust Jun 5, 2025
bfdd3e0
remove get_unique_values_with_dpep
icfaust Jun 5, 2025
436405c
remove actually
icfaust Jun 5, 2025
55eab86
Update _array_api.py
icfaust Jun 5, 2025
a7c8fb0
try to fix
icfaust Jun 6, 2025
982c7c4
Update _device_offload.py
icfaust Jun 7, 2025
e975a4f
Update _device_offload.py
icfaust Jun 7, 2025
d46d175
Update _device_offload.py
icfaust Jun 7, 2025
8e8b6d9
Update _device_offload.py
icfaust Jun 7, 2025
125e727
Update _sycl_usm.py
icfaust Jun 7, 2025
fc6fa24
Update _third_party.py
icfaust Jun 7, 2025
c9244b8
Update _device_offload.py
icfaust Jun 7, 2025
c171175
Update _device_offload.py
icfaust Jun 7, 2025
18308b2
Update _device_offload.py
icfaust Jun 7, 2025
603e7d3
Update _device_offload.py
icfaust Jun 7, 2025
bc1c0e3
Update _sycl_usm.py
icfaust Jun 7, 2025
51a6b06
Update _sycl_usm.py
icfaust Jun 7, 2025
1f1648c
Update _third_party.py
icfaust Jun 7, 2025
0ec3ed8
Update _third_party.py
icfaust Jun 7, 2025
5688076
Update _sycl_usm.py
icfaust Jun 7, 2025
39d300e
Update _third_party.py
icfaust Jun 7, 2025
62611c0
Update _third_party.py
icfaust Jun 7, 2025
3688b1b
Update _array_api.py
icfaust Jun 7, 2025
65bc9ae
Update _array_api.py
icfaust Jun 7, 2025
8efea19
formatting
icfaust Jun 8, 2025
4520268
Update setup.py
icfaust Jun 8, 2025
474ab8f
Update _third_party.py
icfaust Jun 8, 2025
f744a6e
Update _third_party.py
icfaust Jun 8, 2025
c1ce7af
Update _third_party.py
icfaust Jun 8, 2025
fc41abc
Merge branch 'uxlfoundation:main' into dev/lazy_load
icfaust Jun 9, 2025
e697ca3
Merge branch 'main' into dev/lazy_load
icfaust Jun 18, 2025
273f4a7
Update _data_conversion.py
icfaust Jun 18, 2025
a6013e1
Update __init__.py
icfaust Jun 18, 2025
c1176b4
Update __init__.py
icfaust Jun 18, 2025
49ba4e6
Update _data_conversion.py
icfaust Jun 18, 2025
d4317b4
Update _device_offload.py
icfaust Jun 18, 2025
70d7557
Update _third_party.py
icfaust Jun 18, 2025
1a2e04c
add requested comments to code
icfaust Jun 20, 2025
f401c89
add requested comments to code
icfaust Jun 20, 2025
1231735
fix codespell hits
icfaust Jun 20, 2025
d4e3e4d
Merge branch 'uxlfoundation:main' into dev/lazy_load
icfaust Jun 20, 2025
53d9cb6
Update test_common.py
icfaust Jun 21, 2025
e4e08c1
Update test_common.py
icfaust Jun 21, 2025
ee12c4f
Update test_common.py
icfaust Jun 21, 2025
d0a8de2
Update test_common.py
icfaust Jun 21, 2025
20723ba
Update test_common.py
icfaust Jun 21, 2025
ed51abc
Update test_common.py
icfaust Jun 21, 2025
0b9785b
Update test_common.py
icfaust Jun 21, 2025
72c9f9c
Update test_common.py
icfaust Jun 21, 2025
09dfab2
Update test_common.py
icfaust Jun 21, 2025
59e9119
Update test_common.py
icfaust Jun 21, 2025
de631bc
Update test_common.py
icfaust Jun 21, 2025
ec3ce64
Update test_common.py
icfaust Jun 21, 2025
31bc191
Update test_common.py
icfaust Jun 21, 2025
145ab80
Update test_common.py
icfaust Jun 21, 2025
869e036
Update test_common.py
icfaust Jun 21, 2025
400aad8
Update test_common.py
icfaust Jun 21, 2025
556cb16
Update test_common.py
icfaust Jun 21, 2025
de1c1ea
Update test_common.py
icfaust Jun 21, 2025
ab76a76
Update test_common.py
icfaust Jun 21, 2025
03b99a0
Update test_common.py
icfaust Jun 21, 2025
733511e
Update test_common.py
icfaust Jun 21, 2025
24fa5ba
Update test_common.py
icfaust Jun 21, 2025
a37bacb
Update test_common.py
icfaust Jun 21, 2025
f31c296
Update test_common.py
icfaust Jun 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .ci/scripts/run_sklearn_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,11 @@ export DESELECT_FLAGS="--public ${DESELECT_FLAGS}"
if [ -n "${SKLEARNEX_PREVIEW}" ]; then
export DESELECT_FLAGS="--preview ${DESELECT_FLAGS}"
fi
export DESELECTED_TESTS=$(python ../.circleci/deselect_tests.py ../deselected_tests.yaml ${DESELECT_FLAGS})
if [ "$1" == "gpu" ]; then
export DESELECT_FLAGS="--gpu ${DESELECT_FLAGS}"
fi

export DESELECTED_TESTS=$(python ../.circleci/deselect_tests.py ../deselected_tests.yaml ${DESELECT_FLAGS})
# manual setting of OCL_ICD_FILENAMES is required in
# specific MSYS environment with conda packages downloaded from intel channel
if [[ "$(uname)" =~ "MSYS" ]] && [ -z "${OCL_ICD_FILENAMES}" ] && [ -n "${CONDA_PREFIX}" ]; then
Expand Down
246 changes: 211 additions & 35 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,35 @@ env:
DPCTL_VERSION: 0.18.1
DPNP_VERSION: 0.16.0
DPCTL_PY_VERSIONS: '3.9\|3.11'
UXL_PYTHONVERSION: "3.12"
UXL_SKLEARNVERSION: "1.4"
ONEDAL_REPO: "uxlfoundation/oneDAL"

jobs:

onedal_nightly:
runs-on: ubuntu-24.04
name: Identify oneDAL nightly
timeout-minutes: 2

steps:
- name: Get run ID of "Nightly-build" workflow
id: get-run-id
run: |
WF_NAME="Nightly-build"
JQ_QUERY='map(select(.event == "workflow_dispatch" or .event == "schedule")) | .[0].databaseId'
RUN_ID=`gh run --repo ${{ env.ONEDAL_REPO }} list --workflow "${WF_NAME}" --json databaseId,event --status success --jq "${JQ_QUERY}"`
echo "Detected latest run id of ${RUN_ID} for workflow ${WF_NAME}"
echo "run-id=${RUN_ID}" >> "$GITHUB_OUTPUT"
env:
GH_TOKEN: ${{ github.token }}
outputs:
run-id: ${{ steps.get-run-id.outputs.run-id }}
uxl-python: ${{ env.UXL_PYTHONVERSION }}
uxl-sklearn: ${{ env.UXL_SKLEARNVERSION }}

sklearn_lnx:
needs: onedal_nightly
strategy:
fail-fast: false
matrix:
Expand All @@ -46,7 +72,7 @@ jobs:
SKLEARN_VERSION: "1.2"
- PYTHON_VERSION: "3.11"
SKLEARN_VERSION: "1.3"
name: LinuxNightly/pip Python${{ matrix.PYTHON_VERSION }}_Sklearn${{ matrix.SKLEARN_VERSION }}
name: LinuxNightly/venv Python${{ matrix.PYTHON_VERSION }}_Sklearn${{ matrix.SKLEARN_VERSION }}
runs-on: ubuntu-24.04
timeout-minutes: 120

Expand All @@ -57,32 +83,21 @@ jobs:
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.PYTHON_VERSION }}
- name: Get run ID of "Nightly-build" workflow
id: get-run-id
run: |
OTHER_REPO="uxlfoundation/oneDAL"
WF_NAME="Nightly-build"
JQ_QUERY='map(select(.event == "workflow_dispatch" or .event == "schedule")) | .[0].databaseId'
RUN_ID=`gh run --repo ${OTHER_REPO} list --workflow "${WF_NAME}" --json databaseId,event --status success --jq "${JQ_QUERY}"`
echo "Detected latest run id of ${RUN_ID} for workflow ${WF_NAME}"
echo "run-id=${RUN_ID}" >> "$GITHUB_OUTPUT"
env:
GH_TOKEN: ${{ github.token }}
- name: Download oneDAL build artifact
uses: actions/download-artifact@v4
with:
name: __release_lnx
github-token: ${{ github.token }}
repository: uxlfoundation/oneDAL
run-id: ${{ steps.get-run-id.outputs.run-id }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
path: ./__release_lnx
- name: Download oneDAL environment artifact
uses: actions/download-artifact@v4
with:
name: oneDAL_env
github-token: ${{ github.token }}
repository: uxlfoundation/oneDAL
run-id: ${{ steps.get-run-id.outputs.run-id }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
path: .ci/env
- name: Set Environment Variables
id: set-env
Expand Down Expand Up @@ -161,6 +176,7 @@ jobs:
bash .ci/scripts/run_sklearn_tests.sh $CPU

sklearn_win:
needs: onedal_nightly
strategy:
fail-fast: false
matrix:
Expand All @@ -171,7 +187,7 @@ jobs:
SKLEARN_VERSION: "1.2"
- PYTHON_VERSION: "3.11"
SKLEARN_VERSION: "1.3"
name: WindowsNightly/pip Python${{ matrix.PYTHON_VERSION }}_Sklearn${{ matrix.SKLEARN_VERSION }}
name: WindowsNightly/venv Python${{ matrix.PYTHON_VERSION }}_Sklearn${{ matrix.SKLEARN_VERSION }}
runs-on: windows-2025
timeout-minutes: 120

Expand All @@ -182,33 +198,21 @@ jobs:
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.PYTHON_VERSION }}
- name: Get run ID of "Nightly-build" workflow
id: get-run-id
shell: bash
run: |
OTHER_REPO="uxlfoundation/oneDAL"
WF_NAME="Nightly-build"
JQ_QUERY='map(select(.event == "workflow_dispatch" or .event == "schedule")) | .[0].databaseId'
RUN_ID=`gh run --repo ${OTHER_REPO} list --workflow "${WF_NAME}" --json databaseId,event --status success --jq "${JQ_QUERY}"`
echo "Detected latest run id of ${RUN_ID} for workflow ${WF_NAME}"
echo "run-id=${RUN_ID}" >> "$GITHUB_OUTPUT"
env:
GH_TOKEN: ${{ github.token }}
- name: Download oneDAL build artifact
uses: actions/download-artifact@v4
with:
name: __release_win
github-token: ${{ github.token }}
repository: uxlfoundation/oneDAL
run-id: ${{ steps.get-run-id.outputs.run-id }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
path: ./__release_win
- name: Download Intel BaseKit artifact
uses: actions/download-artifact@v4
with:
name: intel_oneapi_basekit
github-token: ${{ github.token }}
repository: uxlfoundation/oneDAL
run-id: ${{ steps.get-run-id.outputs.run-id }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
- name: Decompress Intel BaseKit
shell: cmd
run: |
Expand All @@ -234,8 +238,8 @@ jobs:
with:
name: opencl_rt_installer
github-token: ${{ github.token }}
repository: uxlfoundation/oneDAL
run-id: ${{ steps.get-run-id.outputs.run-id }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
path: .
- name: Install Intel OpenCL CPU Runtime
if: ${{ steps.set-env.outputs.DPCFLAG == '' }}
Expand Down Expand Up @@ -313,3 +317,175 @@ jobs:
if "${{ steps.set-env.outputs.DPCFLAG }}"=="" set CPU=cpu
set SKLEARNEX_PREVIEW=YES
bash .ci/scripts/run_sklearn_tests.sh %CPU%

build_uxl:
if: github.repository == 'uxlfoundation/scikit-learn-intelex'
needs: onedal_nightly
name: LinuxNightly build Python${{ needs.onedal_nightly.outputs.uxl-python }}
runs-on: uxl-xlarge
timeout-minutes: 30

steps:
- name: Checkout Scikit-learn-intelex
uses: actions/checkout@v4
- name: Install Python
uses: actions/setup-python@v5
with:
python-version: ${{ env.UXL_PYTHONVERSION }}
cache: 'pip'
cache-dependency-path: |
**/dependencies-dev
**/requirements-test.txt
- name: Download oneDAL build artifact
uses: actions/download-artifact@v4
with:
name: __release_lnx
github-token: ${{ github.token }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
path: ./__release_lnx
- name: Download oneDAL environment artifact
uses: actions/download-artifact@v4
with:
name: oneDAL_env
github-token: ${{ github.token }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
path: .ci/env
- name: Set Environment Variables
id: set-env
run: |
# Disable SPMD testing
echo "NO_DIST=1" >> "$GITHUB_ENV"
# enable coverage report generation
echo "SKLEARNEX_GCOV=1" >> "$GITHUB_ENV"
- name: apt-get
run: sudo apt-get update && sudo apt-get install -y clang-format
- name: dpcpp installation
run: |
# This CI system yields oneAPI dependencies from the oneDAL repository
bash .ci/env/apt.sh dpcpp
- name: describe system
run: |
source /opt/intel/oneapi/setvars.sh
bash .ci/scripts/describe_system.sh
- name: Install develop requirements
run: |
pip install -r dependencies-dev
pip list
- name: Build daal4py/sklearnex
run: |
source .github/scripts/activate_components.sh ${{ steps.set-env.outputs.DPCFLAG }}
python setup.py bdist_wheel
- name: Archive sklearnex build
uses: actions/upload-artifact@v4
with:
name: sklearnex_build_${{ env.UXL_PYTHONVERSION }}
path: |
./dist/*.whl

test_uxl:
strategy:
fail-fast: false
matrix:
include:
- OS: uxl-gpu-xlarge
FRAMEWORKS: "numpy,pytorch"
DEVICE: gpu
- OS: uxl-xlarge
FRAMEWORKS: "numpy,pandas"
DEVICE: cpu
needs: [onedal_nightly, build_uxl]
name: LinuxNightly ${{ matrix.DEVICE }} test Python${{ needs.onedal_nightly.outputs.uxl-python }}_Sklearn${{ needs.onedal_nightly.outputs.uxl-sklearn }}
runs-on: ${{ matrix.OS }}
timeout-minutes: 120
steps:
- name: Checkout Scikit-learn-intelex
uses: actions/checkout@v4
- name: Install Python
uses: actions/setup-python@v5
with:
python-version: ${{ env.UXL_PYTHONVERSION }}
cache-dependency-path: |
**/dependencies-dev
**/requirements-test.txt
- name: Download oneDAL build artifact
uses: actions/download-artifact@v4
with:
name: __release_lnx
github-token: ${{ github.token }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
path: ./__release_lnx
- name: Download oneDAL environment artifact
uses: actions/download-artifact@v4
with:
name: oneDAL_env
github-token: ${{ github.token }}
repository: ${{ env.ONEDAL_REPO }}
run-id: ${{ needs.onedal_nightly.outputs.run-id }}
path: .ci/env
- name: Set Environment Variables
id: set-env
run: |
echo "NO_DIST=1" >> "$GITHUB_ENV"
# enable coverage report generation
echo "COVERAGE_RCFILE=$(readlink -f .coveragerc)" >> "$GITHUB_ENV"
echo "ONEDAL_PYTEST_FRAMEWORKS=${{ matrix.FRAMEWORKS }}" >> "$GITHUB_ENV"
# reduce GPU driver/runner related memory issues
echo "NEOReadDebugKeys=1" >> "$GITHUB_ENV"
echo "EnableRecoverablePageFaults=1" >> "$GITHUB_ENV"
echo "GpuFaultCheckThreshold=0" >> "$GITHUB_ENV"
- name: apt-get
run: sudo apt-get update
- name: dpcpp installation
run: |
# This CI system yields oneAPI dependencies from the oneDAL repository
bash .ci/env/apt.sh dpcpp
- name: describe system
run: |
source /opt/intel/oneapi/setvars.sh
bash .ci/scripts/describe_system.sh
- name: Install test requirements
run: |
pip install -r dependencies-dev
echo "NUMPY_BUILD=$(python -m pip freeze | grep numpy)" >> "$GITHUB_ENV"
bash .ci/scripts/setup_sklearn.sh ${{ env.UXL_SKLEARNVERSION }}
pip install --upgrade -r requirements-test.txt
pip install $(python .ci/scripts/get_compatible_scipy_version.py ${{ env.UXL_SKLEARVERSION }}) pyyaml
pip list
- name: Download sklearnex wheel
uses: actions/download-artifact@v4
with:
name: sklearnex_build_${{ env.UXL_PYTHONVERSION }}
- name: Install PyTorch
if: contains(matrix.FRAMEWORKS, 'pytorch')
run: pip install torch --index-url https://download.pytorch.org/whl/xpu
- name: Install daal4py/sklearnex
run: pip install *.whl
- name: Sklearnex testing
run: |
source .github/scripts/activate_components.sh
export COVERAGE_FILE=$(pwd)/.coverage.sklearnex
cd .ci
../conda-recipe/run_test.sh
- name: Sklearn testing
run: |
source .github/scripts/activate_components.sh
export COVERAGE_FILE=$(pwd)/.coverage.sklearn
bash .ci/scripts/run_sklearn_tests.sh ${{ matrix.DEVICE }}
- name: Create coverage report
run: |
source .github/scripts/activate_components.sh
bash .github/scripts/generate_coverage_reports.sh uxl_lnx_${{ matrix.DEVICE }}
- name: Archive coverage report
uses: actions/upload-artifact@v4
with:
name: coverage_uxl_lnx_${{ matrix.DEVICE }}
path: |
*uxl_lnx_${{ matrix.DEVICE }}.info
- name: Sklearn testing [preview]
run: |
source .github/scripts/activate_components.sh
export SKLEARNEX_PREVIEW='YES'
bash .ci/scripts/run_sklearn_tests.sh ${{ matrix.DEVICE }}
27 changes: 13 additions & 14 deletions daal4py/sklearn/utils/validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,21 +72,20 @@ def _assert_all_finite(

# Data with small size has too big relative overhead
# TODO: tune threshold size
if hasattr(X, "size"):
if X.size < 32768:
if sklearn_check_version("1.1"):
_sklearn_assert_all_finite(
X,
allow_nan=allow_nan,
msg_dtype=msg_dtype,
estimator_name=estimator_name,
input_name=input_name,
)
else:
_sklearn_assert_all_finite(X, allow_nan=allow_nan, msg_dtype=msg_dtype)
return

is_df = is_DataFrame(X)
if not (is_df or isinstance(X, np.ndarray)) or X.size < 32768:
if sklearn_check_version("1.1"):
_sklearn_assert_all_finite(
X,
allow_nan=allow_nan,
msg_dtype=msg_dtype,
estimator_name=estimator_name,
input_name=input_name,
)
else:
_sklearn_assert_all_finite(X, allow_nan=allow_nan, msg_dtype=msg_dtype)
return

num_of_types = get_number_of_types(X)

# if X is heterogeneous pandas.DataFrame then
Expand Down
Loading
Loading