Skip to content

Commit 7062a1e

Browse files
update: update algorithm.rst based on current implementation (#2354)
* update: update aldorithm and preview * fix: remove incremental algos and added extra trees and tsne n_compoents description * fix: fix basic stats * fix: fix basic stats * fix: went through all algos * fix: address comments * fix move incremental imiltaiton under notes * fix: add notes in other incremental algos * fix: sklearn < 1.5 * fix: removed versions * Update doc/sources/algorithms.rst Co-authored-by: david-cortes-intel <david.cortes@intel.com> * fix: various chagnes based on comments * fix: add verbose note * Update doc/sources/algorithms.rst Co-authored-by: david-cortes-intel <david.cortes@intel.com> * fix: verbose html * fix: verbose note * fix: address comments * fix: increoemtn notes * fix: verbose * fix: LogisticRegression * fix: doc ci * fix: fix basic stats format * fix: fix basic stats * fix: fix basic stats * fix: format * fix: fix indent * fix: format * fix: address comments * fix: fix logistic regression cpu * fix: fix logistic regression cpu * fix: fix spmd logistic regression --------- Co-authored-by: david-cortes-intel <david.cortes@intel.com>
1 parent c549b86 commit 7062a1e

File tree

5 files changed

+145
-33
lines changed

5 files changed

+145
-33
lines changed

doc/sources/algorithms.rst

Lines changed: 130 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@
1919
Supported Algorithms
2020
####################
2121

22+
.. note::
23+
To verify that oneDAL is being used for these algorithms, you can enable verbose mode.
24+
See :ref:`verbose mode documentation <verbose>` for details.
25+
2226
Applying |sklearnex| impacts the following |sklearn| estimators:
2327

2428
on CPU
@@ -48,6 +52,13 @@ Classification
4852
- ``ccp_alpha`` != `0`
4953
- ``criterion`` != `'gini'`
5054
- Multi-output and sparse data are not supported
55+
* - `ExtraTreesClassifier`
56+
- All parameters are supported except:
57+
58+
- ``warm_start`` = `True`
59+
- ``ccp_alpha`` != `0`
60+
- ``criterion`` != `'gini'`
61+
- Multi-output and sparse data are not supported
5162
* - `KNeighborsClassifier`
5263
-
5364
- For ``algorithm`` == `'kd_tree'`:
@@ -58,12 +69,8 @@ Classification
5869
all parameters except ``metric`` not in [`'euclidean'`, `'manhattan'`, `'minkowski'`, `'chebyshev'`, `'cosine'`]
5970
- Multi-output and sparse data are not supported
6071
* - `LogisticRegression`
61-
- All parameters are supported except:
62-
63-
- ``solver`` not in [`'lbfgs'`, `'newton-cg'`]
64-
- ``class_weight`` != `None`
65-
- ``sample_weight`` != `None`
66-
- Only dense data is supported
72+
- All parameters are supported
73+
- No limitations
6774

6875
Regression
6976
**********
@@ -89,6 +96,13 @@ Regression
8996
- ``ccp_alpha`` != `0`
9097
- ``criterion`` != `'mse'`
9198
- Multi-output and sparse data are not supported
99+
* - `ExtraTreesRegressor`
100+
- All parameters are supported except:
101+
102+
- ``warm_start`` = `True`
103+
- ``ccp_alpha`` != `0`
104+
- ``criterion`` != `'mse'`
105+
- Multi-output and sparse data are not supported
92106
* - `KNeighborsRegressor`
93107
- All parameters are supported except:
94108

@@ -97,16 +111,17 @@ Regression
97111
* - `LinearRegression`
98112
- All parameters are supported except:
99113

100-
- ``normalize`` != `False`
101114
- ``sample_weight`` != `None`
115+
- ``positive`` = `True`
102116
- Only dense data is supported.
103117
* - `Ridge`
104118
- All parameters are supported except:
105119

106-
- ``normalize`` != `False`
107120
- ``solver`` != `'auto'`
108121
- ``sample_weight`` != `None`
109-
- Only dense data is supported, `#observations` should be >= `#features`.
122+
- ``positive`` = `True`
123+
- ``alpha`` must be scalar
124+
- Only dense data is supported.
110125
* - `ElasticNet`
111126
- All parameters are supported except:
112127

@@ -132,8 +147,10 @@ Clustering
132147
* - `KMeans`
133148
- All parameters are supported except:
134149

135-
- ``precompute_distances``
136-
- ``sample_weight`` != `None`
150+
- ``algorithm`` != ``'lloyd'`` ('elkan' falls back to 'lloyd')
151+
- ``n_clusters`` = ``1``
152+
- ``sample_weight`` must be None, constant, or equal weights
153+
- ``init`` = `'k-means++'` falls back to CPU
137154
- No limitations
138155
* - `DBSCAN`
139156
- All parameters are supported except:
@@ -156,16 +173,16 @@ Dimensionality Reduction
156173
* - `PCA`
157174
- All parameters are supported except:
158175

159-
- ``svd_solver`` not in [`'full'`, `'covariance_eigh'`]
160-
- Sparse data is not supported
161-
* - `IncrementalPCA`
162-
- All parameters are supported
176+
- ``svd_solver`` not in [`'full'`, `'covariance_eigh'`, `'onedal_svd'`]
177+
- For |sklearn| < 1.5: `'full'` solver is automatically mapped to `'covariance_eigh'`
163178
- Sparse data is not supported
164179
* - `TSNE`
165180
- All parameters are supported except:
166181

167182
- ``metric`` != 'euclidean' or `'minkowski'` with ``p`` != `2`
168183

184+
- ``n_components`` can only be `2`
185+
169186
Refer to :ref:`TSNE acceleration details <acceleration_tsne>` to learn more.
170187
- Sparse data is not supported
171188

@@ -204,17 +221,33 @@ Other Tasks
204221
* - `EmpiricalCovariance`
205222
- All parameters are supported
206223
- Only dense data is supported
224+
* - `BasicStatistics`
225+
- All parameters are supported
226+
- Supported data formats:
227+
228+
- Dense data
229+
- CSR sparse matrices
230+
- Sample weights **not** supported for CSR data format
207231
* - `train_test_split`
208232
- All parameters are supported
209-
- Only dense data is supported
233+
- Supported data formats:
234+
235+
- Only dense data is supported
236+
- Only integer and 32/64-bits floating point types are supported
237+
- Data with more than 3 dimensions is not supported
238+
- Only ``np.ndarray`` inputs are supported.
210239
* - `assert_all_finite`
211240
- All parameters are supported
212241
- Only dense data is supported
213242
* - `pairwise_distance`
214243
- All parameters are supported except:
215244

216245
- ``metric`` not in [`'cosine'`, `'correlation'`]
217-
- Only dense data is supported
246+
- Supported data formats:
247+
248+
- Only dense data is supported
249+
- ``Y`` must be `None`
250+
- Input dtype must be `np.float64`
218251
* - `roc_auc_score`
219252
- All parameters are supported except:
220253

@@ -255,6 +288,15 @@ Classification
255288
- ``oob_score`` = `True`
256289
- ``sample_weight`` != `None`
257290
- Multi-output and sparse data are not supported
291+
* - `ExtraTreesClassifier`
292+
- All parameters are supported except:
293+
294+
- ``warm_start`` = `True`
295+
- ``ccp_alpha`` != `0`
296+
- ``criterion`` != `'gini'`
297+
- ``oob_score`` = `True`
298+
- ``sample_weight`` != `None`
299+
- Multi-output and sparse data are not supported
258300
* - `KNeighborsClassifier`
259301
- All parameters are supported except:
260302

@@ -269,7 +311,13 @@ Classification
269311
- ``class_weight`` != `None`
270312
- ``sample_weight`` != `None`
271313
- ``penalty`` != `'l2'`
272-
- Only dense data is supported
314+
- ``dual`` = `True`
315+
- ``intercept_scaling`` != `1`
316+
- ``multi_class`` != `'multinomial'`
317+
- ``warm_start`` = `True`
318+
- ``l1_ratio`` != `None`
319+
- Only binary classification is supported
320+
- No limitations
273321

274322
Regression
275323
**********
@@ -291,6 +339,15 @@ Regression
291339
- ``oob_score`` = `True`
292340
- ``sample_weight`` != `None`
293341
- Multi-output and sparse data are not supported
342+
* - `ExtraTreesRegressor`
343+
- All parameters are supported except:
344+
345+
- ``warm_start`` = `True`
346+
- ``ccp_alpha`` != `0`
347+
- ``criterion`` != `'mse'`
348+
- ``oob_score`` = `True`
349+
- ``sample_weight`` != `None`
350+
- Multi-output and sparse data are not supported
294351
* - `KNeighborsRegressor`
295352
- All parameters are supported except:
296353

@@ -301,8 +358,8 @@ Regression
301358
* - `LinearRegression`
302359
- All parameters are supported except:
303360

304-
- ``normalize`` != `False`
305361
- ``sample_weight`` != `None`
362+
- ``positive`` = `True`
306363
- Only dense data is supported.
307364

308365
Clustering
@@ -319,10 +376,11 @@ Clustering
319376
* - `KMeans`
320377
- All parameters are supported except:
321378

322-
- ``precompute_distances``
323-
- ``sample_weight`` != `None`
324-
- ``Init`` = `'k-means++'` fallbacks to CPU.
325-
- Sparse data is not supported
379+
- ``algorithm`` != ``'lloyd'`` ('elkan' falls back to 'lloyd')
380+
- ``n_clusters`` = ``1``
381+
- ``sample_weight`` must be None, constant, or equal weights
382+
- ``init`` = `'k-means++'` falls back to CPU
383+
- No limitations
326384
* - `DBSCAN`
327385
- All parameters are supported except:
328386

@@ -344,7 +402,8 @@ Dimensionality Reduction
344402
* - `PCA`
345403
- All parameters are supported except:
346404

347-
- ``svd_solver`` not in [`'full'`, `'covariance_eigh'`]
405+
- ``svd_solver`` not in [`'full'`, `'covariance_eigh'`, `'onedal_svd'`]
406+
- For |sklearn| < 1.5: `'full'` solver is automatically mapped to `'covariance_eigh'`
348407
- Sparse data is not supported
349408

350409
Nearest Neighbors
@@ -380,6 +439,13 @@ Other Tasks
380439
* - `EmpiricalCovariance`
381440
- All parameters are supported
382441
- Only dense data is supported
442+
* - `BasicStatistics`
443+
- All parameters are supported
444+
- Supported data formats:
445+
446+
- Dense data
447+
- CSR sparse matrices
448+
- Sample weights **not** supported for CSR data format
383449

384450
.. _spmd-support:
385451

@@ -408,6 +474,15 @@ Classification
408474
- ``oob_score`` = `True`
409475
- ``sample_weight`` != `None`
410476
- Multi-output and sparse data are not supported
477+
* - `ExtraTreesClassifier`
478+
- All parameters are supported except:
479+
480+
- ``warm_start`` = `True`
481+
- ``ccp_alpha`` != `0`
482+
- ``criterion`` != `'gini'`
483+
- ``oob_score`` = `True`
484+
- ``sample_weight`` != `None`
485+
- Multi-output and sparse data are not supported
411486
* - `KNeighborsClassifier`
412487
- All parameters are supported except:
413488

@@ -423,7 +498,13 @@ Classification
423498
- ``class_weight`` != `None`
424499
- ``sample_weight`` != `None`
425500
- ``penalty`` != `'l2'`
426-
- Only dense data is supported
501+
- ``dual`` = `True`
502+
- ``intercept_scaling`` != `1`
503+
- ``multi_class`` != `'multinomial'`
504+
- ``warm_start`` = `True`
505+
- ``l1_ratio`` != `None`
506+
- Only binary classification is supported
507+
- No limitations
427508

428509
Regression
429510
**********
@@ -445,6 +526,15 @@ Regression
445526
- ``oob_score`` = `True`
446527
- ``sample_weight`` != `None`
447528
- Multi-output and sparse data are not supported
529+
* - `ExtraTreesRegressor`
530+
- All parameters are supported except:
531+
532+
- ``warm_start`` = `True`
533+
- ``ccp_alpha`` != `0`
534+
- ``criterion`` != `'mse'`
535+
- ``oob_score`` = `True`
536+
- ``sample_weight`` != `None`
537+
- Multi-output and sparse data are not supported
448538
* - `KNeighborsRegressor`
449539
- All parameters are supported except:
450540

@@ -455,8 +545,8 @@ Regression
455545
* - `LinearRegression`
456546
- All parameters are supported except:
457547

458-
- ``normalize`` != `False`
459548
- ``sample_weight`` != `None`
549+
- ``positive`` = `True`
460550
- Only dense data is supported.
461551

462552
Clustering
@@ -473,10 +563,11 @@ Clustering
473563
* - `KMeans`
474564
- All parameters are supported except:
475565

476-
- ``precompute_distances``
477-
- ``sample_weight`` != `None`
478-
- ``Init`` = `'k-means++'` fallbacks to CPU.
479-
- Sparse data is not supported
566+
- ``algorithm`` != ``'lloyd'`` ('elkan' falls back to 'lloyd')
567+
- ``n_clusters`` = ``1``
568+
- ``sample_weight`` must be None, constant, or equal weights
569+
- ``init`` = `'k-means++'` falls back to CPU
570+
- No limitations
480571
* - `DBSCAN`
481572
- All parameters are supported except:
482573

@@ -498,8 +589,8 @@ Dimensionality Reduction
498589
* - `PCA`
499590
- All parameters are supported except:
500591

501-
- ``svd_solver`` not in [`'full'`, `'covariance_eigh'`]
502-
- ``fit`` is the only method supported
592+
- ``svd_solver`` not in [`'full'`, `'covariance_eigh'`, `'onedal_svd'`]
593+
- For |sklearn| < 1.5: `'full'` solver is automatically mapped to `'covariance_eigh'`
503594
- Sparse data is not supported
504595

505596
Nearest Neighbors
@@ -535,6 +626,13 @@ Other Tasks
535626
* - `EmpiricalCovariance`
536627
- All parameters are supported
537628
- Only dense data is supported
629+
* - `BasicStatistics`
630+
- All parameters are supported
631+
- Supported data formats:
632+
633+
- Dense data
634+
- CSR sparse matrices
635+
- Sample weights **not** supported for CSR data format
538636

539637
Scikit-learn Tests
540638
------------------

sklearnex/basic_statistics/incremental_basic_statistics.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,9 @@ class IncrementalBasicStatistics(oneDALEstimator, BaseEstimator):
106106
Attribute exists only if corresponding result option has been provided.
107107
108108
Names of attributes without the trailing underscore are supported
109-
currently but deprecated in 2025.1 and will be removed in 2026.0
109+
currently but deprecated in 2025.1 and will be removed in 2026.0.
110+
111+
Sparse data formats are not supported. Input dtype must be ``float32`` or ``float64``.
110112
111113
%incremental_serialization_note%
112114

sklearnex/covariance/incremental_covariance.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,10 @@ class IncrementalEmpiricalCovariance(oneDALEstimator, BaseEstimator):
9898
n_features_in_ : int
9999
Number of features seen during ``fit`` or ``partial_fit``.
100100
101+
Notes
102+
-----
103+
Sparse data formats are not supported. Input dtype must be ``float32`` or ``float64``.
104+
101105
%incremental_serialization_note%
102106
103107
Examples

sklearnex/linear_model/incremental_linear.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,10 @@ class IncrementalLinearRegression(
110110
n_features_in_ : int
111111
Number of features seen during ``fit`` or ``partial_fit``.
112112
113+
Notes
114+
-----
115+
Sparse data formats are not supported. Input dtype must be ``float32`` or ``float64``.
116+
113117
%incremental_serialization_note%
114118
115119
Examples

sklearnex/linear_model/incremental_ridge.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,10 @@ class IncrementalRidge(MultiOutputMixin, RegressorMixin, oneDALEstimator, BaseEs
9797
batch_size_ : int
9898
Inferred batch size from ``batch_size``.
9999
100+
Notes
101+
-----
102+
Sparse data formats are not supported. Input dtype must be ``float32`` or ``float64``.
103+
100104
%incremental_serialization_note%
101105
"""
102106

0 commit comments

Comments
 (0)