Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'AttributeError: 'PCA' object has no attribute 'n_oversamples' #1018

Closed
Gabriel-p opened this issue Jul 8, 2022 · 5 comments · Fixed by #1032
Closed

'AttributeError: 'PCA' object has no attribute 'n_oversamples' #1018

Gabriel-p opened this issue Jul 8, 2022 · 5 comments · Fixed by #1032
Assignees
Labels
bug Something isn't working

Comments

@Gabriel-p
Copy link

Describe the bug

Attempting to run the code below results in an error when sklearnex is combined with PCA. This line produces the error shown below

python -m sklearnex PCA_test.py

this line does not

python PCA_test.py

To Reproduce

Store this code in a PCA_test.py file and call using the commands above

import numpy as np
from sklearn.decomposition import PCA

data = np.random.uniform(-10, 10, (1000, 3))
pca = PCA(n_components=3)
data_pca = pca.fit(data).transform(data)

Expected behavior
No error

Output/Screenshots


Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
Traceback (most recent call last):
  File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/site-packages/sklearnex/__main__.py", line 55, in <module>
    sys.exit(_main())
  File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/site-packages/sklearnex/__main__.py", line 52, in _main
    runf(args.name, run_name='__main__')
  File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 265, in run_path
    return _run_module_code(code, init_globals, run_name,
  File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 97, in _run_module_code
    _run_code(code, mod_globals, init_globals,
  File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "PCA_test.py", line 9, in <module>
    data_pca = pca.fit(data).transform(data)
  File "/home/gperren/miniconda3/envs/pyupmask/lib/python3.8/site-packages/sklearn/decomposition/_pca.py", line 402, in fit
    self.n_oversamples,
AttributeError: 'PCA' object has no attribute 'n_oversamples'

Environment:

System:
    python: 3.8.13 (default, Mar 28 2022, 11:38:47)  [GCC 7.5.0]
executable: /home/gperren/miniconda3/envs/pyupmask/bin/python
   machine: Linux-5.0.16-100.fc28.x86_64-x86_64-with-glibc2.17

Python dependencies:
      sklearn: 1.1.1
          pip: 21.2.4
   setuptools: 61.2.0
        numpy: 1.22.3
        scipy: 1.7.3
       Cython: None
       pandas: None
   matplotlib: None
       joblib: 1.1.0
threadpoolctl: 2.2.0

Built with OpenMP: True

threadpoolctl info:
       filepath: /home/gperren/miniconda3/envs/pyupmask/lib/python3.8/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
         prefix: libgomp
       user_api: openmp
   internal_api: openmp
        version: None
    num_threads: 48

       filepath: /home/gperren/miniconda3/envs/pyupmask/lib/libmkl_rt.so.1
         prefix: libmkl_rt
       user_api: blas
   internal_api: mkl
        version: 2021.4-Product
    num_threads: 24
threading_layer: intel

       filepath: /home/gperren/miniconda3/envs/pyupmask/lib/libiomp5.so
         prefix: libiomp
       user_api: openmp
   internal_api: openmp
        version: None
    num_threads: 48
@Gabriel-p Gabriel-p added the bug Something isn't working label Jul 8, 2022
@FavorMylikes
Copy link
Contributor

Seems only reproduce at MacOS

CI - Test

------------------------------- Captured stdout --------------------------------
Command '['/usr/local/miniconda/envs/CB/bin/python', '/Users/runner/work/1/s/daal4py/sklearn/monkeypatch/tests/utils/_launch_algorithms.py']' returned non-zero exit status 1.
------------------------------- Captured stderr --------------------------------
dispatcher.py:151: FutureWarning: 
Scikit-learn patching with daal4py is deprecated and will be removed in the future.
Use Intel(R) Extension for Scikit-learn* module instead (pip install scikit-learn-intelex).
To enable patching, please use one of the following options:
1) From the command line:
    python -m sklearnex <your_script>
2) From your script:
    from sklearnex import patch_sklearn
    patch_sklearn()
Intel(R) oneAPI Data Analytics Library solvers for sklearn enabled: https://intelpython.github.io/daal4py/sklearn.html
Traceback (most recent call last):
  File "/Users/runner/work/1/s/daal4py/sklearn/monkeypatch/tests/utils/_launch_algorithms.py", line 117, in <module>
    run_algotithms()
  File "/Users/runner/work/1/s/daal4py/sklearn/monkeypatch/tests/utils/_launch_algorithms.py", line 93, in run_algotithms
    run_patch(info, t)
  File "/Users/runner/work/1/s/daal4py/sklearn/monkeypatch/tests/utils/_launch_algorithms.py", line 61, in run_patch
    model.fit(X, y)
  File "/usr/local/miniconda/envs/CB/lib/python3.9/site-packages/sklearn/decomposition/_pca.py", line 402, in fit
    self.n_oversamples,
AttributeError: 'PCA' object has no attribute 'n_oversamples'
=========================== short test summary info ============================
ERROR s/daal4py/sklearn/monkeypatch/tests/test_patching.py - SystemExit: 1
!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!

@samir-nasibli
Copy link
Contributor

Hi @Gabriel-p @FavorMylikes Thank your for your detailed reports. It seems due to new sklearn version 1.1.1, where we have some new parameters for PCA such as n_oversamples. So it is sklearex bug.

@samir-nasibli samir-nasibli self-assigned this Jul 27, 2022
@mjoy296
Copy link

mjoy296 commented Jul 29, 2022

Running into the same issue, how do we solve it?

@FavorMylikes
Copy link
Contributor

Running into the same issue, how do we solve it?

@mjoy296 Downgrade sklearn to 1.0.2

@mauriciocramos
Copy link

@mjoy296, I'm running into the same issue as well.

@FavorMylikes, I appreciate your solution but unfortunately I can't downgrade sklearn to 1.0.2 as I have dependencies on 1.1.

As seen in latest sklearn (1.1.2) docs there are new PCA() parameters since 1.1:

n_oversamples : int, default=10
power_iteration_normalizer : {‘auto’, ‘QR’, ‘LU’, ‘none’}, default=’auto’

Here are evidences:

>>> from sklearnex import patch_sklearn
>>> patch_sklearn()
Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)
>>> from sklearn.decomposition import PCA
>>> p = PCA()
>>> p.get_params()
{'copy': True, 'iterated_power': 'auto', 'n_components': None, 'random_state': None, 'svd_solver': 'auto', 'tol': 0.0, 'whiten': False}
>>> from sklearnex import unpatch_sklearn
>>> unpatch_sklearn()
>>> from sklearn.decomposition import PCA
>>> p = PCA()
>>> p.get_params()
{'copy': True, 'iterated_power': 'auto', 'n_components': None, 'n_oversamples': 10, 'power_iteration_normalizer': 'auto', 'random_state': None, 'svd_solver': 'auto', 'tol': 0.0, 'whiten': False}

I wonder if sklearnex could support new default PCA() parameters "from the future" or at least ignore their existence, otherwise I'd prefer to sklearnex.unpatch_sklearn() just for PCA() as its performance seems acceptable without sklearnex for now.

Eventually I would seek for more PCA performance by other means like using GPUs with the PCA from RAPIDS/cuml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants