Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SKLEARN_NO_OPENMP not working #14332

Closed
adamjstewart opened this issue Jul 13, 2019 · 14 comments
Closed

SKLEARN_NO_OPENMP not working #14332

adamjstewart opened this issue Jul 13, 2019 · 14 comments

Comments

@adamjstewart
Copy link
Contributor

@adamjstewart adamjstewart commented Jul 13, 2019

Description

I'm trying to install scikit-learn from source on macOS. Sklearn recently added an OpenMP dependency, which I am still in the process of getting working. In the meantime, I tried setting SKLEARN_NO_OPENMP to TRUE, but found that compilation still fails.

Steps/Code to Reproduce

$ export SKLEARN_NO_OPENMP=TRUE
$ python setup.py build
$ python setup.py install

Expected Results

Successful compilation without OpenMP support.

Actual Results

I see the following error message during compilation:

sklearn/ensemble/_hist_gradient_boosting/splitting.c:614:10: fatal error: 'omp.h' file not found
#include <omp.h>
         ^~~~~~~
1 warning and 1 error generated.

All compilation is done using the Spack package manager. Full build logs are presented below. Don't worry about understanding the inner workings of Spack, I'm a Spack developer so I can handle that part.

Build Output
Build Environment

Versions

Dependency Version
OS macOS 10.14.5
Compiler Apple Clang 10.0.1
Platform Darwin-18.6.0-x86_64-i386-64bit
Python 3.7.3
NumPy 1.16.4
SciPy 1.2.1
Sklearn 0.21.2
Cython 0.29.7
@jeremiedbb

This comment has been minimized.

Copy link
Member

@jeremiedbb jeremiedbb commented Jul 13, 2019

Just a question: If you try to build without setting the environment variable, do you get the following error message ?

    It seems that scikit-learn cannot be built with OpenMP support.

        - Make sure you have followed the installation instructions:

            https://scikit-learn.org/dev/developers/advanced_installation.html

        - If your compiler supports OpenMP but the build still fails, please
          submit a bug report at:

            https://github.com/scikit-learn/scikit-learn/issues

        - If you want to build scikit-learn without OpenMP support, you can set
          the environment variable SKLEARN_NO_OPENMP and rerun the build
          command. Note however that some estimators will run in sequential
          mode and their `n_jobs` parameter will have no effect anymore.
@adamjstewart

This comment has been minimized.

Copy link
Contributor Author

@adamjstewart adamjstewart commented Jul 13, 2019

Yes, I do get that exact error message if no flags are set.

@jnothman

This comment has been minimized.

Copy link
Member

@jnothman jnothman commented Jul 14, 2019

@adamjstewart

This comment has been minimized.

Copy link
Contributor Author

@adamjstewart adamjstewart commented Jul 14, 2019

@jnothman I tried again and made sure I was building in a clean working directory and the problem still occurred, same error message.

@jeremiedbb

This comment has been minimized.

Copy link
Member

@jeremiedbb jeremiedbb commented Jul 16, 2019

@adamjstewart can you execute this:

SKLEARN_NO_OPENMP=TRUE python -c 'from sklearn._build_utils import check_openmp_support; print(check_openmp_support())'
@adamjstewart

This comment has been minimized.

Copy link
Contributor Author

@adamjstewart adamjstewart commented Jul 16, 2019

@jeremiedbb you want me to execute that in the build directory before cythonizing, or after the build fails?

@jeremiedbb

This comment has been minimized.

Copy link
Member

@jeremiedbb jeremiedbb commented Jul 16, 2019

Sorry this won't work if sklearn isn't correctly installed.
The thing to do is go to sklearn/_build_utils/ and then execute:

SKLEARN_NO_OPENMP=TRUE python -c 'from openmp_helpers import check_openmp_support; print(check_openmp_support())'

you can do it before or after trying to build, it does not matter.

@adamjstewart

This comment has been minimized.

Copy link
Contributor Author

@adamjstewart adamjstewart commented Jul 18, 2019

Sorry for the delay, here you go:

$ SKLEARN_NO_OPENMP=TRUE python -c 'from openmp_helpers import check_openmp_support; print(check_openmp_support())'
False
@jeremiedbb

This comment has been minimized.

Copy link
Member

@jeremiedbb jeremiedbb commented Jul 18, 2019

How did you get the sources of sklearn ? cloning the repo ?
Do you have a file named "PKG-INFO" in scikit-learn folder ? If not, the only possibility I can see is a cython bug where the compile_time_env is somehow ignored. It's hard to debug because I can't reproduce it :/

@adamjstewart

This comment has been minimized.

Copy link
Contributor Author

@adamjstewart adamjstewart commented Jul 18, 2019

The error message I reported above and the build logs I shared came from building the latest release tarball downloaded from PyPI. The check_openmp_support results I showed above are from both the latest release tarball and from the latest commit on the master branch of the repo. The tarball contains a PKG-INFO file, but the master branch does not. If you want, I can try building the master branch.

It's hard to debug because I can't reproduce it :/

You're also building on macOS with Apple Clang? Are you sure you don't have any omp.h file on the system? Let me see if I can share that splitting.c file so we can compare.

@jeremiedbb

This comment has been minimized.

Copy link
Member

@jeremiedbb jeremiedbb commented Jul 18, 2019

The error message I reported above and the build logs I shared came from building the latest release tarball downloaded from PyPI. [...] The tarball contains a PKG-INFO file.

In the tarball we ship the sources already cythonized. And they were cythonized with OpenMP support.
So from there there's no way to build without OpenMP support.
That's an issue. But we don't want to ship cythonized sources without OpenMP support. It would for sequential mode to users who actually have OpenMP.

Maybe it's possible to ship 2 different cythonized version, with and without OpenMP support, and the final build would choose the right one. What do you think @jnothman ?

If you want, I can try building the master branch.

Yes please, with the SKLEARN_NO_OPEN env var. This should work, hopefully :)

You're also building on macOS with Apple Clang? Are you sure you don't have any omp.h file on the system?

My tests were on the master branch from the repo, hence why I couldn't reproduce.

Let me see if I can share that splitting.c file so we can compare.

I guess it's not necessary since we now know the cause.

@adamjstewart

This comment has been minimized.

Copy link
Contributor Author

@adamjstewart adamjstewart commented Jul 18, 2019

Ah, that makes sense. Is there any way to re-cythonize the sources from the tarballs? Then you can ship cythonized versions with OpenMP support, and anyone who doesn't have OpenMP installed can re-cythonize them.

@adamjstewart

This comment has been minimized.

Copy link
Contributor Author

@adamjstewart adamjstewart commented Jul 18, 2019

Update: the master branch does indeed install fine with SKLEARN_NO_OPENMP, so the issue was that I was trying to install an already cythonized release tarball.

@jeremiedbb

This comment has been minimized.

Copy link
Member

@jeremiedbb jeremiedbb commented Jul 18, 2019

Is there any way to re-cythonize the sources from the tarballs?

as a temporary workaround, you can delete the PKG-INFO file and remove all problematic .c files, then build. Don't delete all .c files because some don't come from the cythonization.

We ship already cythonized sources to avoid the cython dependency on user installing from Pypi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.