diff --git a/doc/developers/advanced_installation.rst b/doc/developers/advanced_installation.rst index dc6824d8fb306..e341268fc04d2 100644 --- a/doc/developers/advanced_installation.rst +++ b/doc/developers/advanced_installation.rst @@ -1,9 +1,12 @@ .. _advanced-installation: -================================================================== -Installing the development version of scikit-learn (master branch) -================================================================== +================================================== +Installing the development version of scikit-learn +================================================== + +This section introduces how to install the **master branch** of scikit-learn. +This can be done by either installing a nightly build or building from source. .. _install_nightly_builds: @@ -12,7 +15,16 @@ Installing nightly builds The continuous integration servers of the scikit-learn project build, test and upload wheel packages for the most recent Python version on a nightly -basis to help users test bleeding edge features or bug fixes:: +basis. + +Installing a nightly build is the quickest way to: + +- try a new feature that will be shipped in the next release (that is, a + feature from a pull-request that was recently merged to the master branch); + +- check whether a bug you encountered has been fixed since the last release. + +:: pip install --pre -f https://sklearn-nightly.scdn8.secure.raxcdn.com scikit-learn @@ -20,129 +32,202 @@ basis to help users test bleeding edge features or bug fixes:: .. _install_bleeding_edge: Building from source -===================== +==================== + +Building from source is required to work on a contribution (bug fix, new +feature, code or documentation improvement). + +.. _git_repo: + +#. Use `Git `_ to check out the latest source from the + `scikit-learn repository `_ on + Github.:: + + git clone git://github.com/scikit-learn/scikit-learn.git + cd scikit-learn + + If you plan on submitting a pull-request, you should clone from your fork + instead. + +#. Install a compiler with OpenMP_ support for your platform. See intructions + for :ref:`compiler_windows`, :ref:`compiler_macos`, :ref:`compiler_linux` + and :ref:`compiler_freebsd`. + +#. Optional (but recommended): create and activate a dedicated virtualenv_ + or `conda environment`_. -In the vast majority of cases, building scikit-learn for development purposes -can be done with:: +#. Install Cython_ and build the project with pip in :ref:`editable_mode`:: - pip install cython pytest flake8 + pip install cython + pip install --verbose --editable . -Then, in the main repository:: +#. Check that the installed scikit-learn has a version number ending with + `.dev0`:: - pip install --editable . + python -c "import sklearn; sklearn.show_versions()" -Please read below for details and more advanced instructions. +#. Please refer to the :ref:`developers_guide` and :ref:`pytest_tips` to run + the tests on the module of your choice. + +.. note:: + + You will have to re-run the ``pip install --editable .`` command every time + the source code of a Cython file is updated (ending in `.pyx` or `.pxd`). Dependencies ------------ -Scikit-learn requires: +Runtime dependencies +~~~~~~~~~~~~~~~~~~~~ + +Scikit-learn requires the following dependencies both at build time and at +runtime: - Python (>= 3.5), - NumPy (>= 1.11), - SciPy (>= 0.17), - Joblib (>= 0.11). +Those dependencies are **automatically installed by pip** if they were missing +when building scikit-learn from source. + .. note:: - For installing on PyPy, PyPy3-v5.10+, Numpy 1.14.0+, and scipy 1.1.0+ + For running on PyPy, PyPy3-v5.10+, Numpy 1.14.0+, and scipy 1.1.0+ are required. For PyPy, only installation instructions with pip apply. +Build dependencies +~~~~~~~~~~~~~~~~~~ -Building Scikit-learn also requires +Building Scikit-learn also requires: -- Cython >=0.28.5 -- OpenMP +- Cython >= 0.28.5 +- A C/C++ compiler and a matching OpenMP_ runtime library. See the + :ref:`platform system specific instructions + ` for more details. .. note:: It is possible to build scikit-learn without OpenMP support by setting the ``SKLEARN_NO_OPENMP`` environment variable (before cythonization). This is not recommended since it will force some estimators to run in sequential - mode and their ``n_jobs`` parameter will be ignored. + mode. + +Since version 0.21, scikit-learn automatically detects and use the linear +algebrea library used by SciPy **at runtime**. Scikit-learn has therefore no +build dependency on BLAS/LAPACK implementations such as OpenBlas, Atlas, Blis +or MKL. +Test dependencies +~~~~~~~~~~~~~~~~~ -Running tests requires +Running tests requires: -.. |PytestMinVersion| replace:: 3.3.0 +.. |PytestMinVersion| replace:: 4.6.2 - pytest >=\ |PytestMinVersion| Some tests also require `pandas `_. -.. _git_repo: -Retrieving the latest code --------------------------- - -We use `Git `_ for version control and -`GitHub `_ for hosting our main repository. - -You can check out the latest sources with the command:: - - git clone git://github.com/scikit-learn/scikit-learn.git +Building a specific version from a tag +-------------------------------------- If you want to build a stable version, you can ``git checkout `` to get the code for that particular version, or download an zip archive of the version from github. -Once you have all the build requirements installed (see below for details), -you can build and install the package in the following way. +.. _editable_mode: -If you run the development version, it is cumbersome to reinstall the -package each time you update the sources. Therefore it's recommended that you -install in editable mode, which allows you to edit the code in-place. This -builds the extension in place and creates a link to the development directory -(see `the pip docs `_):: +Editable mode +------------- - pip install --editable . +If you run the development version, it is cumbersome to reinstall the package +each time you update the sources. Therefore it is recommended that you install +in with the ``pip install --editable .`` command, which allows you to edit the +code in-place. This builds the extension in place and creates a link to the +development directory (see `the pip docs +`_). -.. note:: - - This is fundamentally similar to using the command ``python setup.py develop`` - (see `the setuptool docs `_). - It is however preferred to use pip. +This is fundamentally similar to using the command ``python setup.py develop`` +(see `the setuptool docs +`_). +It is however preferred to use pip. -.. note:: +On Unix-like systems, you can equivalently type ``make in`` from the top-level +folder. Have a look at the ``Makefile`` for additional utilities. - You will have to re-run:: +.. _platform_specific_instructions: - pip install --editable . +Platform-specific instructions +============================== - every time the source code of a compiled extension is changed (for - instance when switching branches or pulling changes from upstream). - Compiled extensions are Cython files (ending in `.pyx` or `.pxd`). +Here are instructions to install a working C/C++ compiler with OpenMP support +to build scikit-learn Cython extensions for each supported platform. -On Unix-like systems, you can equivalently type ``make in`` from the -top-level folder. Have a look at the ``Makefile`` for additional utilities. +.. _compiler_windows: -Mac OSX +Windows ------- -The default C compiler, Apple-clang, on Mac OSX does not directly support -OpenMP. We present two solutions to enable OpenMP support (you need to do only -one). +First, install `Build Tools for Visual Studio 2019 +`_. -.. note:: +.. warning:: - First, clean any previously built files in the source folder of - scikit-learn:: + You DO NOT need to install Visual Studio 2019. You only need the "Build + Tools for Visual Studio 2019", under "All downloads" -> "Tools for Visual + Studio 2019". - make clean +Secondly, find out if you are running 64-bit or 32-bit Python. The building +command depends on the architecture of the Python interpreter. You can check +the architecture by running the following in ``cmd`` or ``powershell`` +console:: -Using conda -~~~~~~~~~~~ + python -c "import struct; print(struct.calcsize('P') * 8)" -One solution is to install another compiler which supports OpenMP. If you use -the conda package manager, you can install the ``compilers`` meta-package from -the conda-forge channel, which provides OpenMP-enabled C compilers. +For 64-bit Python, configure the build environment with:: -It is recommended to use a dedicated conda environment to build scikit-learn -from source:: + SET DISTUTILS_USE_SDK=1 + "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" x64 + +Replace ``x64`` by ``x86`` to build for 32-bit Python. + +Please be aware that the path above might be different from user to user. The +aim is to point to the "vcvarsall.bat" file that will set the necessary +environment variables in the current command prompt. + +Finally, build scikit-learn from this command prompt:: + + pip install --verbose --editable . + +.. _compiler_macos: + +macOS +----- + +The default C compiler on macOS, Apple clang (confusingly aliased as +`/usr/bin/gcc`), does not directly support OpenMP. We present two alternatives +to enable OpenMP support: + +- either install `conda-forge::compilers` with conda; + +- or install `libomp` with Homebrew to extend the default Apple clang compiler. + +macOS compilers from conda-forge +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you use the conda package manager, you can install the ``compilers`` +meta-package from the conda-forge channel, which provides OpenMP-enabled C/C++ +compilers based on the llvm toolchain. + +It is recommended to use a dedicated `conda environment`_ to build +scikit-learn from source:: conda create -n sklearn-dev python numpy scipy cython joblib pytest \ conda-forge::compilers conda-forge::llvm-openmp conda activate sklearn-dev + make clean pip install --verbose --editable . .. note:: @@ -166,19 +251,20 @@ variables:: echo $CXXFLAGS echo $LDFLAGS -They point to files and folders from your sklearn-dev conda environment +They point to files and folders from your ``sklearn-dev`` conda environment (in particular in the bin/, include/ and lib/ subfolders). -The compiled extensions should be built with the clang and clang++ compilers -with the ``-fopenmp`` command line flag. +In the log, you should see the compiled extension being built with the clang +and clang++ compilers installed by conda with the ``-fopenmp`` command line +flag. -Using homebrew -~~~~~~~~~~~~~~ +macOS compilers from Homebrew +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Another solution is to enable OpenMP support for the clang compiler shipped by default on macOS. -You first need to install the OpenMP library:: +You first need to install the OpenMP library using Homebrew_:: brew install libomp @@ -191,132 +277,89 @@ Then you need to set the following environment variables:: export CXXFLAGS="$CXXFLAGS -I/usr/local/opt/libomp/include" export LDFLAGS="$LDFLAGS -Wl,-rpath,/usr/local/opt/libomp/lib -L/usr/local/opt/libomp/lib -lomp" -Finally, build scikit-learn in verbose mode:: +Finally, build scikit-learn in verbose mode (to check for the presence of the +``-fopenmp`` flag in the compiler commands):: + make clean pip install --verbose --editable . -FreeBSD -------- - -The clang compiler included in FreeBSD 12.0 and 11.2 base systems does not -include OpenMP support. You need to install the `openmp` library from packages -(or ports):: - - sudo pkg install openmp - -This will install header files in ``/usr/local/include`` and libs in -``/usr/local/lib``. Since these directories are not searched by default, you -can set the environment variables to these locations:: - - export CFLAGS="$CFLAGS -I/usr/local/include" - export CXXFLAGS="$CXXFLAGS -I/usr/local/include" - export LDFLAGS="$LDFLAGS -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lomp" - -Finally you can build the package using the standard command. - -For the upcomming FreeBSD 12.1 and 11.3 versions, OpenMP will be included in -the base system and these steps will not be necessary. - - -Installing build dependencies -============================= +.. _compiler_linux: Linux ----- -Installing from source without conda requires you to have installed the -scikit-learn runtime dependencies, Python development headers and a working -C/C++ compiler. Under Debian-based operating systems, which include Ubuntu:: +Linux compilers from the system +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - sudo apt-get install build-essential python3-dev python3-setuptools \ - python3-pip - -and then:: +Installing scikit-learn from source without using conda requires you to have +installed the scikit-learn Python development headers and a working C/C++ +compiler with OpenMP support (typically the GCC toolchain). - pip3 install numpy scipy cython +Install build dependencies for Debian-based operating systems, e.g. +Ubuntu:: -.. note:: + sudo apt-get install build-essential python3-dev python3-pip - In order to build the documentation and run the example code contains in - this documentation you will need matplotlib:: +then proceed as usual:: - pip3 install matplotlib + pip3 install cython + pip3 install --verbose --editable . -When precompiled wheels are not avalaible for your architecture, you can -install the system versions:: +Cython and the pre-compiled wheels for the runtime dependencies (numpy, scipy +and joblib) should automatically be installed in +``$HOME/.local/lib/pythonX.Y/site-packages``. Alternatively you can run the +above commands from a virtualenv_ or a `conda environment`_ to get full +isolation from the Python packages installed via the system packager. When +using an isolated environment, ``pip3`` should be replaced by ``pip`` in the +above commands. - sudo apt-get install cython3 python3-numpy python3-scipy python3-matplotlib +When precompiled wheels of the runtime dependencies are not avalaible for your +architecture (e.g. ARM), you can install the system versions:: + + sudo apt-get install cython3 python3-numpy python3-scipy On Red Hat and clones (e.g. CentOS), install the dependencies using:: sudo yum -y install gcc gcc-c++ python-devel numpy scipy -.. note:: +Linux compilers from conda-forge +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - To use a high performance BLAS library (e.g. OpenBlas) see - `scipy installation instructions - `_. - -Windows -------- - -To build scikit-learn on Windows you need a working C/C++ compiler in -addition to numpy, scipy and setuptools. - -The building command depends on the architecture of the Python interpreter, -32-bit or 64-bit. You can check the architecture by running the following in -``cmd`` or ``powershell`` console:: - - python -c "import struct; print(struct.calcsize('P') * 8)" - -The above commands assume that you have the Python installation folder in your -PATH environment variable. - -You will need `Build Tools for Visual Studio 2017 -`_. - -.. warning:: - You DO NOT need to install Visual Studio 2019. - You only need the "Build Tools for Visual Studio 2019", - under "All downloads" -> "Tools for Visual Studio 2019". - -For 64-bit Python, configure the build environment with:: - - SET DISTUTILS_USE_SDK=1 - "C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" x64 - -Please be aware that the path above might be different from user to user. -The aim is to point to the "vcvarsall.bat" file. - -And build scikit-learn from this environment:: - - python setup.py install - -Replace ``x64`` by ``x86`` to build for 32-bit Python. +Alternatively, install a recent version of the GNU C Compiler toolchain (GCC) +in the user folder using conda:: + conda create -n sklearn-dev numpy scipy joblib cython conda-forge::compilers + conda activate sklearn-dev + pip install --verbose --editable . -Building binary packages and installers ---------------------------------------- +.. _compiler_freebsd: -The ``.whl`` package and ``.exe`` installers can be built with:: +FreeBSD +------- - pip install wheel - python setup.py bdist_wheel bdist_wininst -b doc/logos/scikit-learn-logo.bmp +The clang compiler included in FreeBSD 12.0 and 11.2 base systems does not +include OpenMP support. You need to install the `openmp` library from packages +(or ports):: -The resulting packages are generated in the ``dist/`` folder. + sudo pkg install openmp +This will install header files in ``/usr/local/include`` and libs in +``/usr/local/lib``. Since these directories are not searched by default, you +can set the environment variables to these locations:: -Using an alternative compiler ------------------------------ + export CFLAGS="$CFLAGS -I/usr/local/include" + export CXXFLAGS="$CXXFLAGS -I/usr/local/include" + export LDFLAGS="$LDFLAGS -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lomp" -It is possible to use `MinGW `_ (a port of GCC to Windows -OS) as an alternative to MSVC for 32-bit Python. Not that extensions built with -mingw32 can be redistributed as reusable packages as they depend on GCC runtime -libraries typically not installed on end-users environment. +Finally, build the package using the standard command:: -To force the use of a particular compiler, pass the ``--compiler`` flag to the -build step:: + pip install --verbose --editable . - python setup.py build --compiler=my_compiler install +For the upcomming FreeBSD 12.1 and 11.3 versions, OpenMP will be included in +the base system and these steps will not be necessary. -where ``my_compiler`` should be one of ``mingw32`` or ``msvc``. +.. _OpenMP: https://en.wikipedia.org/wiki/OpenMP +.. _Cython: https://cython.org +.. _Homebrew: https://brew.sh +.. _virtualenv: https://docs.python.org/3/tutorial/venv.html +.. _conda environment: https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html