scikit-learn: machine learning in Python
Switch branches/tags
0.6.X 0.12.X 20news agglo_no_samples ami batchKMeans bdist_rpm better_enet_cv bug_ovo_string_y cd_cleanup check_sparse chunk_manhattan clustering_doc contributing_statement cross_val dictionary_learning dipslay_tb doc-faq doc_partial_fit_complete dwf_sparse_pca eleven_point_ap enet_cv euclidean_speedup example_tomo fast_log fast_ward faster_hierarchical faster_ward fix_copybutton fix_doc_gcv fix_docstring fix_fastica fix_intp fix_joblib_embedding fix_joblib_pickle fix_lasso_lars_path_length fix_mb_kmeans fix_mbdictlearning fix_np1.9 fix_roc_auc_doc_phrasing fix_whatsnew fix_1406 gael gaussian-process glasso graph_lasso_centered grid_search_covariance grid_search hc_linkage hcluster hcluster2 hierarchical_refactor hmmc_fix hmmc hotfix hungarian imgmath_pngmath infonea_testimonial joblib_deprecation joonazzz_master kalman label-propagation lars_cv_debug lars_cv lars_drop_for_good lars_early_stop lars_ill_conditionned lfw-dataset linkage linking_arrayfuncs lle logistic_cv logistic_doc manifold master mini_bach_kmeans monologue more_ledoit_wold_tests multi_task_lasso murmurhash my-travis-old-numpy-scipy my_master neighbors_np_1_3 nmds nmf non_an_array np_warnings okcupid_testimonial omp_intercept onlinespca parra_kmeans phimeca pipeline pr_1200 pr_1732 pr_1954 pr_1984 pr_2151 pr_2172 pr_2201 pr_2283 pr_2304 pr_2305 pr_2307 pr_2541 pr_2638 pr_2822 pr_2916 pr_2949 pr_3022 pr_4009 pr_4090 pr_4779 pr_7356 pr_8217 progress_log progress_logger random_forest_embedding_digits randomized_lasso_clean randomized_lasso_rebase randomized_lasso remove_test ridge_sample_weights rm_solve_triangular sample_datasets scale_c_example score_func_objects scoring_doc single_linkage sklearn_warnings sparse_ols sparse_pca spectral_fix sss_bug stat_tutorial strong_rules_2 test_branch test_lars_drop testimonial_jp_morgan tree treeweights tron_logistic unvendor_joblib variational-infinite-gmm wip_neighbors wip
Nothing to show
Clone or download
Pull request Compare This branch is 6491 commits behind scikit-learn:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.




scikit-learn is a Python module for machine learning built on top of SciPy and distributed under the 3-Clause BSD license.

The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. See the AUTHORS.rst file for a complete list of contributors.

It is currently maintained by a team of volunteers.

Note scikit-learn was previously referred to as scikits.learn.

Important links


scikit-learn is tested to work under Python 2.6, Python 2.7, and Python 3.4. (using the same codebase thanks to an embedded copy of six). It should also work with Python 3.3.

The required dependencies to build the software are NumPy >= 1.6.2, SciPy >= 0.9 and a working C/C++ compiler.

For running the examples Matplotlib >= 1.1.1 is required and for running the tests you need nose >= 1.1.2.

This configuration matches the Ubuntu Precise 12.04 LTS release from April 2012.

scikit-learn also uses CBLAS, the C interface to the Basic Linear Algebra Subprograms library. scikit-learn comes with a reference implementation, but the system CBLAS will be detected by the build system and used if present. CBLAS exists in many implementations; see Linear algebra libraries for known issues.


This package uses distutils, which is the default way of installing python modules. To install in your home directory, use:

python install --user

To install for all users on Unix/Linux:

python build
sudo python install




You can check the latest sources with the command:

git clone

or if you have write privileges:

git clone


Quick tutorial on how to go about setting up your environment to contribute to scikit-learn:

Before opening a Pull Request, have a look at the full Contributing page to make sure your code complies with our guidelines:


After installation, you can launch the test suite from outside the source directory (you will need to have the nose package installed):

$ nosetests -v sklearn

Under Windows, it is recommended to use the following command (adjust the path to the python.exe program) as using the nosetests.exe program can badly interact with tests that use multiprocessing:

C:\Python34\python.exe -c "import nose; nose.main()" -v sklearn

See the web page for more information.

Random number generation can be controlled during testing by setting the SKLEARN_SEED environment variable.