Third party projects and code snippets

azrdev edited this page Apr 23, 2018 · 43 revisions

This page references projects and code snippets (gists) which are compatible with the scikit-learn API conventions.

Compare also the list in the repository at doc/related_projects.rst.


Different models and algorithms

  • auto_ml Automated machine learning for production and analytics, built on top of scikit-learn. Trains production-ready pipelines with all the standard ML steps built in, and prints verbose analytics
  • lightning Large-scale linear classification and regression in Python/Cython
  • libOPF Optimal path forest classifier
  • pyIPCA Incremental Principal Component Analysis
  • sklearn_pandas bridge for scikit-learn pipelines and pandas data frame with dedicated transformers.
  • py-earth Multivariate adaptive regression splines
  • [HMMLearn] ( Hidden Markov Models
  • sklearn-compiledtrees generate a C++ implementation of the predict function for decision trees (and ensembles) trained by sklearn. Useful for latency-sensitive production environments.
  • glm-sklearn scikit-learn compatible wrapper around the GLM module in statsmodel.
  • Fast svmlight / libsvm file loader
  • pyensemble An implementation of Caruana et al's Ensemble Selection algorithm in Python, based on scikit-learn
  • seqlearn: sequence classification library (HMMs, structured perceptron)
  • lda: fast implementation of Latent Dirichlet Allocation in Cython (github)
  • random-output-trees Multi-output random forest on randomised output space
  • nolearn scikit-learn compatible wrappers for neural net libraries, and other utilities.
  • sklearn-theano Scikit-learn compatible tools using theano
  • Sparse Filtering Unsupervised feature learning based on sparse-filtering
  • Kernel Regression Implementation of Nadaraya-Watson kernel regression with automatic bandwidth selection
  • Extreme Learning Machines Implementation of ELM (random layer + ridge) with a scikit-learn compatible interface.
  • gplearn Genetic Programming for symbolic regression tasks.
  • auto-sklearn Drop-in replacement for a scikit-learn estimator that performs automatic model and parameter selection
  • fastFM Fast factorization machine implementation compatible with scikit-learn
  • pyFM Another implementation of FMs in Python
  • kmodes k-modes clustering algorithm for categorical data, and several of its variations
  • sklearn-deap Use evolutionary algorithms instead of gridsearch in scikit-learn.
  • gp-extras Additional kernels that can be used in scikit-learn's GaussianProcessRegressor

Application-specific projects



Code snippets that do not follow the fit / predict / transform API.

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.