Third party projects and code snippets

Mathieu Blondel edited this page Feb 22, 2016 · 41 revisions
Clone this wiki locally

This page references projects and code snippets (gists) which are compatible with the scikit-learn API conventions.


Different models and algorithms

  • lightning Large-scale linear classification and regression in Python/Cython
  • libOPF Optimal path forest classifier
  • pyIPCA Incremental Principal Component Analysis
  • sklearn_pandas bridge for scikit-learn pipelines and pandas data frame with dedicated transformers.
  • py-earth Multivariate adaptive regression splines
  • [HMMLearn] ( Hidden Markov Models
  • sklearn-compiledtrees generate a C++ implementation of the predict function for decision trees (and ensembles) trained by sklearn. Useful for latency-sensitive production environments.
  • glm-sklearn scikit-learn compatible wrapper around the GLM module in statsmodel.
  • Fast svmlight / libsvm file loader
  • pyensemble An implementation of Caruana et al's Ensemble Selection algorithm in Python, based on scikit-learn
  • seqlearn: sequence classification library (HMMs, structured perceptron)
  • lda: fast implementation of Latent Dirichlet Allocation in Cython (github)
  • random-output-trees Multi-output random forest on randomised output space
  • nolearn scikit-learn compatible wrappers for neural net libraries, and other utilities.
  • sklearn-theano Scikit-learn compatible tools using theano
  • Sparse Filtering Unsupervised feature learning based on sparse-filtering
  • Kernel Regression Implementation of Nadaraya-Watson kernel regression with automatic bandwidth selection
  • Extreme Learning Machines Implementation of ELM (random layer + ridge) with a scikit-learn compatible interface.
  • gplearn Genetic Programming for symbolic regression tasks.
  • auto-sklearn Drop-in replacement for a scikit-learn estimator that performs automatic model and parameter selection
  • fastFM Fast factorization machine implementation compatible with scikit-learn
  • pyFM Another implementation of FMs in Python
  • kmodes k-modes clustering algorithm for categorical data, and several of its variations
  • sklearn-deap Use evolutionary algorithms instead of gridsearch in scikit-learn.
  • gp-extras Additional kernels that can be used in scikit-learn's GaussianProcessRegressor

Application-specific projects



Code snippets that do not follow the fit / predict / transform API.