Skip to content

Latest commit

 

History

History
506 lines (422 loc) · 19.7 KB

machine-learning.md

File metadata and controls

506 lines (422 loc) · 19.7 KB

Problems

Classification

Binary classification

  • special case of: 'Classification'

Multiclass classification

  • special case of: 'Classification'

Regression

One-class classification

Positive-unlabeled learning

Novelty detection

Anomaly detection

Discrete sequence anomaly detection'

  • also called: 'symbolic sequence anomaly detection'
  • type of: 'One-class classification'
  • domain: 'Machine learning'

Covariance matrix estimation

Sparse inverse covariance matrix estimation

  • also called: 'Sparse precision matrix estimation'
  • domain: 'statistics', 'Multivariate analysis'

ML related, but not really ML

GPipe

  • paper: 'GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism'
  • applications: 'Neural Network Scaling and Distribution'
  • input: 'Neural network'
  • implemented in: 'GPipe library'

Meta algorithms

Bootstrap aggregating

Machine learning models and algorithms

FairCo

  • paper: 'Controlling Fairness and Bias in Dynamic Learning-to-Rank' (2020) https://doi.org/10.1145/3397271.3401100
  • applications: 'Learning to rank'
  • domain: 'Machine learning', 'Information retrieval'
  • Dynamic learning to rank
  • implemented in: 'MarcoMorik/Dynamic-Fairness'

Dense Passage Retriever

  • also called: 'DPR'
  • solves: 'Information retrieval', 'Ranking'
  • uses: 'BERT', 'FAISS'
  • applications: 'Open-domain question answering'

Naive Bayes classifier

  • also called: 'Naïve Bayes classifiers'
  • solves: 'Classification'
  • is a: 'Linear classifier'
  • uses model: 'Naïve Bayes probability model'
  • decision rule: 'Maximum a posteriori estimation'
  • related: 'Logistic regression'
  • is a bad estimator for probabilities

Gaussian Naive Bayes

  • usually optimized by: 'Maximum likelihood'
  • variant of: 'Naive Bayes classifier'
  • implemented in (libraries): 'sklearn.naive_bayes.GaussianNB'

Multinomial Naive Bayes

  • variant of: 'Naive Bayes classifier'
  • implemented in (libraries): 'sklearn.naive_bayes.MultinomialNB'

Bernoulli Naive Bayes

  • variant of: 'Naive Bayes classifier'
  • implemented in (libraries): 'sklearn.naive_bayes.BernoulliNB'

Linear discriminant analysis

  • also called: 'LDA', 'Normal discriminant analysis', 'NDA', 'Discriminant function analysis'
  • https://en.wikipedia.org/wiki/Linear_discriminant_analysis
  • related: 'Analysis of variance', 'Principal component analysis'
  • implemented in (libraries): 'sklearn.discriminant_analysis.LinearDiscriminantAnalysis'
  • quadratic variant: 'Quadratic discriminant analysis'
  • is a: 'Linear classifier'

Quadratic discriminant analysis

Bagging SVM

Structured support vector machine

Positive Unlabeled Random Forest

REINFORCE

  • also called: 'Monto-Carlo policy gradient'
  • paper: 'Simple statistical gradient-following algorithms for connectionist reinforcement learning' (1992)
  • is a: 'Policy gradient algorithm'
  • applications: 'Reinforcement learning'

Actor-Critic

  • paper: 'Neuronlike adaptive elements that can solve difficult learning control problems' (1983)
  • type: 'Temporal-Difference Learning'

Advantage Actor-Critic

  • also called: 'Actor Advantage Critic', 'A2C'

Asynchronous Advantage Actor-Critic

  • also called: 'A3C'
  • paper: 'Asynchronous Methods for Deep Reinforcement Learning' (2016)
  • is a: 'Policy gradient algorithm'
  • applications: 'Reinforcement learning'

Soft Actor-Critic

  • also called: 'SAC'
  • paper: 'Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor' (2018)
  • based on: 'Actor-critic'
  • applications: 'Reinforcement learning'
  • properties: 'off-policy'

Q-learning

Deep Q-Networks

  • also called: 'DQN'
  • paper: 'Playing Atari with Deep Reinforcement Learning' (2013)
  • based on: 'Q-learning'
  • properties: 'value based'

Self-critical sequence training

  • also called: 'SCST'
  • paper: 'Self-critical Sequence Training for Image Captioning' (2016)
  • form of: REINFORCE algorithm

Sparse FC

  • paper: 'Kernelized Synaptic Weight Matrices'
  • applications: 'Collaborative Filtering', 'Recommender Systems'
  • compare: 'I-AutoRec', 'CF-NADE', 'I-CFN', 'GC-MC'

BERT

  • also called: 'Bidirectional Encoder Representations from Transformers'
  • paper: 'BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding'
  • applications: 'Natural Language Processing', 'Question Answering', 'Natural Language Inference', 'Sentiment Analysis'
  • is a: 'Masked Language Model', 'Semantic hash', 'Trained hash'
  • implemented in: 'google-research/bert'
  • trained on: 'Cloze task', 'Next Sentence Prediction'
  • based on: 'Transformer'
  • domain: 'Unsupervised machine learning'

OpenAI GPT

  • also called: 'Generative Pre-training Transformer'
  • paper: 'Improving Language Understandingby Generative Pre-Training'
  • based on: 'Transformer'
  • is a: 'Language Model'
  • domain: 'Unsupervised machine learning'

GPT-2

DrQA

  • paper: 'Reading Wikipedia to Answer Open-Domain Questions'
  • applications: 'Extractive Question Answering'

HyperQA

  • paper: 'Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering'
  • applications: 'Multiple Choice Question Answering / Answer selection'

RMDL

  • paper: 'RMDL: Random Multimodel Deep Learning for Classification' (2018)
  • applications: 'Document Classification', 'Image Classification'
  • implemented in: 'Python kk7nc/RMDL'

MSRN

Automatic Differentiation Variational Inference

  • also called: 'ADVI'
  • paper: 'Automatic Variational Inference in Stan (2015)'
  • implemented in: 'Stan'
  • does: 'approximate Bayesian inference'

Gradient Tree Boosting

  • also called: 'Gradient boosting machine' (GBM), 'Gradient boosted regression tree' (GBRT)
  • also called: 'Gradient Boosting Decision Tree' (GBDT), 'Multiple Additive Regression Trees' (MART)
  • https://en.wikipedia.org/wiki/Gradient_boosting#Gradient_tree_boosting
  • implemented in: 'xgboost', 'LightGBM', 'pGBRT', 'catboost'
  • applications: 'Learning to rank'
  • easily distributable
  • properties: 'non-parametric'

Elastic net regularization

Tikhonov regularization

LASSO regularization

  • https://en.wikipedia.org/wiki/Lasso_(statistics)
  • also called: 'Least absolute shrinkage and selection operator'
  • applications: 'regression analysis'
  • implemented in: 'sklearn.linear_model.Lasso' (fitted with 'Coordinate descent')
  • properties: 'linear'

Isomap

UMAP

  • also called: 'Uniform Manifold Approximation and Projection'
  • paper: 'UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction' (2018) https://arxiv.org/abs/1802.03426
  • solves: 'Nonlinear dimensionality reduction'
  • applications: 'Data visualization'
  • implemented in: 'Python umap-learn'

Self-organizing map

  • also called: 'Kohonen map', 'Kohonen network'
  • https://en.wikipedia.org/wiki/Self-organizing_map
  • type of: 'Artificial neural network'
  • unsupervised
  • solves: 'Nonlinear dimensionality reduction', 'Dimensionality reduction'
  • applications: 'Data visualization'
  • implemented in: 'Python mvpa2.mappers.som.SimpleSOMMapper, Bio.Cluster.somcluster'
  • paradigm: 'Competitive learning'

t-SNE

Barnes-Hut t-SNE

Recursive autoencoder

  • also called: 'RAE'
  • paper: 'Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection' (2011)
  • applications: 'Paraphrase detection'

Conditional Variational Autoencoder

  • also called: 'CVAE'
  • paper: 'Learning Structured Output Representation using Deep Conditional Generative Models' (2015)

OneClass SVM

  • also called: 'OCSVM'
  • solves: 'Novelty detection'
  • implemented in: 'libsvm', 'sklearn.svm.OneClassSVM'
  • uses: 'Support estimation'
  • applications: 'One-class classification'

OneClass CRF

  • also called: 'OCCRF', 'One-class conditional random fields'
  • paper: 'One-class conditional random fields for sequential anomaly detection' (2013) https://dl.acm.org/doi/10.5555/2540128.2540370
  • applications: 'One-class classification', 'Sequence anomaly detection'
  • version of: 'Conditional random field'

Support Vector Data Description

Support-vector clustering

Chinese Whispers

ID3 algorithm

C4.5 algorithm

Classification and regression tree

DBSCAN

Mini-batch k-Means clustering

BIRCH

  • also called: 'Balanced iterative reducing and clustering using hierarchies'
  • paper: 'BIRCH: an efficient data clustering method for very large databases' (1996) https://doi.org/10.1145/235968.233324
  • https://en.wikipedia.org/wiki/BIRCH
  • type: 'unsupervised'
  • implemented in: 'sklearn.cluster.Birch'
  • properties: 'online'
  • alternative to: 'Mini-Batch K-Means clustering'
  • domain: 'data mining'
  • applications: 'Hierarchical clustering'

OPTICS algorithm

  • also called: 'Ordering points to identify the clustering structure algorithm'
  • paper: 'OPTICS: Ordering Points To Identify the Clustering Structure' (1999)
  • https://en.wikipedia.org/wiki/OPTICS_algorithm
  • is a: 'Density-based clustering algorithm'
  • generalization of: 'DBSCAN'

SUBCLU

Robust Random Cut Forest

  • also called: 'RRCF', 'Random Cut Forest','RCF'
  • paper: 'Robust random cut forest based anomaly detection on streams' (2016) https://dl.acm.org/doi/proceedings/10.5555/3045390
  • implemented in (cloud): 'AWS sagemaker.RandomCutForest'
  • implemented in: 'kLabUM/rrcf'
  • applications: 'Anomaly detection'
  • type: 'unsupervised'

Isolation Forest

Extended Isolation Forest

Local Outlier Factor

  • also called: 'LOF'
  • paper: 'LOF: identifying density-based local outliers'
  • solves: 'Novelty detection', 'Anomaly detection'
  • https://en.wikipedia.org/wiki/Local_outlier_factor
  • implemented in: 'sklearn.neighbors.LocalOutlierFactor', 'ELKI'
  • is a: 'Nearest neighbor method'

Elliptic Envelope

  • paper: 'A Fast Algorithm for the Minimum Covariance Determinant Estimator (1998)'
  • solves: 'Anomaly detection'
  • implemented in: 'sklearn.covariance.EllipticEnvelope'
  • input: 'Normal distributed data' with n_samples > n_features ** 2
  • uses: 'Covariance estimation'

Graphical Lasso

Ledoit-Wolf estimator

  • also called: 'LW'
  • paper: 'A well-conditioned estimator for large-dimensional covariance matrices' (2004) https://doi.org/10.1016/S0047-259X(03)00096-4
  • implemented in: 'sklearn.covariance.LedoitWolf'
  • applications: 'Covariance matrix estimation'
  • is a: 'well-conditioned estimator'
  • assumes distribution: None

Oracle Approximating Shrinkage estimator

  • also called: 'OAS'
  • paper: 'Shrinkage Algorithms for MMSE Covariance Estimation' (2010) https://doi.org/10.1109/TSP.2010.2053029
  • implemented in: 'sklearn.covariance.OAS'
  • applications: 'Covariance matrix estimation'
  • improvement of: 'Ledoit-Wolf estimator'
  • assumes distribution: 'Gaussian'

Minimum Covariance Determinant

FAST-MCD algorithm

Random forest

Random Forest regression

AdaBoost

AdaBoost-SAMME

AdaBoost.R2

Multi-Channel Convolutional Neural Network

  • also called: 'MCCNN'
  • paper: 'Question Answering on Freebase via Relation Extraction and Textual Evidence' (2016) https://doi.org/10.18653/v1/P16-1220
  • applications: 'Relation Extraction'