Skip to content

A cheat sheet for Machine Learning and Statistics topics. The purpose of this repo is to prepare for interviews and offer resources for initial problem statements.

Notifications You must be signed in to change notification settings

cmiley/ml-stat-cheat-sheet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

ML/Stat Cheat Sheet (IN PROGRESS)

A cheat sheet for Machine Learning and Statistics topics. The purpose of this repo is to prepare for interviews and offer resources for initial problem statements. The format is structured to include a list of topics (not in an particular order) that are relevant to machine learning and statistical modeling. Each topic will include a short description, some resources related to the topic, and any tags that may be useful for those searching the page as an index.

If you think something should change, whether there is an error in the topic description, another relevant topic to add, or something else, please create a pull request and I will add you as a contributor to that topic.

Contributors: Cayler Miley

Topics

  • Singular Value Decomposition (SVD)
  • Maximum Likelihood Estimation (MLE)
  • Bayesian Data Analysis (BDA)
  • Principal Component Analysis (PCA)
  • t-SNE Visualization
  • Universal Manifold Approximation and Projection (UMAP)
  • Agglomerative Clustering
  • Decision Trees
  • Random Forest (Mean Trees)
  • Random Forest (Quantile Trees)
  • Linear Regression
  • Lasso Regression
  • Factor Analysis
  • Multivariate Multiple Regression
  • Least Squares Estimation
  • ANOVA & MANOVA
  • Fisher's Method
  • Miknowski Metric
  • Euclidean Distance
  • Canberra Metric
  • Czekanowski Coefficient
  • Complete Linkage
  • Average Linkage
  • Ward's Hierarchical Clustering
  • k-Means Clustering
  • Multi-Dimensional Scaling
  • Correspondence Analysis
  • Biplots
  • Procrustes Analysis
  • Confidence Intervals
  • Normalization
  • Bernoulli Distribution
  • Binomial Distribution
  • Poisson Distribution
  • Normal Distribution
  • Lognormal Distribution
  • Exponential Distribution
  • Central Limit Theorem
  • Hypothesis Testing
  • Factorial Experiments
  • Hypergeometric Distribution
  • Geometric Distribution
  • Negative Binomial Distribution
  • Multinomial Distribution
  • Uniform Distribution
  • Erlang Distribution
  • Gamma Distribution
  • Weibull Distribution
  • Markov Chain
  • Numerical Integration
  • Importance Sampling
  • Gibbs Sampler
  • Metropolis and Metropolis-Hastings Algorithm
  • Hamiltonian Monte Carlo
  • Variational Inference
  • Expectation Propagation
  • Expectation Maximization
  • Robust Inference
  • Multiple Imputation
  • Splines and Weighted Sums
  • Gaussian Process Regression
  • Latent Gaussian Process
  • Finite Mixture Modeling
  • Dirichlet Process Modeling
  • Moore-Penrose Pseudoinverse
  • Trace Operator
  • Marginal Probability
  • Conditional Probability
  • Chain Rule for Probability
  • Baye's Rule
  • Underflow
  • Overflow
  • Hyperparameters
  • Stochastic Gradient Descent (SGD)
  • Deep Feedforward Networks
  • Perceptron
  • Back-Propagation
  • Parameter Norm Penalty
  • Regularization
  • Noise Robustness
  • Multitask Learning
  • Early Stopping
  • Bagging
  • Ensemble Methods
  • Dropout
  • Adversarial Training
  • Generative Adversarial Network (GAN)
  • Tangent Distance, Prop, and Manifold Tangent
  • Difference Between Learning and Optimization
  • Convex Optimization
  • Definition of Convex
  • Convolutional Neural Network (CNN)
  • Pooling
  • Efficient Convolutions
  • Recurrent Neural Network (RNN)
  • Encoder-Decoder Sequence to Sequence
  • Leaky Units
  • Echo State Networks
  • Long-Short Term Memory (LSTM) and Gated Units
  • Sparse Coding
  • Independent Component Analysis
  • Autoencoders
  • Representation Learning
  • Log-Likelihood Gradient
  • Contrastive Divergence
  • Pseudolikelihood
  • Partition Function
  • Deep Generative Modeling
  • Boltzmann Machines and Restricted Boltzmann Machines (RBM)
  • Deep Belief Network
  • Boosting
  • Transformer
  • Attention
  • Multimodal
  • Classification vs. Regression
  • Hyperparameter Tuning
  • Graph Neural Networks (GNN)
  • Sampling
  • k-Nearest Neighbors (KNN)
  • Support Vector Machine (SVM)
  • Levenshtein Distance
  • Edit Distance

About

A cheat sheet for Machine Learning and Statistics topics. The purpose of this repo is to prepare for interviews and offer resources for initial problem statements.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages