Skip to content

RoozbehSanaei/Stat-and-ML-Blogs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Statistics and Machine Learning Blogs

The objective of this compilation is to bring together a variety of resources that provide straightforward and accessible explanations of fundamental principles in statistics and machine learning.

Defintions

Probability

Set theory
Venn Diagrams
Probability Axioms
pdf,cdf,ppt
Quantiles
Experiment, Sample space, Event, Probability function, Random variable
Properties of cdf and pdf
Transformations of Random Variables
Joint Probability Distribution
Expected Values, Properties of Expected Value
Variance Values, Properties of Variance
Bayes Rule

Probability Distributions

Univariate distributions
Bernoulli distribution
Binomial distribution
Continous Uniform Distribution
Poission distribution
Exponential distribution

Statistics

General Concepts

Probablity vs Statistics
Likelihoodist, Bayesian, and Frequentist Methods
Mathematical Basis of Bayesian vs Frequentist Debate

Correlation

Covariance and Correlation
Pearson Correlation
Partial Correlation
Kendall Rank Correlation
Wald Wolfowitz Run Test

Estimators

Estimators
Difference between estimator and statistics
Maximum Likelihood Estimation
MLE vs MAP Estimation
MLE vs MAP Bayesian Inference

Hypothesis Testing

Law of Large Numbers
Difference between Strong and Weak laws of large numbers
Central Limit Theorem
Effect Size p-value
Right, Left, and Two tailed test
Type I and Type II Errors
SVD and PCA

Z-test

Z-Score
One Proportion Z-test
Two Sample Z-test

t-rest

t-distribution
Paired t-test
UnPaird t-test, Pooled t-test, Welch's t-test

Chi-square test

Chi-square distribution
Pearson's Theorem
Chi-square test

Non-Paramteric Tests

Nonparametric Tests vs. Parametric Tests
Mann Whitney U Test (Wilcoxon Rank Sum Test)
Wilcoxon Signed Rank Test
Sign Test
The Kruskal-Wallis Test
Permutation Test

Linear Regression

Ordinary Least Squares through minimising the sum of square errors
Projection and Orthogonality
Method of Moments
Linear Regression as Maximum Likelihoods
Regression vs Correlation coefficients
Bayesian Linear Regression
Applying SVD to Linear Regression
Linear Regression Metrics

Multi Linear Regression

Variance Inflation Factors
Multi Linear Regression and multicollinearityand also

Bias-Variance

Bias-Variance Decomposition of the Squared Loss
Bias-Variance Trade-off and Double Descent
Regualization: the path to Bias-Variance Trade-off

Multiple Hypothesis Testing

F-Test

F-distribution
General Linear F-test
Calculating F-Statistic
Coding Systems For Categorical Variables

ANOVA

What is ANOVA
One Way Anova
ANOVA mathematical model
ANOVA Assumptions
Linear Combinations and Contrasts
Fixed Effect, Random Effect and Mixed Effect models
Factorial and Unbalanced ANOVA
ANCOVA

Multiple Comparision Problem

Multiple Comparison Problem
Bonferroni’s Correction
Holm’s Step-Down and Hochberg’s Step-Up Procedure
Studentized range distribution
Turkey's Range Test

Multivariate Hypothesis Testing

MANOVA
PCA
Factor Analysis
Canonical Analysis

Structure Equation Modeling

Basics
Tutorial

Statistical Paradoxes

Monty Hall
Russels Paradox

Bayesian Statistics

Bayesian Learning
A/B testing, Bayesian
Hierarchical Modeling

Bayesian Samplers

Rejection Sampling
Importance Sampling
Inverse Transform Sampling
The Metropolis-hasting algorithm and also
Gibbs Sampling
Gibbs Sampling as a Special Case of Metropolis–Hastings

Causal Inference

Structural Causal Models
Chains, and Forks
Colliders
d-separation
Model Testing and Causal Search
Interventions
The Adjustment Formula
Backdoor Criterion
Front-door Criterion

Gaussian Process
Bootstrapping

Machine Learning

Decision Trees

Decision Trees
ID3, C4.5, C5.0, CART decision tree difference
C4.5 and C5.0 Algorithm
ID3 Algorithm
Pruning
Gini Impurity, Entropy, Classification Error

Expectation Maximization (Kmeans, and GMM)

K-means clustering
Gaussian Mixture Modeling

Support Vector Machines

Support Vector Machine
SVM vs logisitic regression

Ensemble Methods

Ensemble methods: bagging, boosting and stacking
Adaboost
Gradient Boosting

Explanation Methods

Lime
Shapley and Shap
Counterfactual Explanations
Global Surrogate

Time Series Modeling

Arima
Sarima, Sarimax
Prophet
Forecasting: Principles and Practice

Anomality Detection

General Introduction
Isolation Forest
One Class SVM
Local Outlier Factor
Robust Covariance Estimator

Data

Data Cleaning
Imbalanced datasets
Data Set Shift
Covariate Shift

Data Splitting

The Importance of Data Splitting
Training, Development and Test errors

Deep Learning

My sides on Convolutional Neural Networks
My sides on Sequence Modles

Transformers

Mechanics of Seq2seq Models With Attention
The illustrated transformer
Line-by-line implementation of “Attention is All You Need”
Illustrated GPT-2
Decoding Strategies

About

This collection aims to gather blogs and online resources that explain different statistical concepts simply and clearly

Resources

License

Stars

Watchers

Forks

Packages

No packages published