Machine Learning

A 15-module course on classical machine learning — regression, classification, ensemble methods, and how to choose between them. Each module is a self-contained Jupyter notebook that walks through one algorithm from problem framing to working code, with a worked example you can read by hand and a "when it breaks" section that names the failure modes you'll actually hit.

The course is designed to be read in order, but every module also stands on its own as a reference.

What's inside

Modules

#	Topic
00	Introduction — what machine learning is, the workflow, and how to use this course
01	Linear Regression
02	Polynomial & Regularized Regression
03	Logistic Regression
04	Support Vector Machines
05	Decision Trees
06	Bagging
07	Random Forests
08	AdaBoost
09	Gradient Boosting
10	Stacking & Voting
11	k-Nearest Neighbors
12	Naive Bayes
13	Algorithm Selection — given a problem, which algorithm should you reach for?
14	Model Comparison & Evaluation — capstone on metrics, validation, and head-to-head comparison

Reference notebooks (`references/`)

Concept-focused notebooks the modules cross-link to. Read them when a module assumes a concept you want to revisit.

correlation.ipynb
feature-encoding.ipynb
gradient-descent.ipynb
learning-rate.ipynb
loss-functions.ipynb
n-grams.ipynb
regularization.ipynb
standardization.ipynb
statistical-inference.ipynb
tf-idf.ipynb

Datasets (`data/`)

File	Used in	Description
`california-housing.csv`	Modules 00–02 (regression)	20,640 California Census block groups, 8 features, median house price target
`breast-cancer.csv`	Modules 03–12 (classification)	569 samples, 30 features, malignant/benign target (Wisconsin Diagnostic)
`breast-cancer-explore.csv`	Exploration	Same data with a small subset for hand-readable walkthroughs

See data/README.md for full dataset documentation, sources, and the pattern for adding new ones.

Running the notebooks

You'll need Python 3.10+ and the standard scientific stack. From the repo root:

pip install jupyter numpy pandas scikit-learn matplotlib seaborn
jupyter notebook

Then open any .ipynb. Notebooks load data from the local data/ directory, so they work offline.

How the modules are structured

Every algorithm module follows the same arc:

Problem — the situation where this algorithm earns its keep
Model — what it predicts and what shape its decisions take
How it learns — the training procedure, formulas, and intuition
Superpower — what this algorithm does better than its peers
When it breaks — the failure modes, visually and quantitatively
Worked example — a 10-row dataset you can compute by hand
Code — full scikit-learn implementation with sensible defaults
Comparisons — head-to-head tables against algorithms covered earlier

Formulas are followed by a "Reading this formula" block that names every symbol. Tunable numbers (learning rates, depths, regularization strengths) are named variables, not magic literals.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
references		references
.gitignore		.gitignore
00-introduction.ipynb		00-introduction.ipynb
01-linear-regression.ipynb		01-linear-regression.ipynb
02-polynomial-regularized-regression.ipynb		02-polynomial-regularized-regression.ipynb
03-logistic-regression.ipynb		03-logistic-regression.ipynb
04-support-vector-machines.ipynb		04-support-vector-machines.ipynb
05-decision-trees.ipynb		05-decision-trees.ipynb
06-bagging.ipynb		06-bagging.ipynb
07-random-forests.ipynb		07-random-forests.ipynb
08-adaboost.ipynb		08-adaboost.ipynb
09-gradient-boosting.ipynb		09-gradient-boosting.ipynb
10-stacking-voting.ipynb		10-stacking-voting.ipynb
11-knn.ipynb		11-knn.ipynb
12-naive-bayes.ipynb		12-naive-bayes.ipynb
13-algorithm-selection.ipynb		13-algorithm-selection.ipynb
14-model-comparison.ipynb		14-model-comparison.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning

What's inside

Modules

Reference notebooks (`references/`)

Datasets (`data/`)

Running the notebooks

How the modules are structured

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Machine Learning

What's inside

Modules

Reference notebooks (references/)

Datasets (data/)

Running the notebooks

How the modules are structured

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Reference notebooks (`references/`)

Datasets (`data/`)

Packages