Skip to content

polleoai/machinelearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning

A 15-module course on classical machine learning — regression, classification, ensemble methods, and how to choose between them. Each module is a self-contained Jupyter notebook that walks through one algorithm from problem framing to working code, with a worked example you can read by hand and a "when it breaks" section that names the failure modes you'll actually hit.

The course is designed to be read in order, but every module also stands on its own as a reference.

What's inside

Modules

# Topic
00 Introduction — what machine learning is, the workflow, and how to use this course
01 Linear Regression
02 Polynomial & Regularized Regression
03 Logistic Regression
04 Support Vector Machines
05 Decision Trees
06 Bagging
07 Random Forests
08 AdaBoost
09 Gradient Boosting
10 Stacking & Voting
11 k-Nearest Neighbors
12 Naive Bayes
13 Algorithm Selection — given a problem, which algorithm should you reach for?
14 Model Comparison & Evaluation — capstone on metrics, validation, and head-to-head comparison

Reference notebooks (references/)

Concept-focused notebooks the modules cross-link to. Read them when a module assumes a concept you want to revisit.

  • correlation.ipynb
  • feature-encoding.ipynb
  • gradient-descent.ipynb
  • learning-rate.ipynb
  • loss-functions.ipynb
  • n-grams.ipynb
  • regularization.ipynb
  • standardization.ipynb
  • statistical-inference.ipynb
  • tf-idf.ipynb

Datasets (data/)

File Used in Description
california-housing.csv Modules 00–02 (regression) 20,640 California Census block groups, 8 features, median house price target
breast-cancer.csv Modules 03–12 (classification) 569 samples, 30 features, malignant/benign target (Wisconsin Diagnostic)
breast-cancer-explore.csv Exploration Same data with a small subset for hand-readable walkthroughs

See data/README.md for full dataset documentation, sources, and the pattern for adding new ones.

Running the notebooks

You'll need Python 3.10+ and the standard scientific stack. From the repo root:

pip install jupyter numpy pandas scikit-learn matplotlib seaborn
jupyter notebook

Then open any .ipynb. Notebooks load data from the local data/ directory, so they work offline.

How the modules are structured

Every algorithm module follows the same arc:

  1. Problem — the situation where this algorithm earns its keep
  2. Model — what it predicts and what shape its decisions take
  3. How it learns — the training procedure, formulas, and intuition
  4. Superpower — what this algorithm does better than its peers
  5. When it breaks — the failure modes, visually and quantitatively
  6. Worked example — a 10-row dataset you can compute by hand
  7. Code — full scikit-learn implementation with sensible defaults
  8. Comparisons — head-to-head tables against algorithms covered earlier

Formulas are followed by a "Reading this formula" block that names every symbol. Tunable numbers (learning rates, depths, regularization strengths) are named variables, not magic literals.

License

All rights reserved.

About

ML/AI course — modules 00-14 (regression, classification, ensembles, algorithm selection, model comparison) with reference notebooks on supporting concepts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors