My notes and superstitions about common machine learning algorithms
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Supervised learning superstitions cheat sheet

This notebook contains my notes and beliefs about several commonly-used supervised learning algorithms. My dream is that it will be useful as a quick reference or for people who are studying for machine learning interviews/quizzes/etc..

After some setup code, the methods discussed are:

  • Logistic regression
  • Decision trees
  • Support vector machines
  • K Nearest neighbors
  • Naive Bayes

To better understand each classifier we train on various versions of the "two moons" dataset and plot empirical decision boundaries. Each plot shows the training data on top of a few thousand randomly chosen points which have been colored by the output of the learned model. Superstition #1: The plots suggest that linear classifiers are often out performed on high quality training sets but still produce sane results on noisy small datasets. Note: not all the plots have the same xy dimensions.

Other resources: