ML Algorithms from Scratch
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
benchmarking
mlfromscratch
.gitignore
LICENSE
README.md
requirements.txt
setup.py

README.md

ML From Scratch

A self-lead refresher in basic ML algorithms

This project was inspired by a hackathon in the Spring of 2016 when I was working on the engineering team at Magnetic. At Magnetic I had been working on our machine learning systems that used Vowpal Wabbit, and in this hackathon we attempted to implement a similar logistic regression solver in the Go language - which we called Gowpal Wabbit (a large part of projects at Magnetic involved arguing about how to construct a witty/punny name, Gowpal Wabbit was not our best work). My colleague Dan Crosta wrote about what he learned about logistic regression from the process.

While I had learned a lot about the most commonly used algorithms in grad school and at work, writing logistic regression from scratch and teaching a team of software engineers the math and intuition beyond the gradient descent solver made me think much harder about the various choices that go into writing a working implementation. It was a surprisingly educational experience.

Since then I've changed jobs, and after a traveling a lot in the Summer and Fall of 2016 I've found some free time again. I am (intermittently) writing some of the more commonly used ML algorithms from near-scratch and comparing their performance (both in terms of predictive power and computational efficiency) versus scikit-learn. While I feel that I had a pretty good handle on this subject beforehand, I think that forcing myself to reinvent the wheel has been worthwhile.

These algorithms exist for my re-education and very little else. My algorithms will hopefully be just as good at prediction as scikit-learn's options, but theirs are more fully-featured and much faster (since they're written in Cython). There is no reason anybody should be using my algorithms unless they find my code educational.

Note: this project is completely unrelated to (and was started before) the very similarly named repo ML-from-scratch. I guess my name wasn't as original as I had hoped.