A machine learning algorithm library in pure Python with mini project included for every algorithm.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



Repo Intro

This repo is to construct a basic Machine Learning Algorithm library for learning and testing, each algorithm comes with a classic miniproject application.

File Structure

Each directory include an algorithm with a mini project using this algorithm(data included), the projects as well as the algorithms are listed in a recommended reading sequence:

-- KNN

    -- kNeibohood.py
      """ core algorithm implementation"""
    -- imageRecognizer.py
      """ mini project, MNIST image recognizer"""
    -- digits
      """ MNIST dataset"""
    -- basicFunction.py
      """ Helper function"""
-- KTrees

    -- ktrees.py
      """ core algorithm implementation"""
    -- lenseproject.py
      """ mini project, lense recognizer"""
    -- lenses.txt
      """ dataset for lense project"""
    -- plottree.py 
      """ helper method to help you plot your ktrees data structure."""
-- Bayes
   -- bernoullibayers.py
      """ core algorithm implementation"""
    -- spamproject.py
      """ mini project, spam recognizer"""
    -- email
      """ dataset for spam project"""
-- Logistic Regression
   -- logisticRegression.py
      """ core algorithm implementation"""
    -- horseproject.py
      """ mini project, spam recognizer"""
    -- horseClinicTest.txt & horseClinicTraining.txt
      """ dataset for spam project"""
-- SVM
   -- svm.py
      """ core algorithm implementation"""

-- Adaboost
   -- adaboost.py
      """ core algorithm implementation"""      

How to use

1, download the repo to local, a star to the repo is appreciated

2, make sure Python2 is installed, a virtual env is recommended

3, pip install -r requirement.txt

4, run core algorithm file or miniproject file directly.

More to do

A lot of algorithm are coming soon, include:

[x] SVM

[x] Adaboost

[] Regression and Tree Regression

[] Kmeans

[] EM

[] PCA

[] SVD

Special Thanks

This repo has referenced some content and dataset of the book Machine Learning in Action(https://www.amazon.com/Machine-Learning-Action-Peter-Harrington/dp/1617290181/ref=sr_1_1?ie=UTF8&qid=1508746100&sr=8-1&keywords=Machine+Learning+in+Action), Thanks a lot for this great handbook.

This repo also referenced from stanford CS229 Machine Learningcourse, Link:


Thanks a lot for the great materialis.

Contact Me

Email: nick_fandingwei@outlook.com

Twitter: https://twitter.com/nick_fandingwei

For Chinese user, zhihu is the fastest way to get response from me: https://www.zhihu.com/people/NickWey

You can also check my tech blog for more: http://nickiwei.github.io/

Consider to follow me on Zhihu, Twitter and Github, thanks!