Overview

This is a machine learning library, made from scratch.

It uses:

numpy: for handling matrices/vectors
scipy: for various mathematical operations
cvxopt: for convex optimization
networkx: for handling graphs in decision trees

It contains the following functionality:

Supervised Learning:
- Linear and Logistic regression
  - Regularization
  - Solvers
    - Gradient descent
    - Steepest descent
    - Newton's method
    - SGD
    - Backtracking line search
    - Closed form solutions
- Support Vector Machines
  - Soft and hard margins
  - Kernels
- Tree Methods
  - CART (classificiation and regression)
  - PRIM
  - AdaBoost
  - Gradient Boost
  - Random Forests
- Kernel Smoothing Methods
  - Nadaraya average
  - Local linear regression
  - Local logistic regression
  - Kernel density classification
- Discriminant Analysis
  - LDA, QDA, RDA
- Naive Bayes Classification
  - Gaussian
  - Bernoulli
- Prototype Methods
  - KNN
  - LVQ
  - DANN
- Perceptron
Unsupervised Learning
- K means/mediods clustering
- PCA
- Gaussian Mixtures
Model Selection and Validation

Examples

Examples are shown in two dimensions for visualisation purposes, however, all methods can handle high dimensional data.

Regression

Linear and logistic regression with regularization. Closed form, gradient descent, and SGD solvers.

Support Vector Machines

Support vector machines maximize the margins between classes

Using kernels, support vector machines can produce non-linear decision boundries. The RBF kernel is shown below

An alternative learning algorithm, the perceptron, can linearly separate classes. It does not maximize the margin, and is severely limited.

Tree Methods

The library contains a large collection of tree methods, the basis of which are decision trees for classification and regression

These decision trees can be aggregated and the library supports the following ensemble methods:

AdaBoosting
Gradient Boosting
Random Forests

Kernel Methods

Kernel methods estimate the target function by fitting seperate functions at each point using local smoothing of training data

Nadaraya–Watson estimation uses a local weighted average

Local linear regression uses weighted least squares to locally fit an affine function to the data

The library also supports kernel density estimation (KDE) of data which is used for kernel density classification

Discriminant Analysis

Linear Discriminant Analysis creates decision boundries by assuming classes have the same covariance matrix.
LDA can only form linear boundries

Quadratic Discriminant Analysis creates deicion boundries by assuming classes have indepdent covariance matrices.
QDA can form non-linear boundries.

Regularized Discriminant Analysis uses a combination of pooled and class covariance matrices to determine decision boundries.

Prototype Methods

K-nearest neighbors determines target values by averaging the k-nearest data points. The library supports both regression and classification.

Learning vector quantization is a prototype method where prototypes are iteratively repeled by out-of-class data, and attracted to in-class data

Discriminant Adaptive Nearest Neighbors (DANN). DANN adaptively elongates neighborhoods along boundry regions.
Useful for high dimensional data.

Unsupervised Learning

K means and K mediods clustering. Partitions data into K clusters.

Gaussian Mixture Models. Assumes data are generated from a mixture of Gaussians and estimates those Gaussians via the EM algorithm. The decision boundry between two estimated Gaussians is shown below.

Principal Component Analysis (PCA) Transforms given data set into orthonormal basis, maximizing variance.

Name		Name	Last commit message	Last commit date
Latest commit History 234 Commits
ML		ML
tests		tests
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Examples

Regression

Support Vector Machines

Tree Methods

Kernel Methods

Discriminant Analysis

Prototype Methods

Unsupervised Learning

About

Releases

Packages

Languages

License

christopherjenness/ML-lib

Folders and files

Latest commit

History

Repository files navigation

Overview

Examples

Regression

Support Vector Machines

Tree Methods

Kernel Methods

Discriminant Analysis

Prototype Methods

Unsupervised Learning

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages