This is a personal project whose goal is to concentrate all my Python code about Machine Learning.
You are completely free to use it anyway you want. However, it'd be cool if I was notified... just to know that this is useful for someone else =).
Of course, use it without any warranty... at your own risk. I can't guarantee complete documentation and stability. I make it stable and documented just enough for my own purposes.
Also, any contribution (new algorithms and, particularly, tests) are welcome. Send me a pull request or a patch and i'll be very happy to review it.
Currently, the following features are implemented:
Supervised Learning:
- Bayesian classifier based on Gaussian Discriminant Analysis
- Multivariate Kernel Density Estimation
- Bayesian classifier based on Kernel Density Estimation
- kNN and Weighted-kNN classifiers
- Rules for combination of classifiers (sum, product, median, min, max, majority) with single/multiple feature spaces
- Stratified k-Fold cross validation
- Some convenient functions to help evaluation
Unsupervised Learning:
- Partitioning Around Medoids (PAM) realization of the k-Medoids class of clustering algorithms
- Some convenient functions to help partition clustering evaluation
Statistics:
- Efficient mode and frequency count functions
- Multivariate normal PDFs
This project depends upon the following libraries:
- NumPy >= 1.6.0 (http://numpy.scipy.org/)
- SciPy >= 0.9.0 (http://www.scipy.org/)