Pydata Dallas 2015 Scikit-Learn Tutorial
0. Setup of Libraries and Version Numbers.ipynb
1. Scikit Learn - Introduction to Supervised Learning Problem and Cross Validation.ipynb
2. Scikit Learn - Parameter Search via Grid Search.ipynb
3. Scikit Learn - Parameter Search via Randomized Parameter Search.ipynb
4. Scikit Learn - Pipeline.ipynb
5. Scikit Learn - Distance Functions and Scoring Functions.ipynb
6. Scikit Learn - Feature Unions.ipynb
7. Scikit Learn - Generalization and Model Selection.ipynb
8. Scikit Learn - Out of Core Learning via partial_fit.ipynb
9. Scikit Learn - Deployment.ipynb
Language Detector.ipynb

A Thorough Machine Learning Pipeline via Scikit-Learn

This repo includes all of the IPython notebooks that I will go through in the tutorial. It is targeted to people who have intermediate knowledge in machine learning and wants to learn more advanced features of the Scikit-learn.

It tries to cover the following concepts in Scikit-Learn:

  1. Pipeline
  2. Cross-Validation
  3. Grid-Search
  4. Randomized Grid Search
  5. Distance and Scoring Functions
  6. Feature Unions and Engineering
  7. Out-of-core Learning (partial_fit)

The dependencies are given in the 0th notebook, to reproduce it, make sure you have at least those versions in that notebook. Otherwise, please feel free to open an issue in this repository.

