Pydata Dallas 2015 Scikit-Learn Tutorial
Python
Switch branches/tags
Nothing to show
Permalink
Failed to load latest commit information.
data Bank data Apr 11, 2015
.gitignore Initial commit Apr 4, 2015
0. Setup of Libraries and Version Numbers.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
1. Scikit Learn - Introduction to Supervised Learning Problem and Cross Validation.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
2. Scikit Learn - Parameter Search via Grid Search.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
3. Scikit Learn - Parameter Search via Randomized Parameter Search.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
4. Scikit Learn - Pipeline.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
5. Scikit Learn - Distance Functions and Scoring Functions.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
6. Scikit Learn - Feature Unions.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
7. Scikit Learn - Generalization and Model Selection.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
8. Scikit Learn - Out of Core Learning via partial_fit.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
9. Scikit Learn - Deployment.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
Language Detector.ipynb Rerun some ipython notebooks and various minor changes Apr 24, 2015
README.md Titanic data, preprocessing and randomized parameter search is added Apr 4, 2015
load.py Basic flow and naming of most of the ipython notebooks are complete Apr 4, 2015

README.md

A Thorough Machine Learning Pipeline via Scikit-Learn

This repo includes all of the IPython notebooks that I will go through in the tutorial. It is targeted to people who have intermediate knowledge in machine learning and wants to learn more advanced features of the Scikit-learn.

It tries to cover the following concepts in Scikit-Learn:

  1. Pipeline
  2. Cross-Validation
  3. Grid-Search
  4. Randomized Grid Search
  5. Distance and Scoring Functions
  6. Feature Unions and Engineering
  7. Out-of-core Learning (partial_fit)

The dependencies are given in the 0th notebook, to reproduce it, make sure you have at least those versions in that notebook. Otherwise, please feel free to open an issue in this repository.

You could browse the IPython notebooks in nbviewer