Advanced Scikit-learn training session
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data add bank campaign description Jun 8, 2016
figures add book cover Jun 6, 2016
mglearn backward compatibility fixes Jun 8, 2016
plots make warmstarts into warmstarts and cv objects Jun 7, 2016
solutions add a lot of notebooks etc Jun 6, 2016
.gitignore wix gitignore Jun 6, 2016
00.0 Overview.ipynb minor fixes to overview Jun 8, 2016
01.1 Review of Supervised Learning.ipynb minor fixes during course Jun 9, 2016
01.2 Linear models.ipynb minor fixes during course Jun 9, 2016
02.1 Scaling and Normalization.ipynb minor fixes during course Jun 9, 2016
02.2 Feature preprocessing, feature selection, interactions.ipynb minor fixes during course Jun 9, 2016
03.1 Cross Validation and Grid Searches.ipynb minor fixes during course Jun 9, 2016
03.2 Evaluation Metrics.ipynb minor fixes during course Jun 9, 2016
04.1 Pipelines.ipynb minor fixes during course Jun 9, 2016
05.1 Trees and Forests.ipynb minor fixes during course Jun 9, 2016
05.2 Gradient Boosting.ipynb minor fixes during course Jun 9, 2016
05.3 Support Vector Machines.ipynb consistent naming Jun 6, 2016
05.4 Neural Networks.ipynb fix interactive tree plotting, add exercise for 05.4 neural net Jun 7, 2016
06.1 Unsupervised Feature Extraction.ipynb remove clustering from unsupervised feature extraction, add exercise Jun 7, 2016
07.1 Outlier Detection.ipynb minor fixes during course Jun 9, 2016
08.1 Gaussian Processes.ipynb minor fixes during course Jun 9, 2016
9.1 Out Of Core Learning.ipynb move 10 to 9 Jun 8, 2016
9.2 Custom Estimators.ipynb move 10 to 9 Jun 8, 2016
LICENSE Initial commit Jun 6, 2016
README.md shuffling around more stuff Jun 6, 2016
dataset test bikes.ipynb moving mglearn, adding bike and campaign data Jun 7, 2016
datasets.rst add a lot of notebooks etc Jun 6, 2016
old-supervised-learning.ipynb make warmstarts into warmstarts and cv objects Jun 7, 2016
preamble.py moving mglearn, adding bike and campaign data Jun 7, 2016
robust_pca.py slowly getting there.... Jun 7, 2016
toc.rst add a lot of notebooks etc Jun 6, 2016
todo.rst add stuff to gp Jun 8, 2016

README.md

advanced_training

Advanced Scikit-learn training session

Outline

1 Basic algorithms

  • Review of supervised learning
  • Linear models for classification and regression
  • Loss functions, regularization, empirical risk minimization
  • Path algorithms
  • Exercise: FIXME Regression

2 Basic tools

  • Cross-validation vs train/test split
  • GridSearchCV
  • Overfitting Parameters
  • Scoring Metrics
  • Exercise: FIXME

3 Preprocessing

  • Scaling and normalization

  • Feature selection:

    • Univariate
    • Model-based
    • RFE
    • Forward / backward selection
  • Polynomial and interaction features

  • Exercise: FIXME

4 Advanced tools

  • Pipelines
  • FeatureUnion
  • Function Transformer?
  • Exercise: FIXME

5 Advanced Supervised Learning

  • Decision Tree Recap
  • Random Forests
  • Gradient Boosting / xgboost
  • Kernel SVMs
  • Kernel approximation
  • Neural Networks
  • Exercise: FIXME

6 Unsupervised feature extraction and visualization

  • PCA
  • NMF
  • Robust PCA?
  • TSNE
  • Exercise: FIXME

7 Outlier Detection

  • Elliptic Envelope?
  • IForest ?
  • What else?
  • KDE?
  • SVM?
  • robust PCA?
  • Exercise: FIXME

8 Gaussian Processes

  • Non-iid data
  • Gaussian fit...
  • Covariance matrix is a kernel
  • regression, outlier detection, time series modelling
  • Exercise: FIXME

9 More Neural Networks

10 beyond standard sklearn

  • warm starts
  • out of core
  • custom estimators