Machine Learning using Scikit-Learn
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
0. Beginning.ipynb
1. Hello, World!.ipynb
2.0 First Impressions of Machine Learning.ipynb
2.1 Supervised Learning - Classification.ipynb
2.2 Supervised Learning - Regression.ipynb
2.3 Unsupervised Learning - Transformations and Dimensionality Reduction.ipynb
2.4 Unsupervised Learning - Clustering.ipynb
2.5 Review of Scikit-learn API.ipynb
3. Validations and Learning Curves.ipynb
4.1 Example - Supervised Spam Classification.ipynb
4.2 Example - Face Recognition.ipynb
5. Where do we go from here.ipynb

Machine Learning using Scikit-Learn


The following repository contains notebooks which are based on the material used by me during the BangPypers July meetup. These notebooks are made keeping in mind that the intended audience has very little or no experience with scikit-learn and/or machine learning but have some knowledge of python.


  • Clone this repo git clone
  • If you don't have python-dev install it using sudo apt-get install python-dev or whatver equivalent command you have for your distribution.
  • Installation is a non-trivial process generally. However we have the wonderful conda environment manager, a part of Anaconda Scientific Distribution. The best course of action is downloading and installing miniconda.
    • Once you have minconda installed issue the following command on your shell
    • conda install numpy scipy matplotlib scikit-learn ipython-notebook seaborn
    • conda install -c conda-forge ipywidgets
    • Note: The above process requires a good net connection and time. Please do this before coming to the workshop.
  • If you want to further simplify the process you can go for the fullfledged package Anaconda instead of the above method. (This is the most preferred method)
    • After installing issue conda install -c conda-forge ipywidgets
  • fetches the data required for the Facial Recognition Example. The dataset is ~230MB. If you want to follow along during the workshow you can execute python after cd'ing into the repo directory. In case you don't want to download it, you are welcome to look at the example during the workshop.
  • NOTE : This repo is a work in process. To keep yourself updated issue a git pull before attending the workshop to be on the latest version.
  • NOTE : If you face any problems during installation, please create an issue on github.
  • That's it.


  • Python-2.7
  • Working knowledge of Python


  • This workshop has been developed with the intended audience as people with little or no experience of scikit-learn and/or machine learning.
  • Please download the repo and fetch the dependencies before coming to the workshop. The installation takes time which can be spent on the workshop instead.

Credits where credit's due

  • These notebooks owe a lot to the notebooks published by Jake Vanderplas and Andreas Muller, who have a much more extensive coverage of the topics. If you want to go further in regards to the black box approach with scikit-learn, I would highly recommend going through their notebooks and screencasts. These tutorials helped me a lot in understanding scikit-learn and it's application.

Where to go from here?