Skip to content

Scikit-Learn tutorial for the UIUC ADSA Data Science summit, based on material for Scipy 2015

License

Notifications You must be signed in to change notification settings

vene/adsa_uiuc_sklearn_tutorial

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scikit-learn Tutorial

Heavily based on (and shortened from) the SciPy 2015 tutorial by Kyle Kastner(@kastnerkyle) and Andreas Mueller(@t3kcit), which was, in turn, based on the SciPy 2013 tutorial by Gael Varoquaux, Olivier Grisel and Jake VanderPlas.

You can find the video recordings of the SciPy 2015 tutorial on youtube:

Instructor

Installation Notes

First your need to make sure you have Python on your machine. In order to check it you can just type:

python -V 

If Python is installed, you should be able to see somethng like this:

Python 2.7.11

Otherwise you will see something else. If you don't have python try to install:

  • On OS X install it via brew (If you don't have brew, you can inatll it here ):

    brew install python

  • On Windows use the installer from the official website.

  • On Linux use the package manage of your distribution (e.g. apt for Ubuntu).

Both Python2.7 and 3.4 should both work fine for this tutorial.

This tutorial will require recent installations of numpy, scipy, matplotlib, scikit-learn and ipython with ipython notebook.

The last one is important, you should be able to type:

ipython notebook

in your terminal window and see the notebook panel load in your web browser. Try opening and running a notebook from the material to see check that it works. If you don't have ipython, you can install it through pip:

pip install ipython 

For users who do not yet have these packages installed, a relatively painless way to install all the requirements is to use a package such as Anaconda CE, which can be downloaded and installed for free.

After getting the material, you should run python check_env.py to verify your environment.

If you are missing any package, use pip to instal them. For example to instal' numpy we can do:

pip install numpy

Downloading the Tutorial Materials

I would highly recommend using git, not only for this tutorial, but for the general betterment of your life. Once git is installed, you can clone the material in this tutorial by using the git address shown above:

git clone git://github.com/vene/adsa_uiuc_sklearn_tutorial.git

If you can't or don't want to install git, there is a link above to download the contents of this repository as a zip file. We may make minor changes to the repository in the days before the tutorial, however, so cloning the repository is a much better option.

Data Downloads

The data for this tutorial is not included in the repository. We will be using several data sets during the tutorial: most are built-in to scikit- learn, which includes code which automatically downloads and caches these data. Because the wireless network at conferences can often be spotty, it would be a good idea to download these data sets before arriving at the conference. Run fetch_data.py to download all necessary data beforehand.

About

Scikit-Learn tutorial for the UIUC ADSA Data Science summit, based on material for Scipy 2015

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 89.7%
  • Python 10.3%