Material for Pandas Tutorial at Pydata Carolinas 2016
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Tutorial Objective

Let's start doing our data analysis not in a spreadsheet program and learn Python and Pandas along the way.

Don't get me wrong, I use spreadsheets, but not for data analysis.

Also, there are some notes from people who I've talked to during the conference in the notes folder. Click the .md file, and github will render the document on the website (like this file you are reading now).


Material for Pandas Tutorial at Pydata Carolinas 2016

PyData Carolinas 2016
September 14-16, 2016
Hosted by IBM Emerging Technologies
Research Triangle Park, NC

IBM RTP Activity Center 3039 East Cornwallis Road, Building 400 Research Triangle, NC 27709

PyData Schedule


Covered in the tutorial

  1. Pandas DataFrame basics
  2. Data assembly
  3. Missing Data

Not covered in the tutorial

  1. Plotting


The easiest way to get everything you need to the tutorial is to install anaconda

You can download and install it here:

I will be using the Python 3 version during the tutorial.

I actually ended up using Python 2 because of I had a last minute computer change

Install seaborn for plotting

conda install seaborn


  1. Gapminder:
  2. Survey: Comes from the Software-Carpentry SQL lesson
  3. Ebola: