# PANDAS
No, not [that kind of pandas](https://www.google.co.uk/search?q=pandas&source=lnms&tbm=isch&sa=X&ved=0CAcQ_AUoAWoVChMI25_rx56IxgIVUSrbCh2BQgA4&biw=960&bih=771). This kind is a [data-analysis library built for Python](http://pandas.pydata.org/).
## So what does that mean?
Well, it's built on a library called (Numpy)[http://www.numpy.org/] which is ***the*** scientific computing library for Python (IMHO). It's rather cool, but we only need to worry about it insofar as what it give to pandas. Pandas is a really powerful library for manipulating and processing tabular and series data; far better than any spreadsheet!

In [27]:
# First we import it.
# The convention is to import it as the shortened 'pd'
import pandas as pd

###First steps
Pandas is all about Indexed data, whether in lists ('Series'), or in rows and columns ('Dataframes').
The first step is a series. It's better than a list, because it has loads of handy methods ready to go straight out of the box *and* has some really powerful indexing built-in. It combines the properties of the list and the ordered dictionary. Shhhhweeet.

In [28]:
#Indexing a Python list.
star_trek_captains = ["Kirk", "Picard", "Janeway", "Cisco", "Archer"]
star_trek_shows = ["Original", "Next Gen", "Voyager", "DS9", "Enterprise"]
captains_by_show = pd.Series(data=star_trek_captains, index=star_trek_shows)
print(my_pandas_series['Original'])
print("-"*10)
print(my_pandas_series.sort_index())
print("-"*10)
print(my_pandas_series.order())

Kirk
----------
DS9             Cisco
Enterprise     Archer
Next Gen       Picard
Original         Kirk
Voyager       Janeway
dtype: object
----------
Enterprise     Archer
DS9             Cisco
Voyager       Janeway
Original         Kirk
Next Gen       Picard
dtype: object


So, we've got our Captains, indexed by the series they were on. But what if we want to add some extra info? Like how many episodes they all did? It's the same *index*, because it's by show, but it won't fit in this **series** because the series is just one list, a list of names. Let's make the new **series** with the shows, and see about combining them.

In [29]:
series_episodes = [79, 176, 168, 173, 97]
episode_count_by_show = pd.Series(data=series_episodes, index=star_trek_shows)
episode_count_by_show

Original       79
Next Gen      176
Voyager       168
DS9           173
Enterprise     97
dtype: int64

In [30]:
# So, now we have two different series. Combining them could be tricky...
# Or it could be really easy, because they share the same index!
star_trek = pd.concat([captains_by_show, episode_count_by_show], axis=1)
star_trek

Unnamed: 0,0,1
Original,Kirk,79
Next Gen,Picard,176
Voyager,Janeway,168
DS9,Cisco,173
Enterprise,Archer,97


Ah hah! Looks like a table! Shame about the columns being labeled 0 and 1. If only pandas had a way we could name the columns...

In [31]:
star_trek.columns = ['Captain', 'Episodes']
star_trek

Unnamed: 0,Captain,Episodes
Original,Kirk,79
Next Gen,Picard,176
Voyager,Janeway,168
DS9,Cisco,173
Enterprise,Archer,97


Well, whad'dya know. So what does a dataframe let us do?

In [32]:
# Get to individual columns
star_trek['Captain']

Original         Kirk
Next Gen       Picard
Voyager       Janeway
DS9             Cisco
Enterprise     Archer
Name: Captain, dtype: object

In [33]:
# We can even add columns
starting_year = [1966, 1987, 1995, 1993, 2001]
star_trek['First Aired'] = starting_year
star_trek

Unnamed: 0,Captain,Episodes,First Aired
Original,Kirk,79,1966
Next Gen,Picard,176,1987
Voyager,Janeway,168,1995
DS9,Cisco,173,1993
Enterprise,Archer,97,2001


In [37]:
# We can get to rows with 'ix'
star_trek.ix[0]

Captain        Kirk
Episodes         79
First Aired    1966
Name: Original, dtype: object

In [38]:
# More than one at the same time, in fact
star_trek.ix[0:2]

Unnamed: 0,Captain,Episodes,First Aired
Original,Kirk,79,1966
Next Gen,Picard,176,1987


In [40]:
# We can EVEN pick columns AND rows
star_trek.ix[0:1, 1:3]

Unnamed: 0,Episodes,First Aired
Original,79,1966


In [43]:
# BY NAME
star_trek.ix[['Voyager', 'DS9'], 'Episodes']

Voyager    168
DS9        173
Name: Episodes, dtype: int64

So, that's the basics of dataframes. In the next section we'll have a look at a bigger dataframe, that we'll load from a file. FANCY