# Pandas Demo

![Pandas Logo](images/pandas_logo.png)

I'm taking this from Brandon Rhodes Pandas tutorial at PyCon 2015.

[https://github.com/brandon-rhodes/pycon-pandas-tutorial](https://github.com/brandon-rhodes/pycon-pandas-tutorial)

It's a really good tutorial, and worth walking through if you want a sense of what you can do with Pandas.

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo("5JnMutdy6Fw")

In [None]:
from IPython.core.display import HTML
css = open('style_table.css').read()
HTML('<style>{}</style>'.format(css))

In [None]:
import pandas as pd

In [None]:
import seaborn
%matplotlib inline

In [None]:
titles = pd.DataFrame.from_csv('data/titles.csv', index_col=None)
titles.head()

In [None]:
cast = pd.DataFrame.from_csv('data/cast.csv', index_col=None)
cast.head()

## What are the ten most common movie names of all time?

In [None]:
titles.title.value_counts().head(10)

## Which three years of the 1930s saw the most films released?

In [None]:
t = titles
t = t[t.year // 10 == 193]
t.year.value_counts().head(3)

## Plot the number of films that have been released each decade over the history of cinema.

In [None]:
t = titles
(t.year // 10 * 10).value_counts().sort_index().plot(kind='bar')

## What are the 11 most common character names in movie history?

In [None]:
cast.character.value_counts().head(11)

## Who are the 10 people most often credited as "Herself" in film history?

In [None]:
c = cast
c[c.character == 'Herself'].name.value_counts().head(10)

## Who are the 10 people most often credited as "Himself" in film history?

In [None]:
c = cast
h = c[c.character == 'Himself'].name.value_counts()
h.head(10)

In [None]:
h.rank(ascending=False, method='first')['Donald Trump']

## Plot the n-values of the roles that Judi Dench has played over her career.


In [None]:
c = cast
c = c[c.name == 'Judi Dench'].sort_values(by='year')
c = c[c.n.notnull()]
c.plot(x='year', y='n', kind='scatter')

## List each of the characters that Frank Oz has portrayed at least twice.

In [None]:
c = cast
c = c[c.name == 'Frank Oz']
g = c.groupby(['character']).size()
g[g > 1].sort_values()

## How many years in film history have been Superman years?

Define a year as a "Superman year" whose films feature more Superman characters than Batman.

In [None]:
c = cast
c = c[(c.character == 'Superman') | (c.character == 'Batman')]
c = c.groupby(['year', 'character']).size()
c = c.unstack()
c = c.fillna(0)
c.head()

In [None]:
d = c.Superman - c.Batman
print('Superman years:')
print(len(d[d > 0.0]))

## How many years have been "Batman years"

How many years have been "Batman years", with more Batman characters than Superman characters?

In [None]:
print('Batman years:')
print(len(d[d < 0.0]))

![Pandas Cheatsheet](images/pandas_cheat_sheet.jpg)