# Common DataFrame Operations

Let's go ahead and import Pandas and import some data:

In [None]:
import pandas as pd

rows = [
['Aberdeen Township', 18150, 19, 0, 13, 6],
['Absecon', 8380, 21, 0, 4, 15],
['Allendale', 6712, 0, 0, 0, 0],
['Allenhurst', 493, 0, 0, 0, 0],
['Allentown', 1812, 3, 0, 0, 3],
['Alpine', 2314, 1, 0, 0, 1],
['Andover Township', 6273, 1, 0, 0, 1],
]
df = pd.DataFrame.from_records(rows, columns=['City', 'Population', 'Violent Crimes', 'Murders', 'Roberies', 'Aggrevated Assaults'])

df

We're going to demonstrate a few of the basic things Pandas can do with a `DataFrame`.
======

A complete list can be found [here](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html).

## Functions that Operate on the Entire DF

In [None]:
df.sum()

## Grouping and Indexing

Sometimes, it's useful to group data.  Sometimes parts of that data aren't really data per se, but rather labels.  In our case, `City` is a label.  It looking at the previous cell, it doesn't make much sense to treat like data.  To handle this, Pandas gives us `index`es to express that some of these labels aren't meant to be operated on.

In [None]:
df = df.set_index(['City', ])

df

In [None]:
df.sum()

In [None]:
df['Cohort'] = df['Population'].apply(lambda x: 'A' if x > 5000 else 'B')

df

In [None]:
df.groupby('Cohort').sum()

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

cohort_df = df.groupby('Cohort').sum()
cohort_df = cohort_df.drop('Population', axis=1)
cohort_df.plot(kind='bar')