# Data Aggregation and Group Operations
There are a number of grouped operations by utilizing any function that accepts a pandas object or Numpy array.
* Split a pandas object into pieces using one or more keys
* Computing group summary statistics
* Apply a varying set of functions to each column of a DataFrame
* Apply within-group transformations or other manipulations
* Compute pivot tables and cross-tabulations
* Perform quantile analysis and other data-derived group analysis. 

## GroupBy Mechanics
The term _split-apply-combine_ can be decomposed as following:
* At first, data contained in a pandas object is __split__ into groups based on one ore more _keys_
* Then, a function is __applied__ to each group, producing a new value.
* Finally, the result of all those function applications are _combined_ into a result object.

In [1]:
import pandas as pd
import numpy as np
df = pd.DataFrame({'key1' : ['a'] * 2 + ['b'] * 2 + ['a'],
                   'key2' : ['one', 'two', 'one', 'two', 'one'],
                   'data1' : np.random.randn(5), 
                   'data2' : np.random.randn(5)})
df

Unnamed: 0,data1,data2,key1,key2
0,-0.104501,-0.443625,a,one
1,0.199215,-0.355202,a,two
2,0.137974,-1.678113,b,one
3,0.277403,2.119903,b,two
4,-0.311725,-0.702552,a,one


In [2]:
grouped = df['data1'].groupby(df['key1'])
grouped

<pandas.core.groupby.SeriesGroupBy object at 0x7fb03021fa90>

In [3]:
grouped.mean()

key1
a   -0.072337
b    0.207688
Name: data1, dtype: float64

In [4]:
df['data1'].groupby([df['key1'], df['key2']]).mean()

key1  key2
a     one    -0.208113
      two     0.199215
b     one     0.137974
      two     0.277403
Name: data1, dtype: float64

In [5]:
df.groupby('key1').mean()

Unnamed: 0_level_0,data1,data2
key1,Unnamed: 1_level_1,Unnamed: 2_level_1
a,-0.072337,-0.50046
b,0.207688,0.220895


In [6]:
df.groupby(['key1', 'key2']).size()

key1  key2
a     one     2
      two     1
b     one     1
      two     1
dtype: int64