# Groupby Method : 

- **Groupby allows you to group together rows based off of a column and perform an aggregate function *(i.e. taking in multiple inputs and giving single output for example, sum,average,mean,etc...)* on them.**

In [1]:
import numpy as np
import pandas as pd

In [10]:
data = { 'Company': ['GOOG','GOOG','MSFT','MSFT','FB','FB'],
        'Person' : ['Sam', 'Charlie', 'Ani', 'Vanessa', 'Clay', 'Sarah'],
        'Sales' : [200,120,340,124,243,350]
       }

In [11]:
df = pd.DataFrame(data)
df

Unnamed: 0,Company,Person,Sales
0,GOOG,Sam,200
1,GOOG,Charlie,120
2,MSFT,Ani,340
3,MSFT,Vanessa,124
4,FB,Clay,243
5,FB,Sarah,350


In [12]:
# grouping rows together based of the column names

byComp = df.groupby('Company')
byComp

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x0000024CFC4A4580>

In [13]:
byComp.mean()

Unnamed: 0_level_0,Sales
Company,Unnamed: 1_level_1
FB,296.5
GOOG,160.0
MSFT,232.0


In [27]:
byComp.sum()

Unnamed: 0_level_0,Sales
Company,Unnamed: 1_level_1
FB,593
GOOG,320
MSFT,464


In [23]:
# Step-Deviation
byComp.std()

Unnamed: 0_level_0,Sales
Company,Unnamed: 1_level_1
FB,75.660426
GOOG,56.568542
MSFT,152.735065


In [20]:
# It will ignore if grouping is not possible
df.groupby('Person').mean()

Unnamed: 0_level_0,Sales
Person,Unnamed: 1_level_1
Ani,340.0
Charlie,120.0
Clay,243.0
Sam,200.0
Sarah,350.0
Vanessa,124.0


In [29]:
# It will return Person, 'cause it was able to count 
df.groupby('Company').count()
# It also returns Person column in max(), min() like functions

Unnamed: 0_level_0,Person,Sales
Company,Unnamed: 1_level_1,Unnamed: 2_level_1
FB,2,2
GOOG,2,2
MSFT,2,2


In [26]:
# stacking more commands for making it inline
df.groupby("Company").sum().loc['FB']

Sales    593
Name: FB, dtype: int64

---

### Using `.describe()` for grabbing bunch of info:

In [41]:
df.groupby('Company').describe()

Unnamed: 0_level_0,Sales,Sales,Sales,Sales,Sales,Sales,Sales,Sales
Unnamed: 0_level_1,count,mean,std,min,25%,50%,75%,max
Company,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
FB,2.0,296.5,75.660426,243.0,269.75,296.5,323.25,350.0
GOOG,2.0,160.0,56.568542,120.0,140.0,160.0,180.0,200.0
MSFT,2.0,232.0,152.735065,124.0,178.0,232.0,286.0,340.0


### Transposing a DataFrame:
> Syntax: `df_name.T` OR `df_name.transpose()`

In [43]:
df

Unnamed: 0,Company,Person,Sales
0,GOOG,Sam,200
1,GOOG,Charlie,120
2,MSFT,Ani,340
3,MSFT,Vanessa,124
4,FB,Clay,243
5,FB,Sarah,350


In [44]:
df.T

Unnamed: 0,0,1,2,3,4,5
Company,GOOG,GOOG,MSFT,MSFT,FB,FB
Person,Sam,Charlie,Ani,Vanessa,Clay,Sarah
Sales,200,120,340,124,243,350


In [45]:
df.groupby('Company').describe().transpose()

Unnamed: 0,Company,FB,GOOG,MSFT
Sales,count,2.0,2.0,2.0
Sales,mean,296.5,160.0,232.0
Sales,std,75.660426,56.568542,152.735065
Sales,min,243.0,120.0,124.0
Sales,25%,269.75,140.0,178.0
Sales,50%,296.5,160.0,232.0
Sales,75%,323.25,180.0,286.0
Sales,max,350.0,200.0,340.0


In [None]:
df.groupby('Company').describe().transpose()['']