<center>
    This is Set 6: Grouping Methods (Exercises 51-60)
</center>

**Prerequisites**
* The sample dataset [penguins](https://github.com/mwaskom/seaborn-data/blob/master/penguins.csv) from seaborn will be used for the exercises.
* seaborn package

In [1]:
import pandas as pd
from seaborn import load_dataset

data = pd.DataFrame(load_dataset('penguins'))
data.head()

Unnamed: 0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
0,Adelie,Torgersen,39.1,18.7,181.0,3750.0,Male
1,Adelie,Torgersen,39.5,17.4,186.0,3800.0,Female
2,Adelie,Torgersen,40.3,18.0,195.0,3250.0,Female
3,Adelie,Torgersen,,,,,
4,Adelie,Torgersen,36.7,19.3,193.0,3450.0,Female


**Exercise 51: Calculate the number of rows of each `species` in `data`**

In [2]:
data.groupby('species')['species'].count()

species
Adelie       152
Chinstrap     68
Gentoo       124
Name: species, dtype: int64

**Exercise 52: Calculate the sum of `body_mass_g` by `island` in `data`**

In [3]:
data.groupby('island')['body_mass_g'].sum()

island
Biscoe       787575.0
Dream        460400.0
Torgersen    189025.0
Name: body_mass_g, dtype: float64

**Exercise 53: Calculate the mean of `body_mass_g` by non-null `sex` in `data`**

In [4]:
data.groupby('sex')['body_mass_g'].mean()

sex
Female    3862.272727
Male      4545.684524
Name: body_mass_g, dtype: float64

**Exercise 54: Calculate the maximum and minimum of `flipper_length_mm` together by `species` in `data`**

In [5]:
data.groupby('species')['flipper_length_mm'].agg(["min", "max"])

Unnamed: 0_level_0,min,max
species,Unnamed: 1_level_1,Unnamed: 2_level_1
Adelie,172.0,210.0
Chinstrap,178.0,212.0
Gentoo,203.0,231.0


**Exercise 55: Calculate the median of `body_mass_g` by `species` and `sex` in `data`**

In [6]:
data.groupby(['species', 'sex'])['body_mass_g'].median()

species    sex   
Adelie     Female    3400.0
           Male      4000.0
Chinstrap  Female    3550.0
           Male      3950.0
Gentoo     Female    4700.0
           Male      5500.0
Name: body_mass_g, dtype: float64

**Exercise 56: Calculate the standard deviation of `bill_depth_mm` in cm instead of mm by `island` in `data`**

In [7]:
data.groupby('island')['bill_depth_mm'].std() / 10

island
Biscoe       0.182072
Dream        0.113312
Torgersen    0.133945
Name: bill_depth_mm, dtype: float64

**Exercise 57: Calculate the covariance between `flipper_length_mm` and `body_mass_g` by `species` and `island` in `data`**

In [8]:
data.groupby(['species', 'island']).apply(lambda x: x['flipper_length_mm'].cov(x['body_mass_g']))

species    island   
Adelie     Biscoe       1727.021670
           Dream        1378.652597
           Torgersen    1209.225490
Chinstrap  Dream        1758.538191
Gentoo     Biscoe       2297.144476
dtype: float64

**Exercise 58: Calculate the absolute correlation between `bill_length_mm` and `bill_depth_mm` by `species` and `sex` in `data`**

In [9]:
data.groupby(['species', 'sex']).apply(lambda x: abs(x['bill_length_mm'].corr(x['bill_depth_mm'])))

species    sex   
Adelie     Female    0.160636
           Male      0.038247
Chinstrap  Female    0.256317
           Male      0.446270
Gentoo     Female    0.430444
           Male      0.306767
dtype: float64

**Exercise 59: Calculate the number of null values in `sex` by `island` in `data`**

In [10]:
data.groupby('island')['sex'].apply(lambda x: x.isnull().sum())

island
Biscoe       5
Dream        1
Torgersen    5
Name: sex, dtype: int64

**Exercise 60: Sort `data` on `bill_depth_mm` grouped by `sex` and assign it to `data_gs`**

In [11]:
data_gs = data.groupby('sex').apply(lambda x: x.sort_values('bill_depth_mm'))
data_gs

Unnamed: 0_level_0,Unnamed: 1_level_0,species,island,bill_length_mm,bill_depth_mm,flipper_length_mm,body_mass_g,sex
sex,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Female,244,Gentoo,Biscoe,42.9,13.1,215.0,5000.0,Female
Female,220,Gentoo,Biscoe,46.1,13.2,211.0,4500.0,Female
Female,268,Gentoo,Biscoe,44.9,13.3,213.0,5100.0,Female
Female,228,Gentoo,Biscoe,43.3,13.4,209.0,4400.0,Female
Female,225,Gentoo,Biscoe,46.5,13.5,210.0,4550.0,Female
...,...,...,...,...,...,...,...,...
Male,35,Adelie,Dream,39.2,21.1,196.0,4150.0,Male
Male,61,Adelie,Biscoe,41.3,21.1,195.0,4400.0,Male
Male,49,Adelie,Dream,42.3,21.2,191.0,4150.0,Male
Male,13,Adelie,Torgersen,38.6,21.2,191.0,3800.0,Male


✅ This completes Set 6: Grouping Methods (Exercises 51-60)

Original exercises for Datatable package can be found [here](https://github.com/vopani/datatableton)