# Subplots and Pivoting

(Lab 9)


1. [Smoking Data](#Smoking-Data)
  * Group by age (rounded to 10 years) **and** smoking status and compute the survival rates separately for the two groups.
  * Compute the average difference in survival.
  * Create a grid of subplots with 1 row and 2 columns.
    - Left plot: Line plot of `p_alive` and	`p_smokes` as a function of age. Annotate the maximum `p_smokes`.
    - Right plot: Line plot of survival rates for the two smoking groups. Annotate the maximum difference.
2. [Netflix Data](#Netflix-Data)
  * Create a grid of subplots with 2 rows and 1 column.
    - Top plot: Create a **daily** plot of hours spent watching 'Modern Family'
    - Extra Credit: Find the top peak and annotate it by the user watching most that day (plus his/her total duration).
    - Bottom plot: Create a **weekly** plot of hours spent watching 'Modern Family' (Hint: recall the `resample` command)


If you need a refresher on `resample`, here is the Data Camp course on [Working with dates and times in python](https://campus.datacamp.com/courses/working-with-dates-and-times-in-python/easy-and-powerful-dates-and-times-in-pandas?ex=4)

In [3]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime

------------------------------------

## Smoking Data

3 columns: "outcome" measures whether the person is still alive after 10 years.

We want to glean the effect of smoking on survival probability.

In [4]:
df = pd.read_csv("https://calmcode.io/datasets/smoking.csv")
df = df.assign(alive = (df['outcome'] == 'Alive').astype(int),
               smokes = (df['smoker'] == 'Yes').astype(int))
df.head()

Unnamed: 0,outcome,smoker,age,alive,smokes
0,Alive,Yes,23,1,1
1,Alive,Yes,18,1,1
2,Dead,Yes,71,0,1
3,Alive,No,67,1,0
4,Alive,No,64,1,0


Smoking is good for your health ??

In [5]:
df.groupby(['smoker']).alive.mean()

smoker
No     0.685792
Yes    0.761168
Name: alive, dtype: float64

In [None]:
df.groupby(['smoker']).agg(prob=('alive', np.mean))

Unnamed: 0_level_0,prob
smoker,Unnamed: 1_level_1
No,0.685792
Yes,0.761168


Age adjustment

In [None]:
np.round(19 / 10) * 10

20.0

In [6]:
df = df.assign(age10 =np.round(df['age'] / 10) * 10)
df

Unnamed: 0,outcome,smoker,age,alive,smokes,age10
0,Alive,Yes,23,1,1,20.0
1,Alive,Yes,18,1,1,20.0
2,Dead,Yes,71,0,1,70.0
3,Alive,No,67,1,0,70.0
4,Alive,No,64,1,0,60.0
...,...,...,...,...,...,...
1309,Alive,Yes,35,1,1,40.0
1310,Alive,No,33,1,0,30.0
1311,Alive,Yes,21,1,1,20.0
1312,Alive,No,46,1,0,50.0


In [8]:
df2 = (df.groupby(['age10'])
  .agg(p_alive=('alive', np.mean)
   ,p_smokes=('smokes', np.mean)))
df2


Unnamed: 0_level_0,p_alive,p_smokes
age10,Unnamed: 1_level_1,Unnamed: 2_level_1
20.0,0.980645,0.445161
30.0,0.972332,0.438735
40.0,0.9,0.495833
50.0,0.821622,0.632432
60.0,0.587302,0.464286
70.0,0.203947,0.236842
80.0,0.0,0.168831


----------------------------------

## Netflix Data

In [None]:
url = "https://drive.google.com/uc?id=1-1rqPVMKh3LMviUGyG1MfQpFTtn0rJ5V"
netflix = pd.read_csv(url, parse_dates=["Start.Time","Start.Day"])
#netflix['Start.Day'] = pd.to_datetime(netflix['Start.Day'])
netflix.head()

Unnamed: 0,Profile,Start.Time,Duration,Attributes,Title,Supplemental.Video.Type,Device.Type,Bookmark,Latest.Bookmark,Country,Start.Day,Show
0,Olivia,2021-01-17 16:22:20,32,,Ginny & Georgia: Season 2: Latkes Are Lit (Epi...,,Apple iPad 10.2 inch 8th Gen Wi-Fi iPad,00:09:06,00:09:06,GB (United Kingdom),2021-01-17,Ginny & Georgia
1,Olivia,2021-01-17 16:07:48,126,,Ginny & Georgia: Season 2: Latkes Are Lit (Epi...,,Apple iPad 10.2 inch 8th Gen Wi-Fi iPad,00:08:33,Not latest view,GB (United Kingdom),2021-01-17,Ginny & Georgia
2,Olivia,2021-01-17 15:55:31,385,,Ginny & Georgia: Season 2: Latkes Are Lit (Epi...,,Apple iPad 10.2 inch 8th Gen Wi-Fi iPad,00:06:26,Not latest view,GB (United Kingdom),2021-01-17,Ginny & Georgia
3,Olivia,2021-01-17 13:02:24,3426,,Ginny & Georgia: Season 2: Happy My Birthday t...,,Apple iPad 10.2 inch 8th Gen Wi-Fi iPad,00:55:56,00:55:56,GB (United Kingdom),2021-01-17,Ginny & Georgia
4,Olivia,2021-01-17 12:08:05,3257,,Ginny & Georgia: Season 2: What Are You Playin...,,Apple iPad 10.2 inch 8th Gen Wi-Fi iPad,00:54:18,00:54:18,GB (United Kingdom),2021-01-17,Ginny & Georgia
