<h1><center>A Day in an American's Life</center></h1>

It is a well-known fact that a complete rotation of the planet Earth takes 24 hours causing the formation of day and night. The fascinating part is to analyze how the people in the world decide to spend the available precious 24 hours. The demographics plays a very vital role on analysing and understanding the life-style of people mainly due to the influence of the time-zone, culture, population, socio-economy factors, current trend, etc. In this article, we focus on visually studying the behaviour of person living in the United States for the most obvious reasons that the country comprises of diverse, multi-cultural and a democratic-republic population.

To go another step ahead, this article shows how an American spends his/her day which is classified into a number of activities like Personal Care, Travelling, Socialising-Leisure time, Religion, Work, Sports- Fitness and so on.

In [1]:
#nbi:hide_in
import pandas as pd
import numpy as np
import re
import bqplot as bq
import ipywidgets
from ipywidgets import Layout

In [2]:
#nbi:hide_in
#nbi:hide_out
ats_sum = pd.read_csv('https://raw.githubusercontent.com/sarvaniputta/nbinteract_tutorial/master/atussum_2017.csv')
cps = pd.read_csv('https://raw.githubusercontent.com/sarvaniputta/nbinteract_tutorial/master/atuscps_2017.csv', usecols=['TUCASEID', 'GESTFIPS'])
cps_ = cps.groupby('TUCASEID').GESTFIPS.first().to_frame()
merged = ats_sum.merge(cps_, left_on = 'TUCASEID', right_index = True)

def activity_columns(data, activity_code):
    col_prefix = "t" + activity_code
    return [column for column in data.columns if col_prefix in column]



work_cols = activity_columns(merged, '0501')
travel_cols = activity_columns(merged, '1805')
sleep_cols = activity_columns(merged, '0101')
religion_cols = activity_columns(merged, '1401')
leisure_cols = activity_columns(merged, '1203')
sports_cols = activity_columns(merged, '1301')
housework_cols = activity_columns(merged, '0201')

work_statewise = merged.loc[:, work_cols].groupby(merged.GESTFIPS).mean()
travel_statewise = merged.loc[:, travel_cols].groupby(merged.GESTFIPS).mean()
sleep_statewise = merged.loc[:, sleep_cols].groupby(merged.GESTFIPS).mean()
religion_statewise = merged.loc[:, religion_cols].groupby(merged.GESTFIPS).mean()
leisure_statewise = merged.loc[:, leisure_cols].groupby(merged.GESTFIPS).mean()
sports_statewise = merged.loc[:, sports_cols].groupby(merged.GESTFIPS).mean()
housework_statewise = merged.loc[:, housework_cols].groupby(merged.GESTFIPS).mean()

activity_cols_dict = {0: work_cols, 1: travel_cols, 2: sleep_cols, 3: religion_cols, 4: leisure_cols,
                      5: sports_cols, 6: housework_cols}

In [3]:
#nbi:hide_in
#nbi:hide_out
activity_list = ['Average Working Time', 'Average Travel Time', 'Average Sleeping Time', 'Average Religious Time',
                'Average Leisure Time', 'Average Sports Time', 'Average Housework Time']

act_dd = ipywidgets.Dropdown(options = activity_list, description = 'Select activity', 
                             style={'description_width': 'initial'})


######################### Map ############################
cscale = bq.ColorScale(scheme = 'Oranges')        
                   # reverse the colorscale or Hawaii is not visible

map_tt = bq.Tooltip(labels = ['State', 'Time (minutes)'], fields = ['name', 'color'])
sc_geo = bq.AlbersUSA(scale=2400)
states_map = bq.Map(color = work_statewise.sum(axis=1).round(1).to_dict(), 
                     map_data=bq.topo_load('map_data/USStatesMap.json'),
                    scales = {'projection':sc_geo, 'color':cscale},tooltip = map_tt,
                    interactions = {'click': 'select', 'hover':'tooltip', },
                    anchor_style = {'fill':'red'}, 
                    selected_style = {'opacity': 1.0},
                    unselected_style = {'opacity': 1.0})

cax = bq.ColorAxis(scale=cscale, orientation='vertical', side='left', label='Time(minutes)')
fig_map = bq.Figure(marks=[states_map],axes=[cax], title = 'Average Working Time',
                   layout=Layout(min_width='800px', min_height='800px'),
                   background_style = {'fill': 'white'},)
fig_map.fig_margin = {'bottom': 0, 'top': 0, 'left': 70, 'right': 0}
fig_map.aspect_ratio = 1920/1080

###########################################################

######################### Bar #############################
time_spent = merged.loc[:, ['TUCASEID'] + work_cols].set_index('TUCASEID').sum(axis=1)

def normalized_hist(data, bins=10):
    counts, bins = np.histogram(data, bins=bins)
    counts = counts*100/counts.sum()
    return bins, counts  

sc_x2 = bq.LinearScale()
sc_y2 = bq.LinearScale()
ax_x2 = bq.Axis(label = 'Time spent(minutes)', scale=sc_x2, orientation='horizontal')
ax_y2 = bq.Axis(label='Population%', scale=sc_y2,orientation='vertical', grid_color='gray', grid_lines='dashed')
x, y = normalized_hist(time_spent, bins=24)
bars_hist = bq.Bars(x = x[:-1], y=y, align='right', scales={'x': sc_x2, 'y': sc_y2}, colors=['#13294a'])
#           ,width=bins[1]-bins[0]  , edgecolor='black')
fig_bar1 = bq.Figure(marks = [bars_hist],axes=[ax_x2, ax_y2], background_style = {'fill': 'white'})
fig_bar1.layout = Layout(max_width='100%', max_height='60%')
#fig_bar1.fig_margin = {'bottom':50, 'top':50, 'left':20, 'right':10}

###########################################################

##################### Age Line Plot #######################
age_wise_act_sum = merged.groupby('TEAGE').sum()[work_cols].sum(axis=1)
age_wise_count = merged.groupby('TEAGE').count()['TUCASEID']
y_data_line = (age_wise_act_sum / age_wise_count).values
x_data_line = sorted(ats_sum['TEAGE'].unique())

x_sc_line = bq.OrdinalScale()
y_sc_line = bq.LinearScale()

x_ax_line = bq.Axis(scale = x_sc_line, num_ticks=10, label='Age', color='#13294a')
y_ax_line = bq.Axis(scale = y_sc_line, num_ticks=10, 
                   orientation = 'vertical', label='Time spent in minutes',
                    grid_color='gray', grid_lines='dashed')

lines = bq.Lines(x = x_data_line, y = y_data_line, scales = {'x': x_sc_line, 'y': y_sc_line},
                colors=['#13294a'], interpolation = 'basis')

fig_line = bq.Figure(marks = [lines], axes = [x_ax_line, y_ax_line], 
                    title='Average Working Time across US', 
                    background_style = {'fill': 'white'})
fig_line.layout = Layout(max_width='100%', max_height='60%')

###########################################################

##################### Interactivity #######################

def on_select_map(change):
    if not change['new']:
        selected_fips = merged.GESTFIPS.unique()
    else:
        selected_fips = change['new']
        states_map.selected = [selected_fips[-1]]
    activity = act_dd.index
    columns = activity_cols_dict[activity]
    x, y = normalized_hist(merged
                            .loc[merged.GESTFIPS.isin([selected_fips[-1]]), ['TUCASEID']+columns]
                            .set_index('TUCASEID')
                            .sum(axis=1), bins=24)
    bars_hist.x = x
    bars_hist.y = y
    # For updating line plot
    #merged_subset = merged[merged['GESTFIPS'] == selected_fips[-1]]
    #age_wise_act_sum = merged_subset.groupby('TEAGE').sum()[work_cols].sum(axis=1)
    #age_wise_count = merged_subset.groupby('TEAGE').count()['TUCASEID']
    #lines.y = (age_wise_act_sum / age_wise_count).values
    #lines.x = sorted(age_wise_act_sum.index)
    
    #print(selected_fips) Fix needed for selectex FIPS [50, 36]
# Observe above changes 
states_map.observe(on_select_map, 'selected')

def on_activity_change(change):
    states_map.selected=[]
    activity = act_dd.index
    columns = activity_cols_dict[activity]
    filtered = merged.loc[:, columns].groupby(merged.GESTFIPS).mean().sum(axis=1).round(1).to_dict()
    states_map.color = filtered
    fig_map.title = act_dd.value
    # For updating line plot
    age_wise_act_sum = merged.groupby('TEAGE').sum()[columns].sum(axis=1)
    age_wise_count = merged.groupby('TEAGE').count()['TUCASEID']
    lines.y = (age_wise_act_sum / age_wise_count).values
    lines.x = sorted(ats_sum['TEAGE'].unique())
    fig_line.title = act_dd.value + ' across US'
    
# Observe above changes 
act_dd.observe(on_activity_change, 'value')

###########################################################

# ipywidgets.VBox([act_dd, ipywidgets.HBox([fig_map, ipywidgets.VBox([fig_bar1, fig_line])])])

In [4]:
#nbi:left
#nbi:hide_in
ipywidgets.VBox([act_dd, fig_map])

VBox(children=(Dropdown(description='Select activity', options=('Average Working Time', 'Average Travel Time',…

In [5]:
#nbi:right
#nbi:hide_in
ipywidgets.VBox([fig_bar1, fig_line])

VBox(children=(Figure(axes=[Axis(label='Time spent(minutes)', scale=LinearScale()), Axis(grid_color='gray', gr…

## Top states that work the longest hours
#### Considering only people who are employed

In [6]:
#nbi:hide_in
merged['work_total'] = merged[work_cols].sum(axis=1)
m_work = merged
m_work_men = merged[merged['TESEX'] == 1]
m_work_women = merged[merged['TESEX'] == 2]

m_work = m_work[m_work['work_total'] != 0]
m_work_men = m_work_men[m_work_men['work_total'] != 0]
m_work_women = m_work_women[m_work_women['work_total'] != 0]

m_work = m_work.loc[:, work_cols].groupby(m_work.GESTFIPS).mean()
m_work_men = m_work_men.loc[:, work_cols].groupby(m_work_men.GESTFIPS).mean()
m_work_women = m_work_women.loc[:, work_cols].groupby(m_work_women.GESTFIPS).mean()

In [7]:
#nbi:hide_in
mer_work = (m_work.sum(axis=1).sort_values(ascending=False)/60).round(2).to_frame().reset_index()
mer_work_male = (m_work_men.sum(axis=1).sort_values(ascending=False)/60).round(2).to_frame().reset_index()
mer_work_female = (m_work_women.sum(axis=1).sort_values(ascending=False)/60).round(2).to_frame().reset_index()

merge1 = pd.merge(mer_work, mer_work_male, how='left', on='GESTFIPS')
merge2 = pd.merge(merge1, mer_work_female, how='left', on='GESTFIPS')
merge2 = merge2.rename(columns = {'0_x':'Overall', '0_y':'Male', 0:'Female'})
merge2 = merge2.sort_values(by='Overall', ascending=False)


In [8]:
#nbi:hide_in
top_worktimes = merge2['Overall'][1:11].values
top_worktimes_male = merge2['Male'][1:11].values
top_worktimes_female = merge2['Female'][1:11].values

top_workstates = ['South Dakota', 'New Mexico', 'Nebraska', 'Utah', 'Tennessee',
                  'Oregon', 'Alaska', 'Idaho', 'Louisiana', 'Georgia']

################### Top 10 work States Bar #######################

sc_x3 = bq.OrdinalScale()
sc_y3 = bq.LinearScale(min=5, max=9.7)
ax_x3 = bq.Axis(label='States',scale=sc_x3, orientation='horizontal', color='#13294a')
ax_y3 = bq.Axis(label = 'Avg. Work Time in Hours', scale=sc_y3, orientation='vertical', color='#13294a',
               grid_color='gray', grid_lines='dashed')

bars_topwork = bq.Bars(x = top_workstates, y=top_worktimes, padding=0.35,
                    scales={'x': sc_x3, 'y': sc_y3}, colors=['#13294a'])

lines_male = bq.Lines(x = top_workstates, y=top_worktimes_male, scales={'x': sc_x3, 'y': sc_y3}, 
                      colors=['#22A8DB'], stroke_width = 3, marker='circle', display_legend = True, 
                      labels=['Male'])

lines_female = bq.Lines(x = top_workstates, y=top_worktimes_female, scales={'x': sc_x3, 'y': sc_y3}, 
                        colors=['#FC0F3A'], stroke_width = 3, marker='circle', display_legend = True, 
                        labels=['Female'])


fig_bar2 = bq.Figure(marks = [bars_topwork, lines_male, lines_female], axes=[ax_x3, ax_y3], 
                     background_style = {'fill': 'white'},
                     layout={'flex': '1'},
                     title = 'States that work the longest hours and comparison of gender work gap',
                     title_style = {'font-family': 'Montserrat', 'font-size': '20px', 'text-transform': 'uppercase',
                                    'font-weight': '700', 'color': '#13294a'})

In [9]:
#nbi:hide_in
bar2box = ipywidgets.Box(children=[fig_bar2], layout=Layout(display='flex', justify='center'))
bar2box

Box(children=(Figure(axes=[Axis(color='#13294a', label='States', scale=OrdinalScale()), Axis(color='#13294a', …

## Reason for big difference between male and female in Alaska:
*"Almost certainly the biggest factor — we have a higher percentage of oil and gas employment than other states do, and then that industry is the highest-paid in Alaska, and … the percentages are high for males," said Dan Robinson, chief of research and analysis at the Alaska Department of Labor and Workforce Development.*