# What is the First Day of the Week?

Whether the Gregorian calendar shows Sunday or Monday as the first day of the week depends on where you live.

Do more countries start the week on Sunday or Monday? What about people? What about by continent?

The file `first-day-of-week.csv` shows the first day of the week for each territory. The file `population.csv` shows the population in the year 2020 for each territory in millions, and the file `four-regions.csv` specifies whether each territory is in asia, europe, africa, or the americas. 


In [1]:
# FOR GOOGLE COLAB ONLY.
# Uncomment and run the code below. A dialog will appear to upload files.
# Upload 'first-day-of-week.csv', 'population.csv', and 'four-regions.csv'.

# from google.colab import files
# uploaded = files.upload()

In [1]:
import pandas as pd
df = pd.read_csv('first-day-of-week.csv')
df.head()

Unnamed: 0,territory,alpha3,first_day,units,paper
0,Afghanistan,AFG,sat,metric,A4
1,Aland Islands,ALA,mon,metric,A4
2,Albania,ALB,mon,metric,A4
3,Algeria,DZA,sat,metric,A4
4,American Samoa,ASM,sun,metric,A4


In [2]:
pop = pd.read_csv('population.csv')
pop.head()

Unnamed: 0,alpha3,population
0,AFG,39.07
1,ALB,2.87
2,DZA,44.04
3,AND,0.08
4,AGO,33.45


In [3]:
regions = pd.read_csv('four-regions.csv')
regions.head()

Unnamed: 0,alpha3,four_regions
0,AUS,asia
1,BRN,asia
2,KHM,asia
3,CHN,asia
4,FJI,asia


### Project Ideas

- How many territories show Friday, Saturday, Sunday, and Monday as the `first_day` of the week?

- How many people start the week on Friday, Saturday, Sunday, and Monday?
	- Hint: This will involve a `merge`.

- Which of the `four_regions` predominantly start the week on Sunday? On Monday? Are there any regions that are more divided between Sunday and Monday?
	- Hint: This will also involve a `merge`.

In [14]:
# YOUR CODE HERE (add additional cells as needed)
first_day_territories = (
    df['first_day']
    .value_counts()
    .reindex(['fri', 'sat', 'sun', 'mon'], fill_value=0)
)


first_day_territories

first_day
fri      1
sat     15
sun     55
mon    186
Name: count, dtype: int64

In [47]:
merged_population_days = df.merge(
    pop,
    on='alpha3',
    how='left'
).assign(
    population=lambda x: x['population'].fillna(0),
    pop_mill=lambda x: x['population'] * 1_000_000
)

first_day_people = (
    merged_population_days
    .groupby('first_day', as_index=False)
    .agg(people=('pop_mill', 'sum'), 
         population=('population', 'sum'))
)

first_day_people



Unnamed: 0,first_day,people,population
0,fri,500000.0,0.5
1,mon,3600190000.0,3600.19
2,sat,431790000.0,431.79
3,sun,3844080000.0,3844.08


In [171]:
# Merge the four regions with the first day of the week data. Only include matching alpha3 rows.

merged_four_regions = df.merge(
    regions,
    on='alpha3',
    how='inner'
)

first_day_per_region = (
    merged_four_regions
    .groupby(['four_regions', 'first_day'], as_index=False)
    .agg(first_day_amount=('first_day', 'count'))
)


predominant_sun = first_day_per_region[first_day_per_region['first_day'] == 'sun']
predominant_mon = first_day_per_region[first_day_per_region['first_day'] == 'mon']

merged_predominant = (
    predominant_sun.merge(
        predominant_mon,
        on='four_regions',
        suffixes=('_sun', '_mon'),
        how='outer',
    )
).fillna(0)

sun_regions_predominant = merged_predominant.query('first_day_amount_sun > first_day_amount_mon')[['four_regions', 'first_day_sun', 'first_day_amount_sun']].rename(
    columns={'first_day_sun': 'first_day'}
)

mon_regions_predominant = merged_predominant.query('first_day_amount_mon > first_day_amount_sun')[['four_regions', 'first_day_mon', 'first_day_amount_mon']].rename(
    columns={'first_day_mon': 'first_day'}
)

merged_predominant

Unnamed: 0,four_regions,first_day_sun,first_day_amount_sun,first_day_mon,first_day_amount_mon
0,africa,sun,6,mon,43
1,americas,sun,20,mon,15
2,asia,sun,21,mon,27
3,europe,sun,2,mon,46
