# What is the First Day of the Week?

Whether the Gregorian calendar shows Sunday or Monday as the first day of the week depends on where you live.

Do more countries start the week on Sunday or Monday? What about people? What about by continent?

The file `first-day-of-week.csv` shows the first day of the week for each territory. The file `population.csv` shows the population in the year 2020 for each territory in millions, and the file `four-regions.csv` specifies whether each territory is in asia, europe, africa, or the americas. 


In [2]:
# FOR GOOGLE COLAB ONLY.
# Uncomment and run the code below. A dialog will appear to upload files.
# Upload 'first-day-of-week.csv', 'population.csv', and 'four-regions.csv'.

# from google.colab import files
# uploaded = files.upload()

In [3]:
import pandas as pd
df = pd.read_csv('first-day-of-week.csv')
df.head()

Unnamed: 0,territory,alpha3,first_day,units,paper
0,Afghanistan,AFG,sat,metric,A4
1,Aland Islands,ALA,mon,metric,A4
2,Albania,ALB,mon,metric,A4
3,Algeria,DZA,sat,metric,A4
4,American Samoa,ASM,sun,metric,A4


In [4]:
pop = pd.read_csv('population.csv')
pop.head()

Unnamed: 0,alpha3,population
0,AFG,39.07
1,ALB,2.87
2,DZA,44.04
3,AND,0.08
4,AGO,33.45


In [5]:
regions = pd.read_csv('four-regions.csv')
regions.head()

Unnamed: 0,alpha3,four_regions
0,AUS,asia
1,BRN,asia
2,KHM,asia
3,CHN,asia
4,FJI,asia


### Project Ideas

- How many territories show Friday, Saturday, Sunday, and Monday as the `first_day` of the week?

- How many people start the week on Friday, Saturday, Sunday, and Monday?
	- Hint: This will involve a `merge`.

- Which of the `four_regions` predominantly start the week on Sunday? On Monday? Are there any regions that are more divided between Sunday and Monday?
	- Hint: This will also involve a `merge`.

In [17]:
territory_counts = df['first_day'].value_counts().sort_index()
print(territory_counts)

first_day
fri      1
mon    186
sat     15
sun     55
Name: count, dtype: int64


In [22]:
# Question 2: How many people start the week on each day?
print("=== Question 2: Population by First Day of Week ===")

# Merge the first day data with population data
df_with_pop = df.merge(pop, on='alpha3', how='inner')
print(f"Successfully merged {len(df_with_pop)} territories with population data")
print(df_with_pop)

# Calculate total population for each first day
population_by_day = df_with_pop.groupby('first_day')['population'].sum().sort_index()
print("Population (in millions) by first day of week:")
for day, pop_millions in population_by_day.items():
    print(f"{day:9}: {pop_millions:8.1f} million people")

print()

# Calculate percentages
total_population = population_by_day.sum()
print("Percentage of world population by first day:")
for day, pop_millions in population_by_day.items():
    percentage = (pop_millions / total_population) * 100
    print(f"{day:9}: {percentage:5.1f}%")

print(f"\nTotal population analyzed: {total_population:.1f} million")


=== Question 2: Population by First Day of Week ===
Successfully merged 196 territories with population data
       territory alpha3 first_day   units      paper  population
0    Afghanistan    AFG       sat  metric         A4       39.07
1        Albania    ALB       mon  metric         A4        2.87
2        Algeria    DZA       sat  metric         A4       44.04
3        Andorra    AND       mon  metric         A4        0.08
4         Angola    AGO       mon  metric         A4       33.45
..           ...    ...       ...     ...        ...         ...
191    Venezuela    VEN       sun  metric  US-Letter       28.44
192      Vietnam    VNM       mon  metric         A4       98.08
193        Yemen    YEM       sun  metric         A4       36.13
194       Zambia    ZMB       mon  metric         A4       19.06
195     Zimbabwe    ZWE       sun  metric         A4       15.53

[196 rows x 6 columns]
Population (in millions) by first day of week:
fri      :      0.5 million people
mon  

In [32]:
# Question 3: Regional analysis of first day preferences
print("=== Question 3: Regional Analysis of First Day Preferences ===")

# Merge all three datasets
df_complete = df.merge(pop, on='alpha3', how='inner').merge(regions, on='alpha3', how='inner')
print(f"Successfully merged all datasets: {len(df_complete)} territories")
print()

# Let's check the column names to make sure we're using the right ones
print("Columns in merged dataset:")
print(df_complete.columns.tolist())
print()

# Analyze by region - territory counts
print("Territory counts by region and first day:")
region_territory_analysis = df_complete.groupby(['four_regions', 'first_day']).size().unstack(fill_value=0)
print(region_territory_analysis)
print()

# Analyze by region - population weighted
print("Population (millions) by region and first day:")
region_population_analysis = df_complete.groupby(['four_regions', 'first_day'])['population'].sum().unstack(fill_value=0)
print(region_population_analysis.round(1))
print()

# Calculate percentages for each region
print("Percentage distribution within each region (by population):")
region_percentages = region_population_analysis.div(region_population_analysis.sum(axis=1), axis=0) * 100
print(region_percentages.round(1))
print()


=== Question 3: Regional Analysis of First Day Preferences ===
Successfully merged all datasets: 196 territories

Columns in merged dataset:
['territory', 'alpha3', 'first_day', 'units', 'paper', 'population', 'four_regions']

Territory counts by region and first day:
first_day     fri  mon  sat  sun
four_regions                    
africa          0   43    5    6
americas        0   15    0   20
asia            1   27   10   21
europe          0   46    0    2

Population (millions) by region and first day:
first_day     fri     mon    sat     sun
four_regions                            
africa        0.0   890.4  208.3   280.4
americas      0.0   126.9    0.0   892.4
asia          0.5  1742.0  223.5  2660.4
europe        0.0   840.8    0.0    10.9

Percentage distribution within each region (by population):
first_day     fri   mon   sat   sun
four_regions                       
africa        0.0  64.6  15.1  20.3
americas      0.0  12.5   0.0  87.5
asia          0.0  37.7   4.8  57.