# What is the First Day of the Week?

Whether the Gregorian calendar shows Sunday or Monday as the first day of the week depends on where you live.

Do more countries start the week on Sunday or Monday? What about people? What about by continent?

The file `first-day-of-week.csv` shows the first day of the week for each territory. The file `population.csv` shows the population in the year 2020 for each territory in millions, and the file `four-regions.csv` specifies whether each territory is in asia, europe, africa, or the americas. 


In [1]:
# FOR GOOGLE COLAB ONLY.
# Uncomment and run the code below. A dialog will appear to upload files.
# Upload 'first-day-of-week.csv', 'population.csv', and 'four-regions.csv'.

# from google.colab import files
# uploaded = files.upload()

In [5]:
import pandas as pd
df = pd.read_csv('first-day-of-week.csv')
df.head()

Unnamed: 0,territory,alpha3,first_day,units,paper
0,Afghanistan,AFG,sat,metric,A4
1,Aland Islands,ALA,mon,metric,A4
2,Albania,ALB,mon,metric,A4
3,Algeria,DZA,sat,metric,A4
4,American Samoa,ASM,sun,metric,A4


In [2]:
pop = pd.read_csv('population.csv')
pop.head()

Unnamed: 0,alpha3,population
0,AFG,39.07
1,ALB,2.87
2,DZA,44.04
3,AND,0.08
4,AGO,33.45


In [3]:
regions = pd.read_csv('four-regions.csv')
regions.head()

Unnamed: 0,alpha3,four_regions
0,AUS,asia
1,BRN,asia
2,KHM,asia
3,CHN,asia
4,FJI,asia


### Project Ideas

- How many territories show Friday, Saturday, Sunday, and Monday as the `first_day` of the week?

- How many people start the week on Friday, Saturday, Sunday, and Monday?
	- Hint: This will involve a `merge`.

- Which of the `four_regions` predominantly start the week on Sunday? On Monday? Are there any regions that are more divided between Sunday and Monday?
	- Hint: This will also involve a `merge`.

In [26]:
import pandas as pd
import pycountry

# Load datasets
first_day_df = pd.read_csv("first-day-of-week.csv")
population_df = pd.read_csv("population.csv")
regions_df = pd.read_csv("four-regions.csv")  # uses 'alpha3' column

# Add alpha-3 codes to first_day_df
def get_alpha3(name):
    try:
        return pycountry.countries.lookup(name).alpha_3
    except LookupError:
        return None

first_day_df["alpha3"] = first_day_df["territory"].apply(get_alpha3)
first_day_df = first_day_df.dropna(subset=["alpha3"])

# Merge population
merged_pop = pd.merge(first_day_df, population_df, on="alpha3")

# Q1: Count how many territories start the week on each day
territory_counts = first_day_df["first_day"].value_counts()
print("Territories by first day:\n", territory_counts)

# Q2: Total population per start-of-week day
people_per_day = merged_pop.groupby("first_day")["population"].sum().sort_values(ascending=False)
print("\nPopulation by first day:\n", people_per_day)

# Q3: Merge with four_regions on 'alpha3'
merged_regions = pd.merge(merged_pop, regions_df, on="alpha3")

# Group by region and day, then pivot
region_day = merged_regions.groupby(["four_regions", "first_day"])["population"].sum().unstack().fillna(0)

# Determine dominant start day per region
region_day["dominant_day"] = region_day.idxmax(axis=1)

print("\nRegion-wise breakdown:\n", region_day)


Territories by first day:
 first_day
mon    147
sun     48
sat     15
fri      1
Name: count, dtype: int64

Population by first day:
 first_day
sun    3782.00
mon    3227.08
sat     431.79
fri       0.50
Name: population, dtype: float64

Region-wise breakdown:
 first_day     fri      mon     sat      sun dominant_day
four_regions                                            
africa        0.0   759.05  208.31   280.38          mon
americas      0.0   126.60    0.00   890.85          sun
asia          0.5  1736.39  223.48  2599.88          sun
europe        0.0   605.04    0.00    10.89          mon
