# **Birth Rates and Women's Labour Force in Singapore**

In the last blog post, I took a look at Singapore's fertility rates over the years, and how they have evolved since the introduction of pro-natalist policies like the Baby Bonus. 

The prelimanary conclusion that we gathered from that post was that Singapore's fertility rates have been decreasing drastically over the years. Though we did not establish any sort of causality, we could see that the introduction of Baby Bonus did not lead to a reversal of this trend. 

In this project, we look at one reason economists attribute to declining fertility rates - an increase in women's participation in the labour force. We can also assume this to be a proxy indicator of another very important factor - women's education. In Gary Becker's model of fertility for instance,  the demand for children is tied to the "price" of a child. Since women in more education societies have 'more to lose' in terms of opportunity costs, they are less likely to want a higher number of children.

We will hence explore the relationship between Singapore's birth rates (specifically Total Fertility Rate) and the Resident Labour Force Participation Rate 

In [27]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

To start off, let's import the CSV file, which is from data.gov.sg - the Singapore Government's open data repository, and look up TFR values.

In [28]:
# load Birth Rates Data
birth_rates_df = pd.read_csv('BirthsAndFertilityRatesAnnual.csv')

# read first few rows 
birth_rates_df.head()

Unnamed: 0,DataSeries,2024,2023,2022,2021,2020,2019,2018,2017,2016,...,1969,1968,1967,1966,1965,1964,1963,1962,1961,1960
0,Total Fertility Rate (TFR),0.97,0.97,1.04,1.12,1.1,1.14,1.14,1.16,1.2,...,3.22,3.53,3.91,4.46,4.66,4.97,5.16,5.21,5.41,5.76
1,15 - 19 Years,2.3,2.2,2.1,2.2,2.3,2.5,2.5,2.6,2.7,...,27.1,30.9,35.8,33.0,35.9,38.3,45.7,52.0,63.4,69.6
2,20 - 24 Years,9.8,10.6,11.2,11.7,12.7,12.7,14.4,15.1,17.0,...,150.1,165.8,195.8,218.5,227.1,240.0,249.0,245.5,241.1,250.5
3,25 - 29 Years,42.6,43.7,48.8,53.4,54.6,59.4,60.6,62.2,65.8,...,227.8,236.6,244.7,261.2,259.5,277.6,287.2,291.7,304.9,323.9
4,30 - 34 Years,79.3,78.7,86.7,92.9,90.8,92.4,92.9,93.3,96.2,...,134.3,152.0,166.7,202.0,216.2,226.7,228.7,231.5,238.4,259.7


Now we work on filtering and cleaning up the data to get the just the TFR and year as columns. This will help us create the plot later, easier.

In [29]:
# first filter for TFR only 
tfr_df = birth_rates_df[birth_rates_df['DataSeries'] == 'Total Fertility Rate (TFR)'].copy()

# then we use melt to have year as a column
tfr_melt = tfr_df.melt(id_vars=['DataSeries'], var_name='Year', value_name='TFR')
tfr_melt['Year'] = pd.to_numeric(tfr_melt['Year'], errors='coerce')
tfr_melt['TFR'] = pd.to_numeric(tfr_melt['TFR'], errors='coerce')

# drop the DataSeries column since it's no longer needed
tfr_melt = tfr_melt.drop(columns=['DataSeries'])

# sort by Year
tfr_melt = tfr_melt.sort_values('Year').reset_index(drop=True)

tfr_melt.head()

Unnamed: 0,Year,TFR
0,1960,5.76
1,1961,5.41
2,1962,5.21
3,1963,5.16
4,1964,4.97


Next, we carry out the same process for the female labour force participation rate.

In [30]:
# load labour force participation  Data
labour_df = pd.read_csv('ResidentLabourForceParticipationRatebySex.csv')

# inspect date
labour_df.head()

Unnamed: 0,year,sex,lfpr
0,1991,male,79.4
1,1991,female,48.0
2,1992,male,79.4
3,1992,female,48.6
4,1993,male,78.8


In [31]:
# filter for female
labour_df = labour_df[labour_df['sex'] == 'female'].copy()

# ensure Year is numeric
labour_df['year'] = pd.to_numeric(labour_df['year'], errors='coerce')

# rename columns to merge
labour_df = labour_df.rename(columns={'year': 'Year', 'lfpr': 'Female_LFPR'})

# drop the sex column as it's no longer needed
labour_df = labour_df.drop(columns=['sex'])

labour_df.head()

Unnamed: 0,Year,Female_LFPR
1,1991,48.0
3,1992,48.6
5,1993,48.0
7,1994,48.6
9,1996,49.9


Next, we merge the two dataframes to create a single dataframe to plot.

In [32]:
# merge the two dfs based on year
merge_df = pd.merge(tfr_melt, labour_df, on='Year', how='inner')

# read the displayed data 
merge_df.head()

Unnamed: 0,Year,TFR,Female_LFPR
0,1991,1.73,48.0
1,1992,1.72,48.6
2,1993,1.74,48.0
3,1994,1.71,48.6
4,1996,1.66,49.9


We will now use a dual-axis line chart to show how both the Total Fertility Rate and Women's Labour Force Participation Rate have changed over time.

In [33]:
# create figure with secondary y-axis (help from AI to figure this bit out)
fig = make_subplots(specs=[[{"secondary_y": True}]])

fig.add_trace(
    go.Scatter(x=merge_df['Year'], y=merge_df['TFR'], name="Total Fertility Rate (TFR)", mode='lines+markers'),
    secondary_y=False,
)

fig.add_trace(
    go.Scatter(x=merge_df['Year'], y=merge_df['Female_LFPR'], name="Women's Labour Force Participation (%)", mode='lines+markers'),
    secondary_y=True,
)

# adding title
fig.update_layout(
    title_text="Relationship between Birth Rate and Women's Labour Force Participation in Singapore (1990-2022)"
)

# setting x-axis title
fig.update_xaxes(title_text="Year")

# setting y-axes titles
fig.update_yaxes(title_text="Total Fertility Rate (TFR)", secondary_y=False)
fig.update_yaxes(title_text="Women's Labour Force Participation Rate (%)", secondary_y=True)

fig.show()

## Takeaways

As we can see above, there seems to be an inverse corrolation between the Total Fertility Rate (TFR) and Women's Labour Force Participation Rate (LFPR) in Singapore. TFR declined from 1.73 in 1990 to 1.04 in 2022. In the same period, LFPR increased from 63.7% to 70%.

Once again, though we cannot draw a direct cause and effect relationship, this inverse correlation aligns with the thoeries of many economists, and could hence warrant more in-depth analysis.

