# Project 1: Birth Rates in Singapore


Singapore has one of the **world's lowest fertility rates**. It competes with East Asian countries like South Korea for the lowest TFR (Total Fertility Rate) every year. To boost birth rates to at least replacement level (TFR = 2.1), the Singapore government has introduced various pro-natalist policies like a direct cash subsidy called the Baby Bonus from the early 2000s. Whilst we might not be able to establish any causality in this short project, it would be interesting to visually see if birth rates have stablised after the introduction of these policies. Before moving to that however, we can start with establishing some summary statistics to understand the problem.

To start off, let's import the CSV file, which is from data.gov.sg - the Singapore Government's open data repository, and look up TFR values.

In [35]:
import csv

tfr_sg = "BirthsAndFertilityRatesAnnual.csv"

# opens the file, and saves the row dictory for TFR in variable tfr_row 
with open(tfr_sg, "r") as f:
    reader = csv.DictReader(f)
    for row in reader:
        if "Total Fertility Rate" in row["DataSeries"]:
            tfr_row = row
            break

# looping through every key in the dictonary, keeping only numbers
years = [int(k) for k in tfr_row if k.isdigit()]

# looping through every year, and looking up thier corresponding TFR
values = [float(tfr_row[str(y)]) for y in years]


Great! Now let's get some summary statistics, starting with the mean - which is calculated by dividing the total sum of TFRs, divided by the number of years in the dataset.

In [34]:
mean_tfr = sum(values) / len(values)
print(mean_tfr)

2.063076923076923


So the mean TFR between 1965 and now has been below replacement rate of TFR = 2.1.

Next, let's try and get the median value.

In [37]:
values_sort = sorted(values)
n = len(values_sort)
mid = n // 2

# checking if list length is even or odd  
# if odd, the middle number is indeed the median
if n % 2 == 1:
    median_tfr = values_sort[mid]

# if even, the average of the two middle numbers are the median
else:
    values_sort[mid - 1] + values_sort[mid] / 2

print(median_tfr)

1.62


Once again, the median TFR between 1965 and now has been below replacement rate of TFR = 2.1. Now let's check the mode.


In [39]:
counts = {}
for f in values:
    counts[f] = counts.get(f, 0) + 1
max_count = max(counts.values())
modes = [v for v, c in counts.items() if c == max_count]
print(modes)


[1.61]


Now, we want to see if there has been any visable changes to TFR since the introduction of the Baby Bonus subsidies in 2001. Once again, though this exercise does not imply causality, it could give us a general idea of trends since the policy's introduction. 

*For this part, I solicited the help of ChatGPT to give me a general idea of where to go - and where to get started (in line with the referencing guidelines in the course sylabus).*

In [47]:
def bar_chart(labels, nums, year_min=None, year_max=None, width=40, title="Total Fertility Rate (TFR) in Singapore"):
    
    # pair up years & values, and filter so that we can zoom into 2001-24 
    data = [(y, v) for y, v in zip(labels, nums)
            if (year_min is None or y >= year_min) and (year_max is None or y <= year_max)]
    
    # sort by year for chronological order
    data.sort()

    # compute range for normalization
    values = [v for _, v in data]
    vmin, vmax = min(values), max(values)
    span = vmax - vmin or 1.0

    # print title with divider
    print(f"\n{title}\n{'=' * len(title)}")

    # drawing the bars 
    for year, v in data:
        # important - span rescales the TFRs so that it is between 0-1
        norm = (v - vmin) / span
        # normalised number * width of total bar chart, rounded.
        # also ensuring that every bar chart has at least one star
        bar = "*" * max(1, int(round(norm * width)))
        print(f"{year}: {bar} {v:.2f}")

bar_chart(years, values, year_min=2001, year_max=2024)


Total Fertility Rate (TFR) in Singapore
2001: **************************************** 1.41
2002: ************************************ 1.37
2003: *************************** 1.27
2004: ************************** 1.26
2005: ************************** 1.26
2006: **************************** 1.28
2007: ***************************** 1.29
2008: **************************** 1.28
2009: *********************** 1.22
2010: **************** 1.15
2011: ********************* 1.20
2012: ***************************** 1.29
2013: ******************** 1.19
2014: ************************* 1.25
2015: ************************* 1.24
2016: ********************* 1.20
2017: ***************** 1.16
2018: *************** 1.14
2019: *************** 1.14
2020: ************ 1.10
2021: ************** 1.12
2022: ****** 1.04
2023: * 0.97
2024: * 0.97


Interestingly, from what we can see above, Singapore's TFR continues to drop even after pro-natalist policies were implemented. This could mean that the subsidy is actually working, but the drop in TFR would have been even worse without them, or that the subsidies had no impact on declining birth rates.

There is a lot of interesting literature on this topic - including Gary Becker's quality-quantity trade-off. He suggests that families, who have a limited budget of time and money and must make a choice between having more children (quantity) or investing more resources, such as education, into each child (quality).

My hypothesis (unproven) is that the high-stress education system, and educational arms race it perpetuates, means that parents in Singapore try to devote a lot of resources to a fewer children. I think this framework can be applied to other countries like South Korea as well. 