For my final project, I decided to leverage the knowledge I’ve gained in this course to illustrate the correlation between agriculture, supply chains, food security, and life expectancy. The reason I chose this topic is that agriculture has always resonated with me. Growing up on a farm, I developed a deep appreciation for food - seeing the process from planting and harvesting to being served on the table. I’m hopeful that, with advancements in technology, we can help people access food, regardless of where they are, viewing it as a necessity rather than a luxury.

For this assignment, I first looked into the percentage of the population that is undernourished. Then, I explored why certain regions are severely undernourished. Is it because there isn’t enough labor working in the fields? Is it due to a lower allocation of protein production, or is it because insufficient transportation makes it difficult for harvested goods to reach consumers? Finally, I am curious whether there is a direct correlation between food supply and life expectancy.

Disclaimer:
- The term "Sub-Saharan Africa" used in this project refers to the geographical region of the continent of Africa that lies south of the Sahara.
- All of the data of this assignment is coming from Out World in Data (https://ourworldindata.org/)
- This project and all relevant files can also be retrived from my GitHub: https://github.com/mariyyy9/Data_Visualization/blob/Final_Project/Mari_CSCA5702_Final_Project.ipynb

In [40]:
import pandas as pd
import altair as alt

# Pulling the undernourishement by 2 region t1_sample = 'Africa'
# t2_sample = 'Europe' and including 'World' as benchmark on borh table
t1 = pd.read_csv("prevalence-of-undernourishment.csv")
t1_sample = t1[t1['Country'].str.contains('Africa|World', case=False, na=False)]
chart1 = alt.Chart(t1_sample).mark_line().encode(
    x ='Year',
    y ='Percent',
    color = alt.Color('Country', type='nominal'),
    tooltip = ['Country','Percent']
).interactive()

t2 = pd.read_csv("prevalence-of-undernourishment.csv")
t2_sample = t2[t2['Country'].str.contains('Europe|World', case=False, na=False)]
chart2 = alt.Chart(t2_sample).mark_line().encode(
    x = 'Year',
    y = 'Percent',
    color = alt.Color('Country', type='nominal'),
    tooltip = ['Country','Percent']
).interactive()

chart1 | chart2

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


From the side-to-side description, it shows that only a few regions in Africa are above the world average, where the majority of the population is still undernourished, i.e., Middle Africa, East Africa, and most Sub-Saharan African countries. I then started looking at employment to see if the shortage was due to a lack of labor force, but shockingly, this is not the case.

In [64]:
t3 = pd.read_csv("value-added-in-the-agricultural.csv")
alt.Chart(t3).mark_circle().encode(
    x ='GDP per capita',
    y ='Agriculture (% of GDP)',
    color = alt.Color('Region', type='nominal'),
    tooltip = ['Country','Population (historical)'],
    size = 'Agriculture (% of GDP)'
).interactive()

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


From the illustration for the fiscal year 2021, where the majority of Sub-Saharan African countries heavily rely on agriculture as a key contributor to their economies - some countries approaching 40%. I now wonder if the lack of diversification in their agricultural produce is a major factor contributing to undernourishment.

In [79]:
t4 = pd.read_csv("meat-production-tonnes.csv")
t4_sample = t4[t4['Year'] == 2021]
t4_sample = t4[t4['Country'].str.contains('Brazil|United States|Russia|Sierra Leone|Niger')]
alt.Chart(t4_sample).mark_bar().encode(
    x = 'Country',
    y = 'Tonnes',
    color = alt.Color('Country', type='nominal'),
    tooltip = ['Country','Tonnes']
).interactive()

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


It seems that protein production is pointing me in a certain direction. For example, Russia has five times the population of Niger, yet their meat production is 100 times greater. The lack of production certainly plays a role in undernourishment, but what about shipping? How efficient is the transportation of post-harvest produce to people in these regions?

In [86]:
t5 = pd.read_csv("food-loss-postharvest-by-region.csv")
t5_sample = t5[t5['Country/Region'].str.contains('Africa|World')]
alt.Chart(t5_sample).mark_line().encode(
    x = 'Year',
    y = 'Percent',
    color = alt.Color('Country/Region', type='nominal'),
    tooltip = ['Country/Region','Percent']
)

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


The numbers are quite shocking. It appears that all regions in Africa experience significant food waste during transportation, with some areas losing nearly 25%. This definitely worsens the situation, on top of the lack of production diversification. It also points to potential areas for future focus, such as improving supply chains and developing technology for post-harvest storage.

Finally, I'm curious if there is a correlation between undernourishment and life expectancy.

In [100]:
t6 = pd.read_csv("food-supply-vs-life-expectancy.csv")
t6_sample = t6[t6['Year'] == 2021]
alt.Chart(t6_sample).mark_circle().encode(
    x = 'Life expectancy',
    y = 'Daily calorie supply per person',
    color = alt.Color('Country', type='nominal'),
    tooltip = ['Country','Daily calorie supply per person', 'Life expectancy'],
    size = 'Life expectancy'
).interactive()

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


The life expectancy in the majority of Sub-Saharan African countries falls below the world average, which is truly saddening to witness through the data. However, with the help of visualizations, the issues can be more easily identified and understood, providing clearer direction for addressing them.