![Callysto.ca Banner](https://github.com/callysto/curriculum-notebooks/blob/master/callysto-notebook-banner-top.jpg?raw=true)

<a href="https://hub.callysto.ca/jupyter/hub/user-redirect/git-pull?repo=https%3A%2F%2Fgithub.com%2Fcallysto%2Fcurriculum-notebooks&branch=master&subPath=SocialStudies/HansardAnalysis/hansard-analysis.ipynb&depth=1" target="_parent"><img src="https://raw.githubusercontent.com/callysto/curriculum-notebooks/master/open-in-callysto-button.svg?sanitize=true" width="123" height="24" alt="Open in Callysto"/></a>

# Callysto's Weekly Data Visualization


## Global Food Prices

### Recommended Grade levels: 5-9

### Instructions

Click "Cell" and select "Run All".

This will import the data and run all the code, so you can see this week's data visualization. Scroll back to the top after you’ve run the cells.

![instructions](https://github.com/callysto/data-viz-of-the-week/blob/main/images/instructions.png?raw=true)

**You don't need to do any coding to view the visualizations**.

The plots generated in this notebook are interactive. You can hover over and click on elements to see more information. 

Email contact@callysto.ca if you experience issues.

### About this Notebook

Callysto's Weekly Data Visualization is a learning resource that aims to develop data literacy skills. We provide Grades 5-12 teachers and students with a data visualization, like a graph, to interpret. This companion resource walks learners through how the data visualization is created and interpreted by a data scientist. 

The steps of the data analysis process are listed below and applied to each weekly topic.

1. Question - What are we trying to answer?
2. Gather - Find the data source(s) you will need. 
3. Organize - Arrange the data, so that you can easily explore it. 
4. Explore - Examine the data to look for evidence to answer the question. This includes creating visualizations. 
5. Interpret - Describe what's happening in the data visualization. 
6. Communicate - Explain how the evidence answers the question. 

## Question

What are the global food prices around the world, and how do fluctuations in global food prices impact economies and societies around the world?

### Goal



### Background



## Gather

Global food price data was collected through [Kaggle](https://www.kaggle.com/datasets/lasaljaywardena/global-food-prices-dataset?resource=download), which is sourced by the [WFP](https://www.wfp.org/) (The World Food Program). The dataset is distributed by [HDX](https://data.humdata.org/). 

### Code: 

Run the code cells below to import the libraries we need for this project. Libraries are pre-made code that make it easier to analyze our data.

In [11]:
import pandas as pd
import plotly_express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
print("Libaries imported")

Libaries imported


In [12]:
data_types = {'adm1_name': str}
global_food_prices= pd.read_csv('global_food_prices.csv', dtype=data_types)

global_food_prices

Unnamed: 0,adm0_name,adm1_name,mkt_name,cm_name,cur_name,pt_name,um_name,mp_month,mp_year,mp_price,mp_commoditysource
0,Afghanistan,Badakhshan,Fayzabad,Bread - Retail,AFN,Retail,KG,1,2014,50.0000,
1,Afghanistan,Badakhshan,Fayzabad,Bread - Retail,AFN,Retail,KG,2,2014,50.0000,
2,Afghanistan,Badakhshan,Fayzabad,Bread - Retail,AFN,Retail,KG,3,2014,50.0000,
3,Afghanistan,Badakhshan,Fayzabad,Bread - Retail,AFN,Retail,KG,4,2014,50.0000,
4,Afghanistan,Badakhshan,Fayzabad,Bread - Retail,AFN,Retail,KG,5,2014,50.0000,
...,...,...,...,...,...,...,...,...,...,...,...
1439616,Zimbabwe,Midlands,Mbilashaba,Salt - Retail,ZWL,Retail,KG,6,2021,71.0000,
1439617,Zimbabwe,Midlands,Mbilashaba,Beans (sugar) - Retail,ZWL,Retail,KG,6,2021,233.3333,
1439618,Zimbabwe,Midlands,Mbilashaba,Toothpaste - Retail,ZWL,Retail,100 ML,6,2021,112.5000,
1439619,Zimbabwe,Midlands,Mbilashaba,Laundry soap - Retail,ZWL,Retail,KG,6,2021,114.0000,


In [25]:
print(global_food_prices.dtypes)

adm0_name              object
adm1_name              object
mkt_name               object
cm_name                object
cur_name               object
pt_name                object
um_name                object
mp_month                int64
mp_year                 int64
mp_price              float64
mp_commoditysource    float64
dtype: object


In [26]:
global_food_prices['pt_name'].value_counts()

Retail       1287998
Wholesale     150711
Farm Gate        664
Producer         248
Name: pt_name, dtype: int64

In [14]:
count_of_food = global_food_prices['cm_name'].value_counts().nlargest(20)
top_20 = pd.DataFrame({'cm_name': count_of_food.index, 'count': count_of_food.values})

top_20_foods = px.bar(top_20, x='cm_name', y='count', title='Top 20 Foods by Highest Count', color='count', labels={"count":"Count", "cm_name":"Foods"})
top_20_foods.show()

In [15]:
highest_year = global_food_prices['mp_year'].max()
lowest_year = global_food_prices['mp_year'].min()
print(f"Highest Year: {highest_year}")
print(f"Lowest Year: {lowest_year}")

Highest Year: 2021
Lowest Year: 1990


In [16]:
high_temp = global_food_prices[global_food_prices['mp_year'] == highest_year]
low_temp = global_food_prices[global_food_prices['mp_year'] == lowest_year]

top_20_foods_highest_year = high_temp['cm_name'].value_counts().nlargest(20)
top_20_highest_year = pd.DataFrame({'cm_name': top_20_foods_highest_year.index, 'count': top_20_foods_highest_year.values})

top_20_foods_lowest_year = low_temp['cm_name'].value_counts().nlargest(20)
top_20_lowest_year = pd.DataFrame({'cm_name': top_20_foods_lowest_year.index, 'count': top_20_foods_lowest_year.values})

high_low_fig = make_subplots(rows=2, cols=1, vertical_spacing=0.2)

high_low_fig.add_trace(go.Bar(x=top_20_highest_year['cm_name'], y=top_20_highest_year['count'], name='Highest Year'), row=1, col=1)
high_low_fig.add_trace(go.Bar(x=top_20_lowest_year['cm_name'], y=top_20_lowest_year['count'], name='Lowest Year'), row=2, col=1)

high_low_fig.update_layout(title_text=f"Top 20 Foods by Highest Count in Highest ({highest_year}) and Lowest ({lowest_year} Years")
high_low_fig.update_yaxes(title="Count", row=1, col=1)
high_low_fig.update_yaxes(title="Count", row=2, col=1)


high_low_fig.update_layout(height=800)

high_low_fig.show()

In [20]:
millet_retail_data = global_food_prices[global_food_prices['cm_name'] == 'Millet - Retail']

indices_highest_prices = millet_retail_data.groupby('adm0_name')['mp_price'].idxmax()

millet_retail_highest_prices = millet_retail_data.loc[indices_highest_prices]
millet_retail_highest_prices = millet_retail_highest_prices[millet_retail_highest_prices['mp_year']== 2021].reset_index(drop=True)
millet_plots = px.scatter(millet_retail_highest_prices, x='adm0_name', y='mp_price', color='mp_price', title='Prices of Retail Millet in 2021', hover_data=['cur_name', 'mp_month']).show()

While it is difficult to consistently convert the currencies, we can get a general overview of which prices are considered high and low as currencies do not fluctuate often. Let's use CAD to convert the currencies for simplicity's sake. As of August 2023, *Benin* has the highest cost at 4.6040 CAD while Cameroon has the lowest cost at 0.8905 CAD. The other countries hover from approximately 1.15 CAD to 1.75 CAD. 

In [31]:
all_countries = global_food_prices['adm0_name'].unique()
print(f"Countries in the dataset: \n{all_countries}")

Countries in the dataset: 
['Afghanistan' 'Algeria' 'Angola' 'Argentina' 'Bangladesh' 'Belarus'
 'Benin' 'Bolivia' 'Burkina Faso' 'Burundi' 'Cambodia' 'Cameroon'
 'Cape Verde' 'Central African Republic' 'China' 'Colombia' 'Congo'
 "Cote d'Ivoire" 'Democratic Republic of the Congo' 'Djibouti' 'Ecuador'
 'Eritrea' 'Ethiopia' 'Gabon' 'Gambia' 'Georgia' 'Ghana' 'Guinea'
 'Guinea-Bissau' 'Haiti' 'Iran  (Islamic Republic of)' 'Japan'
 'Kazakhstan' 'Kenya' "Lao People's Democratic Republic" 'Lesotho'
 'Liberia' 'Madagascar' 'Malawi' 'Mali' 'Mauritania' 'Mexico'
 'Moldova Republic of' 'Mongolia' 'Myanmar' 'Namibia' 'Nepal' 'Niger'
 'Nigeria' 'Pakistan' 'Paraguay' 'Peru' 'Philippines' 'Russian Federation'
 'Rwanda' 'Senegal' 'Sierra Leone' 'Somalia' 'South Africa'
 'State of Palestine' 'Sudan' 'Syrian Arab Republic' 'Thailand'
 'Timor-Leste' 'Togo' 'Uganda' 'United Republic of Tanzania' 'Viet Nam'
 'Yemen' 'Zambia' 'Zimbabwe']


In [32]:
# Change user_country to any country found above
# For example, instead of 'Japan', you can use 'Somalia'
user_country = 'Japan'

country_subset = global_food_prices[global_food_prices['adm0_name'] == user_country]

indices_highest_prices = country_subset.groupby(['mp_year', 'mkt_name'])['mp_price'].idxmax()
indices_lowest_prices = country_subset.groupby(['mp_year', 'mkt_name'])['mp_price'].idxmin()

highest_prices_per_year_market = country_subset.loc[indices_highest_prices].reset_index(drop=True)
lowest_prices_per_year_market = country_subset.loc[indices_lowest_prices].reset_index(drop=True)

print("Highest Prices per Year and Market:")
display(highest_prices_per_year_market)

print("\nLowest Prices per Year and Market:")
display(lowest_prices_per_year_market)

Highest Prices per Year and Market:


Unnamed: 0,adm0_name,adm1_name,mkt_name,cm_name,cur_name,pt_name,um_name,mp_month,mp_year,mp_price,mp_commoditysource
0,Japan,Oosaka,Osaka,Rice - Retail,JPY,Retail,5 KG,12,2011,2299.0,
1,Japan,Tookyoo,Tokyo,Rice - Retail,JPY,Retail,5 KG,12,2011,2493.0,
2,Japan,Oosaka,Osaka,Rice - Retail,JPY,Retail,5 KG,11,2012,2456.0,
3,Japan,Tookyoo,Tokyo,Rice - Retail,JPY,Retail,5 KG,11,2012,2609.0,
4,Japan,Oosaka,Osaka,Rice - Retail,JPY,Retail,5 KG,1,2013,2417.0,
5,Japan,Tookyoo,Tokyo,Rice - Retail,JPY,Retail,5 KG,3,2013,2627.0,
6,Japan,Oosaka,Osaka,Rice - Retail,JPY,Retail,5 KG,5,2014,2327.0,
7,Japan,Tookyoo,Tokyo,Rice - Retail,JPY,Retail,5 KG,5,2014,2498.0,
8,Japan,Oosaka,Osaka,Rice - Retail,JPY,Retail,5 KG,12,2015,2112.0,
9,Japan,Tookyoo,Tokyo,Rice - Retail,JPY,Retail,5 KG,11,2015,2331.0,



Lowest Prices per Year and Market:


Unnamed: 0,adm0_name,adm1_name,mkt_name,cm_name,cur_name,pt_name,um_name,mp_month,mp_year,mp_price,mp_commoditysource
0,Japan,Oosaka,Osaka,"Rice (glutinous, unmilled) - Retail",JPY,Retail,KG,12,2011,488.0,
1,Japan,Tookyoo,Tokyo,"Rice (glutinous, unmilled) - Retail",JPY,Retail,KG,7,2011,578.0,
2,Japan,Oosaka,Osaka,"Rice (glutinous, unmilled) - Retail",JPY,Retail,KG,12,2012,491.0,
3,Japan,Tookyoo,Tokyo,"Rice (glutinous, unmilled) - Retail",JPY,Retail,KG,3,2012,568.0,
4,Japan,Oosaka,Osaka,"Rice (glutinous, unmilled) - Retail",JPY,Retail,KG,4,2013,504.0,
5,Japan,Tookyoo,Tokyo,"Rice (glutinous, unmilled) - Retail",JPY,Retail,KG,11,2013,567.0,
6,Japan,Oosaka,Osaka,Radish - Retail,JPY,Retail,KG,12,2014,141.0,
7,Japan,Tookyoo,Tokyo,Radish - Retail,JPY,Retail,KG,11,2014,117.0,
8,Japan,Oosaka,Osaka,Radish - Retail,JPY,Retail,KG,12,2015,130.0,
9,Japan,Tookyoo,Tokyo,Radish - Retail,JPY,Retail,KG,12,2015,111.0,


In [48]:
mean_prices_per_product_year = country_subset.groupby(['cm_name', 'mp_year'])['mp_price'].mean().reset_index()
mean_prices_per_product_year = mean_prices_per_product_year.sort_values(by='mp_year').reset_index(drop=True)
mean_prices_per_product_year

Unnamed: 0,cm_name,mp_year,mp_price
0,Rice - Retail,2011,2283.333333
1,"Rice (glutinous, unmilled) - Retail",2011,544.500000
2,Rice - Retail,2012,2465.875000
3,"Rice (glutinous, unmilled) - Retail",2012,552.541667
4,"Rice (glutinous, unmilled) - Retail",2013,558.958333
...,...,...,...
57,Cabbage - Retail,2020,208.222222
58,Radish - Retail,2020,186.833333
59,Potatoes - Retail,2020,418.777778
60,Sugar - Retail,2020,192.611111


In [None]:
mean_prices_fig = px.bar(mean_prices_per_product_year, x='cm_name', y='mp_price', color='mp_year', title='Mean Prices per Product per Year')