# Global Food Production

---

<img src="notebook_imgs/pop_graph.jpg" width='400px'/>
<figcaption><center><i>Projected world population growth (6).</i></center></figcaption>
    
---

### Introduction

The current world population is nearly 7.8 billion people, and this number is estimated to rise to around 9.7 billion in the year 2050. This means within the next 30 years, we will need to feed two billion more people without sacrificing the planet
(1). This means we will need to double our crop production in order to feed that growing population. Agriculture is one of the greatest contributors of global warming, with farming consuming immense amounts of our water supplies and leaving major pollutants as its byproduct from fertilizer runoff (4). So how do we increase food supply without destroying our environment?
    
With a continuously growing population this leads to the question, how do we supply the necessary amount of food for an increasing world population without sacrificing the climate of our planet? There are many solutions to this question, but the focus of this notebook will be on analyzing the global production of consumables, as well as the ratio of food (human consumption) to feed (livestock consumption) produced by each country. Only 55% of the current world crop production is consumed by humans, with the remaining being fed to livestock. On top of this, nearly 25% of the world’s food calories are wasted before they can be consumed (4).
    
Exploring the most produced food items as well as their use case will allow us to understand where a majority of our food is being directed, and whether or not a shift in diet could lead to more crops being used for food instead of feed. Global population trends will also be explored in this notebook, analyzing the change in global population compared to crop production. We will also see if top producing countries have the fastest growing populations over the last 50 years.

---
### Data
- `FAO_FOOD_STAT` is a Food Balance Sheet obtained through [Kaggle](https://www.kaggle.com/dorbicycle/world-foodfeed-production) that originated from [FAO](http://www.fao.org/faostat/en/#data/FBSH) but has been reformatted for easy of use. It contains the yearly food production of 115 food items for 174 countries, spanning from 1961 to 2013. It also is broken down into food and feed categories, which represent human and livestock consumables (respectively).


- `FAO_POP` is an Annual Population Report obtained through [FAO](http://www.fao.org/faostat/en/#data/OA) that contains the estimated population for 245 countries from 1961 to 2018. An important note is that this dataset contains 71 more countries than the first dataset, but will be kept in for total population purposes. We will drop the 5 extra years for this notebook, but in modeling we will keep years 2013-2018 for prediction purposes.
---
    
### Notebook Outline
    
1) [Import Modules and Data Prep](#section1)
- Import necessary libraries for data exploration tasks.
- Load in `FAO_FOOD_STAT` and `FAO_POP` datasets, as well as displaying the header for both.
- Drop all unnecessary columns and fill any missing values that need to be handled.
   
2) [Reformatting Population Data](#section2)
- We will want to reformat the population dataset so it is similar to the production dataset.
- Turn year from a single column, to multiple columns for each year.

3) [Production Exploration](#section3)
- *To be filled in...*


<a id='section1'></a>

## 1. Import Modules and Data

In [3]:
# import general libraries for data exploration and cleaning
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# set default parameters for graphs
sns.set_style('darkgrid')
plt.rcParams['font.size'] = 12

In [8]:
data = pd.read_csv('FAO_FOOD_STAT.csv')
pop = pd.read_csv('FAO_POP.csv')

In [9]:
data.head()

Unnamed: 0,Area Abbreviation,Area Code,Area,Item Code,Item,Element Code,Element,Unit,latitude,longitude,...,Y2004,Y2005,Y2006,Y2007,Y2008,Y2009,Y2010,Y2011,Y2012,Y2013
0,AFG,2,Afghanistan,2511,Wheat and products,5142,Food,1000 tonnes,33.94,67.71,...,3249.0,3486.0,3704.0,4164.0,4252.0,4538.0,4605.0,4711.0,4810,4895
1,AFG,2,Afghanistan,2805,Rice (Milled Equivalent),5142,Food,1000 tonnes,33.94,67.71,...,419.0,445.0,546.0,455.0,490.0,415.0,442.0,476.0,425,422
2,AFG,2,Afghanistan,2513,Barley and products,5521,Feed,1000 tonnes,33.94,67.71,...,58.0,236.0,262.0,263.0,230.0,379.0,315.0,203.0,367,360
3,AFG,2,Afghanistan,2513,Barley and products,5142,Food,1000 tonnes,33.94,67.71,...,185.0,43.0,44.0,48.0,62.0,55.0,60.0,72.0,78,89
4,AFG,2,Afghanistan,2514,Maize and products,5521,Feed,1000 tonnes,33.94,67.71,...,120.0,208.0,233.0,249.0,247.0,195.0,178.0,191.0,200,200


In [11]:
# drop columns that will not be used in exploration
data.drop(['Item Code', 'Area Code', 'Element Code', 'latitude', 'longitude'], axis=1, inplace=True)

data.head()

Unnamed: 0,Area Abbreviation,Area,Item,Element,Unit,Y1961,Y1962,Y1963,Y1964,Y1965,...,Y2004,Y2005,Y2006,Y2007,Y2008,Y2009,Y2010,Y2011,Y2012,Y2013
0,AFG,Afghanistan,Wheat and products,Food,1000 tonnes,1928.0,1904.0,1666.0,1950.0,2001.0,...,3249.0,3486.0,3704.0,4164.0,4252.0,4538.0,4605.0,4711.0,4810,4895
1,AFG,Afghanistan,Rice (Milled Equivalent),Food,1000 tonnes,183.0,183.0,182.0,220.0,220.0,...,419.0,445.0,546.0,455.0,490.0,415.0,442.0,476.0,425,422
2,AFG,Afghanistan,Barley and products,Feed,1000 tonnes,76.0,76.0,76.0,76.0,76.0,...,58.0,236.0,262.0,263.0,230.0,379.0,315.0,203.0,367,360
3,AFG,Afghanistan,Barley and products,Food,1000 tonnes,237.0,237.0,237.0,238.0,238.0,...,185.0,43.0,44.0,48.0,62.0,55.0,60.0,72.0,78,89
4,AFG,Afghanistan,Maize and products,Feed,1000 tonnes,210.0,210.0,214.0,216.0,216.0,...,120.0,208.0,233.0,249.0,247.0,195.0,178.0,191.0,200,200


In [12]:
pop.head()

Unnamed: 0,Domain,Area,Element,Item,Year,Unit,Value,Note
0,Annual population,Afghanistan,Total Population - Both sexes,Population - Est. & Proj.,1961,1000 persons,9169.41,
1,Annual population,Afghanistan,Total Population - Both sexes,Population - Est. & Proj.,1962,1000 persons,9351.441,
2,Annual population,Afghanistan,Total Population - Both sexes,Population - Est. & Proj.,1963,1000 persons,9543.205,
3,Annual population,Afghanistan,Total Population - Both sexes,Population - Est. & Proj.,1964,1000 persons,9744.781,
4,Annual population,Afghanistan,Total Population - Both sexes,Population - Est. & Proj.,1965,1000 persons,9956.32,


In [16]:
# drop columns that will not be used in exploration
pop.drop(['Domain','Element', 'Item', 'Note'], axis=1, inplace=True)

pop.head()

Unnamed: 0,Area,Year,Unit,Value
0,Afghanistan,1961,1000 persons,9169.41
1,Afghanistan,1962,1000 persons,9351.441
2,Afghanistan,1963,1000 persons,9543.205
3,Afghanistan,1964,1000 persons,9744.781
4,Afghanistan,1965,1000 persons,9956.32


<a id='section2'></a>

## 2. Reformatting Population Data

We want to reformat the population data in a way that resembles the production dataset so it will be easier to graph. To do this, we will turn the `Year` column into `Y1961`, ..., `Y2013` for each country. We will also drop years 2013 to 2018 to be consistent when graphing, and to not have any issues with the population being longer than the production years.

<a id='section3'></a>

## 3. Production Exploration