# FAOStat Dataset: Time Series

## Business Understanding 

### Overview

Kenya's food production plays a crucial role in ensuring food security for its population. The country's agricultural sector employs a significant portion of the population and contributes to the national economy. Kenya is known for its diverse agricultural activities, including crop cultivation, livestock rearing, and fisheries.
In recent years, Kenya has made strides to improve food production through various initiatives, including promoting modern farming techniques, investing in irrigation infrastructure, and supporting small-scale farmers. These efforts have led to increased agricultural productivity and improved crop yields.
However, despite these advancements, food production in Kenya still faces challenges that affect its sufficiency. Climate change, unpredictable weather patterns, and recurrent droughts pose significant risks to agricultural productivity. Additionally, limited access to affordable inputs, inadequate infrastructure, and post-harvest losses contribute to the food production challenges.
As a result, Kenya occasionally experiences food shortages and relies on imports to meet the country's food demands. Despite efforts to enhance domestic food production, there is a need for further investment in sustainable agriculture, resilient farming practices, and improved market access to ensure long-term food sufficiency in Kenya.
Overall, while Kenya has made progress in food production, there is still work to be done to achieve full sufficiency. Continued efforts to address challenges and invest in sustainable agricultural practices are essential to enhance food security and meet the growing demands of the population.

### Problem Statement

The current state of food production in Kenya poses challenges to ensuring sufficient food supply for the growing population. Despite efforts to improve agricultural productivity, factors such as climate change, unpredictable weather patterns, and limited access to resources continue to impact the ability to accurately forecast and meet the population's food needs.

There is a need for a reliable prediction model that can forecast food production in Kenya to assess whether it will be sufficient to meet the population's requirements. Such a model would help policymakers, agricultural stakeholders, and government agencies make informed decisions regarding food security, resource allocation, and import/export planning.

By leveraging historical data, real-time information, and advanced analytical techniques, the model would provide valuable insights into future food production levels, helping to identify potential shortfalls or surpluses.

The development of a prediction model would support proactive planning and decision-making processes, allowing stakeholders to take appropriate measures in advance to bridge any potential food supply gaps. It would aid in optimizing resource allocation, promoting sustainable farming practices, and implementing targeted interventions to ensure food sufficiency for Kenya's population.

Therefore, the problem at hand is the lack of a reliable prediction model that accurately forecasts food production, which hinders the ability to determine whether it will be sufficient to meet the growing population's needs. Developing such a model would greatly contribute to enhancing food security, optimizing resource allocation, and ensuring the well-being of the Kenyan population.

### Objectives

The objectives of the prediction model for food production in Kenya are as follows:

1. Forecasting Food Production: The primary objective of the model is to accurately predict food production levels in Kenya. By analyzing historical data, current conditions, and relevant variables, the model aims to provide forecasts that reflect the expected output of crops, livestock, and other food sources.

2. Assessing Food Sufficiency: The model seeks to determine whether the projected food production will be sufficient to meet the needs of the population. It aims to assess the adequacy of food supply in order to identify potential shortfalls or surpluses.

3. Informing Decision-Making: The model aims to provide valuable insights to policymakers, government agencies, and agricultural stakeholders. By offering reliable predictions, the model can inform decision-making processes related to resource allocation, import/export planning, and interventions to ensure food security.

4. Optimizing Resource Allocation: The model aims to optimize the allocation of resources by identifying areas of potential food shortages or surpluses. This can help in directing resources, such as irrigation, fertilizers, and agricultural investments, to areas that require them the most.

5. Promoting Sustainable Farming Practices: By considering various factors that impact food production, such as climate conditions and agricultural practices, the model can promote sustainable farming techniques. It can provide recommendations for resilient and environmentally-friendly practices that enhance productivity while minimizing negative impacts.

6. Enhancing Food Security: Ultimately, the objective of the prediction model is to contribute to improving food security in Kenya. By accurately forecasting food production and assessing sufficiency, the model aims to support proactive measures that ensure a consistent and adequate food supply for the growing population.

These objectives collectively aim to provide valuable insights, aid decision-making processes, and contribute to long-term food security in Kenya.

## Data Understanding

In [1]:
import pandas as pd

In [2]:
#previewing the dataset
faoDf = pd.read_csv('FaoStat_EA.csv')
faoDf.head(10)

Unnamed: 0,Area Code (M49),Area,Element Code,Element,Item Code (CPC),Item,Year,Unit,Value,Flag,Flag Description
0,108,Burundi,511,Total Population - Both sexes,F2501,Population,2014,1000 No,9844.3,X,Figure from international organizations
1,108,Burundi,511,Total Population - Both sexes,F2501,Population,2015,1000 No,10160.03,X,Figure from international organizations
2,108,Burundi,511,Total Population - Both sexes,F2501,Population,2016,1000 No,10488.0,X,Figure from international organizations
3,108,Burundi,511,Total Population - Both sexes,F2501,Population,2017,1000 No,10827.02,X,Figure from international organizations
4,108,Burundi,511,Total Population - Both sexes,F2501,Population,2018,1000 No,11175.37,X,Figure from international organizations
5,108,Burundi,511,Total Population - Both sexes,F2501,Population,2019,1000 No,11530.58,X,Figure from international organizations
6,108,Burundi,511,Total Population - Both sexes,F2501,Population,2020,1000 No,11890.78,X,Figure from international organizations
7,108,Burundi,5301,Domestic supply quantity,F2501,Population,2014,1000 t,0.0,I,Imputed value
8,108,Burundi,5301,Domestic supply quantity,F2501,Population,2015,1000 t,0.0,I,Imputed value
9,108,Burundi,5301,Domestic supply quantity,F2501,Population,2016,1000 t,0.0,I,Imputed value


In [3]:
faoDf.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38257 entries, 0 to 38256
Data columns (total 11 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   Area Code (M49)   38257 non-null  int64  
 1   Area              38257 non-null  object 
 2   Element Code      38257 non-null  int64  
 3   Element           38257 non-null  object 
 4   Item Code (CPC)   38257 non-null  object 
 5   Item              38257 non-null  object 
 6   Year              38257 non-null  int64  
 7   Unit              38257 non-null  object 
 8   Value             38257 non-null  float64
 9   Flag              38257 non-null  object 
 10  Flag Description  38257 non-null  object 
dtypes: float64(1), int64(3), object(7)
memory usage: 3.2+ MB


In [4]:
# Reset the index
df_reset = faoDf.reset_index()
# Pivot the 'Element' column into separate columns
df_pivot = df_reset.pivot(index=['index','Area Code (M49)', 'Area', 'Element', 'Unit', 'Value', 'Flag', 'Flag Description'],
                          columns='Year',
                          values='Value')
# Reset the index to convert the remaining columns to regular columns
df_pivot = df_pivot.reset_index()
# Merge the pivoted columns with the original DataFrame
df_merged = pd.merge(df_reset, df_pivot, on=['index','Area Code (M49)', 'Area','Element', 'Unit', 'Value', 'Flag', 'Flag Description'])
df_merged.head(10)

Unnamed: 0,index,Area Code (M49),Area,Element Code,Element,Item Code (CPC),Item,Year,Unit,Value,...,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
0,0,108,Burundi,511,Total Population - Both sexes,F2501,Population,2014,1000 No,9844.3,...,,,,9844.3,,,,,,
1,1,108,Burundi,511,Total Population - Both sexes,F2501,Population,2015,1000 No,10160.03,...,,,,,10160.03,,,,,
2,2,108,Burundi,511,Total Population - Both sexes,F2501,Population,2016,1000 No,10488.0,...,,,,,,10488.0,,,,
3,3,108,Burundi,511,Total Population - Both sexes,F2501,Population,2017,1000 No,10827.02,...,,,,,,,10827.02,,,
4,4,108,Burundi,511,Total Population - Both sexes,F2501,Population,2018,1000 No,11175.37,...,,,,,,,,11175.37,,
5,5,108,Burundi,511,Total Population - Both sexes,F2501,Population,2019,1000 No,11530.58,...,,,,,,,,,11530.58,
6,6,108,Burundi,511,Total Population - Both sexes,F2501,Population,2020,1000 No,11890.78,...,,,,,,,,,,11890.78
7,7,108,Burundi,5301,Domestic supply quantity,F2501,Population,2014,1000 t,0.0,...,,,,0.0,,,,,,
8,8,108,Burundi,5301,Domestic supply quantity,F2501,Population,2015,1000 t,0.0,...,,,,,0.0,,,,,
9,9,108,Burundi,5301,Domestic supply quantity,F2501,Population,2016,1000 t,0.0,...,,,,,,0.0,,,,


In [5]:
df = faoDf.copy()

# Pivot the dataframe
pivot_df = df.pivot(index=['Area Code (M49)', 'Area', 'Element Code', 'Element', 'Item Code (CPC)', 'Item', 'Unit', 'Flag',
                          'Flag Description'],
                    columns='Year',
                    values='Value').reset_index()

In [6]:
pivot_df

Year,Area Code (M49),Area,Element Code,Element,Item Code (CPC),Item,Unit,Flag,Flag Description,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
0,108,Burundi,511,Total Population - Both sexes,F2501,Population,1000 No,X,Figure from international organizations,,,,,9844.30,10160.03,10488.00,10827.02,11175.37,11530.58,11890.78
1,108,Burundi,645,Food supply quantity (kg/capita/yr),F2511,Wheat and products,kg,E,Estimated value,,,,,4.74,2.48,5.26,6.41,7.02,7.63,5.63
2,108,Burundi,645,Food supply quantity (kg/capita/yr),F2513,Barley and products,kg,E,Estimated value,,,,,0.00,0.00,0.00,0.00,0.00,0.00,0.00
3,108,Burundi,645,Food supply quantity (kg/capita/yr),F2514,Maize and products,kg,E,Estimated value,,,,,13.57,16.08,23.66,23.67,28.49,23.92,21.54
4,108,Burundi,645,Food supply quantity (kg/capita/yr),F2515,Rye and products,kg,E,Estimated value,,,,,,,,0.00,0.00,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4479,834,United Republic of Tanzania,5911,Export Quantity,F2782,"Fish, Liver Oil",1000 t,E,Estimated value,,,,,,,,,,,0.00
4480,834,United Republic of Tanzania,5911,Export Quantity,F2782,"Fish, Liver Oil",1000 t,I,Imputed value,0.0,0.0,0.0,0.0,0.00,0.00,0.00,0.00,0.00,0.00,
4481,834,United Republic of Tanzania,5911,Export Quantity,F2807,Rice and products,1000 t,I,Imputed value,75.0,54.0,27.0,79.0,107.00,23.00,19.00,1.00,46.00,171.00,527.00
4482,834,United Republic of Tanzania,5911,Export Quantity,F2848,Milk - Excluding Butter,1000 t,I,Imputed value,0.0,0.0,0.0,0.0,0.00,0.00,0.00,0.00,0.00,0.00,0.00


In [7]:
# Filter the dataframe
filtered_df = pivot_df[(pivot_df['Element'].str.contains('Production'))
                 & (pivot_df['Item'] != 'Population')]

# Print the filtered dataframe
filtered_df

Year,Area Code (M49),Area,Element Code,Element,Item Code (CPC),Item,Unit,Flag,Flag Description,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
357,108,Burundi,5511,Production,F2511,Wheat and products,1000 t,I,Imputed value,,,,,6.00,7.0,8.0,8.0,23.0,5.0,9.0
358,108,Burundi,5511,Production,F2514,Maize and products,1000 t,I,Imputed value,,,,,128.00,161.0,244.0,228.0,290.0,271.0,260.0
359,108,Burundi,5511,Production,F2517,Millet and products,1000 t,I,Imputed value,,,,,10.00,10.0,10.0,10.0,10.0,10.0,11.0
360,108,Burundi,5511,Production,F2518,Sorghum and products,1000 t,I,Imputed value,,,,,22.00,31.0,30.0,25.0,28.0,9.0,25.0
361,108,Burundi,5511,Production,F2520,"Cereals, Other",1000 t,I,Imputed value,,,,,3.00,13.0,30.0,15.0,18.0,11.0,11.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4265,834,United Republic of Tanzania,5511,Production,F2781,"Fish, Body Oil",1000 t,I,Imputed value,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,
4266,834,United Republic of Tanzania,5511,Production,F2782,"Fish, Liver Oil",1000 t,E,Estimated value,,,,,,,,,,,0.0
4267,834,United Republic of Tanzania,5511,Production,F2782,"Fish, Liver Oil",1000 t,I,Imputed value,0.0,0.0,0.0,0.00,0.00,0.0,0.0,0.0,0.0,0.0,
4268,834,United Republic of Tanzania,5511,Production,F2807,Rice and products,1000 t,I,Imputed value,2650.0,2248.0,1801.0,2195.00,1681.00,1937.0,2229.0,2452.0,3415.0,3475.0,4528.0


In [13]:
filtered_df["Item"].nunique()

94

In [8]:
filtered_df.shape

(533, 20)

In [10]:
filtered_df.corr(numeric_only=True)

Year,Area Code (M49),Element Code,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019,2020
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Area Code (M49),1.0,,-0.054128,-0.05018,-0.054177,-0.054919,-0.008238,-0.007903,-0.012965,-0.010458,-0.003826,-0.017723,-0.013933
Element Code,,,,,,,,,,,,,
2010,-0.054128,,1.0,0.998454,0.997928,0.996991,0.993084,0.993125,0.992062,0.989254,0.991707,0.987776,0.988815
2011,-0.05018,,0.998454,1.0,0.998478,0.997755,0.993504,0.993378,0.99292,0.992126,0.992202,0.988966,0.989086
2012,-0.054177,,0.997928,0.998478,1.0,0.998321,0.993774,0.994109,0.993051,0.990841,0.992711,0.9878,0.988129
2013,-0.054919,,0.996991,0.997755,0.998321,1.0,0.996045,0.996023,0.995459,0.992834,0.991792,0.986225,0.987886
2014,-0.008238,,0.993084,0.993504,0.993774,0.996045,1.0,0.998549,0.997781,0.994999,0.993362,0.988279,0.989953
2015,-0.007903,,0.993125,0.993378,0.994109,0.996023,0.998549,1.0,0.997931,0.992768,0.993889,0.988601,0.990147
2016,-0.012965,,0.992062,0.99292,0.993051,0.995459,0.997781,0.997931,1.0,0.995339,0.992473,0.985967,0.987589
2017,-0.010458,,0.989254,0.992126,0.990841,0.992834,0.994999,0.992768,0.995339,1.0,0.991686,0.988089,0.988147
