# PROJECT OVERVIEW
The proposed project aims to conduct a time series analysis and prediction of the prices of basic food commodities in Kenya. By leveraging historical price data, the goal is to gain insights into the patterns, trends, and potential influencing factors affecting food prices over time. The ultimate objective is to develop predictive models that can contribute to informed decision-making and strategies to mitigate challenges related to food affordability and accessibility.

Given the significant portion of the population living in poverty and the challenges related to food security, the successful execution of this project has the potential to contribute to poverty alleviation, improved nutrition, and progress toward achieving the United Nations Sustainable Development Goal of zero hunger in Kenya.

### BUSINESS UNDERSTANDING

### FOOD AND NUTRITION SITUATION IN KENYA
Kenya, situated in Eastern Africa, faces significant socio-economic challenges, with 35.6% of its population living below the international poverty line of US$1.90 per day. In rural areas, 29% of children experience stunted growth, highlighting the pressing need for improved living conditions and nutritional access. The total population stands at 48.5 million, with a substantial portion struggling to secure adequate quantities of nutritious food.

### Challenges:
Access to quality and sufficient food remains a major hurdle for a third of the population, exacerbating issues related to poverty and malnutrition. This underscores the importance of addressing food security concerns and aligning efforts with the global goal of achieving zero hunger and improving overall nutrition.

### DATA UNDERSTANDING

### BUSINESS OBJECTIVES
>1.Gather an indepth understanding of the dynamics of food prices in Kenya.

>2.Develop reliable predictive models for future price trends.

>3.Provide informed recommendations for policymakers and stakeholders to address challenges related to food affordability and accessibility.

In [9]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
%matplotlib inline

In [10]:
data=pd.read_csv('Data/wfp_food_prices_ken.csv')
data.head(5)

Unnamed: 0,date,admin1,admin2,market,latitude,longitude,category,commodity,unit,priceflag,pricetype,currency,price,usdprice
0,#date,#adm1+name,#adm2+name,#loc+market+name,#geo+lat,#geo+lon,#item+type,#item+name,#item+unit,#item+price+flag,#item+price+type,#currency,#value,#value+usd
1,2006-01-15,Coast,Mombasa,Mombasa,-4.05,39.666667,cereals and tubers,Maize (white),90 KG,actual,Wholesale,KES,1480.0,20.5041
2,2006-01-15,Coast,Mombasa,Mombasa,-4.05,39.666667,pulses and nuts,Beans,KG,actual,Wholesale,KES,33.63,0.4659
3,2006-01-15,Coast,Mombasa,Mombasa,-4.05,39.666667,pulses and nuts,Beans (dry),90 KG,actual,Wholesale,KES,3246.0,44.9705
4,2006-01-15,Eastern,Kitui,Kitui,-1.366667,38.016667,cereals and tubers,Maize (white),KG,actual,Retail,KES,17.0,0.2355


In [11]:
#Dropping the first row
data.drop(0, inplace=True)
data.head()

Unnamed: 0,date,admin1,admin2,market,latitude,longitude,category,commodity,unit,priceflag,pricetype,currency,price,usdprice
1,2006-01-15,Coast,Mombasa,Mombasa,-4.05,39.666667,cereals and tubers,Maize (white),90 KG,actual,Wholesale,KES,1480.0,20.5041
2,2006-01-15,Coast,Mombasa,Mombasa,-4.05,39.666667,pulses and nuts,Beans,KG,actual,Wholesale,KES,33.63,0.4659
3,2006-01-15,Coast,Mombasa,Mombasa,-4.05,39.666667,pulses and nuts,Beans (dry),90 KG,actual,Wholesale,KES,3246.0,44.9705
4,2006-01-15,Eastern,Kitui,Kitui,-1.366667,38.016667,cereals and tubers,Maize (white),KG,actual,Retail,KES,17.0,0.2355
5,2006-01-15,Eastern,Kitui,Kitui,-1.366667,38.016667,cereals and tubers,Potatoes (Irish),50 KG,actual,Wholesale,KES,1249.99,17.3175


## EXPLORATORY DATA ANALYSIS

In [12]:
# General info about the data
data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 15735 entries, 1 to 15735
Data columns (total 14 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   date       15735 non-null  object
 1   admin1     15735 non-null  object
 2   admin2     15735 non-null  object
 3   market     15735 non-null  object
 4   latitude   15735 non-null  object
 5   longitude  15735 non-null  object
 6   category   15735 non-null  object
 7   commodity  15735 non-null  object
 8   unit       15735 non-null  object
 9   priceflag  15735 non-null  object
 10  pricetype  15735 non-null  object
 11  currency   15735 non-null  object
 12  price      15735 non-null  object
 13  usdprice   15735 non-null  object
dtypes: object(14)
memory usage: 1.8+ MB


In [16]:
data.columns.

Index(['date', 'admin1', 'admin2', 'market', 'latitude', 'longitude',
       'category', 'commodity', 'unit', 'priceflag', 'pricetype', 'currency',
       'price', 'usdprice'],
      dtype='object')

In [17]:
data.shape

(15735, 14)

In [27]:
print("Unique categories:")
print(data['category'].unique())
print('\n')
print("Unique commodities:")
print(data['commodity'].unique())
print('\n')
print("Unique units of measurements:")
print(data['unit'].unique())

Unique categories:
['cereals and tubers' 'pulses and nuts' 'milk and dairy' 'oil and fats'
 'non-food' 'meat, fish and eggs' 'miscellaneous food'
 'vegetables and fruits']


Unique commodities:
['Maize (white)' 'Beans' 'Beans (dry)' 'Potatoes (Irish)' 'Sorghum'
 'Bread' 'Maize' 'Milk (cow, pasteurized)' 'Oil (vegetable)'
 'Fuel (diesel)' 'Fuel (kerosene)' 'Fuel (petrol-gasoline)' 'Maize flour'
 'Rice' 'Wheat flour' 'Meat (beef)' 'Meat (goat)' 'Milk (UHT)' 'Sugar'
 'Cooking fat' 'Bananas' 'Kale' 'Onions (red)' 'Tomatoes'
 'Potatoes (Irish, red)' 'Beans (kidney)' 'Beans (rosecoco)'
 'Beans (yellow)' 'Cabbage' 'Onions (dry)' 'Spinach'
 'Potatoes (Irish, white)' 'Rice (aromatic)' 'Sorghum (red)'
 'Beans (dolichos)' 'Cowpeas' 'Cowpea leaves' 'Maize (white, dry)'
 'Beans (mung)' 'Millet (finger)' 'Rice (imported, Pakistan)'
 'Fish (omena, dry)' 'Sorghum (white)' 'Salt' 'Meat (camel)'
 'Milk (camel, fresh)' 'Milk (cow, fresh)']


Unique units of measurements:
['90 KG' 'KG' '50 KG' '400 G' '50

In [29]:
#Checking for null  values
data.isna().sum()

date         0
admin1       0
admin2       0
market       0
latitude     0
longitude    0
category     0
commodity    0
unit         0
priceflag    0
pricetype    0
currency     0
price        0
usdprice     0
dtype: int64

In [30]:
#Checking the datatypes of columns
data.dtypes


date         object
admin1       object
admin2       object
market       object
latitude     object
longitude    object
category     object
commodity    object
unit         object
priceflag    object
pricetype    object
currency     object
price        object
usdprice     object
dtype: object