# Nigerian Food Price Prediction - Notebook 1 (Lagos Only)

## Objective
This notebook will:
1. Load the raw Nigerian food price dataset.
2. Focus on **Lagos state only**.
3. Explore the dataset and understand its columns.
4. Keep only relevant columns for modeling (closing prices and overall food price index).
5. Filter years **2020–2025**.
6. Aggregate multiple markets in Lagos into a **single state-level price per commodity per day**.
7. Handle missing values.
8. Export a clean CSV dataset ready for modeling.



## 2. Import libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette('husl')

print("Libraries imported successfully!")

Libraries imported successfully!


## 3. Load Data

In [2]:
# Load the dataset
df = pd.read_csv('../data_raw/NGA_market_2007 to 2025.csv')

print(f"Dataset shape: {df.shape}")
print(f"\nDate range: {df['price_date'].min()} to {df['price_date'].max()}")
df.head()

Dataset shape: (12597, 95)

Date range: 2007-01-01 to 2025-05-01


Unnamed: 0,ISO3,country,adm1_name,adm2_name,mkt_name,lat,lon,geo_id,price_date,year,month,currency,components,start_dense_data,last_survey_point,data_coverage,data_coverage_recent,index_confidence_score,spatially_interpolated,bread,cassava_meal,cowpeas,gari,groundnuts,maize,millet,rice,sorghum,yam,o_bread,h_bread,l_bread,c_bread,inflation_bread,trust_bread,o_cassava_meal,h_cassava_meal,l_cassava_meal,c_cassava_meal,inflation_cassava_meal,trust_cassava_meal,o_cowpeas,h_cowpeas,l_cowpeas,c_cowpeas,inflation_cowpeas,trust_cowpeas,o_gari,h_gari,l_gari,c_gari,inflation_gari,trust_gari,o_groundnuts,h_groundnuts,l_groundnuts,c_groundnuts,inflation_groundnuts,trust_groundnuts,o_maize,h_maize,l_maize,c_maize,inflation_maize,trust_maize,o_millet,h_millet,l_millet,c_millet,inflation_millet,trust_millet,o_rice,h_rice,l_rice,c_rice,inflation_rice,trust_rice,o_sorghum,h_sorghum,l_sorghum,c_sorghum,inflation_sorghum,trust_sorghum,o_yam,h_yam,l_yam,c_yam,inflation_yam,trust_yam,o_food_price_index,h_food_price_index,l_food_price_index,c_food_price_index,inflation_food_price_index,trust_food_price_index
0,NGA,Nigeria,Abia,Oboma Ngwa,Aba,5.15,7.36,gid_5150000073600000,2007-01-01,2007,1,NGN,"bread (1 Unit, Index Weight = 1), cassava_meal...",Jan 2007,Jan 2023,12.01,0,0.96,0,,,,,,,,,,,38.09,39.31,36.86,38.09,,7.2,10147.74,10302.97,9829.78,10165.96,,9.2,10576.87,11018.41,10080.64,10574.63,,6.6,7873.09,8059.34,7656.14,7912.79,,9.3,25225.53,25539.31,24390.68,25152.93,,8.8,3353.65,3404.27,3216.58,3363.81,,9.3,4560.92,4666.26,4421.32,4587.98,,8.9,6164.99,6235.31,6031.07,6209.82,,9.5,4233.74,4356.45,4095.72,4266.28,,8.3,61.01,64.19,59.13,61.74,,9.2,0.47,0.48,0.46,0.48,,9.4
1,NGA,Nigeria,Abia,Oboma Ngwa,Aba,5.15,7.36,gid_5150000073600000,2007-02-01,2007,2,NGN,"bread (1 Unit, Index Weight = 1), cassava_meal...",Jan 2007,Jan 2023,12.01,0,0.96,0,,,,,,,,,,,38.08,39.31,36.86,38.39,,7.2,10084.45,10321.47,9627.0,9627.0,,9.2,10547.29,11016.07,9214.69,9214.69,,6.6,7897.36,8099.98,7562.43,7562.43,,9.3,24893.14,25465.8,23673.78,23673.78,,8.8,3320.46,3414.59,3037.84,3037.84,,9.3,4570.74,4693.94,4182.67,4182.67,,8.9,6177.79,6280.65,5918.4,5918.4,,9.5,4258.57,4389.94,3748.06,3748.06,,8.3,62.4,64.96,59.25,59.25,,9.2,0.47,0.48,0.45,0.45,,9.4
2,NGA,Nigeria,Abia,Oboma Ngwa,Aba,5.15,7.36,gid_5150000073600000,2007-03-01,2007,3,NGN,"bread (1 Unit, Index Weight = 1), cassava_meal...",Jan 2007,Jan 2023,12.01,0,0.96,0,,,,,,,,,,,38.49,39.63,37.35,37.9,,7.2,9441.19,9704.99,9185.07,9704.99,,9.2,8759.96,9214.8,8317.34,9214.8,,6.6,7458.6,7652.24,7264.96,7614.74,,9.3,23141.3,23840.14,22442.47,23805.37,,8.8,2946.76,3043.77,2849.76,3041.09,,9.3,4072.13,4191.93,3959.82,4191.93,,8.9,5796.59,5929.94,5663.24,5923.32,,9.5,3653.49,3788.04,3518.95,3752.11,,8.3,59.06,61.52,56.61,59.43,,9.2,0.44,0.45,0.43,0.45,,9.4
3,NGA,Nigeria,Abia,Oboma Ngwa,Aba,5.15,7.36,gid_5150000073600000,2007-04-01,2007,4,NGN,"bread (1 Unit, Index Weight = 1), cassava_meal...",Jan 2007,Jan 2023,12.01,0,0.96,0,,,,,,,,,,,37.75,38.79,36.71,38.31,,7.2,9690.08,9938.5,9441.67,9711.28,,9.2,9000.2,9423.97,8576.42,9148.48,,6.6,7617.1,7822.72,7411.47,7624.58,,9.3,23684.74,24281.06,23088.42,23635.41,,8.8,3012.11,3107.03,2917.19,3018.89,,9.3,4140.71,4299.48,3981.94,4154.57,,8.9,5843.83,5963.77,5723.89,5894.22,,9.5,3667.32,3813.36,3521.28,3727.59,,8.3,60.13,61.91,58.35,59.05,,9.2,0.45,0.46,0.44,0.45,,9.4
4,NGA,Nigeria,Abia,Oboma Ngwa,Aba,5.15,7.36,gid_5150000073600000,2007-05-01,2007,5,NGN,"bread (1 Unit, Index Weight = 1), cassava_meal...",Jan 2007,Jan 2023,12.01,0,0.96,0,,,,,,,,,,,38.48,39.46,37.5,37.82,,7.2,9746.52,9950.41,9542.62,9611.01,,9.2,9085.91,9508.1,8663.71,9104.57,,6.6,7661.86,7833.59,7490.13,7549.9,,9.3,23584.98,24067.33,23102.63,23666.28,,8.8,3061.4,3152.35,2970.45,3003.7,,9.3,4164.1,4270.46,4057.73,4147.34,,8.9,5865.01,5965.65,5764.36,5877.18,,9.5,3788.8,3939.37,3638.23,3702.76,,8.3,59.57,61.3,57.83,59.02,,9.2,0.45,0.46,0.44,0.44,,9.4


## 4. Initial Data Exploration

In [3]:
# Column names and data types
df.info()

#convert date(price_date) to date type
df['price_date'] = pd.to_datetime(df['price_date'], errors='coerce')
df = df.dropna(subset=['price_date'])


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12597 entries, 0 to 12596
Data columns (total 95 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   ISO3                        12597 non-null  object 
 1   country                     12597 non-null  object 
 2   adm1_name                   12597 non-null  object 
 3   adm2_name                   12597 non-null  object 
 4   mkt_name                    12597 non-null  object 
 5   lat                         12376 non-null  float64
 6   lon                         12376 non-null  float64
 7   geo_id                      12597 non-null  object 
 8   price_date                  12597 non-null  object 
 9   year                        12597 non-null  int64  
 10  month                       12597 non-null  int64  
 11  currency                    12597 non-null  object 
 12  components                  12597 non-null  object 
 13  start_dense_data            125

In [4]:
# Summary statistics for numerical columns
df.describe()

Unnamed: 0,lat,lon,price_date,year,month,data_coverage,data_coverage_recent,index_confidence_score,spatially_interpolated,bread,cassava_meal,cowpeas,gari,groundnuts,maize,millet,rice,sorghum,yam,o_bread,h_bread,l_bread,c_bread,inflation_bread,trust_bread,o_cassava_meal,h_cassava_meal,l_cassava_meal,c_cassava_meal,inflation_cassava_meal,trust_cassava_meal,o_cowpeas,h_cowpeas,l_cowpeas,c_cowpeas,inflation_cowpeas,trust_cowpeas,o_gari,h_gari,l_gari,c_gari,inflation_gari,trust_gari,o_groundnuts,h_groundnuts,l_groundnuts,c_groundnuts,inflation_groundnuts,trust_groundnuts,o_maize,h_maize,l_maize,c_maize,inflation_maize,trust_maize,o_millet,h_millet,l_millet,c_millet,inflation_millet,trust_millet,o_rice,h_rice,l_rice,c_rice,inflation_rice,trust_rice,o_sorghum,h_sorghum,l_sorghum,c_sorghum,inflation_sorghum,trust_sorghum,o_yam,h_yam,l_yam,c_yam,inflation_yam,trust_yam,o_food_price_index,h_food_price_index,l_food_price_index,c_food_price_index,inflation_food_price_index,trust_food_price_index
count,12376.0,12376.0,12597,12597.0,12597.0,12597.0,12597.0,12597.0,12597.0,1544.0,1231.0,1487.0,1428.0,1446.0,1692.0,1591.0,1427.0,1669.0,1187.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0,12597.0,12597.0,12597.0,12597.0,11913.0,12597.0
mean,11.505714,11.564286,2016-03-01 13:14:55.927601664,2015.714932,6.420814,12.01,0.0,0.96,0.0,261.046813,18746.527774,26093.39421,15887.914671,32816.223672,12214.501531,14456.949485,18712.062957,13537.006477,279.710017,140.571153,148.525824,132.82586,140.337157,24.122177,7.631984,17233.399268,17740.453702,16750.671326,17257.18935,10.47401,9.323529,18414.243016,19320.784403,17495.561244,18401.093754,7.11382,7.123974,15459.443635,15949.904354,14995.631303,15478.054562,12.02711,9.410542,36361.479455,37498.759301,35245.212728,36419.592546,6.648528,8.985886,10622.899632,11008.176826,10247.044923,10620.877704,13.852808,9.403826,12036.581992,12443.796041,11639.292312,12031.79566,14.188323,9.068929,18352.733691,18826.760429,17955.202059,18423.718622,15.458937,9.568651,10739.012204,11139.941679,10349.131838,10736.312786,14.938641,8.566762,187.245115,194.175955,180.611633,187.264897,13.942408,9.311431,1.024206,1.047745,1.001592,1.024619,10.392064,9.480416
min,5.15,3.4,2007-01-01 00:00:00,2007.0,1.0,12.01,0.0,0.96,0.0,96.53,4400.0,7472.0,4000.0,7646.0,2350.0,2235.0,6600.0,2295.0,22.0,9.1,9.97,8.63,9.35,-75.65,7.2,4192.29,4523.38,3861.21,4400.0,-65.84,9.2,5901.0,6400.36,5401.65,5966.08,-62.42,6.6,3825.4,4062.35,3588.44,4000.0,-68.76,9.3,8114.42,8781.53,7447.3,8199.5,-73.95,8.8,2329.21,2502.95,2109.85,2350.0,-46.3,9.3,2284.36,2433.27,2133.54,2235.69,-50.96,8.9,5198.25,5351.41,5008.11,5174.95,-31.29,9.5,2331.07,2496.98,2145.89,2295.0,-55.52,8.3,47.35,48.93,41.89,47.27,-84.86,9.2,0.36,0.38,0.35,0.36,-38.76,9.4
25%,11.2675,11.07,2011-08-01 00:00:00,2011.0,3.0,12.01,0.0,0.96,0.0,180.0,11400.0,14612.5,9274.25,21600.0,7000.0,8150.0,13131.0,7000.0,153.88,25.15,26.02,24.2,25.18,-11.01,7.2,11876.75,12174.67,11598.14,11876.12,-6.61,9.2,13680.87,14296.7,13059.21,13682.83,-14.46,6.6,10269.36,10565.28,10006.28,10270.16,-6.78,9.3,27594.15,28612.19,26764.71,27661.52,-9.13,8.8,5866.14,6047.17,5667.1,5861.3,-2.37,9.3,6590.98,6783.59,6402.8,6573.21,-1.85,8.9,9169.17,9341.22,8987.96,9132.79,0.75,9.5,5665.41,5845.66,5459.37,5641.04,-2.72,8.3,91.95,94.24,89.21,91.53,4.71,9.2,0.67,0.68,0.66,0.67,-2.35,9.4
50%,11.83,12.375,2016-03-01 00:00:00,2016.0,6.0,12.01,0.0,0.96,0.0,247.52,16240.0,22483.87,14577.42,26612.5,10425.0,12777.5,16500.0,11550.0,220.0,150.13,161.38,141.0,149.89,4.31,7.2,14547.2,14880.53,14211.7,14526.31,8.61,9.2,16269.69,17001.7,15581.73,16275.53,1.54,6.6,12581.63,12934.48,12273.22,12590.45,8.64,9.3,35597.12,36592.0,34535.17,35598.44,7.8,8.8,7571.27,7829.33,7300.0,7550.0,5.86,9.3,8514.56,8782.99,8241.55,8494.64,7.82,8.9,11328.81,11626.41,11119.2,11474.04,9.49,9.5,7406.2,7728.1,7131.68,7424.32,4.94,8.3,133.95,139.71,128.45,134.04,9.71,9.2,0.8,0.82,0.78,0.8,7.24,9.4
75%,12.1325,13.18,2020-10-01 00:00:00,2020.0,9.0,12.01,0.0,0.96,0.0,315.5475,22266.0,34729.305,20000.0,40900.0,17245.89,19280.335,25208.335,18000.0,314.075,221.37,233.82,209.42,220.36,30.37,7.2,19883.01,20663.33,19054.81,19932.2,19.86,9.2,20928.85,22008.14,19824.21,20857.26,21.7,6.6,17618.55,18279.77,16872.91,17622.26,21.79,9.3,43249.62,44494.91,42012.42,43313.33,14.65,8.8,15073.79,15865.46,14324.77,15078.57,21.93,9.3,16607.58,17375.33,15844.59,16645.53,21.46,8.9,23229.14,23799.94,22677.35,23257.78,21.29,9.5,15642.78,16459.2,14900.7,15700.0,22.42,8.3,241.92,253.44,234.18,242.59,19.49,9.2,1.28,1.31,1.25,1.28,18.92,9.4
max,13.15,14.49,2025-05-01 00:00:00,2025.0,12.0,12.01,0.0,0.96,0.0,711.39,58000.0,74150.0,44916.13,96774.19,27641.94,48130.67,44800.0,51482.67,1645.58,674.5,706.61,609.76,664.74,1247.56,10.0,69429.69,70929.23,67930.15,69895.59,254.17,10.0,73241.37,76102.25,69206.45,71932.76,267.91,10.0,55150.06,56115.06,54185.05,54802.61,210.37,10.0,92213.87,95747.25,88680.49,92857.02,184.18,10.0,31000.47,31645.34,30481.3,30824.92,288.74,10.0,47746.51,48507.05,45522.93,46985.96,221.74,10.0,73656.41,74682.15,71756.88,72548.56,143.54,10.0,52348.43,53810.14,49977.42,50784.71,376.0,10.0,1643.4,1759.78,1527.02,1624.51,475.77,10.0,3.0,3.05,2.95,2.98,101.96,10.0
std,1.431014,2.596024,,5.320301,3.458921,1.776427e-15,0.0,2.220534e-16,0.0,115.772024,10216.739859,14093.784599,8274.731001,15937.95447,6189.131596,8260.859959,8122.121105,8660.917417,220.701974,113.147087,119.154726,106.960285,112.217396,68.322976,0.983008,8268.080553,8546.988104,8010.700778,8302.130235,25.288369,0.28263,7728.166872,8225.768666,7184.637906,7638.796879,35.546522,1.19244,8154.851544,8458.9664,7880.229673,8190.425246,28.154395,0.249903,10877.700959,11155.026362,10597.467,10877.068179,22.646157,0.424284,6196.421316,6405.70767,6002.639541,6193.533755,28.656989,0.240388,7464.493981,7696.605914,7240.644305,7452.7152,27.580582,0.38582,14048.9641,14468.545034,13696.970202,14124.449749,20.95358,0.162587,6841.378423,7081.162519,6609.252331,6831.647224,34.271187,0.594029,136.785246,145.644882,128.380827,136.340171,22.704066,0.261775,0.492585,0.50454,0.481232,0.493384,17.050827,0.189934


In [5]:
# Commodities in dataset
commodities = ['bread', 'cassava_meal', 'cowpeas', 'gari', 'groundnuts', 
               'maize', 'millet', 'rice', 'sorghum', 'yam']

In [6]:
# Check unique states
print("Number of states:", df['adm1_name'].nunique())
print(df['adm1_name'].value_counts())

Number of states: 14
adm1_name
Borno             5967
Yobe              3315
Adamawa            663
Kaduna             442
Katsina            221
Abia               221
Kano               221
Gombe              221
Kebbi              221
Jigawa             221
Oyo                221
Zamfara            221
Lagos              221
Market Average     221
Name: count, dtype: int64


In [7]:
# View price distribution for a sample food(eg. rice)
df['c_rice'].describe()

count    12597.000000
mean     18423.718622
std      14124.449749
min       5174.950000
25%       9132.790000
50%      11474.040000
75%      23257.780000
max      72548.560000
Name: c_rice, dtype: float64

## 5. Filter to Lagos and 2020-2025 Timeframe

In [8]:
# Keep only Lagos
df_lagos = df[df['adm1_name'] == 'Lagos']

# Filter years 2020–2025
df_lagos = df_lagos[(df_lagos['price_date'].dt.year >= 2020) & (df_lagos['price_date'].dt.year <= 2025)]
print("Filtered Lagos shape:", df_lagos.shape)

df_lagos.head()

df_lagos['adm2_name'].unique()


Filtered Lagos shape: (65, 95)


array(['Kosofe'], dtype=object)

## 6. Columns to keep
- Keep: adm1_name, price_date, closing prices (c_*), c_food_price_index
- Drop: metadata, opening/high/low prices, inflation, trust, ADM2, markets, coordinates as they are not needed for our current prediction.

In [9]:
# Closing price columns
closing_cols = [
    'c_bread', 'c_cassava_meal', 'c_cowpeas', 'c_gari', 
    'c_groundnuts', 'c_maize', 'c_millet', 'c_rice', 
    'c_sorghum', 'c_yam', 'c_food_price_index'
]

cols_to_keep = ['adm1_name', 'price_date'] + closing_cols
df_lagos = df_lagos[cols_to_keep]

df_lagos.head()


Unnamed: 0,adm1_name,price_date,c_bread,c_cassava_meal,c_cowpeas,c_gari,c_groundnuts,c_maize,c_millet,c_rice,c_sorghum,c_yam,c_food_price_index
8112,Lagos,2020-01-01,609.76,8700.0,16450.0,5034.31,26650.0,10663.99,12800.0,22475.0,11577.97,189.79,1.17
8113,Lagos,2020-02-01,609.76,8700.0,16300.0,5756.61,26900.0,10650.0,13225.0,22300.0,11475.0,196.25,1.18
8114,Lagos,2020-03-01,609.76,9964.15,17287.5,7028.4,28475.0,11912.5,14012.5,26525.0,13225.0,211.38,1.28
8115,Lagos,2020-04-01,664.74,11763.0,18466.67,9691.85,31786.67,13210.0,15153.33,28348.97,14136.67,234.98,1.41
8116,Lagos,2020-05-01,605.79,13769.13,18750.0,12397.37,33722.58,14022.58,17290.32,28093.55,14674.19,282.73,1.45


## 7. Aggregate Multiple Markets (State-Level)

In [10]:
# Aggregate markets per date by averaging closing prices
df_lagos_state = df_lagos.groupby(['adm1_name', 'price_date'])[closing_cols].mean().reset_index()

# Handle missing values: forward-fill then backward-fill
df_lagos_state = df_lagos_state.groupby('adm1_name').ffill().bfill()

# Check final shape
print(df_lagos_state.shape)
df_lagos_state.head()


(65, 12)


Unnamed: 0,price_date,c_bread,c_cassava_meal,c_cowpeas,c_gari,c_groundnuts,c_maize,c_millet,c_rice,c_sorghum,c_yam,c_food_price_index
0,2020-01-01,609.76,8700.0,16450.0,5034.31,26650.0,10663.99,12800.0,22475.0,11577.97,189.79,1.17
1,2020-02-01,609.76,8700.0,16300.0,5756.61,26900.0,10650.0,13225.0,22300.0,11475.0,196.25,1.18
2,2020-03-01,609.76,9964.15,17287.5,7028.4,28475.0,11912.5,14012.5,26525.0,13225.0,211.38,1.28
3,2020-04-01,664.74,11763.0,18466.67,9691.85,31786.67,13210.0,15153.33,28348.97,14136.67,234.98,1.41
4,2020-05-01,605.79,13769.13,18750.0,12397.37,33722.58,14022.58,17290.32,28093.55,14674.19,282.73,1.45


## Step 8: Create per-kg columns for each commodity

- component column shows: bread (1 Unit, Index Weight = 1), cassava_meal (100 KG, Index Weight = 0.01), cowpeas (100 KG, Index Weight = 0.01),
gari (100 KG, Index Weight = 0.01), groundnuts (100 KG, Index Weight = 0.01), maize (100 KG, Index Weight = 0.01), millet (100 KG, Index Weight = 0.01),
rice (50 KG, Index Weight = 0.02), sorghum (100 KG, Index Weight = 0.01), yam (1 KG, Index Weight = 1).

- Some commodities are measured per 100 kg (cassava, maize, cowpeas, etc.) or per 50 kg (rice), while bread is per unit.
- ML models work better when numeric features are comparable in scale.
- By converting to 'price per kg' for most foods, we help the model see actual trends per kg.
- Bread remains per unit; later we can normalize all features if needed.


In [11]:
# Define divisor for per-kg conversion
per_kg_divisor = {
    'c_bread': 1,                # leave as is for now(per unit)
    'c_cassava_meal': 100,
    'c_cowpeas': 100,
    'c_gari': 100,
    'c_groundnuts': 100,
    'c_maize': 100,
    'c_millet': 100,
    'c_rice': 50,
    'c_sorghum': 100,
    'c_yam': 1                   
}

# Create new per-kg columns
for col, divisor in per_kg_divisor.items():
    new_col = col + '_per_kg'
    df_lagos_state[new_col] = df_lagos_state[col] / divisor

df_lagos_state.head()


Unnamed: 0,price_date,c_bread,c_cassava_meal,c_cowpeas,c_gari,c_groundnuts,c_maize,c_millet,c_rice,c_sorghum,c_yam,c_food_price_index,c_bread_per_kg,c_cassava_meal_per_kg,c_cowpeas_per_kg,c_gari_per_kg,c_groundnuts_per_kg,c_maize_per_kg,c_millet_per_kg,c_rice_per_kg,c_sorghum_per_kg,c_yam_per_kg
0,2020-01-01,609.76,8700.0,16450.0,5034.31,26650.0,10663.99,12800.0,22475.0,11577.97,189.79,1.17,609.76,87.0,164.5,50.3431,266.5,106.6399,128.0,449.5,115.7797,189.79
1,2020-02-01,609.76,8700.0,16300.0,5756.61,26900.0,10650.0,13225.0,22300.0,11475.0,196.25,1.18,609.76,87.0,163.0,57.5661,269.0,106.5,132.25,446.0,114.75,196.25
2,2020-03-01,609.76,9964.15,17287.5,7028.4,28475.0,11912.5,14012.5,26525.0,13225.0,211.38,1.28,609.76,99.6415,172.875,70.284,284.75,119.125,140.125,530.5,132.25,211.38
3,2020-04-01,664.74,11763.0,18466.67,9691.85,31786.67,13210.0,15153.33,28348.97,14136.67,234.98,1.41,664.74,117.63,184.6667,96.9185,317.8667,132.1,151.5333,566.9794,141.3667,234.98
4,2020-05-01,605.79,13769.13,18750.0,12397.37,33722.58,14022.58,17290.32,28093.55,14674.19,282.73,1.45,605.79,137.6913,187.5,123.9737,337.2258,140.2258,172.9032,561.871,146.7419,282.73


## 9. Plot Trends for all Commodities

In [12]:
# Identify all commodity columns (those starting with "c_")
commodity_cols = [col for col in df_lagos.columns if col.endswith("_kg")]

# Loop through commodities and plot each one
for col in commodity_cols:
    plt.figure(figsize=(10, 4))
    plt.plot(df_lagos['price_date'], df_lagos[col])
    plt.title(f"{col} Price Trend in Lagos")
    plt.xlabel("Date")
    plt.ylabel("Price")
    plt.tight_layout()
    plt.show()


## 10. Save Filtered Dataset

In [14]:
# Save the filtered data
output_path = '../data_processed/lagos_filtered.csv'
df_lagos.to_csv(output_path, index=False)

print(f"✓ Filtered dataset saved to: {output_path}")
print(f"  Shape: {df_lagos.shape}")
print(f"  Size: {df_lagos.memory_usage(deep=True).sum() / 1024**2:.2f} MB")

✓ Filtered dataset saved to: ../data_processed/lagos_filtered.csv
  Shape: (65, 13)
  Size: 0.01 MB
