This is the connection link to my database on postgreSQL, the actual connection function is on the file db_connect.db

In [None]:
# Import necessary packages
import pandas as pd
from db_connect import connect_to_db

# Step 1: Connect to the database
conn = connect_to_db()

# Step 2: Create a cursor and run a query
cursor = conn.cursor()
query = "SELECT * FROM food_prices_cleaned.food_prices_kenya;"
cursor.execute(query)

# Step 3: Fetch results and convert to a DataFrame
rows = cursor.fetchall()
df = pd.DataFrame(rows, columns=[desc[0] for desc in cursor.description])

# Step 4: Display the data
print("Connection successful! Previewing data:")
display(df.head(25))


Data Exploration with Python, tryna get to understand my data

In [None]:
df.info(50)

In [None]:
df.describe()

# Standardising potatoes data to price per KG

Divide potatoes price columns by 50 to standardize to price per 1kg
I choose to do this on derived columns to avoid confusion, or incase i'll need the original data in future.

In [None]:
df['o_potatoes_1kg'] = df['o_potatoes'] / 50
df['h_potatoes_1kg'] = df['h_potatoes'] / 50
df['l_potatoes_1kg'] = df['l_potatoes'] / 50
df['c_potatoes_1kg'] = df['c_potatoes'] / 50

# confirm if the additional potatoes columns have been added


In [None]:
df.info()

Renaming some columns for better understanding of what they represent

In [None]:
import pandas as pd

def rename_agric_columns(df):
    """
    Renames columns like o_beans, h_beans, c_maize, etc. 
    to a consistent format such as beans_open, maize_high, etc.
    """
    rename_map = {}
    prefix_map = {
        'o': 'open',
        'h': 'high',
        'l': 'low',
        'c': 'close'
    }

    # Iterate through existing columns
    for col in df.columns:
        # Check for trading-style prefixes (o_, h_, l_, c_)
        for prefix, new_prefix in prefix_map.items():
            if col.startswith(f"{prefix}_"):
                # Example: o_beans → beans_open
                rename_map[col] = f"{col.split('_', 1)[1]}_{new_prefix}"
                break
        # Handle inflation_* and trust_* as is
        if col.startswith("inflation_") or col.startswith("trust_"):
            rename_map[col] = col  # keep same (optional)
    
    # Apply renaming
    df = df.rename(columns=rename_map)
    return df

In [None]:
df = rename_agric_columns(df)

In [None]:
print(df.columns)

In this step i'm grouping the data by province, then ordering it by year: This keeps all rows but arranges them so that:
All rows from the same province are grouped together
Within each province, data appears in chronological order

In [None]:
df = df.sort_values(['provinces', 'year']).reset_index(drop=True)
df

# General Trends and Overview

What are the overall trends in food prices **(beans, maize, potatoes, and the food price index)** across Kenya over the years (2007–2025)?

want to see how prices have changed over time, for each commodity across all regions.
That means we’ll probably focus on averages per year (national trend), not per province yet.

We'll start by calculating the mean closing price for each commodity per year:

In [None]:
yearly_trends = df.groupby('year')[['beans_close','maize_close','potatoes_1kg_close','food_price_index_close']].mean().reset_index()
yearly_trends

# Visualize the trends

## Overall Food Price Trends in Kenya (2007–2025):

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(12, 6))
sns.lineplot(data=yearly_trends, x='year', y='beans_close', label='Beans')
sns.lineplot(data=yearly_trends, x='year', y='maize_close', label='Maize')
sns.lineplot(data=yearly_trends, x='year', y='potatoes_1kg_close', label='Potatoes')
sns.lineplot(data=yearly_trends, x='year', y='food_price_index_close', label='Food_Price_Index')

plt.title('Overall Food Price Trends in Kenya (2007–2025)', fontsize=14)
plt.xlabel('Year')
plt.ylabel('Average Closing Price')
plt.legend(title='Commodity')
plt.grid(True, linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()

## Breakdown based on the Line chart above 

### 1. Upward Trend Overall:
All commodities — **beans**, **maize**, and **potatoes** — show a gradual increase in average prices from 2007 through around 2023, followed by a slight decline toward 2025.  
This suggests **long-term inflationary pressure on food prices in Kenya**.
### 2. Beans are consistently the most expensive:
The **blue line(Beans)** remains well above maize and potatoes throughout the period.
This likely reflects both higher production costs and strong demand for beans as a protein source
### 3. Parallel movement between maize and potatoes
The **Orange(maize)** and **green(Potatoes)** line move roughly together, meaning price changes for one often coincide with the other
This may reflect **shared market influence** like weather condition or fuel prices that affect all stable crops.
### 4. Food Price Index follows the same direction:
Even though its values are smaller in scale, the **Food Price Index (red line)** mirrors the general direction of the other commodities.
It acts as a **summary indicator** of overall food inflation, showing peaks and troughs that align with the crops’ price changes.
### 5. Notable peaks (2022–2023):
There’s a sharp spike across all commodities around 2022–2023, likely due to **global and local disruptions** e.g., drought, COVID-19 aftereffects, global supply chain issues or elections.
After this spike, prices dip slightly toward 2025, suggesting a partial recovery or stabilization.
### 2. Notable Dips
There are **two visible dips**, around **2010** and **2018**, across most commodities.  
These years coincide with **major election periods in Kenya** (the 2010 constitutional referendum and the 2017 general election).  
Such events often influence food prices through **market disruptions, political uncertainty, and short-term policy changes** that affect production and distribution.
