# **Towards a Resilient Grid: Strategies for Variable Renewable Energy in Kenya**
##### *Data-Driven Solutions for a Reliable and Green Energy Future.*

Group Members:
1. Naomi Ngigi
2. Elvis Kiprono
4. Janine Makorre
5. Trevor Maina
6. Caroline Wachira

DSF 12 FT Remote 

July - August 2025

## Introduction

Kenya is on an ambitious path to achieve universal access to affordable, reliable, and modern energy while transitioning to a low-carbon economy. As outlined in the Kenya National Energy Policy (2025 draft), the National Energy Efficiency and Conservation Strategy (2020), and the 100% Renewable Energy Plan by 2050, the country is investing heavily in clean energy technologies such as geothermal, wind, and solar. Today, more than 80% of Kenya’s electricity already comes from renewable sources—making it one of the global leaders in sustainable power generation.

However, this transition faces challenges, including balancing variable renewable energy sources, ensuring grid stability, and meeting the growing energy demand driven by industrialization and population growth. Forecasting energy demand and optimizing renewable integration are critical to preventing outages, reducing reliance on thermal energy, and supporting sustainable economic development.

Our capstone project, “Towards a Resilient Grid: Strategies for Variable Renewable Energy in Kenya,” builds on this vision by developing real-time energy demand forecasting and grid optimization tools. By leveraging historical energy generation and consumption data, our solution will help Kenya maximize renewable energy utilization, enhance energy reliability, and guide infrastructure investments.

## Problem Statement 

Kenya’s power grid is expanding rapidly, yet it continues to face persistent challenges in balancing rising electricity demand with the variable output of renewable energy sources such as solar, wind, and hydro. Despite the country’s abundant clean energy potential, transmission and distribution networks remain constrained, resulting in frequent outages, underutilization of renewable capacity, and increased dependence on costly thermal power during peak demand periods.

This imbalance drives up system costs, reduces reliability, and undermines efforts to achieve affordable, sustainable, and inclusive electricity access. As energy demand accelerates and climate commitments intensify, there is an urgent need for accurate short- and medium-term demand forecasting, smarter renewable integration, and data-driven grid planning to prioritize upgrades where they will deliver the most impact.

Without these innovations, even high renewable penetration will not translate into a resilient, efficient, or equitable power system. The strategic deployment of intelligent forecasting and optimization tools can unlock Kenya’s clean energy potential—ensuring stability, affordability, and scalability for both urban and rural communities.

## Objectives

1. How can we accurately forecast electricity demand?
2. How to optimize the share of variable renewables in the grid?
3. Which infrastructure investments reduce grid bottlenecks?


## Key Stakeholders

+ Kenya Power: Cuts outages and costly backups.                                                    
+ Energy & Petroleum Regulatory Authority (EPRA): Enables smarter, data-driven regulation
+ Ministry of Energy: Guides strategic energy investments.
+ Renewable developers (solar farms, wind IPPs): Reduces curtailment, boosts project value.
+ Industrial & commercial users: Ensures reliable, predictable power.
+ Policy makers, donors, investors: Directs funding where impact is highest.

## Description of dataset

All data used in this analysis is sourced from the Energy and Petroleum Regulatory Authority (EPRA), Kenya’s independent energy regulator. EPRA collects, verifies, and publishes official statistics on electricity generation, sales, customer connections, and grid infrastructure. These datasets provide a reliable foundation for demand forecasting, renewable integration, and grid optimization—supporting Kenya’s clean energy transition.

### Dataset Summary (2019 – April 2025)

We utilize five core EPRA datasets that provide a multi-dimensional view of Kenya’s power system:

### 1. Electricity Generation by Technology
- **Time Range:** Jan 2019 – Apr 2025 (Monthly)  
- **Features:** Energy generation (GWh) by source – Hydro, Thermal, Wind, Geothermal, Bagasse/Biogas, Imports, Solar, Total.

### 2. Installed Renewable Capacity
- **Reference Year:** 2024  
- **Features:** Installed capacity (MW) by technology – Hydro, Geothermal, Wind, Biomass, Solar.

### 3. Transmission & Distribution Infrastructure
- **Time Range:** FY 2019/20 – Apr 2025  
- **Features:** Circuit length (km) by voltage – HV (500kV to 66kV), MV (33kV, 11kV), LV (415/240V).

### 4. Grid Connectivity
- **Time Range:** Jul 2019 – Apr 2025 (Monthly)  
- **Features:** New customer connections and cumulative totals.

### 5. Electricity Consumption
- **Time Range:** Jul 2019 – Apr 2025 (Monthly)  
- **Features:** Electricity sales to end-users (GWh).

## Project Approach

1. **Data Collection & Validation**  
   - Load and inspect EPRA datasets (2019–Apr 2025) on generation, consumption, capacity, grid infrastructure, and connectivity.  
   - Clean, standardize units (GWh/MW), handle missing values, and validate data integrity.

2. **Exploratory Data Analysis (EDA)**  
   - Visualize trends in demand vs. generation (line charts, stacked area plots).  
   - Analyze seasonal peaks and renewable contributions over time.  
   - Correlate connectivity and infrastructure expansion with consumption patterns.

3. **Feature Engineering**  
   - Create time-based features (month, quarter, year).  
   - Generate derived metrics: renewable share %, demand-growth rates, grid expansion indicators.

4. **Modeling & Forecasting**  
   - Apply time series models (ARIMA, Prophet, LSTM) for short- and medium-term demand forecasting.  
   - Evaluate models using MAE, RMSE, and MAPE.  
   - Use regression for outage risk scoring or grid stress indicators.

5. **Visualization & Insights**  
   - Build interactive dashboards (Tableau/Power BI) to visualize demand trends, forecasts, and renewable utilization scenarios.

6. **Deployment (MVP)**  
   - Package models into a FastAPI endpoint for real-time forecasting.  
   - Enable energy planners and developers to query future demand and optimize renewable integration.

# DATA UNDERSTANDING

### Libraries and Modules

In [1]:
import pandas as pd

In [17]:
#Loading Datasets

df_consumption = pd.read_excel('epra_data/Electricity_Consumption.xlsx', skiprows=1).set_index('Month').T
df_consumption

Month,Sales GWh
2019-07-01,748.236679
2019-08-01,749.914883
2019-09-01,740.489519
2019-10-01,754.343940
2019-11-01,763.255304
...,...
2024-12-01,886.000000
2025-01-01,957.000000
2025-02-01,916.000000
2025-03-01,943.000000


In [6]:
df_grid_connectivity = pd.read_excel('epra_data/Grid_Connectivity.xlsx')
df_grid_connectivity.head()

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2
0,Period,Number of new customers,Cummulative Connections
1,2019-07-01 00:00:00,51685,7140389
2,2019-08-01 00:00:00,44145,7183311
3,2019-09-01 00:00:00,38045,7205121
4,2019-10-01 00:00:00,48889,7263265


In [18]:
df_av_price = pd.read_excel('epra_data/Average_Retail_Prices.xlsx').set_index('Average retail tariff (KShs/kWh)').T
df_av_price.head()

Average retail tariff (KShs/kWh),DC 1 Lifeline (0-30 kWh),DC 2 Ordinary (30-100 kWh),DC 3 Ordinary (100-15000 kWh),Small Commercial 1 (0-30 kWh),Small Commercial 2 (30kWh-100kWh),Small Commercial 3 (100kWh-1500kWh),SC3 Bulk Supply (1000kWh-1500kWh),"Commercial Industrial 1 - 415 V (> 15,000 kWh)","Commercial Industrial 2 - 11,000 V","Commercial Industrial 3 - 33,000 V","Commercial Industrial 4 - 66,000 V","Commercial Industrial 5 - 132,000 V","Commercial Industrial 6 - 220,000 V",Commercial Industrial 7 (SEZs),E-Mobility,Street Lighting
2011-01-01 00:00:00,21.1076,10.3556,13.69235,17.1548,15.9452,,,13.896878,11.3629,11.453198,10.859353,10.364754,,,,14.351489
2011-02-01 00:00:00,21.9364,11.1844,14.52115,17.9836,16.774,,,14.725678,12.1917,12.281998,11.688153,11.193554,,,,15.180289
2011-03-01 00:00:00,23.0452,12.2932,15.62995,19.0924,17.8828,,,15.834478,13.3005,13.390798,12.796953,12.302354,,,,16.289089
2011-04-01 00:00:00,24.21,13.458,16.79475,20.2572,19.0476,,,16.999278,14.4653,14.555598,13.961753,13.467154,,,,17.453889
2011-05-01 00:00:00,24.7812,14.0292,17.36595,20.8284,19.6188,,,17.570478,15.0365,15.126798,14.532953,14.038354,,,,18.025089


In [19]:
df_electricity_generation_by_technology = pd.read_excel('epra_data/Electricity_Generation_By_Technology.xlsx').set_index('Energy Purchased (GWh)').T
df_electricity_generation_by_technology

Energy Purchased (GWh),HYDRO,Thermal,WIND,GEOTHERMAL,BAGASSE/BIOGAS,IMPORTS,SOLAR,Total
2019-01-01,278.925391,114.068527,147.551950,417.351390,0.021523,15.176310,7.995863,981.090954
2019-02-01,253.903281,98.556148,146.213428,374.429163,0.013541,14.277945,7.025732,894.419237
2019-03-01,282.826276,98.999828,143.998739,445.103910,0.017971,17.137795,8.073783,996.158301
2019-04-01,191.990307,181.133055,142.211181,397.662472,0.010936,27.224380,8.196664,948.428995
2019-05-01,242.622647,110.332924,164.119003,427.383580,0.003829,24.533480,7.916970,976.912433
...,...,...,...,...,...,...,...,...
2024-12-01,286.308316,103.444933,148.026413,476.336640,0.000000,128.142403,44.878796,1187.137500
2025-01-01,282.660656,108.388546,159.353989,483.293050,0.000000,146.021055,37.375105,1217.092402
2025-02-01,240.079841,113.335233,194.406291,415.083120,0.000000,124.121381,44.424766,1131.450632
2025-03-01,274.267188,154.208331,168.144407,475.698910,0.000000,120.763716,43.968898,1237.051450


In [10]:
df_Installed_Capacity_Renewables = pd.read_excel('epra_data/Installed_Capacity_Renewables.xlsx')
df_Installed_Capacity_Renewables

Unnamed: 0,Technology,Installed Capacity(MW)
0,Hydro,839.48
1,Geothermal,939.98
2,Wind,435.5
3,Biomass,2.0
4,Solar,210.25
5,Total,2427.21


In [21]:
df_Transmission_and_Distribution_Infrastructure = pd.read_excel('epra_data/Transmission_and_Distribution_Infrastructure.xlsx')
df_Transmission_and_Distribution_Infrastructure

Unnamed: 0,"TRANSMISSION AND DISTRIBUTION LINES, CIRCUIT LENGTH IN KILOMETRES",Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6
0,VOLTAGE,2019/20,2020/21,2021/22,2022/23,2023/24,2024/25 upto April 2025
1,500kV HVDC Ketraco,,,,1254,1254,1254
2,400 kV Ketraco,1980.98,1980.98,2030.98,2030.98,2031,2031
3,220kv Ketraco & KenGen links,454,454,724,724,792,792
4,132kv Ketraco,1022.034,1094.034,1094.034,1264.285,1358.285,1452
5,Kplc,,,,,,
6,220 kV,1352.3,1352.3,1352.3,1352.3,1352,1352
7,132 kV,2349.916,2349.916,2349.916,2349.916,2350,2350
8,66 kV,1187.18,1187.18,1188.18,1226.8554,1313,1423
9,33 kV,35703.1119,36569.727,38051,39167.6037,39940,40836
