# 📊 Benin Solar Data - Exploratory Data Analysis (EDA)
**Author:** Temesgen Awoke  
**Date:** 2025-05-19  

This notebook performs data profiling, cleaning, and exploratory analysis on Benin's solar dataset as part of the 10 Academy regional project.

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import missingno as msno 
import warnings
warnings.filterwarnings('ignore')

sns.set(style='whitegrid')

## 📁 Load Dataset

In [None]:
df = pd.read_csv('benin_solar_data.csv')  # replace with actual file name
df.head()

## 🔍 Data Profiling

In [None]:
print('Shape:', df.shape)
df.info()
df.describe()
df.isnull().sum()

## 🧼 Data Cleaning

In [None]:
# Convert date column to datetime
df['timestamp'] = pd.to_datetime(df['timestamp'])

# Fill missing values (forward fill example)
df.fillna(method='ffill', inplace=True)

# Drop duplicates
df.drop_duplicates(inplace=True)

# Rename columns for clarity
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
df.head()

## 📈 Exploratory Data Analysis

In [None]:
plt.figure(figsize=(12,6))
plt.plot(df['timestamp'], df['energy_output_kwh'], color='orange')
plt.title('Daily Energy Output in Benin')
plt.xlabel('Date')
plt.ylabel('Energy Output (kWh)')
plt.grid(True)
plt.show()

In [None]:
plt.figure(figsize=(10,6))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()

In [None]:
sns.histplot(df['energy_output_kwh'], kde=True)
plt.title('Distribution of Energy Output')
plt.show()

In [None]:
df['month'] = df['timestamp'].dt.month
df['hour'] = df['timestamp'].dt.hour

plt.figure(figsize=(10,5))
sns.boxplot(x='month', y='energy_output_kwh', data=df)
plt.title('Energy Output by Month')
plt.show()

## 💾 Save Cleaned Dataset

In [None]:
df.to_csv('cleaned_benin_solar_data.csv', index=False)

## ✅ Summary of Insights
- Dataset shape and time range
- Missing values handled using forward fill
- Daily and monthly production trends observed
- Ready for comparison with Sierra Leone and Togo datasets