# Cross-Country Solar Comparison: Benin, Sierra Leone, and Togo

**Objective:**  
Compare the cleaned solar datasets from three West African countries to identify relative solar potential and key differences.

**Analysis Plan:**
1. Load all three cleaned datasets
2. Compare solar metrics (GHI, DNI, DHI) using boxplots
3. Calculate summary statistics (mean, median, std dev)
4. Perform statistical testing (ANOVA/Kruskal-Wallis)
5. Identify key observations and insights

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import warnings
warnings.filterwarnings('ignore')

# Configure display
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:.2f}'.format)
sns.set_style("whitegrid")
plt.rcParams['figure.dpi'] = 100

print("✓ Libraries imported successfully!")

✓ Libraries imported successfully!


## Step 1: Load Cleaned Datasets

We'll load the three cleaned CSV files:
- `benin_clean.csv`
- `sierraleone_clean.csv`
- `togo-dapaong_qc.csv`

In [2]:
# Load cleaned datasets
print("Loading cleaned datasets...")
print("="*60)

benin = pd.read_csv('../data/benin_clean.csv')
print(f"✓ Benin: {benin.shape[0]:,} rows × {benin.shape[1]} columns")

sierraleone = pd.read_csv('../data/sierraleone_clean.csv')
print(f"✓ Sierra Leone: {sierraleone.shape[0]:,} rows × {sierraleone.shape[1]} columns")

togo = pd.read_csv('../data/togo-dapaong_qc.csv')
print(f"✓ Togo: {togo.shape[0]:,} rows × {togo.shape[1]} columns")

# Add country identifier to each dataset
benin['Country'] = 'Benin'
sierraleone['Country'] = 'Sierra Leone'
togo['Country'] = 'Togo'

print("\n✓ All datasets loaded with country identifiers!")

Loading cleaned datasets...
✓ Benin: 525,600 rows × 19 columns
✓ Sierra Leone: 525,600 rows × 19 columns
✓ Togo: 525,600 rows × 19 columns

✓ All datasets loaded with country identifiers!
