In [None]:
# Cross-Country Solar Potential Comparison

This notebook synthesizes cleaned datasets from Benin, Sierra Leone, and Togo to compare solar potential (GHI, DNI, DHI). Visualizations, statistical tests, and a summary table highlight key differences to inform MoonLight Energy's investment decisions.

In [6]:
import sys
import os
sys.path.append(os.path.abspath('..'))  # Adds project root to sys.path

import pandas as pd
from scripts.viz_utils import (
    load_and_combine_data,
    plot_metric_boxplots,
    create_summary_table,
    run_kruskal_wallis,
    plot_ghi_ranking
)

# Load data
df = load_and_combine_data(
    '../data/benin_clean.csv',
    '../data/sierraleone_clean.csv',
    '../data/togo_clean.csv'
)

In [None]:
## Metric Comparison

Boxplots visualize the distribution of GHI, DNI, and DHI across countries, highlighting variability and central tendencies.

In [None]:
# Plot boxplots
plot_metric_boxplots(df, ['GHI', 'DNI', 'DHI'], 'figures/compare_boxplots.png')

![Boxplots](figures/compare_boxplots.png)

In [None]:
## Summary Table

The table below compares mean, median, and standard deviation of GHI, DNI, and DHI across countries.

In [10]:
# Create and display summary table
summary_table = create_summary_table(df, ['GHI', 'DNI', 'DHI'])
display(summary_table)
summary_table.to_csv('figures/summary_table.csv', index=False)

Unnamed: 0,Country,GHI_Mean,GHI_Median,GHI_Std,DNI_Mean,DNI_Median,DNI_Std,DHI_Mean,DHI_Median,DHI_Std
0,Benin,236.234508,0.7,328.288787,166.896673,-0.1,262.081997,111.656477,0.5,153.099749
1,Sierra Leone,185.000024,-0.4,279.01946,104.128943,-0.1,200.954182,108.104278,-0.6,153.691835
2,Togo,223.859675,0.5,317.306277,147.975931,0.0,247.68369,112.781249,1.5,151.571744


In [None]:
## Statistical Testing

A Kruskal-Wallis test assesses whether GHI differences between countries are statistically significant. A p-value < 0.05 indicates significant differences.

In [11]:
# Run Kruskal-Wallis test
stat, p_value = run_kruskal_wallis(df, 'GHI')
print(f"Kruskal-Wallis Test for GHI: Statistic = {stat:.2f}, p-value = {p_value:.4f}")

Kruskal-Wallis Test for GHI: Statistic = 6548.53, p-value = 0.0000


In [None]:
## Key Observations

- Observation 1: [Replace with insight, e.g., "Benin has the highest median GHI (e.g., 500 W/m²) but also the greatest variability (std: 150 W/m²), indicating inconsistent solar potential."]
- Observation 2: [Replace with insight, e.g., "Sierra Leone shows stable GHI (std: 100 W/m²), ideal for consistent energy production."]
- Observation 3: [Replace with insight, e.g., "Togo's lower DNI (median: 300 W/m²) suggests less suitability for concentrated solar power."]

## Bonus: GHI Ranking

A bar chart ranks countries by average GHI to highlight top solar potential.

In [12]:
# Plot GHI ranking
plot_ghi_ranking(df, 'figures/ghi_ranking.png')

![GHI Ranking](figures/ghi_ranking.png)