# Cross-Country Solar Potential Comparison

This notebook compares solar potential metrics (GHI, DNI, DHI) across Benin, Sierra Leone, and Togo.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

# Set style for better visualizations
plt.style.use('seaborn')
sns.set_palette('husl')

# Read the cleaned datasets
benin_df = pd.read_csv('../data/benin_clean.csv')
sierraleone_df = pd.read_csv('../data/sierraleone_clean.csv')
togo_df = pd.read_csv('../data/togo_clean.csv')

# Add country column to each dataframe
benin_df['Country'] = 'Benin'
sierraleone_df['Country'] = 'Sierra Leone'
togo_df['Country'] = 'Togo'

# Combine all dataframes
combined_df = pd.concat([benin_df, sierraleone_df, togo_df], ignore_index=True)

## 1. Boxplots of Solar Metrics

In [None]:
# Create boxplots for each metric
metrics = ['GHI', 'DNI', 'DHI']
fig, axes = plt.subplots(1, 3, figsize=(18, 6))

for idx, metric in enumerate(metrics):
    sns.boxplot(data=combined_df, x='Country', y=metric, ax=axes[idx])
    axes[idx].set_title(f'{metric} Distribution by Country')
    axes[idx].set_ylabel(f'{metric} (W/m²)')
    axes[idx].tick_params(axis='x', rotation=45)

plt.tight_layout()
plt.show()

## 2. Summary Statistics

In [None]:
# Calculate summary statistics for each metric by country
summary_stats = combined_df.groupby('Country')[metrics].agg(['mean', 'median', 'std']).round(2)
summary_stats

## 3. Statistical Testing

In [None]:
# Perform Kruskal-Wallis H-test for each metric
for metric in metrics:
    benin_data = benin_df[metric]
    sierraleone_data = sierraleone_df[metric]
    togo_data = togo_df[metric]
    
    h_stat, p_value = stats.kruskal(benin_data, sierraleone_data, togo_data)
    print(f"{metric} Kruskal-Wallis H-test:")
    print(f"H-statistic: {h_stat:.2f}")
    print(f"p-value: {p_value:.2e}")
    print()

## 4. Key Observations

1. **Benin shows the highest solar potential** with the highest mean GHI (236.23 W/m²) and DNI (166.90 W/m²) values among the three countries. This suggests that Benin has the most favorable conditions for both direct and global solar radiation.

2. **Sierra Leone exhibits the lowest solar potential** with significantly lower mean GHI (185.00 W/m²) and DNI (104.13 W/m²) values compared to the other countries. However, its DHI values (108.10 W/m²) are comparable to Benin and Togo, indicating that while direct sunlight is lower, diffuse radiation remains consistent.

3. **Togo shows moderate but consistent solar potential** with mean GHI (223.86 W/m²) and DNI (147.98 W/m²) values falling between Benin and Sierra Leone. The country shows the highest DHI values (112.78 W/m²) and the lowest standard deviation in daily averages, suggesting more stable solar conditions throughout the year.

## 5. Bonus: Country Ranking by Average GHI

In [None]:
# Calculate average GHI by country
avg_ghi = combined_df.groupby('Country')['GHI'].mean().sort_values(ascending=False)

plt.figure(figsize=(10, 6))
sns.barplot(x=avg_ghi.index, y=avg_ghi.values)
plt.title('Average GHI by Country')
plt.ylabel('Average GHI (W/m²)')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show() 