# **Data-Driven Insights for Solar Energy: Analysis, Findings, and Strategic Recommendations (Benin, Sierraleon and Togo)**

## **Business Overview**

MoonLight Energy Solutions aims to develop a strategic approach to significantlyenhance its operational efficiency and sustainability through targeted solarinvestments. As an Analytics Engineer at MoonLight Energy Solutions, my task is to perform a quick analysis of an environmental measurement provided bytheengineering team and translate your observation as a strategy report.

## **Understanding the Business Objective**

### **Why Analyze Solar Data?**

MoonLight Energy Solutions aims to **enhance operational efficiency and sustainability** by strategically investing in solar energy. Our primary objective is to:

- **Identify high-potential regions** for solar installations.
- **Analyze the impact of environmental factors** on solar panel performance.
- **Optimize maintenance strategies** to improve energy output.


### **Dataset Overview**  

The dataset contains solar radiation  measurements, weather conditions and their impact on solar energy generation. Each row represents a recorded observation at specific time with key variables related to solar irradiance, temperature, pressure, humidity,Precipitation, wind conditions, and sensor readings of radation,and temprature.  

#### **Key Variables:**  
- **Timestamp (yyyy-mm-dd hh:mm):** Date and time of each recorded observation.  
- **Solar Irradiance Metrics:**  
  - **GHI (W/m²):** Global Horizontal Irradiance—total solar radiation on a horizontal surface.  
  - **DNI (W/m²):** Direct Normal Irradiance—solar radiation received per unit area perpendicular to sunlight.  
  - **DHI (W/m²):** Diffuse Horizontal Irradiance—solar radiation received indirectly due to scattering.  
  - **ModA / ModB (W/m²):** Irradiance measurements from specific sensors or modules.  
- **Environmental Conditions:**  
  - **Tamb (°C):** Ambient temperature.  
  - **RH (%):** Relative humidity.  
  - **BP (hPa):** Barometric pressure.  
- **Wind Data:**  
  - **WS (m/s):** Wind speed.  
  - **WSgust (m/s):** Maximum wind gust speed.  
  - **WSstdev (m/s):** Standard deviation of wind speed (variability).  
  - **WD (°N):** Wind direction in degrees from north.  
  - **WDstdev:** Standard deviation of wind direction.  
- **Cleaning & Precipitation:**  
  - **Cleaning (1/0):** Indicates whether a cleaning event occurred.  
  - **Precipitation (mm/min):** Rainfall rate measured in millimeters per minute.  
- **Module Temperature:**  
  - **TModA / TModB (°C):** Temperatures of individual solar modules.  
- **Comments:** A column for additional observations or notes.  

This dataset serves as a foundation for analyzing solar energy potential, identifying trends, and optimizing maintenance strategies.


# **Environment Setup**
Before deep dive in to analysis of the data ,we neee to import all neccessary packages and modules.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from scipy.stats import zscore

# **Data Loading**
we start analysis by loading dataset from the country


In [None]:
counries=['benin','sierraleone',"togo"]
def load_data(country: str)-> pd.DataFrame:
    """Loading the data from the processed csv file"""
    df = pd.read_csv(f"../data/processed/{country}_clean.csv")
    df['country'] = country
    return df
# Load the data and merge into a single DataFrame
full_df=pd.concat([load_data(country) for country in counries],ignore_index=True)
