# 🌞 Solar Weather Prediction Project
Welcome to the starter notebook for predicting solar irradiance using weather data.

## 🧠 Problem Definition
Define the problem and objective of this project.
- Goal: Predict solar irradiance for Karachi using past weather data
- Use: Help solar companies forecast energy output

## 📥 Data Collection
We'll use the Meteostat API to fetch historical weather data for Karachi.

In [None]:

from meteostat import Point, Daily
from datetime import datetime
import pandas as pd

# Define location (Karachi)
karachi = Point(24.8607, 67.0011, 8)

# Date range
start = datetime(2023, 7, 1)
end = datetime(2024, 7, 1)

# Fetch data
data = Daily(karachi, start, end)
df = data.fetch()

# Show first rows
df.head()


## 🧹 Data Cleaning
Clean null values, format dates, and prepare the data.

In [None]:

# Check for missing values
print(df.isnull().sum())

# Drop or fill missing values
df = df.dropna()

# Reset index
df = df.reset_index()
df.head()


## 📊 EDA (Exploratory Data Analysis)
Visualize trends and understand correlations.

In [None]:

import matplotlib.pyplot as plt

# Plot temperature over time
plt.figure(figsize=(12, 4))
plt.plot(df['time'], df['tavg'])
plt.title('Average Temperature Over Time')
plt.xlabel('Date')
plt.ylabel('Temperature (°C)')
plt.grid(True)
plt.show()


## 🧠 Model Training
Train a machine learning model to predict solar irradiance.

In [None]:

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error

# Example feature and target
df = df.dropna()
X = df[['tavg', 'tmin', 'tmax', 'prcp', 'wspd']]  # features
y = df['tsun']  # target: sunshine duration (or replace with irradiance if available)

# Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train
model = RandomForestRegressor()
model.fit(X_train, y_train)

# Predict
y_pred = model.predict(X_test)

# Evaluate
mae = mean_absolute_error(y_test, y_pred)
rmse = mean_squared_error(y_test, y_pred, squared=False)
print(f"MAE: {mae:.2f}, RMSE: {rmse:.2f}")


## 💾 Save Model
Save the trained model for later use.

In [None]:

import joblib

# Save model
joblib.dump(model, 'models/solar_model.pkl')


## 📤 Deployment (Optional)
Later you can build a Streamlit dashboard to use this model interactively.