# PV Power Forecasting with LSTM

This notebook demonstrates how to use the LSTM-based model for photovoltaic (PV) power forecasting. The model predicts power output for the next 24 hours based on historical data and weather forecasts.

## Overview

The forecasting process involves:
1. Loading a pre-trained LSTM model and corresponding scalers
2. Obtaining historical data for the initial sequence
3. Fetching weather forecast data for the prediction horizon
4. Processing and scaling the data appropriately
5. Iteratively predicting the power output one hour at a time
6. Visualizing and analyzing the results

Let's begin by importing the required modules.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import os

# Import our custom inference module
from lstm_lowres_inference import LSTMLowResInference

# Set plotting style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 12

## 1. Initialize the Forecaster

First, we create an instance of the `LSTMLowResInference` class. This will automatically load the most recent trained model and its corresponding scalers.

In [None]:
# Create forecaster with default sequence length (24 hours)
forecaster = LSTMLowResInference()

# Print model information
print("Model summary:")
forecaster.model.summary()

## 2. Examine Historical Data

Let's load and explore the historical data that will be used as the initial sequence for our forecast.

In [None]:
# Load the most recent sequence of historical data
historical_data = forecaster.load_historical_data()

# Display the first few rows
print(f"Historical data shape: {historical_data.shape}")
historical_data.head()

In [None]:
# Plot historical power data
plt.figure(figsize=(12, 6))
plt.plot(historical_data.index, historical_data['power_w'], 'b-', linewidth=2)
plt.title('Recent Historical PV Power Output', fontsize=16)
plt.xlabel('Time', fontsize=12)
plt.ylabel('Power (W)', fontsize=12)
plt.grid(True, linestyle='--', alpha=0.7)
plt.gcf().autofmt_xdate()
plt.tight_layout()
plt.show()

## 3. Get Weather Forecast Data

Now we need to fetch weather forecast data for the prediction horizon (next 24 hours).

In [None]:
# Fetch weather forecast for next 24 hours
forecast_data = forecaster.fetch_weather_forecast(hours=24)

# Display forecast data
print(f"Forecast data shape: {forecast_data.shape}")
forecast_data.head()

In [None]:
# Plot forecasted weather parameters
fig, axs = plt.subplots(3, 1, figsize=(12, 12), sharex=True)

# Plot global radiation
axs[0].plot(forecast_data.index, forecast_data['GlobalRadiation [W m-2]'], 'r-', linewidth=2)
axs[0].set_title('Forecasted Global Radiation', fontsize=14)
axs[0].set_ylabel('Radiation (W/m²)', fontsize=12)
axs[0].grid(True, linestyle='--', alpha=0.7)

# Plot temperature
axs[1].plot(forecast_data.index, forecast_data['Temperature [degree_Celsius]'], 'g-', linewidth=2)
axs[1].set_title('Forecasted Temperature', fontsize=14)
axs[1].set_ylabel('Temperature (°C)', fontsize=12)
axs[1].grid(True, linestyle='--', alpha=0.7)

# Plot clear sky index (or cloud cover)
if 'ClearSkyIndex' in forecast_data.columns:
    axs[2].plot(forecast_data.index, forecast_data['ClearSkyIndex'], 'b-', linewidth=2)
    axs[2].set_title('Forecasted Clear Sky Index', fontsize=14)
    axs[2].set_ylabel('Clear Sky Index', fontsize=12)
else:
    axs[2].plot(forecast_data.index, forecast_data['total_cloud_cover'], 'b-', linewidth=2)
    axs[2].set_title('Forecasted Cloud Cover', fontsize=14)
    axs[2].set_ylabel('Cloud Cover', fontsize=12)
axs[2].grid(True, linestyle='--', alpha=0.7)
axs[2].set_xlabel('Time', fontsize=12)

plt.gcf().autofmt_xdate()
plt.tight_layout()
plt.show()

## 4. Process and Prepare Data

We need to add derived features and scale the data appropriately before feeding it to the model.

In [None]:
# Add derived features to historical data
historical_data_with_features = forecaster.calculate_derived_features(historical_data)

# Add derived features to forecast data
forecast_data_with_features = forecaster.calculate_derived_features(forecast_data)

# Display derived features
print("Derived features for historical data:")
historical_data_with_features[['hour_sin', 'hour_cos', 'day_sin', 'day_cos', 'isNight', 'ClearSkyIndex']].head()

## 5. Generate 24-Hour Forecast

Now we'll run the forecasting algorithm to predict the next 24 hours of PV power output.

In [None]:
# Run the full forecast pipeline
forecast_results = forecaster.predict_next_24h()

# Display the forecast results
forecaster.display_forecast(forecast_results)

## 6. Visualize the Forecast

Let's create detailed visualizations of the forecast results.

In [None]:
# Plot the standard forecast
forecaster.plot_forecast(forecast_results)

In [None]:
# Create a more detailed visualization
plt.figure(figsize=(14, 8))

# Plot power output
plt.plot(forecast_results.index, forecast_results['power_w'], 'b-', linewidth=3, label='Predicted Power')

# Add confidence interval (hypothetical - for demonstration)
upper_bound = forecast_results['power_w'] * 1.2
lower_bound = forecast_results['power_w'] * 0.8
plt.fill_between(forecast_results.index, lower_bound, upper_bound, color='blue', alpha=0.2, 
                 label='Confidence Interval (±20%)')

# Add day/night shading
for i in range(len(forecast_results)-1):
    if forecast_results.index[i].hour >= 18 or forecast_results.index[i].hour < 6:
        plt.axvspan(forecast_results.index[i], forecast_results.index[i+1], 
                   alpha=0.2, color='gray', label='Night' if i == 0 else None)

# Add annotations for key points
max_idx = forecast_results['power_w'].idxmax()
max_val = forecast_results.loc[max_idx, 'power_w']
plt.annotate(f'Peak: {max_val:.2f} W', 
             xy=(max_idx, max_val),
             xytext=(10, 20),
             textcoords='offset points',
             arrowprops=dict(arrowstyle='->', lw=1.5),
             fontsize=12)

# Format plot
plt.title('24-Hour PV Power Forecast with Confidence Interval', fontsize=16)
plt.xlabel('Time', fontsize=14)
plt.ylabel('Power (W)', fontsize=14)
plt.grid(True, linestyle='--', alpha=0.7)
plt.legend(loc='upper left', fontsize=12)

# Add daily energy production
total_energy = forecast_results['power_w'].sum() / 1000  # kWh
plt.text(0.02, 0.02, f'Total Energy: {total_energy:.2f} kWh', 
         transform=plt.gca().transAxes,
         bbox=dict(facecolor='white', alpha=0.8, boxstyle='round,pad=0.5'),
         fontsize=12)

# Format x-axis with nicer time labels
plt.gcf().autofmt_xdate()
plt.tight_layout()
plt.show()

## 7. Export Forecast Results

We can export the forecast results to various formats for further use.

In [None]:
# Create output directory if it doesn't exist
os.makedirs('data', exist_ok=True)

# Export to CSV
csv_path = 'data/pv_forecast_results.csv'
forecast_results.to_csv(csv_path)
print(f"Forecast exported to CSV: {csv_path}")

# Export to JSON
json_path = 'data/pv_forecast_results.json'
forecast_json = forecast_results.reset_index().to_json(orient='records', date_format='iso')
with open(json_path, 'w') as f:
    f.write(forecast_json)
print(f"Forecast exported to JSON: {json_path}")

## 8. Analyzing Forecast Performance

In a real-world scenario, we would compare the forecast with actual values after the fact to evaluate model performance.

Here we'll simulate a comparison with synthetic "actual" data to demonstrate how you might perform such an analysis.

In [None]:
# Create synthetic "actual" data (for demonstration only)
# In a real scenario, this would be actual measured data after the forecast period
np.random.seed(42)  # For reproducibility
noise = np.random.normal(0, forecast_results['power_w'].max() * 0.1, len(forecast_results))
actual_data = forecast_results.copy()
actual_data['actual_power_w'] = forecast_results['power_w'] + noise
actual_data['actual_power_w'] = actual_data['actual_power_w'].clip(lower=0)  # Ensure non-negative

# Calculate performance metrics
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

rmse = np.sqrt(mean_squared_error(actual_data['actual_power_w'], actual_data['power_w']))
mae = mean_absolute_error(actual_data['actual_power_w'], actual_data['power_w'])
r2 = r2_score(actual_data['actual_power_w'], actual_data['power_w'])

# Display metrics
print(f"RMSE: {rmse:.2f} W")
print(f"MAE: {mae:.2f} W")
print(f"R²: {r2:.4f}")

# Plot comparison
plt.figure(figsize=(14, 7))
plt.plot(actual_data.index, actual_data['power_w'], 'b-', linewidth=2, label='Forecast')
plt.plot(actual_data.index, actual_data['actual_power_w'], 'r-', linewidth=2, label='Actual (Simulated)')
plt.title('Forecast vs. Actual PV Power Output (Simulation)', fontsize=16)
plt.xlabel('Time', fontsize=14)
plt.ylabel('Power (W)', fontsize=14)
plt.grid(True, linestyle='--', alpha=0.7)
plt.legend(fontsize=12)
plt.gcf().autofmt_xdate()
plt.tight_layout()
plt.show()

## 9. Conclusion and Key Insights

This notebook demonstrates a complete workflow for PV power forecasting using an LSTM model:

1. The model effectively captures the diurnal pattern of solar energy production
2. Weather forecast data is integrated to improve prediction accuracy
3. Iterative prediction allows for hourly forecasts over the next 24-hour period
4. The results can be visualized and exported in various formats

These forecasts can be used for:
- Energy management and planning
- Grid integration of renewable energy
- Optimizing battery storage systems
- Cost savings through demand-side management

For production use, consider implementing automated data pipelines and regular model retraining as more historical data becomes available.