# 🌬️ Wind Generation Forecasting with ANN (Weekly, NZ Islands)
This notebook uses Artificial Neural Networks to forecast **weekly wind energy generation** separately for the **South** and **North Islands**.

We address:
- **RQ1**: Accuracy of ANN using historical generation (univariate)
- **RQ2**: Improvement with lagged climate features based on correlation

📆 All analysis is done on weekly-aggregated data.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import warnings
warnings.filterwarnings('ignore')

In [None]:
# Load daily wind + climate dataset (South and North combined)
df = pd.read_csv('wind_daily_data.csv', parse_dates=['Date'])
df['Date'] = pd.to_datetime(df['Date'])
df = df.set_index('Date')

# Drop wind direction columns if present
df = df.drop(columns=[col for col in df.columns if 'WD50M' in col], errors='ignore')

In [None]:
# Resample to weekly mean or sum depending on variable type
weekly_df = pd.DataFrame()
weekly_df['GENERATION_SOUTH'] = df['GENERATION_SOUTH'].resample('W').sum()
weekly_df['GENERATION_NORTH'] = df['GENERATION_NORTH'].resample('W').sum()

# Climate variables averaged or summed as appropriate
for var in ['WS50M', 'T2M', 'PS', 'RH2M']:
    weekly_df[f'{var}_SOUTH'] = df[f'{var}_SOUTH'].resample('W').mean()
    weekly_df[f'{var}_NORTH'] = df[f'{var}_NORTH'].resample('W').mean()

for var in ['PRECTOTCORR', 'EVLAND']:
    weekly_df[f'{var}_SOUTH'] = df[f'{var}_SOUTH'].resample('W').sum()
    weekly_df[f'{var}_NORTH'] = df[f'{var}_NORTH'].resample('W').sum()

# Drop weeks with missing values
weekly_df.dropna(inplace=True)
weekly_df.head()

## 🔍 Correlation Analysis for Feature Selection
We calculate Pearson correlation between weekly wind generation and meteorological variables for each island.
Only features with |correlation| > 0.3 will be used for RQ2 models.

In [None]:
# Correlation: South Island
south_df = weekly_df[[col for col in weekly_df.columns if '_SOUTH' in col]].copy()
south_df.columns = [col.replace('_SOUTH', '') for col in south_df.columns]
corr_south = south_df.corr()
plt.figure(figsize=(8, 5))
sns.heatmap(corr_south, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('South Island: Correlation Heatmap')
plt.show()

# Select features with correlation > 0.3 with GENERATION
selected_features_south = corr_south['GENERATION'].drop('GENERATION')
selected_features_south = selected_features_south[abs(selected_features_south) > 0.3].index.tolist()
print('Selected features (South):', selected_features_south)

In [None]:
# Correlation: North Island
north_df = weekly_df[[col for col in weekly_df.columns if '_NORTH' in col]].copy()
north_df.columns = [col.replace('_NORTH', '') for col in north_df.columns]
corr_north = north_df.corr()
plt.figure(figsize=(8, 5))
sns.heatmap(corr_north, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('North Island: Correlation Heatmap')
plt.show()

# Select features with correlation > 0.3 with GENERATION
selected_features_north = corr_north['GENERATION'].drop('GENERATION')
selected_features_north = selected_features_north[abs(selected_features_north) > 0.3].index.tolist()
print('Selected features (North):', selected_features_north)