# Modelling wind power generation with a neural network

In this example we use a neural network implemented with `Keras` trained with actual generation data provided by [ENTSO-E](https://transparency.entsoe.eu/) and wind speed data from [C3S ERA5](https://climate.copernicus.eu/climate-reanalysis)

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.compose import ColumnTransformer

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, InputLayer

from tensorflow.keras.callbacks import EarlyStopping

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats
import seaborn as sns

from IPython.display import Image

The selected onshore wind farm is in Scotland. In the map we show the location of the 35 turbines (blue triangle) and the 16 grid points used as predictors. For each grid point we have the two components of 100-metre wind (`u100` and `v100`) and the actual wind speed (`ws`).


In [None]:
Image("../input/d/matteodefelice/gordonbush-wind/gordonbush.png")

The hourly data used for training is for the period 2016-2018.

In [None]:
df = pd.read_csv('../input/d/matteodefelice/gordonbush-wind/gordonbush-2016_2018.csv')
df.info()

A bit of data wrangling to transform the data frame in a "wide" format, with 16x3 columns.

In [None]:
x = df[['latitude', 'longitude', 'time', 'u100', 'v100', 'ws']]
x = x.assign(point = x['latitude'].astype(str) + x['longitude'].astype(str))
x = x.drop(['latitude', 'longitude'], axis = 1)
x = x.pivot(index = 'time', columns = ['point'], values = ['u100', 'v100', 'ws'])
x.columns = x.columns.to_flat_index().str.join('_')
x = x.reset_index().drop('time', axis = 1)
print(x.shape)

In [None]:
x.head()

For the output we just need to extract the time-series of generation of any grid point. In fact, in the original data frame, it is repeated for each lat/lon pair. 

In [None]:
y = df.loc[df['latitude'] == 58.25].loc[df['longitude'] == -4]['ActualGenerationOutput']
y.shape

Splitting the data in training and testing.

In [None]:
X_train, X_test, Y_train, Y_test = train_test_split(x, y, test_size = 0.75, random_state = 41)

Predictors are scaled using a `StandardScaler` 

In [None]:
ct = ColumnTransformer( [ ("numeric", StandardScaler(), slice(0, 48))  ]  )
X_train = ct.fit_transform(X_train)
X_test  = ct.transform(X_test)

We create a neural network with 128 hidden neurons.

In [None]:
md = Sequential()
md.add(InputLayer(input_shape=(X_train.shape[1],)))
md.add(Dense(128, activation='relu'))
md.add(Dense(1, activation='linear'))

This network has 6401 trainable parameters, we can add layers or experiment other activation functions or number of neurons. 

In [None]:
md.summary()

We set an early stopping criterion (using a 20% of validation data) and we use the MAE to train the network. If the validation error does not improve in 20 epochs the training stops. The training takes ~20 seconds. 

In [None]:
stop = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=20)
md.compile(loss='mean_absolute_error', optimizer='adam', metrics=['mean_squared_error'])
md.fit(X_train, Y_train, epochs = 100, batch_size = 20, verbose=0,validation_split=0.2, callbacks=[stop])

Let's see the MAE and the MSE on the testing dataset.

In [None]:
loss, acc = md.evaluate(X_test, Y_test.values, verbose=0)
print("MAE", loss, "MSE:", acc)

The correlation coefficient on the testing dataset seems rather decent, as the scatter plot suggests the network seems able to generalise. 

In [None]:
y_hat = md.predict(X_test, verbose=0)
print(scipy.stats.pearsonr(Y_test.values, y_hat.flatten()))
# PLOT
plt.scatter(y = y_hat, x = Y_test.values)
plt.xlabel('Testing wind generation')
plt.ylabel('Network testing prediction')
plt.title('Wind generation on the testing data')
plt.grid(True)

So far, we have trained and tested the network on the wind generation for the period 2016-2018, let's see how it works applying the same network predicting the year 2019. We start loading the data as we did before.

In [None]:
df_test = pd.read_csv('../input/d/matteodefelice/gordonbush-wind/gordonbush-2019.csv')
xt = df_test[['latitude', 'longitude', 'time', 'u100', 'v100', 'ws']]
xt = xt.assign(point = xt['latitude'].astype(str) + xt['longitude'].astype(str))
xt = xt.drop(['latitude', 'longitude'], axis = 1)
xt = xt.pivot(index = 'time', columns = ['point'], values = ['u100', 'v100', 'ws'])
xt.columns = xt.columns.to_flat_index().str.join('_')
xt = xt.reset_index().drop('time', axis = 1)

yt = df_test.loc[df_test['latitude'] == 58.25].loc[df_test['longitude'] == -4]['ActualGenerationOutput']
xt.shape, yt.shape

Scaling the features use the same scaler used for the training data.

In [None]:
xt = ct.transform(xt)

Again the correlation coefficient and the scatter plot. The correlation is smaller, and we can see more cases where the network predicts generation while the actual one is zero or very small (~10). Is this maintenance? Or perhaps curtailment? 

In [None]:
y_hat_test = md.predict(xt, verbose=0)
print(scipy.stats.pearsonr(yt.values, y_hat_test.flatten()))
# PLOT
plt.scatter(y = y_hat_test, x = yt.values)
plt.xlabel('Testing wind generation')
plt.ylabel('Network testing prediction')
plt.title('Wind generation on the testing data')
plt.grid(True)

fig, axs = plt.subplots(2, 1, figsize=(12, 5))
bins = np.linspace(0, 80, 20)

sns.histplot(y_hat_test, bins = bins, ax = axs[0], kde = False).set(title='Predicted')
sns.histplot(yt, bins = bins, ax = axs[1], kde = False).set(title = 'Observed (2019)')

plt.show()

In [None]:
plt.plot(y_hat_test[0:200])
plt.plot(yt.values[0:200])