Skip to content

winnie0617/trading-wind-energy

Repository files navigation

Trading-Wind-Energy

Objective

Model for SGInnovate Datathon 2020 July
lstm.py is the python file for training the model, while prediction.py is the python file for making predictions via the model to satisfy organizer's requirements
To forecast the wind energy generated by respective wind farms after 18 hours later so that references could be taken for trading.

Packages

import keras 

Preliminary Data Analysis

There are two main datasets - Wind Energy Production and Wind Forecasts. In order to have a rough idea of the features we are going to use in our neural network model, we created several plots to gain a better understanding of the data.

Wind Energy Production Dataset

Wind energy production plot As observed in the plot, there is no obvious periodicity, which makes it pretty challenging to predict. Autocorrelation

Wind Speed and Direction of 8 Wind Farms

Wind speed vs energy We averaged the wind speed data from the 8 farms and plotted them against the energy production. It can be seen that there is a positive correlation between the average wind speed and the energy production. Therefore, we are going to take the average wind speed as the second input. Wind direction vs energy We also plotted the wind direction vs energy production for each wind farm, one of which is shown above. Although the relationship between the two is non linear and less obvious, we still included it as a feature as it is observed that energy production tends to be higher in two directions.

Data Preprocessing

Speed and direction data are produced every 6 hours. Since we are required to make a prediction every hour, we linearly interpolated the datasets to time base of 1 hour. Missing data in all data sets are also linearly interpolated. Then, as mentioned above, we averaged the wind speed data across the 8 wind farms.
In order to avoid large data points from dominating the learning, we normalized all our datasets to be between 0 and 1.

def scale_data(data):
    scaler = MinMaxScaler()
    scaler.fit(data)
    return scaler.transform(data)

Persistence as Benchmark

We started off by building a persistence model as our benchmark. Test loss of future models we develop will be compared to the persistence loss. Used MSE, and got 0.09

Parameters

BATCH_SIZE = 32
TIMESTEPS = 24
EPOCH = 100
PATIENCE = 10
LEARNING_RATE = 0.001

LSTM Layers and Nodes

model = Sequential()
model.add(LSTM(24, input_shape=(TIMESTEPS, NUM_FEATURES), return_sequences=False, activity_regularizer=l2(0.001)))
model.add(Dropout(0.1))
model.add(Dense(12, activation='relu'))
model.add(Dense(1, activation='linear'))

opt = optimizers.Adam(learning_rate=LEARNING_RATE)
model.compile(loss='mean_squared_error', optimizer=opt)
es = EarlyStopping(monitor='val_loss', patience=PATIENCE, min_delta=0.0001)

# Train model
history = model.fit(X_train, y_train, epochs=EPOCH,
                    validation_split=0.2, batch_size=BATCH_SIZE, callbacks=[es], shuffle=False)

Prediction Visualization

Predictions vs Actuals

Producing Predictions

In prediction.py, energy production, wind speed and direction data are scheduled to be scaped from the web every hour. The datasets then get updated and are used to generate predictions, which are then sent to the organizer's API for simulated trading.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published