# Project Proposal

### Matt Gorbett

## Introduction

This past summer I worked on a project to forecast wind direction from a wind turbine dataset I came across.  Links are here: 
https://mattgorb.github.io/wind.html <br>
https://mattgorb.github.io/wind_multivariatelstm.html

I used the following ML models:
-  Support Vector Regression with SkLearn
-  LSTM recurrent neural network with Keras
-  Multivariate recurrent neural network with Keras for one turbine using wind speed, wind direction, and temperature as predictors.  


This is a time series forecasting problem with circular output values.  I had to transform the outputs to sine and cosine before training. 

I propose experimenting with the following in this project: 
-  Multivariate recurrent neural network for predicting multiple wind turbines.  In my project, I used data for one wind turbine.  I am hoping to structure the dataset to use the four provided wind turbines in the dataset with their included variables wind direction, wind speed, and temperature.  I am hoping multiple sets of data for single times will improve results. 
-  Currently my dataset is 365x144 rows by 6x144 columns.  This means my input features only have 144*6 values.  I want to explore increasing the number of input features and potentially decreasing the number of rows.  
-  Rather than decreasing the number of rows, I want to explore using data from similar seasons.  For example, rather than using the previous 365 days of data, I can try using the previous 120 days of data, and retrieve the previous 3 years data for the corresponding season.  For January 1-14 2017 predictions, I can gather training data from September 1-December 31st, November 1-January 31 2016, November 1-January 31 2015, November 1-January 31 2014.  
-  Understand RNN more.  Start with this link: https://towardsdatascience.com/understanding-lstm-and-its-quick-implementation-in-keras-for-sentiment-analysis-af410fd85b47
- Further explore RNN structures, with different activation functions, embedding nodes.  



*** I will not be running this in a jupyter notebook.  I will be using the Ubuntu Deep Learning machine image on AWS for simple GPU setup. I ran the p2.xlarge instance, which has 61 GiB RAM, 4 CPUs, and a NVIDIA K80 GPU.  I will be recording results and graphs in the project.  

## Methods

I have several ideas and experiments I want to try out on this dataset to see if I can improve both the prediction error and the efficiency. First I will do the first two things:

1.  Increase test set size to determine error on more data.  I will increase the test set size from one day to two weeks.  This will give me a feel for how the model is performing on bigger test sets rather than just a single day.  

2.  Concatenate two datasets into one: <br>
https://opendata-renewables.engie.com/explore/dataset/la-haute-borne-data-2013-2016/ <br>
https://opendata-renewables.engie.com/explore/dataset/la-haute-borne-data-2017-2020/table/
Currently I'm training on only one dataset, 2013-2016.  I want to concatenate more data for better forecasting.  
This will allow me to experiment with different structures of data for training.  

# Experiments 

## 1.  Training data setup 

This is an example of how the data is setup for each model.  t=time.  For each row, I will take the sine and cosine of the wind direction, and fill its inputs with its historical values, in order.  
 
 #### SVR and single variate LSTM data setup
 sin=sine(windDirection)
 cos=cosine(windDirection)
 
|x1    |x2    |x4    |x4    |x5    |y   |
|------|------|------|------|------|----|
|sin(t-5)|sin(t-4)|sin(t-3)|sin(t-2)|sin(t-1)|sin(t)|

|x1    |x2    |x4    |x4    |x5    |y   |
|------|------|------|------|------|----|
|cos(t-5)|cos(t-4)|cos(t-3)|cos(t-2)|cos(t-1)|cos(t)|

 #### Multivariate LSTM data setup
 tp=temperature
 sp=wind speed
 
|x1    |x2    |x4    |x4    |x5    |y   |
|------|------|------|------|------|----|
|sin(t-5)|sin(t-4)|sin(t-3)|sin(t-2)|sin(t-1)|sin(t)|
|tp(t-5)|tp(t-4)|tp(t-3)|tp(t-2)|tp(t-1)||
|sp(t-5)|sp(t-4)|sp(t-3)|sp(t-2)|sp(t-1)||

|x1    |x2    |x4    |x4    |x5    |y   |
|------|------|------|------|------|----|
|cos(t-5)|cos(t-4)|cos(t-3)|cos(t-2)|cos(t-1)|cos(t)|
|tp(t-5)|tp(t-4)|tp(t-3)|tp(t-2)|tp(t-1)||
|sp(t-5)|sp(t-4)|sp(t-3)|sp(t-2)|sp(t-1)||

Dimensions:
Input=[rows, time_back, 3] <br>
Output[rows,1]

## Proposed multi turbine setup for project
#### Hypothesis:
Having more data from other wind turbines at the same time frame will help predict the wind direction of a single turbine.<br>
In the dataset we have data for 4 wind turbines.  Currently I'm only using one.  For this model, I want to train all 4 turbines wind direction forecasts at once.  

  

To do this, I want to try two different structures: 
#### 1. Input=[rows, time_back, 9] <br> Output[rows,4 columns]
<br>
This structure has four output values for a single row.  Each row has a dimension of [time_back, 9]

|x1    |x2    |x4    |x4    |x5    |y   |
|------|------|------|------|------|----|
|wt1_sin(t-5)|wt1_sin(t-4)|wt1_sin(t-3)|wt1_sin(t-2)|wt1_sin(t-1)|wt1_sin(t),wt2_sin(t),wt3_sin(t),wt4_sin(t)|
|wt1_tp(t-5)|wt1_tp(t-4)|wt1_tp(t-3)|wt1_tp(t-2)|wt1_tp(t-1)||
|wt1_sp(t-5)|wt1_sp(t-4)|wt1_sp(t-3)|wt1_sp(t-2)|wt1_sp(t-1)||
|wt2_sin(t-5)|wt2_sin(t-4)|wt2_sin(t-3)|wt2_sin(t-2)|wt2_sin(t-1)||
|wt2_tp(t-5)|wt2_tp(t-4)|wt2_tp(t-3)|wt2_tp(t-2)|wt2_tp(t-1)||
|wt2_sp(t-5)|wt2_sp(t-4)|wt2_sp(t-3)|wt2_sp(t-2)|wt2_sp(t-1)||
|wt3_sin(t-5)|wt3_sin(t-4)|wt3_sin(t-3)|wt3_sin(t-2)|wt3_sin(t-1)||
|wt3_tp(t-5)|wt3_tp(t-4)|wt3_tp(t-3)|wt3_tp(t-2)|wt3_tp(t-1)||
|wt3_sp(t-5)|wt3_sp(t-4)|wt3_sp(t-3)|wt3_sp(t-2)|wt3_sp(t-1)||
|wt4_sin(t-5)|wt4_sin(t-4)|wt4_sin(t-3)|wt4_sin(t-2)|wt4_sin(t-1)||
|wt4_tp(t-5)|wt4_tp(t-4)|wt4_tp(t-3)|wt4_tp(t-2)|wt4_tp(t-1)||
|wt4_sp(t-5)|wt4_sp(t-4)|wt4_sp(t-3)|wt4_sp(t-2)|wt4_sp(t-1)||

|x1    |x2    |x4    |x4    |x5    |y   |
|------|------|------|------|------|----|
|wt1_cos(t-5)|wt1_cos(t-4)|wt1_cos(t-3)|wt1_cos(t-2)|wt1_cos(t-1)|wt1_cos(t),wt2_cos(t),wt3_cos(t),wt4_cos(t)|
|wt1_tp(t-5)|wt1_tp(t-4)|wt1_tp(t-3)|wt1_tp(t-2)|wt1_tp(t-1)||
|wt1_sp(t-5)|wt1_sp(t-4)|wt1_sp(t-3)|wt1_sp(t-2)|wt1_sp(t-1)||
|wt2_cos(t-5)|wt2_cos(t-4)|wt2_cos(t-3)|wt2_cos(t-2)|wt2_cos(t-1)||
|wt2_tp(t-5)|wt2_tp(t-4)|wt2_tp(t-3)|wt2_tp(t-2)|wt2_tp(t-1)||
|wt2_sp(t-5)|wt2_sp(t-4)|wt2_sp(t-3)|wt2_sp(t-2)|wt2_sp(t-1)||
|wt3_cos(t-5)|wt3_cos(t-4)|wt3_cos(t-3)|wt3_cos(t-2)|wt3_cos(t-1)||
|wt3_tp(t-5)|wt3_tp(t-4)|wt3_tp(t-3)|wt3_tp(t-2)|wt3_tp(t-1)||
|wt3_sp(t-5)|wt3_sp(t-4)|wt3_sp(t-3)|wt3_sp(t-2)|wt3_sp(t-1)||
|wt4_cos(t-5)|wt4_cos(t-4)|wt4_cos(t-3)|wt4_cos(t-2)|wt4_cos(t-1)||
|wt4_tp(t-5)|wt4_tp(t-4)|wt4_tp(t-3)|wt4_tp(t-2)|wt4_tp(t-1)||
|wt4_sp(t-5)|wt4_sp(t-4)|wt4_sp(t-3)|wt4_sp(t-2)|wt4_sp(t-1)||


#### 2. Input=[rows, time_back, 3] <br> Output[rows,1]
<br>
This data structure stacks the wind turbines on top of each other, keeping the same time frames in groups of four.  

|x1    |x2    |x4    |x4    |x5    |y   |
|------|------|------|------|------|----|
|wt1_sin(t-5)|wt1_sin(t-4)|wt1_sin(t-3)|wt1_sin(t-2)|wt1_sin(t-1)|wt1_sin(t)|
|wt1_tp(t-5)|wt1_tp(t-4)|wt1_tp(t-3)|wt1_tp(t-2)|wt1_tp(t-1)||
|wt1_sp(t-5)|wt1_sp(t-4)|wt1_sp(t-3)|wt1_sp(t-2)|wt1_sp(t-1)||
|wt2_sin(t-5)|wt2_sin(t-4)|wt2_sin(t-3)|wt2_sin(t-2)|wt2_sin(t-1)|wt2_sin(t)|
|wt2_tp(t-5)|wt2_tp(t-4)|wt2_tp(t-3)|wt2_tp(t-2)|wt2_tp(t-1)||
|wt2_sp(t-5)|wt2_sp(t-4)|wt2_sp(t-3)|wt2_sp(t-2)|wt2_sp(t-1)||
|wt3_sin(t-5)|wt3_sin(t-4)|wt3_sin(t-3)|wt3_sin(t-2)|wt3_sin(t-1)|wt3_sin(t)|
|wt3_tp(t-5)|wt3_tp(t-4)|wt3_tp(t-3)|wt3_tp(t-2)|wt3_tp(t-1)||
|wt3_sp(t-5)|wt3_sp(t-4)|wt3_sp(t-3)|wt3_sp(t-2)|wt3_sp(t-1)||
|wt4_sin(t-5)|wt4_sin(t-4)|wt4_sin(t-3)|wt4_sin(t-2)|wt4_sin(t-1)|wt4_sin(t)|
|wt4_tp(t-5)|wt4_tp(t-4)|wt4_tp(t-3)|wt4_tp(t-2)|wt4_tp(t-1)||
|wt4_sp(t-5)|wt4_sp(t-4)|wt4_sp(t-3)|wt4_sp(t-2)|wt4_sp(t-1)||

|x1    |x2    |x4    |x4    |x5    |y   |
|------|------|------|------|------|----|
|wt1_cos(t-5)|wt1_cos(t-4)|wt1_cos(t-3)|wt1_cos(t-2)|wt1_cos(t-1)|wt1_cos(t)|
|wt1_tp(t-5)|wt1_tp(t-4)|wt1_tp(t-3)|wt1_tp(t-2)|wt1_tp(t-1)||
|wt1_sp(t-5)|wt1_sp(t-4)|wt1_sp(t-3)|wt1_sp(t-2)|wt1_sp(t-1)||
|wt2_cos(t-5)|wt2_cos(t-4)|wt2_cos(t-3)|wt2_cos(t-2)|wt2_cos(t-1)|wt2_cos(t)|
|wt2_tp(t-5)|wt2_tp(t-4)|wt2_tp(t-3)|wt2_tp(t-2)|wt2_tp(t-1)||
|wt2_sp(t-5)|wt2_sp(t-4)|wt2_sp(t-3)|wt2_sp(t-2)|wt2_sp(t-1)||
|wt3_cos(t-5)|wt3_cos(t-4)|wt3_cos(t-3)|wt3_cos(t-2)|wt3_cos(t-1)|wt3_cos(t)|
|wt3_tp(t-5)|wt3_tp(t-4)|wt3_tp(t-3)|wt3_tp(t-2)|wt3_tp(t-1)||
|wt3_sp(t-5)|wt3_sp(t-4)|wt3_sp(t-3)|wt3_sp(t-2)|wt3_sp(t-1)||
|wt4_cos(t-5)|wt4_cos(t-4)|wt4_cos(t-3)|wt4_cos(t-2)|wt4_cos(t-1)|wt4_cos(t)|
|wt4_tp(t-5)|wt4_tp(t-4)|wt4_tp(t-3)|wt4_tp(t-2)|wt4_tp(t-1)||
|wt4_sp(t-5)|wt4_sp(t-4)|wt4_sp(t-3)|wt4_sp(t-2)|wt4_sp(t-1)||


Dimensions:
Input=[rows, time_back, 9] <br>
Output[rows,4 columns for each turbine]



## 2.  Alternate training data structures

Once finding the ideal input data structure from above, I will do further experiments on the input data to determine whether differing row structures help with the training.  

Currently I train with the following: 
1.  Training data [x,y,z] is:<br>
    365 days (rows), 6 previous days (columns), 3 input vars (wind direction, speed, temperature)
    
I will experiment with the following structures:
1.  120 days (rows), 30 previous days (columns), 3 input vars
2.  90 days (rows), 60 previous days (columns), 3 input vars
3.  180 days (90 from previous 90 days, 30 from test year-1, etc.), 30 previous days (columns), 3 input vars. 


## 3. LSTM Network Understanding
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Explore alternate model structures.  

## Possible Results

I will run single and multivariate LSTM on larger dataset to gain a baseline RMSE and MAE.  I will compare these results to the new experiments I run.  

## Timeline

1.  Increase the test set size and determine performance of baseline models.  
2.  Run experiments above and compare results.  

# Information from my previous work

These are functions from my previous work that I will reuse and alter.  

In [None]:
def train_predict():
        model = Sequential()
        model.add(CuDNNLSTM(128*trainX_initial.shape[2]*2, input_shape=(recordsBack,trainX_initial.shape[2])))
        model.add(Dense(1))
        model.compile(loss='mean_absolute_error', optimizer='adam')


        checkpointer=ModelCheckpoint('weights.h5', monitor='val_loss', verbose=2, save_best_only=True, save_weights_only=True, mode='auto', period=1)
        earlystopper=EarlyStopping(monitor='val_loss', min_delta=0, patience=2, verbose=0, mode='auto')
        model.fit(trainX_initial, trainY_initial, validation_data=(validationX, validationY),epochs=20, batch_size=testX.shape[0], verbose=2, shuffle=False,callbacks=[checkpointer, earlystopper])
        
        model.load_weights("weights.h5")

        validationPredict=model.predict(validationX)
        validation_mae=mean_absolute_error(validationY, validationPredict)
        
        model.fit(trainX_initial, trainY_initial, validation_data=(validationX, validationY),epochs=1, batch_size=testX.shape[0], verbose=2)


        testPredict = model.predict(testX)

        testPredict[testPredict > 1] = 1
        testPredict[testPredict <-1] = -1
        return testPredict, validation_mae



def convertToDegrees(sin_prediction,cos_prediction):
	'''
	Converting sine and cosine back to its circular angle depends on finding which of the the 4 circular quadrants the 
	prediction will fall into. If sin and cos are both GT 0, degrees will fall in 0-90.  If sin>0 cos<0, degrees will fall into 90-180, etc. 
	'''
	inverseSin=np.degrees(np.arcsin(sin_prediction))
	inverseCos=np.degrees(np.arccos(cos_prediction))
	radians_sin=[]
	radians_cos=[]
	for a,b,c,d in zip(sin_prediction, cos_prediction, inverseSin, inverseCos):
		if(a>0 and b>0):
			radians_sin.append(c)
			radians_cos.append(d)	
		elif(a>0 and b<0):
			radians_sin.append(180-c)
			radians_cos.append(d)	
		elif(a<0 and b<0):
			radians_sin.append(180-c)
			radians_cos.append(360-d)	
		elif(a<0 and b>0):
			radians_sin.append(360+c)
			radians_cos.append(360-d)
	radians_sin=np.array(radians_sin)
	radians_cos=np.array(radians_cos)
	return radians_sin, radians_cos



def calcWeightedDegreePredictions(sin_error,cos_error,radians_sin,radians_cos):
	errorTotal=cos_error+sin_error
	sinWeight=(errorTotal-sin_error)/errorTotal
	cosWeight=(errorTotal-cos_error)/errorTotal
	weighted=np.add(sinWeight*radians_sin, cosWeight*radians_cos)
	return weighted