## Table of contents
1. [Introduction](#introduction)

2. [Decision Trees](#dectree)

    2.1 [Preprocessing](#preproc)
    
    2.2 [Algorithm](#alg)
    
    2.3 [Predictions](#pred)

3. [Neural Networks](#neural)
    
    2.1 [Algorithm](#alg2)
    
    2.2 [Predictions](#pred2)
    
4. [Discussion](#disc)
    
5. [References](#ref)

## Introduction <a name="introduction"></a>
The goal of this project is to make 2 predictive models which predict wind turbine power from wind turbine speed. The dataset features and labels (i.e. input and output values) [1](#1) that are both continuous so regression models are employed. The two models are decision tree regression and neural networks respectively.

## Decision Trees <a name="dectrees"></a>
This project will use the the DecisionTreeRegressor function from the sklearn package to perform regression using Decision Trees 

Decision Trees work by breaking down the dataset into smaller and smaller segments while a "decision tree" (i.e. a node structure with tests at each node to divide the data) is developed node by node. [2](#2)

This algorithm works in the same way for both classification and regression.


### Preprocessing <a name="preproc"></a>
First the data is imported and preprocessed. The preprocessing consists in splitting up the data into feature data (x values) and labels (y values) and rehsaping the X values as the LinearRegression() from linear_model function does not take a 1D array for the X values.

In [66]:
# import data 
import pandas as pd
lin_data = pd.read_csv('powerproduction.csv')
lin_data.head()

# X and y values for regression
X = lin_data.iloc[:, 0].values
y = lin_data.iloc[:, 1].values

# The X values are reshaped as 
# they only contain one feature
X = X.reshape(-1, 1)

Preprocessing often involves scaling the data and removing outliers etc. The data will not be scaled in this project so the models are as simple as possible and there is no confusion between the predicted data and the observed data. Outliers will not be removed as there is no specific reason given why they are not accurate data.

### Algorithm <a name="alg"></a>
The train_test_split function is defined using the model_selection module from the sklearn package. This is used to randomly split the data into training and testing data.
The DecisionTreeRegressor function is defined and used to fit a regressor to the X_train and y_train datasets. A test size of 27% of the data is chosen (so 73% of the data is training data).

In [67]:
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.27, random_state=0)

regressor = DecisionTreeRegressor();
regressor.fit(X_train, y_train);

### Predictions <a name="pred"></a>
The predicted values are calculated using the regressor and the X_test testing data. Then various metrics (mean absolute error, mean squared error and root mean squared error) are calculated to test the efficacy of the model. In addition, the coefficient of variation (the root mean squared error RMSE as a percentage of the mean of the observed valeus) is calculated.

In [68]:
from sklearn import metrics
import numpy as np
y_pred = regressor.predict(X_test)
print('Mean Absolute Error:', metrics.mean_absolute_error(y_test, y_pred))
print('Mean Squared Error:', metrics.mean_squared_error(y_test, y_pred))
print('Root Mean Squared Error:', np.sqrt(metrics.mean_squared_error(y_test, y_pred)))
print('Mean of observed y values:', np.mean(y))
# coefficient of variation 
print('Coefficient of variation:', (100*np.sqrt(metrics.mean_squared_error(y_test, y_pred)))/np.mean(y))

Mean Absolute Error: 5.822511111111111
Mean Squared Error: 172.3565864222222
Root Mean Squared Error: 13.128464739725747
Mean of observed y values: 48.014584
Coefficient of variation: 27.34266059605087


The coefficient of variation is 27.44%. A good coefficient of variation is considered to be less than 25% [3](#3) so the decision tree regression model is very close to being accurate but is not quite there.

## Neural Networks <a name="neural"></a>
This project will use the the  function from the sklearn package to perform regression using Artificial Neural Networks.

Neural networks operate in a similar manner to neurons in the brain [4](#4). Nodes in the neural network are connected together and they each have an activation function which depends on the inputs (which are the outputs of the other neurons, obtained via the connections). 

Activation functions differ but generally they are approximately 1 when a certain threshold value has been reached and approximately 0 if that value has not been reached. In addition, the connections between the neurons have weights which amplify or diminish the outputs (in an artificial neural network these are  set to randomly chosen small numbers [5](#5)

In a typical artificial neural network, there is an input layer of neurons (which have inputs not connected to any other neuron), an output layer (which have outputs not connected to any other neuron) and a hidden layer (neurons which have inputs and outputs connected to other neurons). A deep neural network has multiple hidden layers. [6](#6)

Neural networks can be used to learn functions and patterns to make predictions. Tensorflow makes predictions using neural networks.

### Algorithm <a name="alg2"></a>
The data preprocessed for the decision tree classifier is used again for this.

The train_test_split function is defined using the model_selection module from the sklearn package. This is used to randomly split the data into training and testing data.

The Input and Dense classes are imported from tensorflow.keras.layers. The input layer and the hidden layers of the neural network are created as objects of these classes.[7]

There are 500 nodes in the first hidden layer of the artificial neural network. This is because there are 500 features and 500 labels to train the model on. The inputs of this layer are conntected to the outputs of the input layer (whose shape is the shape of the input data X).

After this, two more hidden layers, the first with 100 neurons and the second with 50 neurons, are created. The outputs of the last hidden layer are connected to the output layer (which has only 1 neuron). 

The activation function in this case is the rectified linear activation function or "relu". It is a piecewise linear function that returns the input if the input is positive. If the input is not positive it outputs 0. [8](#8)

A regression model called model is created using the Model class. It uses the mean_squared error function for the losos function and the Adam algorithm for the optimiser. [9](#9)

The model is compiled as follows:

In [73]:
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=0)

input_layer = Input(shape=(X.shape[1],))
dense_layer_1 = Dense(500, activation='relu')(input_layer)
dense_layer_2 = Dense(100, activation='relu')(dense_layer_1)
dense_layer_3 = Dense(50, activation='relu')(dense_layer_2)
#dense_layer_3 = Dense(50, activation='relu')(dense_layer_2)
#dense_layer_4 = Dense(25, activation='relu')(dense_layer_3)
#dense_layer_5 = Dense(10, activation='relu')(dense_layer_4)
output = Dense(1)(dense_layer_3)

model = Model(inputs=input_layer, outputs=output)
model.compile(loss="mean_squared_error" , optimizer="adam", metrics=["mean_squared_error"])

The model is trained on the training data as follows:

In [74]:
history = model.fit(X_train, y_train, batch_size=2, epochs=200, verbose=1, validation_split=0.25)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200


Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78/200
Epoch 79/200
Epoch 80/200
Epoch 81/200
Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200


Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200


Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 157/200
Epoch 158/200
Epoch 159/200
Epoch 160/200
Epoch 161/200
Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200


Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


### Predictions <a name="pred2"></a>
The predicted values are calculated using the regressor and the X_test testing data as before. Then various metrics (including the coefficient of variation) are calculated to test the efficacy of the model as before.

In [75]:
y_pred = model.predict(X_test)
print('Mean Absolute Error:', metrics.mean_absolute_error(y_test, y_pred))
print('Mean Squared Error:', metrics.mean_squared_error(y_test, y_pred))
print('Root Mean Squared Error:', np.sqrt(metrics.mean_squared_error(y_test, y_pred)))
print('Mean of observed y values:', np.mean(y))
# coefficient of variation 
print('Coefficient of variation:', (100*np.sqrt(metrics.mean_squared_error(y_test, y_pred)))/np.mean(y))

Mean Absolute Error: 6.508903464675902
Mean Squared Error: 210.48527312759373
Root Mean Squared Error: 14.508110598130749
Mean of observed y values: 48.014584
Coefficient of variation: 30.216049769650716


The coefficient of variation in this case is 30.2%. This is close to being accurate but does not quite meet the threshold of 25% and is less than the coefficient of variation for the Decision Tree regression (which was 27.44%) and took longer than the Decision Tree algorithm which was instantaneous.

# Discussion <a name="disc"></a>

Both models produce results that almost meet the conventional threshold for good regression models but do not meet it. The neural network was quite large (650 neurons in total) and was performed for 200 epochs but it was not sufficient and was in fact less accuate than the decision tree regression.

Removal of outliers was intentionally left out of the preprocessing because there was no specific reason to do so except to improve the accuracy of the models. However, removal of outliers could improve the accuracy of the regression. This will be attempted in the "Fundamentals of Data Analysis" project which requires linear regression to be performed on data.

# References <a name="ref"></a>
[1] Medium. 2020. Some Key Machine Learning Definitions. [online] Available at: <https://medium.com/technology-nineleaps/some-key-machine-learning-definitions-b524eb6cb48> [Accessed 28 December 2020]. <a name="1"></a> <br>
[2] 2020. [online] Available at: <https://www.saedsayad.com/decision_tree_reg.html> [Accessed 28 December 2020]. <a name="2"></a> <br>
[3] Use, 1., 2020. How To Assess A Regression's Predictive Power For Energy Use - Kw Engineering. [online] kW Engineering. Available at: <https://www.kw-engineering.com/how-to-assess-a-regressions-predictive-power-energy-use/> [Accessed 28 December 2020].<a name="3"></a> <br>
[4] Medium. 2020. A Beginner’S Guide To Neural Networks: Part One. [online] Available at: <https://towardsdatascience.com/a-beginners-guide-to-neural-networks-b6be0d442fa4> [Accessed 28 December 2020].<a name="4"></a> <br>
[5] Brownlee, J., 2020. Why Initialize A Neural Network With Random Weights?. [online] Machine Learning Mastery. Available at: <https://machinelearningmastery.com/why-initialize-a-neural-network-with-random-weights/> [Accessed 28 December 2020]. <a name="5"></a> <br>
[6] Brownlee, J., 2020. What Is Deep Learning?. [online] Machine Learning Mastery. Available at: <https://machinelearningmastery.com/what-is-deep-learning/> [Accessed 28 December 2020]. <a name="6"></a> <br>
[7] Stack Abuse. 2020. Tensorflow 2.0: Solving Classification And Regression Problems. [online] Available at: <https://stackabuse.com/tensorflow-2-0-solving-classification-and-regression-problems/> [Accessed 28 December 2020]. <a name="7"></a> <br>
[8] Brownlee, J., 2020. A Gentle Introduction To The Rectified Linear Unit (Relu). [online] Machine Learning Mastery. Available at: <https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks/> [Accessed 28 December 2020]. <a name="8"></a> <br>
[9] TensorFlow. 2020. Tf.Keras.Optimizers.Adam  |  Tensorflow Core V2.4.0. [online] Available at: <https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam> [Accessed 28 December 2020]. <a name="9"></a>