# Wind Energy Data Model

## Project Description
A research engineer is investigating the use of a windmill to generate electricity in different provinces in Canada.
She has collected data on the DC output from these windmills and the corresponding wind velocity. The data are listed
in "Windmill.csv".

Build a model to predict the DC output for a given wind speed in mph (mile per hour).

## Importing the Libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## Importing the Data and Checking

In [2]:
dataset = pd.read_csv('data/Windmill.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, -1].values

In [3]:
dataset

Unnamed: 0,Location,Wind Velocity(mph),DC Output
0,Manitoba,2.45,0.423
1,Manitoba,2.7,0.5
2,Manitoba,2.9,0.653
3,Manitoba,3.05,0.558
4,Manitoba,3.4,1.057
5,Newfoundland,3.6,1.137
6,Newfoundland,3.95,1.144
7,Newfoundland,4.1,1.194
8,Newfoundland,4.6,1.562
9,Alberta,5.0,1.582


In [4]:
# quick check of X and y
X

array([[ 2.45],
       [ 2.7 ],
       [ 2.9 ],
       [ 3.05],
       [ 3.4 ],
       [ 3.6 ],
       [ 3.95],
       [ 4.1 ],
       [ 4.6 ],
       [ 5.  ],
       [ 5.45],
       [ 5.8 ],
       [ 6.  ],
       [ 6.2 ],
       [ 6.35],
       [ 7.  ],
       [ 7.4 ],
       [ 7.85],
       [ 8.15],
       [ 8.8 ],
       [ 9.1 ],
       [ 9.55],
       [ 9.7 ],
       [10.  ],
       [10.2 ]])

**Note:** StandardScaler only accept the data in Matr   ix format. So, we need to reshape the vector y which is 1D array to
a matrix of 25x1.

In [5]:
y = y.reshape(len(y), 1)

## Feature Scaling

In [6]:
from sklearn.preprocessing import StandardScaler

sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)

**Note:** In SVR, we must always scale X **and** y. Otherwise, the data won't be properly projected onto the curve.

## Training the SVR Model on the Dataset

**Note:** Here we chose the RBF Kernel, which is the most common. We may try other Kernels such as:

    - Polynomial Kernel (poly)
    - Gaussian Radial Basis Function (RBF) Kernel
    - Linear Kernel (linear)
    - sigmoid Kernel (sigmoid)
    
This topic will be covered in SVM later.

In [7]:
from sklearn.svm import SVR

regressor = SVR(kernel='rbf')
regressor.fit(X, y)

  y = column_or_1d(y, warn=True)


SVR()

## Using the Model to Predict the DC Output

In [8]:
# Select a wind speed and predict the DC Output
new_X = [[9]]

# we need to scale it
scaled_new_X = sc_X.fit_transform(new_X)

# we predict the output
result = regressor.predict(scaled_new_X)

# the "Result" is in scaled format; we need to inverse it to the actual value.
scaled_result = sc_y.inverse_transform(result)

print(f"The DC Output will be {scaled_result}")


ValueError: Expected 2D array, got 1D array instead:
array=[0.34903597].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

## Visualising the SVR Results

In [None]:
# Note, we need to plot the actual values not the scaled data. So, we 
#need to inverse the scaled data back to original.

plt.scatter(sc_X.inverse_transform(X), sc_y.inverse_transform(y), color='red')
plt.plot(sc_X.inverse_transform(X), sc_y.inverse_transform(regressor.predict(X)))

plt.style.use('dark_background')
plt.title('Wind Energy Data Model (SVR)')
plt.xlabel('Wind Velocity(mph)')
plt.ylabel('DC Output')
plt.show()