# Support Vector Regression (SVR)

## Importing the libraries

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

## Importing the dataset

In [2]:
from google.colab import files

uploaded = files.upload()

Saving Power_Plant.csv to Power_Plant.csv


In [3]:
import io

dataset = pd.read_csv(io.BytesIO(uploaded['Power_Plant.csv']), delimiter=",")

#dataset = pd.read_csv('Power_Plant.csv') # use it in VS Code

dataset.head(10)

Unnamed: 0,AT,V,AP,RH,PE
0,14.96,41.76,1024.07,73.17,463.26
1,25.18,62.96,1020.04,59.08,444.37
2,5.11,39.4,1012.16,92.14,488.56
3,20.86,57.32,1010.24,76.64,446.48
4,10.82,37.5,1009.23,96.62,473.9
5,26.27,59.44,1012.23,58.77,443.67
6,15.89,43.96,1014.02,75.24,467.35
7,9.48,44.71,1019.12,66.43,478.42
8,14.64,45.0,1021.78,41.25,475.98
9,11.74,43.56,1015.14,70.72,477.5


## Splitting the dataset into the Training set and Test set

In [4]:
# First, divide the data set into independent (X) and dependent (y) variables
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

In [5]:
# Import the train_test_split function from the sklearn.model_selection module
from sklearn.model_selection import train_test_split

# Use train_test_split to split the data into training and testing sets
# X: Features, y: Target variable, test_size: Proportion of the dataset to include in the test split
# random_state: Seed for reproducibility, ensures the same split every time the code is run
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

## Feature Scaling

### X_train and X_test scaling process


In [6]:
# X_train BEFORE scaling
print(X_train)

[[  11.22   43.13 1017.24   80.9 ]
 [  13.67   54.3  1015.92   75.42]
 [  32.84   77.95 1014.68   45.8 ]
 ...
 [  16.81   38.52 1018.26   75.21]
 [  12.8    41.16 1022.43   86.19]
 [  32.32   67.9  1006.08   37.93]]


In [7]:
# X_test BEFORE scaling
print(X_test)

[[  28.66   77.95 1009.56   69.07]
 [  17.48   49.39 1021.51   84.53]
 [  14.86   43.14 1019.21   99.14]
 ...
 [  12.24   44.92 1023.74   88.21]
 [  27.28   47.93 1003.46   59.22]
 [  17.28   39.99 1007.09   74.25]]


In [8]:
from sklearn.preprocessing import StandardScaler

# X_train and X_test scaling process

# Create scaler object ss_scaleler_X.
ss_scaleler_X = StandardScaler()

# X_train and X_test are scaled
X_train = ss_scaleler_X.fit_transform(X_train)
X_test = ss_scaleler_X.transform(X_test)

In [9]:
# X_train AFTER scaling
print(X_train)

[[-1.13572795 -0.88685592  0.67357894  0.52070558]
 [-0.80630243 -0.00971567  0.45145467  0.14531044]
 [ 1.77128416  1.84743445  0.24279248 -1.88374143]
 ...
 [-0.38409993 -1.24886277  0.84522042  0.13092486]
 [-0.9232821  -1.04155299  1.54693117  0.8830852 ]
 [ 1.70136528  1.05824381 -1.20438076 -2.42285818]]


In [10]:
# X_test AFTER scaling
print(X_test)

[[ 1.20924389  1.84743445 -0.61878043 -0.28968211]
 [-0.29401214 -0.39528045  1.39211729  0.76937061]
 [-0.64629575 -0.88607065  1.00508258  1.77019599]
 ...
 [-0.99857936 -0.74629361  1.76737267  1.02146078]
 [ 1.02368993 -0.50992904 -1.64526378 -0.96443434]
 [-0.32090402 -1.13342892 -1.03442205  0.0651622 ]]


### Scaling of **y_train**

In [11]:
# y_train BEFORE scaling
print(y_train)

[473.93 467.87 431.97 ... 459.01 462.72 428.12]


The following block of code performs a **data preprocessing** operation by first **standardizing **the scale of features in the dataset

When scaling **y_train**, a StandardScaler() object *named by us* called ss_scaler_y is used. However, in order for this scaling to be possible, the y_train data must be in the form of a **2D array**. *Because*

y_train = y_train.reshape(len(y_train), 1)

With the **reshape** housing, y_train elements are reshaped and converted into a 2D array.



This process reshapes the size of the target variable in the data set into a 2D array.

In [12]:
# First, the y_train data is converted into a 2D array.
# The StandardScaler() object requires such a format.

y_train = y_train.reshape(len(y_train), 1)
print(y_train)

[[473.93]
 [467.87]
 [431.97]
 ...
 [459.01]
 [462.72]
 [428.12]]


In [13]:
# now scale y_train.
ss_scaleler_y = StandardScaler()
y_train = ss_scaleler_y.fit_transform(y_train)

In [14]:
print(y_train)

[[ 1.15069786]
 [ 0.79540777]
 [-1.30936356]
 ...
 [ 0.27595724]
 [ 0.49346982]
 [-1.53508417]]


In [15]:
# Return y_train to its original FORM before applying it to the algorithm.
y_train = y_train.reshape(len(y_train))
print(y_train)

[ 1.15069786  0.79540777 -1.30936356 ...  0.27595724  0.49346982
 -1.53508417]


## Training the SVR model on the Training set

In [16]:
# Import the SVR (Support Vector Regression) model from the sklearn.svm module
from sklearn.svm import SVR

# Create an SVR model with an RBF (Radial Basis Function) kernel
regressor = SVR(kernel='rbf')

# Fit the SVR model to the training data
regressor.fit(X_train, y_train)

## Predicting the Test set results
Predictions of the SVR algorithm based on the values in the test set

In [17]:
y_pred = regressor.predict(X_test)

# Here, the estimated values produced by the algorithm are in scaled format.
# So the output is as follows.

print(y_pred)

[-1.18727339  0.21311322  0.39445405 ...  0.9556216  -0.87278266
  0.38779571]


### Perform inverse scaling of the output values produced by the algorithm, i.e. y_pred.

Thus, they are returned to their original scale.

Above, the algorithm's generated predicted values are in **y_pred** scaled format.
So the output is as follows.

In [18]:
print(y_pred)

[-1.18727339  0.21311322  0.39445405 ...  0.9556216  -0.87278266
  0.38779571]


Before calculating the performance of our model,

y_pred = ss_measurement_y.inverse_transform(y_pred)

housing and **y_pred** should be returned to its **original values**.

In [19]:
# First, adjust its shape by converting it to a 2D array.
y_pred = y_pred.reshape(len(y_pred), 1)
print(y_pred)

[[-1.18727339]
 [ 0.21311322]
 [ 0.39445405]
 ...
 [ 0.9556216 ]
 [-0.87278266]
 [ 0.38779571]]


In [20]:
# Return to original SCALE.
y_pred = ss_scaleler_y.inverse_transform(y_pred)
print(y_pred)

[[434.05242921]
 [457.93810186]
 [461.03113894]
 ...
 [470.60268461]
 [439.41653548]
 [460.91757115]]


In [21]:
# And convert it to its original SHAPE.
y_pred = y_pred.reshape(len(y_pred))
print(y_pred)

[434.05242921 457.93810186 461.03113894 ... 470.60268461 439.41653548
 460.91757115]


## Evaluating the Model Performance

In [22]:
y_test

array([431.23, 460.01, 461.14, ..., 473.26, 438.  , 463.28])

In [23]:
# The outputs produced by the model (y_pred) and real values (y_test) are used.

# Evaluating the model

from sklearn.metrics import r2_score
r2_score(y_test, y_pred)

0.9480784049986258