# **Power Plant Energy Output**

## **Data Set Information:**

The dataset contains 9568 data points collected from a Combined Cycle Power Plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH) and Exhaust Vacuum (V) to predict the net hourly electrical energy output (EP) of the plant.

### **Attribute Information:**

Features consist of hourly average ambient variables
- Temperature (T) in the range 1.81°C and 37.11°C,
- Ambient Pressure (AP) in the range 992.89-1033.30 milibar,
- Relative Humidity (RH) in the range 25.56% to 100.16%
- Exhaust Vacuum (V) in teh range 25.36-81.56 cm Hg
- Net hourly electrical energy output (EP) 420.26-495.76 MW

## **Data Preprocessing**

### **Importing the libraries**

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

### **Importing the dataset**

In [2]:
df = pd.read_csv('/content/Power Plant Data.csv')
print(df.head())
X = df.iloc[:, :-1].values
y = df.iloc[:, -1].values

      AT      V       AP     RH      PE
0  14.96  41.76  1024.07  73.17  463.26
1  25.18  62.96  1020.04  59.08  444.37
2   5.11  39.40  1012.16  92.14  488.56
3  20.86  57.32  1010.24  76.64  446.48
4  10.82  37.50  1009.23  96.62  473.90


### **Splitting the dataset into the Training set and Test set**

In [3]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

## **Multiple Linear Regression**

### **Training the Multiple Linear Regression model on the Training set**


In [4]:
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

LinearRegression()

### **Predicting the Test set results**

In [5]:
y_pred = regressor.predict(X_test).round(2)
# OR, np.set_printoptions(precision=2)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

[[431.43 431.23]
 [458.56 460.01]
 [462.75 461.14]
 ...
 [469.52 473.26]
 [442.42 438.  ]
 [461.88 463.28]]


### **Evaluating the Model Performance**

In [6]:
from sklearn.metrics import r2_score
r2_score(y_test, y_pred)

0.9325301874814955

## **Polynomial Regression**

### **Training the Polynomial Regression model on the Training set**

In [7]:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
poly_reg = PolynomialFeatures(degree = 4)
X_poly = poly_reg.fit_transform(X_train)
regressor = LinearRegression()
regressor.fit(X_poly, y_train)

LinearRegression()

### **Predicting the Test set results**

In [8]:
y_pred = regressor.predict(poly_reg.transform(X_test)).round(2)
#np.set_printoptions(precision=2)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

[[433.94 431.23]
 [457.9  460.01]
 [460.52 461.14]
 ...
 [469.53 473.26]
 [438.27 438.  ]
 [461.66 463.28]]


### **Evaluating the Model Performance**

In [9]:
from sklearn.metrics import r2_score
r2_score(y_test, y_pred)

0.9458202288858492

## **Support Vector Regression (SVR)**

### **Feature Scaling**

In [10]:
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
Scaled_X_train = sc_X.fit_transform(X_train)
Scaled_y_train = sc_y.fit_transform(y_train.reshape(len(y_train),1))

### **Training the SVR model on the Training set**

In [11]:
from sklearn.svm import SVR
regressor = SVR(kernel = 'rbf')
regressor.fit(Scaled_X_train, Scaled_y_train)

  y = column_or_1d(y, warn=True)


SVR()

### **Predicting the Test set results**

In [12]:
y_pred = regressor.predict(sc_X.transform(X_test))
y_pred = y_pred.reshape(len(X_test), 1)
y_pred = sc_y.inverse_transform(y_pred).round(2)
#np.set_printoptions(precision=2)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

[[434.05 431.23]
 [457.94 460.01]
 [461.03 461.14]
 ...
 [470.6  473.26]
 [439.42 438.  ]
 [460.92 463.28]]


### **Evaluating the Model Performance**

In [13]:
from sklearn.metrics import r2_score
r2_score(y_test, y_pred)

0.9480763274995765

## **Decision Tree Regression**

### **Training the Decision Tree Regression model on the Training set**

In [14]:
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(random_state = 0)
regressor.fit(X_train, y_train)

DecisionTreeRegressor(random_state=0)

### **Predicting the Test set results**

In [15]:
y_pred = regressor.predict(X_test).round(2)
#np.set_printoptions(precision=2)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

[[431.28 431.23]
 [459.59 460.01]
 [460.06 461.14]
 ...
 [471.46 473.26]
 [437.76 438.  ]
 [462.74 463.28]]


### **Evaluating the Model Performance**

In [16]:
from sklearn.metrics import r2_score
r2_score(y_test, y_pred)

0.922905874177941

## **Random Forest Regression**

### **Training the Random Forest Regression model on the whole dataset**

In [17]:
from sklearn.ensemble import RandomForestRegressor
regressor = RandomForestRegressor(n_estimators = 10, random_state = 0)
regressor.fit(X_train, y_train)

RandomForestRegressor(n_estimators=10, random_state=0)

### **Predicting the Test set results**

In [18]:
y_pred = regressor.predict(X_test).round(2)
#np.set_printoptions(precision=2)
print(np.concatenate((y_pred.reshape(len(y_pred),1), y_test.reshape(len(y_test),1)),1))

[[434.05 431.23]
 [458.78 460.01]
 [463.02 461.14]
 ...
 [469.48 473.26]
 [439.57 438.  ]
 [460.38 463.28]]


### **Evaluating the Model Performance**

In [19]:
from sklearn.metrics import r2_score
r2_score(y_test, y_pred)

0.9615885203428433