# <b>Combined Cycle Power Plant Using Multiple Linear Regression</b>
The "Combined Cycle Power Plant" dataset contains the following features (variables):

* <b>Temperature :</b> The temperature measured in °C.

* <b>Pressure :</b> The ambient pressure measured in millibars.

* <b>Humidity :</b> The relative humidity measured in percent.

* <b>Vacuum :</b> The exhaust vacuum measured in cm Hg.

* <b>Energy Output :</b> The electrical energy output of the power plant measured in MW.

These features are used to estimate the electrical energy output of a combined cycle power plant. Each instance in the dataset represents a specific combination of these features and the corresponding electrical energy output.

For this purpose, I will estimate a multiple linear regression model and obtain its determination coefficient.

## <b>Importing the libraries</b>

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

## <b>Importing the dataset</b>

In [None]:
dataSet =pd.read_excel("Combined Cycle Power Plant.xlsx")

dataSet.rename(columns = {'AT':'Temperature', 
                     'V':'Vacuum', 
                     'AP':'Pressure',
                     'RH':'Humidity',
                     'PE':'Energy Output'}, inplace = True) 

X = dataSet.iloc[:,:-1].values
Y = dataSet.iloc[:,-1].values
print(f"The Value of X:\n")
print(X)
print(f"\nThe Value of Y:\n")
print(Y)

In [21]:
dataSet.columns

Index(['Temperature', 'Vacuum', 'Pressure', 'Humidity', 'Energy Output'], dtype='object')

In [22]:
dataSet.head()

Unnamed: 0,Temperature,Vacuum,Pressure,Humidity,Energy Output
0,14.96,41.76,1024.07,73.17,463.26
1,25.18,62.96,1020.04,59.08,444.37
2,5.11,39.4,1012.16,92.14,488.56
3,20.86,57.32,1010.24,76.64,446.48
4,10.82,37.5,1009.23,96.62,473.9


## <b>Descriptive Statistics</b>

In [None]:
print("\n\033[1m\033[36m\033[6m{:^50}\033[0m".format("Descriptive Statistics")) 
print(dataSet.describe())

## <b>Histograms</b>

In [None]:
# Selecting our variables 
parameters = ["Temperature", "Vacuum", "Pressure", "Humidity", "Energy Output"] 
# Creating histograms 
for parameter in parameters: 
    sns.histplot(data = dataSet, x = var) 
    plt.title(f"Histogram of {var}") 
    plt.show()

## <b>Boxplots</b>

In [None]:
# Creting Boxplots 
for parameter in parameters: 
    sns.catplot(data=dataSet, y = parameter, kind = "box", color = "#009E60") 
    plt.title(f"{parameter}'s Boxplot") 
    plt.show()

## <b>Splitting the dataset into the Training and Test sets</b>

In [28]:
from sklearn.model_selection import train_test_split 
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state =0)

## <b>Training the Multiple Linear Regression model on the Training set</b>

In [29]:
from sklearn.linear_model import LinearRegression 
regressor = LinearRegression() 
regressor.fit(X_train, y_train)

## <b>Predicting the Test set results</b>

In [31]:
y_pred = regressor.predict(X_test)

In [33]:
from sklearn.metrics import r2_score 
r2_score(y_test, y_pred)

0.9325315554761303

In [37]:
W = regressor.coef_
b = regressor.intercept_

input_value = [8.34,40.77,1010.84,90.01]

f_wb = np.dot(W,input_value) + b
print(f_wb)

477.0864802653599
