# Problem  Statement

#### predict the net hourly electrical energy output (EP)  of the plant.

##### The dataset contains 9568 data points collected from a Combined Cycle Power Plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH) and Exhaust Vacuum (V) to predict the net hourly electrical energy output (EP)  of the plant.


In [145]:
import numpy  as np
import pandas as pd

In [146]:
df = pd.read_excel("C:/Users/user/Desktop/w/powerplant.xlsx")

In [147]:

df.columns = ["Temperature","Exhaust Vacuum ","Ambient Pressure","Relative Humidity","Electric Power output"]

In [148]:
df.head()

Unnamed: 0,Temperature,Exhaust Vacuum,Ambient Pressure,Relative Humidity,Electric Power output
0,14.96,41.76,1024.07,73.17,463.26
1,25.18,62.96,1020.04,59.08,444.37
2,5.11,39.4,1012.16,92.14,488.56
3,20.86,57.32,1010.24,76.64,446.48
4,10.82,37.5,1009.23,96.62,473.9


In [149]:
from sklearn.preprocessing import StandardScaler

In [150]:
scale = StandardScaler() 

In [151]:
x = scale.fit_transform(x)

In [152]:
def costfun(x,y,theta):
    j = np.sum((x.dot(theta) - y)**2)/len(y)
    
    return j
    

In [153]:
def grad(x,y,theta,lr,itr):
    c = [[0]*itr]
    for i in range(itr):
        pred = x.dot(theta)
        residuls = pred - y
        
        partial_terms = x.T.dot(residuls)/len(y)
        
        theta = theta - lr * partial_terms
        
        c.append(costfun(x,y,theta))
        
    return theta,c
        
      
        

In [154]:
def r2_score(x,y,theta):
    pred = x.dot(theta)
    resi = pred - y
    
    ssr = np.sum(resi**2)
    sst = np.sum((pred - np.mean(y))**2)
    
    return 1-(ssr/sst)
    

In [155]:
def adj_r2_score(a,N,p):
    return 1- ((1-a)*(N-1)/(N-p-1))
    

In [156]:
def pred(x,theta):
    return  x.dot(theta)

In [157]:
m = 6000
xtrain = x[:m,:]
ytrain = y[:m] 

xtest = x[m:,:]
ytest = y[m:]

xtrain = np.c_[np.ones(xtrain.shape[0]),xtrain]
theta  = np.zeros(xtrain.shape[1])

xtest = np.c_[np.ones(xtest.shape[0]),xtest]
theta,clist= grad(xtrain,ytrain,theta,lr = 0.1,itr = 5000)
print("cost function value is      :",clist[-1])

print("*-"*40)
print("\n")

print("r2_score value is           :",r2_score(xtrain,ytrain,theta))
print("adjusted_r2_score value is  :",adj_r2_score(r2_score(xtrain,ytrain,theta),len(y),xtrain.shape[1]))

estimated  = pred(xtest[0],theta) 
print("*-"*40)
print("\n")

print("estimated value is : ",estimated)
print("actual    value is : ",ytest[6000])

print("*-"*40)
print("\n")
print("theta values are :",theta)

cost function value is      : 20.206217976366087
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-


r2_score value is           : 0.9253448374815225
adjusted_r2_score value is  : 0.9253058000612556
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-


estimated value is :  466.6854059197665
actual    value is :  469.25
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-


theta values are : [ 4.54329841e+02 -1.50039869e+01 -2.78397247e+00  3.43666059e-01
 -2.43881263e+00]
