Combined Cycle Power Plant dataset contains 9568 data points collected from a Combined Cycle Power Plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH) and Exhaust Vacuum (V) to predict the net hourly electrical energy output (EP) of the plant.

### Imports needed

In [1]:
from sklearn import model_selection
from sklearn import preprocessing

import numpy as np

### Gradient Descent

In [2]:
def step_gradient(X_train, Y_train, learning_rate, coeff):
    n = len(X_train[0]) # num_features and last is 1 ; last 1 bcz we calculate c(intercept) in this array also
    coefficients = np.zeros(n) # [m1, m2, m3, ... mn, m(n+1)] where m(n+1) is c
    M = len(X_train)
    
    for i in range(M):
        x = X_train[i]
        y = Y_train[i]
        for j in range(n):
            coefficients[j] += (-2/M)*(y - (coeff*x).sum())*x[j]
    new_coeff = coeff - learning_rate*coefficients
    return new_coeff

In [3]:
'''
 2nd Way
def cost(X_train, Y_train, coeff):
    total_cost = 0
    M = len(X_train)
    for i in range(M):
        x = X_train[i]
        y = Y_train[i]
        total_cost += (1/M)*( (y - (coeff*x).sum())**2 )
    return total_cost
'''

def cost(X_train, Y_train, coeff):
    return ((Y_train - np.sum(coeff*X_train, axis = 1))**2).mean()

In [4]:
def gd(X_train, Y_train, learning_rate, num_iterations):
    # append column of 1's in X_train
    ones_col = np.ones(len(X_train)).reshape(-1,1) # reshape bcz we want column of 1's
    X_train = np.append(X_train, ones_col, axis=1)
    
    n = len(X_train[0]) # num_features+1 ; +1 bcz we calculate c(intercept) in this array also
    
    # choose random value for coefficients lets say 0
    coefficients = np.zeros(n) # [m1, m2, m3, ... mn, m(n+1)] where m(n+1) is c
    
    for i in range(num_iterations):
        coefficients = step_gradient(X_train, Y_train, learning_rate, coefficients)
        
        # printing cost after every iteration, so that we can see that after which iteration cost is not decreasing much
        print("After iteration ",i+1, "Cost is:", cost(X_train, Y_train, coefficients))
        
    return coefficients

In [5]:
'''
def predictions(X_test, m, c):
    M = len(X_test)
    y_pred = np.zeros(M)
    for i in range(M):
        x = X_test[i]
        y_pred[i] += ((m*x).sum()+c)
    return y_pred
'''
def predictions(X_test, m, c):
    return (np.sum(m*X_test, axis = 1)+c)


### run function which loads data apply feature scaling on it and calls gradient descent

In [6]:
def run():
    training_data = np.genfromtxt('train.csv', delimiter=',')
    X_train = training_data[:, :-1]
    Y_train = training_data[:, -1]
    
    X_test = np.genfromtxt('test.csv', delimiter=',')
    
    # Appply feature scaling
    scaler = preprocessing.StandardScaler() # create scaler object
    scaler.fit(X_train)
    transformed_X_train = scaler.transform(X_train)
    transformed_X_test = scaler.transform(X_test)
    
    learning_rate = 0.01
    num_iterations = 1000
    parameters = gd(transformed_X_train, Y_train, learning_rate, num_iterations)
    m = parameters[:-1]
    c = parameters[-1]
    print(m, c, sep="\n")
    
    # call prediction
    pred = predictions(transformed_X_test, m, c).reshape(-1,1)
    # Rounding off upto 5 decimal places
    #pred = np.round(pred, decimals=5)
    # Save Predictions
    np.savetxt('predictions.csv', pred, delimiter=',')
    print(pred)
    

### call run function

In [7]:
run()

After iteration  1 Cost is: 198599.8033818715
After iteration  2 Cost is: 190724.33461838696
After iteration  3 Cost is: 183161.9092167872
After iteration  4 Cost is: 175900.0180028184
After iteration  5 Cost is: 168926.65792499116
After iteration  6 Cost is: 162230.3109852503
After iteration  7 Cost is: 155799.92410197752
After iteration  8 Cost is: 149624.88985905665
After iteration  9 Cost is: 143695.02809745626
After iteration  10 Cost is: 138000.5683083144
After iteration  11 Cost is: 132532.13278886522
After iteration  12 Cost is: 127280.72052473937
After iteration  13 Cost is: 122237.69176420965
After iteration  14 Cost is: 117394.75325185907
After iteration  15 Cost is: 112743.94409092401
After iteration  16 Cost is: 108277.62220522505
After iteration  17 Cost is: 103988.45137315059
After iteration  18 Cost is: 99869.38880760949
After iteration  19 Cost is: 95913.67325723005
After iteration  20 Cost is: 92114.81360535744
After iteration  21 Cost is: 88466.5779445976
After itera

After iteration  173 Cost is: 216.8069420367122
After iteration  174 Cost is: 209.24690433444576
After iteration  175 Cost is: 201.9853399895249
After iteration  176 Cost is: 195.01043533857433
After iteration  177 Cost is: 188.31084446753133
After iteration  178 Cost is: 181.87567069008367
After iteration  179 Cost is: 175.69444875954392
After iteration  180 Cost is: 169.75712778509944
After iteration  181 Cost is: 164.05405482456004
After iteration  182 Cost is: 158.57595912680608
After iteration  183 Cost is: 153.3139369982125
After iteration  184 Cost is: 148.25943726833862
After iteration  185 Cost is: 143.40424733115321
After iteration  186 Cost is: 138.74047973900156
After iteration  187 Cost is: 134.26055932743412
After iteration  188 Cost is: 129.95721084986576
After iteration  189 Cost is: 125.82344710188032
After iteration  190 Cost is: 121.85255751579503
After iteration  191 Cost is: 118.03809720685571
After iteration  192 Cost is: 114.37387645318387
After iteration  193 Co

After iteration  343 Cost is: 23.815475416729612
After iteration  344 Cost is: 23.7965072376733
After iteration  345 Cost is: 23.777897335631053
After iteration  346 Cost is: 23.75963317288716
After iteration  347 Cost is: 23.741702700368183
After iteration  348 Cost is: 23.724094338349552
After iteration  349 Cost is: 23.706796957925338
After iteration  350 Cost is: 23.689799863210887
After iteration  351 Cost is: 23.673092774249753
After iteration  352 Cost is: 23.656665810596717
After iteration  353 Cost is: 23.640509475549994
After iteration  354 Cost is: 23.624614641007536
After iteration  355 Cost is: 23.60897253292195
After iteration  356 Cost is: 23.593574717330966
After iteration  357 Cost is: 23.57841308694038
After iteration  358 Cost is: 23.563479848237645
After iteration  359 Cost is: 23.548767509115162
After iteration  360 Cost is: 23.534268866983098
After iteration  361 Cost is: 23.5199769973521
After iteration  362 Cost is: 23.505885242867922
After iteration  363 Cost i

After iteration  513 Cost is: 22.261943679583066
After iteration  514 Cost is: 22.25644285342045
After iteration  515 Cost is: 22.250964754784093
After iteration  516 Cost is: 22.24550927827646
After iteration  517 Cost is: 22.24007631941532
After iteration  518 Cost is: 22.234665774611322
After iteration  519 Cost is: 22.229277541146423
After iteration  520 Cost is: 22.223911517153102
After iteration  521 Cost is: 22.218567601594312
After iteration  522 Cost is: 22.21324569424413
After iteration  523 Cost is: 22.20794569566919
After iteration  524 Cost is: 22.20266750721069
After iteration  525 Cost is: 22.197411030967146
After iteration  526 Cost is: 22.19217616977762
After iteration  527 Cost is: 22.18696282720572
After iteration  528 Cost is: 22.181770907524037
After iteration  529 Cost is: 22.176600315699183
After iteration  530 Cost is: 22.171450957377377
After iteration  531 Cost is: 22.166322738870516
After iteration  532 Cost is: 22.1612155671427
After iteration  533 Cost is: 

After iteration  683 Cost is: 21.586972766514368
After iteration  684 Cost is: 21.584227025965575
After iteration  685 Cost is: 21.58149245237833
After iteration  686 Cost is: 21.57876900029606
After iteration  687 Cost is: 21.57605662444828
After iteration  688 Cost is: 21.573355279749812
After iteration  689 Cost is: 21.57066492130001
After iteration  690 Cost is: 21.567985504381976
After iteration  691 Cost is: 21.565316984461763
After iteration  692 Cost is: 21.56265931718762
After iteration  693 Cost is: 21.56001245838918
After iteration  694 Cost is: 21.557376364076767
After iteration  695 Cost is: 21.554750990440542
After iteration  696 Cost is: 21.552136293849827
After iteration  697 Cost is: 21.549532230852275
After iteration  698 Cost is: 21.5469387581732
After iteration  699 Cost is: 21.544355832714743
After iteration  700 Cost is: 21.541783411555198
After iteration  701 Cost is: 21.53922145194821
After iteration  702 Cost is: 21.536669911322104
After iteration  703 Cost is:

After iteration  852 Cost is: 21.25089602252763
After iteration  853 Cost is: 21.249516950602718
After iteration  854 Cost is: 21.24814348642508
After iteration  855 Cost is: 21.246775607191083
After iteration  856 Cost is: 21.24541329018983
After iteration  857 Cost is: 21.244056512802814
After iteration  858 Cost is: 21.24270525250351
After iteration  859 Cost is: 21.241359486857
After iteration  860 Cost is: 21.240019193519633
After iteration  861 Cost is: 21.238684350238618
After iteration  862 Cost is: 21.23735493485167
After iteration  863 Cost is: 21.236030925286634
After iteration  864 Cost is: 21.234712299561135
After iteration  865 Cost is: 21.2333990357822
After iteration  866 Cost is: 21.232091112145874
After iteration  867 Cost is: 21.23078850693691
After iteration  868 Cost is: 21.229491198528358
After iteration  869 Cost is: 21.228199165381216
After iteration  870 Cost is: 21.226912386044113
After iteration  871 Cost is: 21.22563083915289
After iteration  872 Cost is: 21