Combined Cycle Power Plant dataset contains 9568 data points collected from a Combined Cycle Power Plant over 6 years (2006-2011), when the power plant was set to work with full load. Features consist of hourly average ambient variables Temperature (T), Ambient Pressure (AP), Relative Humidity (RH) and Exhaust Vacuum (V) to predict the net hourly electrical energy output (EP) of the plant.


You are given:


1. A Readme file for more details on dataset. 

2. A Training dataset csv file with X train and Y train data.

3. A X test File and you have to predict and submit predictions for this file.


Your task is to:


1. Code Gradient Descent for N features and come with predictions.

2. Try and test with various combinations of learning rates and number of iterations.

3. Try using Feature Scaling, and see if it helps you in getting better results. 


Read Instructions carefully -


1. Use Gradient Descent as a training algorithm and submit results predicted.

2. Files are in csv format, you can use genfromtxt function in numpy to load data from csv file. Similarly you can use savetxt function to save data into a file.

3. Submit a csv file with only predictions for X test data. File should not have any headers and should only have one column i.e. predictions. Also predictions shouldn't be in exponential form.

4. Your score is based on coefficient of determination. So it can be possible that nobody gets full score.


In [1]:
# Ignore warnings
import warnings
warnings.filterwarnings("ignore", category = FutureWarning)

In [2]:
# Loading the training dataset
import numpy as np
training_data = np.loadtxt("training_CombinedCycle.csv", delimiter = ",")
training_data

array([[   8.58,   38.38, 1021.03,   84.37,  482.26],
       [  21.79,   58.2 , 1017.21,   66.74,  446.94],
       [  16.64,   48.92, 1011.55,   78.76,  452.56],
       ...,
       [  29.8 ,   69.34, 1009.36,   64.74,  437.65],
       [  16.37,   54.3 , 1017.94,   63.63,  459.97],
       [  30.11,   62.04, 1010.69,   47.96,  444.42]])

In [3]:
# Input features (training)
X_train = training_data[:,:4]

# Output (training)
Y_train = training_data[:,4]

In [4]:
# Shape of input features (training)
X_train.shape

(7176, 4)

In [5]:
# Shape of output (training)
Y_train.shape

(7176,)

In [6]:
# Converting the input features (training) into Pandas dataframe to check for string, NaN values
import pandas as pd
df = pd.DataFrame(X_train)
df.describe()

Unnamed: 0,0,1,2,3
count,7176.0,7176.0,7176.0,7176.0
mean,19.629712,54.288154,1013.263032,73.275818
std,7.475256,12.751468,5.964863,14.625093
min,1.81,25.36,992.89,25.56
25%,13.47,41.74,1009.01,63.2025
50%,20.315,52.05,1012.91,74.895
75%,25.72,66.54,1017.3025,84.925
max,35.77,81.56,1033.3,100.16


In [7]:
# Feature Scaling (training data)
from sklearn import preprocessing
standard_scaler_object = preprocessing.StandardScaler()
standard_scaler_object.fit(X_train)
X_train = standard_scaler_object.transform(X_train)

In [8]:
x_ = np.append(X_train, np.ones(len(X_train)).reshape(-1, 1), axis = 1)
y_ = Y_train

You have learnt how to code Gradient Descent for a single featured dataset. Try to code a more Generic Gradient Descent. Let us consider that the $i^{th}$ feature for the first row is $x_1^i$. Similarily for the $j^th$ row, the $i^{th}$ feature will be $x_j^i$.So, your cost function would look something like :

$$ cost = \frac{1}{M}\sum_i^M (y_i - (m_ix_i^1 + m_ix_i^2 + m_ix_i^3 + ...... + m_{n + 1}x_{n + 1} ))^2 $$

Here $m_{n + 1}x_{n + 1}$ is actually 'c', constant value. (We usually take them to be 1)

Also, to find the next m (m'), our equation becomes :
$$ m_j' = m_j - \alpha\frac{\partial cost}{\partial m_j} $$

and 

$$\frac{\partial cost}{\partial m_i} = \frac{1}{M}\sum_i^M 2(y_i - (m_ix_i^1 + m_ix_i^2 + m_ix_i^3 + ...... + m_{n + 1}x_{n + 1} ))x_i^j $$

In [9]:
# This function finds the new gradient at each step
def step_gradient(x_, y_, m, learning_rate):
    m_slope = np.zeros(len(x_[0]))
    M = len(x_)
    for i in range(M) :
        x = x_[i]
        y = y_[i]
        for j in range(len(x)):
            m_slope[j] += (-2/M) * (y - sum(m * x))*x[j]
    new_m = m - learning_rate * m_slope
    return new_m

In [10]:
# Gradient Descent Function
def gd(x_, y_, learning_rate, num_iterations):
    m = np.zeros(len(x_[0]))     # Intial random values taken as 0
    for i in range(num_iterations):
        m = step_gradient(x_, y_, m, learning_rate)
        print(i, " Cost: ", cost(x_, y_, m))
    return m

In [11]:
# This function finds the new cost after each optimisation
def cost(x_, y_, m):
    total_cost = 0
    M = len(x_)
    for i in range(M):
        total_cost += (1/M)*((y_[i] - sum(m*x_[i]))**2)
    return total_cost

In [12]:
def run():
    learning_rate = 0.1
    num_iterations = 100
    m = gd(x_, y_, learning_rate, num_iterations)
    print("Final m :", m[0:-1])
    print("Final c :", m[-1])
    return m

In [13]:
m = run()

0  Cost:  132273.17028197853
1  Cost:  84643.21738144128
2  Cost:  54177.474227792685
3  Cost:  34683.656935452964
4  Cost:  22208.479529620818
5  Cost:  14224.382169986968
6  Cost:  9114.383468776608
7  Cost:  5843.781284088756
8  Cost:  3750.4051743364307
9  Cost:  2410.4727339052597
10  Cost:  1552.762193756394
11  Cost:  1003.6891423474597
12  Cost:  652.1571417929908
13  Cost:  427.06242464503333
14  Cost:  282.8969056987255
15  Cost:  190.5340535532469
16  Cost:  131.33178845157047
17  Cost:  93.35828546206275
18  Cost:  68.97644681535036
19  Cost:  53.297930953967374
20  Cost:  43.193704743608556
21  Cost:  36.66077597552082
22  Cost:  32.41688558517924
23  Cost:  29.641097514004986
24  Cost:  27.807765771435207
25  Cost:  26.58026504888727
26  Cost:  25.742971704330984
27  Cost:  25.157726328859923
28  Cost:  24.73596559178298
29  Cost:  24.42088326947476
30  Cost:  24.176010622787736
31  Cost:  23.977905690256787
32  Cost:  23.81147271061384
33  Cost:  23.6669651644581
34  Cos

In [14]:
# Loading the testing dataset
import numpy as np
testing_data = np.loadtxt("testing_CombinedCycle.csv", delimiter = ",")
testing_data

array([[  11.95,   42.03, 1017.58,   90.89],
       [  12.07,   38.25, 1012.67,   81.66],
       [  26.91,   74.99, 1005.64,   78.98],
       ...,
       [  24.32,   66.25, 1009.09,   91.89],
       [  23.49,   42.8 , 1013.96,   65.31],
       [  21.76,   60.27, 1018.96,   85.06]])

In [15]:
# Input features (testing)
X_test = testing_data[:,:4]

In [16]:
# Shape of input features (testing)
X_test.shape

(2392, 4)

In [17]:
# Converting the input features (testing) into Pandas dataframe to check for string, NaN values
import pandas as pd
df = pd.DataFrame(X_test)
df.describe()

Unnamed: 0,0,1,2,3
count,2392.0,2392.0,2392.0,2392.0
mean,19.71579,54.358754,1013.247216,73.408457
std,7.38488,12.578763,5.861068,14.528135
min,3.38,25.36,993.74,26.67
25%,13.66,41.73,1009.3,63.615
50%,20.45,52.75,1013.025,75.09
75%,25.6725,66.49,1017.1725,84.4975
max,37.11,80.25,1033.29,100.13


In [18]:
# Feature Scaling (testing data)
standard_scaler_object = preprocessing.StandardScaler()
standard_scaler_object.fit(X_test)
X_test = standard_scaler_object.transform(X_test)

In [19]:
testing_data = np.append(X_test, np.ones(len(X_test)).reshape(-1, 1), axis = 1)

In [20]:
def predict(final_m, testing_data):
    y_pred = []
    for i in testing_data:
        ans = sum(i * m)
        y_pred.append(ans)
    return y_pred

In [21]:
y_pred = predict(m, testing_data)
y_pred

[470.5906787529307,
 472.28685104868714,
 433.4463937685436,
 457.9781887785976,
 465.4307852578021,
 448.30865341889495,
 478.7749798643719,
 446.225185424537,
 483.9561762967762,
 439.75264768643547,
 434.42360461094285,
 431.8229606101906,
 472.95311667475954,
 463.93800592338454,
 444.26924931792774,
 456.9197516937232,
 487.7339281082945,
 447.2171233532888,
 426.6412022447042,
 438.35753021672616,
 439.456911206729,
 482.76708146117306,
 459.607486983996,
 475.5105438666019,
 431.4207109748451,
 434.064749980003,
 468.07705048192923,
 470.7503351364738,
 432.8500927986768,
 476.76656931257287,
 442.720552924614,
 431.24697206648693,
 449.60751623613737,
 470.9679499501899,
 469.6260515720707,
 472.6647833997972,
 446.94219653406805,
 455.66652591909235,
 444.9699851016592,
 481.3246085658953,
 466.51416439412975,
 434.0894893774075,
 473.7368332952848,
 467.84589490576656,
 461.82878877445444,
 484.3996388671612,
 436.8302221238055,
 430.30316103426685,
 439.90902007669007,
 475.

In [22]:
# Dumping the output obtained from the evaluation data into a "CSV" file
np.savetxt('CombinedCycle Predictions.csv', y_pred, fmt = '%.5f')