# Tubes 2
## Feed Forward Neural Network
___Alvin Sullivan 13515048___

___Albertus Djauhari Djohan 13515054___

___Kevin 13515138___

### Multi Layer Neural Network

#### Implementasi Algoritma Backpropagation

Classifier dibuat dengan sebuah kelas bernama `MultiLayerNN`. Kelas ini berfungsi untuk memodelkan neural network yang mampu melakukan pembelajaran dengan mini-batch stochastic gradient descent. Kelas ini memiliki atribut matriks data input tanpa label, matriks weight dari hidden node, matriks weight dari output node, banyak batch, konstanta learning rate, konstanta tolerance, konstanta momentum, dan banyak epochs. Kelas ini memiliki spesifikasi sebagai berikut.

- Jumlah hidden layer maksimal 10
- Jumlah node dalam setiap hidden layer dapat bervariasi
- Fully-connected layer
- Fungsi aktivasi berupa sigmoid untuk semua hidden layer maupun output layer
- Node output berjumlah 1
- Program memberikan pilihan untuk menggunakan momentum atau tidak
- Program mengimplementasikan mini-batch stochastic gradient descent

Kelas ini memiliki fungsi `train` untuk melakukan pembelajaran mini-batch stochastic gradient descent. Fungsi train melakukan pembelajaran sesuai dengan epochs dan batch yang ditentukan. Untuk setiap batch dalam epochs, dipanggil fungsi `gradient_descent` yang memanggil tiga fungsi lainnya secara berurutan sesuai algoritma gradient descent. Pertama dipanggil fungsi `feed_forward` untuk menentukan output setiap neuron. Kedua dipanggil fungsi `back_propagation` untuk menentukan delta setiap neuron. Ketiga dipanggil fungsi `update_weight` untuk mengubah weight setiap neuron sesuai dengan hasil dari fungsi-fungsi sebelumnya. Setelah seluruh epochs selesai, maka diperoleh sebuah model neural network dengan representasi matriks weight setiap neuron yang sudah diperbarui.

In [153]:
from __future__ import division
import numpy as np
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import train_test_split

from keras.models import Sequential
from keras.layers import Dense
from keras import optimizers
from keras.layers import Dropout
import keras

In [154]:
var_layer_num = 4
var_nodes = [4, 3 , 2, 3 ,1]
var_epoch = 50
var_momentum = 0.001
var_learning_rate = 0.5

In [155]:
class MultiLayerNN:

    def __init__(self, data, hidden_node, output_node,
        num_batch, learning_rate_const, tolerance_const,
        **kwargs):
        
        self.instance = data
        self.w_hidden_node = hidden_node
        self.w_output_node = output_node
        # Gradient Descent Parameters
        self.batch_size = num_batch
        self.learning_rate = learning_rate_const
        self.tolerance = tolerance_const
        self.momentum = kwargs.get('momentum', 0)
        self.epochs = kwargs.get('epochs', 10)
       
    def train(self, instance_target):
        instance_target_t = np.array([instance_target]).T
        batch_iteration = int(np.ceil(self.instance.shape[0] / self.batch_size))
        old_loss = -np.inf
        for step in range(self.epochs):
            loss = 0
            for i in range(batch_iteration):
                start_index = i * self.batch_size
                end_index = i * self.batch_size + self.batch_size
                if end_index > len(instance_target_t):
                    end_index = len(instance_target_t)
                o_out = self.gradient_descent(self.instance[start_index:end_index:1], instance_target_t[start_index:end_index:1])
                loss = loss + self.loss_function(o_out, instance_target_t[start_index:end_index:1])
        # Print Loss
            print ("Loss after epoch %i: %f" % (step, loss/self.instance.shape[0]))
            
            if np.abs(loss - old_loss) < self.tolerance:
                break
            old_loss = loss
    
    def feed_forward(self, instance):
        s = list()
        o = list()
        # Feed Forward Hidden Node
        s_temp = instance.dot(self.w_hidden_node[0].T)
        o_temp = self.sigmoid(s_temp)
        s.append(s_temp)
        o.append(o_temp)
        iteration = len(self.w_hidden_node)
        for i in range(1,iteration):
            s_temp = o[i-1].dot(self.w_hidden_node[i].T)
            o_temp = self.sigmoid(s_temp)
            o.append(o_temp)
        # Feed Forward Output Node
        s_out = o[-1].dot(self.w_output_node.T)
        o_out = self.sigmoid(s_out)
        return s, o, s_out, o_out

    def sigmoid(self, X):
        output = 1 / (1 + np.exp(-X))
        return np.matrix(output)

    def loss_function(self, o_out, instance_target):
        
        squared_error = np.square(instance_target - o_out)
        data_loss = np.sum(squared_error)      
        return data_loss    
    def back_propagation(self, instance_target, o, o_out):
        d = list()
        # Back Propagation Output Node
        d_temp = np.multiply(np.multiply(o_out, 1-o_out), instance_target-o_out)
        d.insert(0, d_temp)
        # Back Propagation Hidden Node
        d_temp = np.multiply(np.multiply(o[-1], 1-o[-1]), (self.w_output_node.T.dot(d[0].T)).T)
        d.insert(0, d_temp)
        iteration = len(self.w_hidden_node)
        for i in range(iteration-1, 0, -1):
            d_temp = np.multiply(np.multiply(o[i-1], 1-o[i-1]), (self.w_hidden_node[i].T.dot(d[0].T)).T)
            d.insert(0, d_temp)
        return d

    def update_weight(self, instance, o, d):
        # Update Weight Output Node
        self.w_output_node[0] = self.w_output_node[0] + self.momentum * self.w_output_node[0] + self.learning_rate * d[-1].T.dot(o[-1])
        # Update Weight Hidden Node
        iteration = len(self.w_hidden_node)
        for i in range(iteration-1, 0, -1):
            self.w_hidden_node[i] = self.w_hidden_node[i] + self.momentum * self.w_hidden_node[i] + self.learning_rate * d[i].T.dot(o[i-1])
        self.w_hidden_node[0] = self.w_hidden_node[0] + self.momentum * self.w_hidden_node[0] + self.learning_rate * d[0].T.dot(instance)
    def gradient_descent(self, instance, instance_target):
        # Feed Forward
        _,o,_,o_out = self.feed_forward(instance)
        # Back Propagation      
        d = self.back_propagation(instance_target, o, o_out)
        # Update Weight
        self.update_weight(instance, o, d)
        return o_out
    def predict(self, instance):
        _,_,s_out,o_out = self.feed_forward(instance)
        return o_out

#### Implementasi Keras

Pada hasil implementasi keras dalam klasifikasi data weather, pertama telah dibuat suatu instans Sequential(). Pada model ini, akan diinisiasi arsitektur dari neural network yang akan dibangun. Dari input node, hidden layer besert jumlah hidden node dari masing-masing layaer, dan output node. Pada inisiasi ini juga ditentukan fungsi aktivasi yang digunakan.

Setelah itu, eksplorasi ini juga mencoba menggunakan stochastic gradient descent optimizer untuk memasukkan fatkro learning rate, momentum dan decay factor pada neural net yang dibangun.

Setelah itu, model akan menggunakan perhitungan loss mean_squared_error dan akurasi sebagai matriks pengukuran perfomansi.

In [156]:
model = Sequential()

for i in range(0,len(var_nodes)):
    if i == 0:
        model.add(Dense(units=var_nodes[1], activation='sigmoid', input_dim=var_nodes[0]))
        model.add(Dropout(0.1))
    elif i == len(var_nodes)-1:
        model.add(Dense(units=1, activation='sigmoid'))
    else:
        model.add(Dense(units=var_nodes[i+1], activation='sigmoid'))
        model.add(Dropout(0.1))
        
sgd = optimizers.SGD(lr=0.01, decay=var_learning_rate, momentum=var_momentum, nesterov=True)

model.compile(optimizer=sgd,
              loss='mean_squared_error',
              metrics=['accuracy'])

### Perbandingan Hasil Algoritma Backpropagation dan Keras

#### Eksekusi Data Weather

- Membaca dataset weather
- Praproses Data (Continuous dan Kategorikal)
    - Kategorikal menggunakan StandardScaler
    - Continuous menggunakan LabelEncoder
- Melakukan split training dengan skema hold-out 10%

In [157]:
from sklearn.preprocessing import StandardScaler
file = "dataset/weather.csv"
data = pd.read_csv(file)

#Handle Continuous Data
scaled_features = data.copy()
col_names = ['temp', 'humidity']
scaled_features[col_names] = StandardScaler().fit_transform(
                                        scaled_features[col_names])
data[col_names] = scaled_features[col_names]

#Handle Categorical Data
le = preprocessing.LabelEncoder()
data['outlook'] = le.fit_transform(data['outlook'])
data['windy'] = le.fit_transform(data['windy'])
data['play'] = le.fit_transform(data['play'])
print(data)
data = data.values
X = data[:, 0:-1]
y = data[:, -1]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=43)

print('X_train')
print(X_train)

print('X_test')
print(X_test)

print('y_train')
print(y_train)

print('y_test')
print(y_test)

    outlook      temp  humidity  windy  play
0         2  1.804715  0.338726      0     0
1         2  1.015152  0.843212      1     0
2         0  1.488890  0.439623      0     1
3         1 -0.563974  1.448595      0     1
4         1 -0.879799 -0.165760      0     1
5         1 -1.353537 -1.174731      1     0
6         0 -1.511449 -1.679217      1     1
7         2 -0.248148  1.347697      0     0
8         2 -0.721886 -1.174731      0     1
9         1  0.225589 -0.165760      0     1
10        2  0.225589 -1.174731      1     1
11        0 -0.248148  0.843212      1     1
12        0  1.173065 -0.670245      0     1
13        1 -0.406061  0.944109      1     0
X_train
[[ 2.          0.22558942 -1.17473092  1.        ]
 [ 1.         -0.40606095  0.9441089   1.        ]
 [ 0.          1.48889015  0.43962323  0.        ]
 [ 0.         -1.5114491  -1.67921659  1.        ]
 [ 2.         -0.72188614 -1.17473092  0.        ]
 [ 0.         -0.24814836  0.84321176  1.        ]
 [ 1.      

  return self.partial_fit(X, y)
  return self.fit(X, **fit_params).transform(X)


#### Inisialisasi bobot awal hidden node dan output node

In [158]:
np.random.seed(0)
w_hidden_node = list()
for i in range(0,len(var_nodes)-1):
    if i == 0:
        w_temp = np.random.randn(4, 4) / np.sqrt(4)
    else:
        w_temp = np.random.randn(var_nodes[i], var_nodes[i-1]) / np.sqrt(var_nodes[i])
    w_hidden_node.append(w_temp)

w_output_node = np.random.randn(var_nodes[-1], var_nodes[-2])

print('Matriks weight hidden node:\n', w_hidden_node)
print('Matriks weight output node:\n', w_output_node)

Matriks weight hidden node:
 [array([[ 0.88202617,  0.2000786 ,  0.48936899,  1.1204466 ],
       [ 0.933779  , -0.48863894,  0.47504421, -0.0756786 ],
       [-0.05160943,  0.20529925,  0.07202179,  0.72713675],
       [ 0.38051886,  0.06083751,  0.22193162,  0.16683716]]), array([[ 0.86260696, -0.11844818,  0.18074972, -0.4931124 ],
       [-1.47396936,  0.37736687,  0.49908247, -0.42848917],
       [ 1.31044344, -0.83967841,  0.02641869, -0.10807065]]), array([[ 1.08383858,  1.03899355,  0.10956438],
       [ 0.26740128, -0.62775932, -1.40063461]]), array([[-0.20086717,  0.09026812],
       [ 0.71030866,  0.69419433],
       [-0.22362324, -0.17453457]])]
Matriks weight output node:
 [[-1.04855297 -1.42001794 -1.70627019]]


#### Implementasi Mini-Batch (Batch_size = 1)

In [159]:
var_batch = 1

##### Classifier Backpropagation

In [160]:
multiLayerNN = MultiLayerNN(X_train, w_hidden_node, w_output_node, var_batch, var_learning_rate, 1e-6, momentum = var_momentum, epochs = var_epoch)
multiLayerNN.train(y_train)
print('Matriks weight hidden node:\n', multiLayerNN.w_hidden_node)
print('Matriks weight output node:\n', multiLayerNN.w_output_node)

y_test_res = multiLayerNN.predict(X_test)
print("Hasil Prediksi Train Test: ")
print(y_test_res)

Loss after epoch 0: 0.530392
Loss after epoch 1: 0.490322
Loss after epoch 2: 0.440204
Loss after epoch 3: 0.385529
Loss after epoch 4: 0.335448
Loss after epoch 5: 0.296674
Loss after epoch 6: 0.270165
Loss after epoch 7: 0.253310
Loss after epoch 8: 0.242938
Loss after epoch 9: 0.236610
Loss after epoch 10: 0.232736
Loss after epoch 11: 0.230341
Loss after epoch 12: 0.228844
Loss after epoch 13: 0.227898
Loss after epoch 14: 0.227294
Loss after epoch 15: 0.226904
Loss after epoch 16: 0.226651
Loss after epoch 17: 0.226485
Loss after epoch 18: 0.226375
Loss after epoch 19: 0.226301
Loss after epoch 20: 0.226251
Loss after epoch 21: 0.226216
Loss after epoch 22: 0.226191
Loss after epoch 23: 0.226173
Loss after epoch 24: 0.226159
Loss after epoch 25: 0.226148
Loss after epoch 26: 0.226139
Loss after epoch 27: 0.226130
Loss after epoch 28: 0.226123
Loss after epoch 29: 0.226115
Loss after epoch 30: 0.226108
Loss after epoch 31: 0.226101
Loss after epoch 32: 0.226094
Loss after epoch 33:

##### Classifier menggunakan keras

In [161]:
model.fit(X_train, y_train, batch_size=var_batch, epochs=var_epoch, verbose=1)
score = model.evaluate(X_test, y_test, verbose=0)
print(score)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
[0.2531016767024994, 0.5]


#### Implementasi Mini-Batch (Batch_size = jumlah_data)¶

In [162]:
var_batch = X_train.shape[0]

##### Classifier Backpropagation

In [163]:
multiLayerNN = MultiLayerNN(X_train, w_hidden_node, w_output_node, var_batch, var_learning_rate, 1e-6, momentum = var_momentum, epochs = var_epoch)
multiLayerNN.train(y_train)
print('Matriks weight hidden node:\n', multiLayerNN.w_hidden_node)
print('Matriks weight output node:\n', multiLayerNN.w_output_node)

y_test_res = multiLayerNN.predict(X_test)
print("Hasil Prediksi Train Test: ")
print(y_test_res)

Loss after epoch 0: 0.222158
Loss after epoch 1: 0.222136
Loss after epoch 2: 0.222120
Loss after epoch 3: 0.222109
Loss after epoch 4: 0.222100
Loss after epoch 5: 0.222092
Loss after epoch 6: 0.222086
Loss after epoch 7: 0.222080
Loss after epoch 8: 0.222075
Loss after epoch 9: 0.222070
Loss after epoch 10: 0.222066
Loss after epoch 11: 0.222061
Loss after epoch 12: 0.222057
Loss after epoch 13: 0.222053
Loss after epoch 14: 0.222048
Loss after epoch 15: 0.222044
Loss after epoch 16: 0.222040
Loss after epoch 17: 0.222036
Loss after epoch 18: 0.222032
Loss after epoch 19: 0.222027
Loss after epoch 20: 0.222023
Loss after epoch 21: 0.222019
Loss after epoch 22: 0.222015
Loss after epoch 23: 0.222010
Loss after epoch 24: 0.222006
Loss after epoch 25: 0.222002
Loss after epoch 26: 0.221997
Loss after epoch 27: 0.221993
Loss after epoch 28: 0.221989
Loss after epoch 29: 0.221984
Loss after epoch 30: 0.221980
Loss after epoch 31: 0.221975
Loss after epoch 32: 0.221970
Loss after epoch 33:

##### Classifier menggunakan keras

In [164]:
model.fit(X_train, y_train, batch_size=var_batch, epochs=var_epoch, verbose=1)
score = model.evaluate(X_test, y_test, verbose=0)
print(score)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
[0.25310471653938293, 0.5]


#### Perbandingan Hasil Classifier A dan B

Untuk batch = 1, pada algoritma 1.a loss pada awal epoch cukup besar dibanding loss awal apa 1.b, hal ini disebabkan faktor optimizer SGD yang digunakan pada model 1.b sehingga inisialisasi bobot awal dari hidden node dan output node dapat mendekati solusi optimum.

### Pembagian Tugas
1. Alvin Sullivan - 13515048 - Feed Forward and BackPropagation
2. Albertus Djauhari - 13515054 - Eksplorasi Keras
3. Kevin - 13515138 - Feed Forward and Backpropagation