# Tugas Besar Pembelajaran Mesin 2

Muhammad Rizki Duwinanto - 13515006<br/>
Kevin Erdiza Yogatama - 13515016<br/>
Edwin Rachman - 13515042

#### Pustaka Terkait

In [1]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn import datasets, metrics
import time
import random
import matplotlib.pyplot as plt
import math

In [2]:
iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.1)

In [3]:
train_data = [(x, y) for x, y in zip(X_train, y_train)]
test_data = [(x, y) for x, y in zip(X_test, y_test)]

## 1.a. Create a Classifier 

#### Deskripsi Algoritma

Inisialisasi model neural network ini menggunakan langkah-langkah sebagai berikut:
1. Jumlah node setiap layer (parameter sizes) dispesifikasikan.
2. Bias dan weight untuk setiap layer setelah layer pertama dihitung secara acak.

Proses fitting model sebelumnya menggunakan algoritma sebagai berikut:
1. Dalam fungsi fit dibutuhkan parameter jumlah data training (training_data), jumlah epoch (epochs), ukuran mini-batch (mini_batch_size), dan learning rate (learning_rate). Momentum (parameter momentum) jika diinginkan juga dapat dispesifikasikan. Data validasi (parameter validation_data) dapat dispesifikasikan secara langsung atau didapatkan dari training data jika parameter validation_split dispesifikasikan.
2. Training data dibagi-bagi menjadi mini-batch berdasarkan ukuran mini-batch.
3. Weight dan bias baru dihitung per mini-batch menggunakan fungsi update_mini_batch yang mengimplementasikan stochastic gradient. Dalam implementasi stochastic gradient digunakan fungsi backpropagation.
4. Lakukan langkah 3 untuk setiap mini-batch
5. Hitung akurasi dan loss training menggunakan fungsi evaluate dengan input mini-batch pertama. Fungsi evaluate menghasilkan nilai akurasi dan loss berdasarkan hasil feed-forward sigmoid (fungsi feed_forward). Hasil feed-forward setiap data jika lebih dari 0.5 akan menghasilkan 1, jika tidak menghasilkan 0. Akurasi dihitung dari rasio jumlah data setelah feed-forward yang sama dengan labelnya dengan jumlah seluruh data. Loss dihitung dari selisih kuadrat antara label dengan hasil data setelah feed-forward setiap data dibagi dengan jumlah semua data.
6. Hitung akurasi dan loss validasi menggunakan fungsi evaluate dengan input validation_data.
7. Lakukan langkah 2-6 untuk setiap epoch

Untuk melakukan prediksi dapat digunakan fungsi predict dengan input test_data. Dalam fungsi ini setiap test_data akan dilakukan feed-forward seperti pada fungsi evaluate.

#### Source Code Program

In [4]:
class Network(object):
    def __init__(self, sizes):
        self.sizes = sizes
        self.num_layers = len(sizes)
        self.biases = [np.random.rand(y, 1) for y in sizes[1:]]
        self.weights = [np.array([np.random.uniform(-0.05, 0.05, x) for i in range(0,y)]) for x,y in zip(sizes[:-1], sizes[1:])]
        self.history = d = {'acc': [], 'val_acc': [], 'loss': [], 'val_loss': []}
    
    def feed_forward(self, activation):
        for bias, weight in zip(self.biases, self.weights):
            activation = sigmoid(np.dot(weight, activation) + bias.transpose()[0])
        return activation
    
    def fit(self, training_data, epochs, mini_batch_size, learning_rate,
            momentum=0, validation_data=None, validation_split=0.0, verbose=1):
        if validation_split != 0.0 and not validation_data :
            training_data, validation_data = train_test_split(training_data, test_size=validation_split, random_state=42)
        n_training = len(training_data)
        if validation_data or validation_split != 0.0: 
            n_validation = len(validation_data)
            if verbose != 0:
                print("Train on {} samples, validate on {} samples".format(n_training, n_validation))
        for epoch in range(epochs):
            mini_batches = [training_data[k:k + mini_batch_size] for k in range(0, n_training, mini_batch_size)]
            previous_weights = self.weights
            previous_biases =self.biases
            first = True
            for mini_batch in mini_batches:
                if first: previous_weights, previous_biases = self.weights, self.biases
                start = time.time()
                previous_weights, previous_biases = self.update_mini_batch(mini_batch, 
                                                                           learning_rate,
                                                                           momentum, 
                                                                           previous_weights, 
                                                                           previous_biases)
                end = time.time() - start
            if validation_data or validation_split != 0:
                training_accuracy, training_loss = self.evaluate(mini_batches[0])
                validation_accuracy, validation_loss = self.evaluate(validation_data)
                self.history['acc'].append(training_accuracy)
                self.history['val_acc'].append(validation_accuracy)
                self.history['loss'].append(training_loss)
                self.history['val_loss'].append(validation_loss)
                if verbose == 1 :
                    print("Epoch {}/{} : {} s - loss: {} - acc: {} - val_loss: {} - val_acc: {}".format(epoch + 1, 
                                                                                                        epochs,
                                                                                                        end,
                                                                                                        training_loss,
                                                                                                        training_accuracy,
                                                                                                        validation_loss,
                                                                                                        validation_accuracy))
                elif verbose == 2 :
                    print("Epoch {} complete.".format(epoch + 1))
            else :
                if verbose != 0:
                    print("Epoch {} complete.".format(epoch + 1))
        
    def update_mini_batch(self, mini_batch, learning_rate, momentum, previous_weights, previous_biases):
        nabla_biases = [np.zeros(bias.shape) for bias in self.biases]
        nabla_weights = [np.zeros(weight.shape) for weight in self.weights]
        for x, y in mini_batch:
            delta_nabla_bias, delta_nabla_weights = self.backpropagation(x, y)
            nabla_biases = [nb + dnb for nb, dnb in zip(nabla_biases, delta_nabla_bias)]
            nabla_weights = [nw + dnw for nw, dnw in zip(nabla_weights, delta_nabla_weights)]
        temp_weights = self.weights
        temp_biases = self.biases
        self.weights = [w + momentum * pw + (learning_rate/len(mini_batch)) * nw 
                        for w, nw, pw in zip(self.weights, nabla_weights, previous_weights)]
        self.biases = [b + momentum * pb + (learning_rate/len(mini_batch)) * nb 
                       for b, nb, pb in zip(self.biases, nabla_biases, previous_biases)]
        return (temp_weights, temp_biases)
        
    def backpropagation(self, x, y):
        nabla_bias = [np.zeros(bias.shape) for bias in self.biases]
        nabla_weights = [np.zeros(weight.shape) for weight in self.weights]
        
        activation = x
        activations = [x]
        z_vectors = []
        
        for bias, weight in zip(self.biases, self.weights):
            z = np.dot(weight, activation) + bias.transpose()[0]
            z_vectors.append(z)
            activation = sigmoid(z)
            activations.append(activation)
        
        delta = self.cost_derivative(activations[-1], y) * sigmoid_prime(z_vectors[-1])
        nabla_bias[-1] = delta
        delta_newaxis = delta[:, np.newaxis]
        m = len(activations[-2])
        activations_newaxis = activations[-2][:, np.newaxis].reshape(1, m)
        nabla_weights[-1] = np.dot(delta_newaxis, activations_newaxis)
        
        for layer in range(2, self.num_layers):
            z = z_vectors[-layer]
            sp = sigmoid_prime(z)
            delta = np.dot(self.weights[-layer+1].transpose(), delta) * sp
            nabla_bias[-layer] = delta
            delta_newaxis = delta[:, np.newaxis]
            m = len(activations[-layer-1].transpose())
            activations_newaxis = activations[-layer-1].transpose()[:, np.newaxis].reshape(1, m)
            nabla_weights[-layer] = np.dot(delta_newaxis, activations_newaxis)
        
        return (nabla_bias, nabla_weights)
    
    def evaluate(self, test_data):
        test_results = [(1 if self.feed_forward(x) * 2 >= 1 else 0, y) for x, y in test_data]
        accuracy = sum(int(x == y) for x, y in test_results)/len(test_results)
        loss = sum(math.pow((y - x), 2) for x, y in test_results)/len(test_results)
        return accuracy, loss
    
    def predict(self, test_data):
        test_results = [1 if self.feed_forward(x) * 2 >= 1 else 0 for x, y in test_data]
        return (test_results)
    
    def cost_derivative(self, output_activations, y):
        return np.squeeze(y - output_activations)

def sigmoid(z):
    return 1.0/(1.0+np.exp(-z))

def sigmoid_prime(z):
    return sigmoid(z)*(1-sigmoid(z))

#### Cara Pengunaan

In [5]:
neural_network = Network([4, 5, 10, 1])
neural_network.fit(train_data, 50, 5, 0.1, momentum=0.0001, validation_split=0.25, verbose=0)

Berikut adalah contoh penggunaan algoritma pada data latih iris. Namun, hasil tidak ditampilkan karena iris memiliki 3 label sedangkan classifier yang kami buat hanya biner. Untuk hasil classifier dapat dilihat pada bagian 1.b.2.

## 1.b.1 Explorasi Keras 

#### Deskripsi Algoritma

Pembelajaran akan menggunakan kakas keras dengan model <i>sequential</i> dan lapisan <i>dense</i> .Model akan memakai input layer sebanyak 1 neuron dengan bentuk input 4 sesuai jumlah attribute data latih, kemudian dengan 3 hidden layer masing-masing 2, 3, 4 neuron dan 1 output layer dengan 1 neuron. Optimizer yang dipakai adalah SGD, dengan perhitungan loss dengan Mean Squared Error, dan Metrics Accuracy. 

#### Source Code Program

In [6]:
from keras.models import Sequential
from keras.layers import Dense, Activation

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [7]:
network = Sequential()
network.add(Dense(1, activation='sigmoid', input_shape=(4,)))
network.add(Dense(2, activation='sigmoid'))
network.add(Dense(3, activation='sigmoid'))
network.add(Dense(4, activation='sigmoid'))
network.add(Dense(1, activation='sigmoid'))

In [8]:
network.compile(optimizer='SGD', loss='mse', metrics=['accuracy'])

In [None]:
network.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 1)                 5         
_________________________________________________________________
dense_2 (Dense)              (None, 2)                 4         
_________________________________________________________________
dense_3 (Dense)              (None, 3)                 9         
_________________________________________________________________
dense_4 (Dense)              (None, 4)                 16        
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 5         
Total params: 39
Trainable params: 39
Non-trainable params: 0
_________________________________________________________________


<b>Percobaan pada iris</b>

In [None]:
history = network.fit(X_train, y_train, epochs=200, verbose=0, batch_size=1, validation_split=0.1)

In [None]:
score = network.evaluate(X_test, y_test, batch_size=1)

In [None]:
print("Loss: {} %".format(score[0]*100.0))
print("Accuracy {} %".format(score[1]*100.0))

In [None]:
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

## 1.b.2 Eksperimen Data Categorization Framework

### 1.b.2.1 Persiapan Data

Hal pertama yang kami lakukan adalah menggunakan data latih Weather Categorization dari WEKA. 

In [None]:
weather_df = pd.read_csv('dataset/weather.csv')

weather_df

Dapat dilihat data latih terdiri dari data numerik dan data kategorikal. Diperlukan preprocessing dengan kakas scikit-learn yaitu LabelEncoder sebagai berikut.

In [None]:
label_encoder = LabelEncoder()
weather_df['outlook'] = label_encoder.fit_transform(weather_df.outlook)
weather_df['windy'] = label_encoder.fit_transform(weather_df.windy)
weather_df['play'] = label_encoder.fit_transform(weather_df.play)

In [None]:
weather_df

In [None]:
X_weather = weather_df.iloc[:,:4].values
X_weather

In [None]:
y_weather = weather_df.play.values
y_weather

Kemudian, setelah kami menjadikan data latih tersebut numerik, kami melakukan pemisahan sebagian data latih (10%) menjadi data uji dengan proporsi.

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X_weather, y_weather, test_size=0.1,shuffle=False)

In [None]:
X_train

In [None]:
X_test

In [None]:
y_train

In [None]:
y_test

In [None]:
train_data = [(x, y) for x, y in zip(X_train, y_train)]
test_data = [(x, y) for x, y in zip(X_test, y_test)]

### 1.b.2.2 Batch = 1

#####  1.b.2.2.1 Classifier Sendiri

In [None]:
start = time.time()

In [None]:
neural_network1 = Network([4, 10, 8, 1])
neural_network1.fit(train_data, 100, 1, 0.1, validation_split=0.1)

In [None]:
end = time.time() - start

In [None]:
plt.plot(neural_network1.history['acc'])
plt.plot(neural_network1.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
plt.plot(neural_network1.history['loss'])
plt.plot(neural_network1.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
accuracy1, loss1 = neural_network1.evaluate(test_data)

In [None]:
print("Loss: {} %".format(loss1*100.0))
print("Accuracy {} %".format(accuracy1*100.0))
print("Time: {} ms".format(end))

##### 1.b.2.2.2 Keras Model

Reinitialisasi model keras untuk eksperimen pertama.

In [None]:
network1 = Sequential([
    Dense(10, activation='sigmoid', input_shape=(4,)),
    Dense(8, activation='sigmoid'),
    Dense(1, activation='sigmoid')
])

In [None]:
network1.compile(optimizer='SGD', loss='mse', metrics=['accuracy'])

In [None]:
network1.summary()

In [None]:
start = time.time()

In [None]:
history1 = network1.fit(X_train, y_train, epochs=100, batch_size=1, validation_split=0.1)

In [None]:
end = time.time() - start

In [None]:
plt.plot(history1.history['acc'])
plt.plot(history1.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
plt.plot(history1.history['loss'])
plt.plot(history1.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
score1 = network1.evaluate(X_test, y_test, batch_size=1)

In [None]:
print("Loss: {} %".format(score1[0]*100.0))
print("Accuracy {} %".format(score1[1]*100.0))
print("Time: {} ms".format(end))

### 1.b.2.3 Batch =  Jumlah Data Latih

##### 1.b.2.3.1 Classifier Sendiri

In [None]:
start = time.time()

In [None]:
neural_network2 = Network([4, 1])
neural_network2.fit(train_data, 100, len(X_train), 0.1, validation_split=0.1)

In [None]:
end = time.time() - start

In [None]:
plt.plot(neural_network2.history['acc'])
plt.plot(neural_network2.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
plt.plot(neural_network1.history['loss'])
plt.plot(neural_network1.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
accuracy2, loss2 = neural_network2.evaluate(test_data)

In [None]:
print("Loss: {} %".format(loss2*100.0))
print("Accuracy {} %".format(accuracy2*100.0))
print("Time: {} ms".format(end))

##### 1.b.2.3.2 Keras Model

Reinitialisasi model keras untuk eksperimen kedua.

In [None]:
network2 = Sequential([
    Dense(10, activation='sigmoid', input_shape=(4,)),
    Dense(8, activation='sigmoid'),
    Dense(1, activation='sigmoid')
])

In [None]:
network2.compile(optimizer='SGD', loss='mse', metrics=['accuracy'])

In [None]:
network2.summary()

In [None]:
start = time.time()

In [None]:
history2 = network2.fit(X_train, y_train, epochs=100, batch_size=len(X_train), validation_split=0.1)

In [None]:
end = time.time() - start

In [None]:
plt.plot(history2.history['acc'])
plt.plot(history2.history['val_acc'])
plt.title('Model accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
plt.plot(history2.history['loss'])
plt.plot(history2.history['val_loss'])
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

In [None]:
score2 = network2.evaluate(X_test, y_test, batch_size=len(X_train))

In [None]:
print("Loss: {} %".format(score2[0]*100.0))
print("Accuracy {} %".format(score2[1]*100.0))
print("Time: {} ms".format(end))

### 1.b.2.4 Analisis Eksperimen

Berdasarkan hasil eksperimen dengan classifier sendiri dan model keras, dapat disimpulkan dengan tabel berikut ini.

In [None]:
analysis_df = pd.DataFrame({'Classifier' : ['Classifier Sendiri', 'Keras Model', 'Classifier Sendiri', 'Keras Model'],
              'Batch' : [1, 1, len(X_train), len(X_train)],
              'Loss' :[loss1, score1[0], loss2, score2[0]],
              'Accuracy' :[accuracy1, score1[1], accuracy2, score2[1]]})
analysis_df.index += 1
analysis_df

Akurasi yang dihasilkan dari kedua classifier adalah sama. Namun, jika melihat grafik dari akurasi dan loss di setiap cluster, terlihat bahwa classifier 4, dengan model keras dan batch = 12 atau sama dengan jumlah data, menghasilkan model yang lebih baik dari yang lain. Jika dibandingkan, proses pembelajaran berjalan lebih baik di keras dikarenakan lebih teroptimasi daripada algoritma kami, sehingga memiliki performa yang lebih baik walaupun tidak memakai adam. 

Sehingga dapat disimpulkan bahwa classifier terbaik dalam eksperimen ini adalah classifier 4 dengan model keras dan batch = 12 atau jumlah data. Hal lain yang dapat disim

### Pembagian Kerja

Muhammad Rizki Duwinanto - 13515006 : Algoritma, Keras, Laporan <br/>
Kevin Erdiza Yogatama - 13515016 : Algoritma, Keras, Laporan <br/>
Edwin Rachman - 13515042 : Algoritma, Keras, Laporan