# Final Exam Second Semester 2566 - Neural Network (Dry Bean Problem)

This exam problem has an objective to develop a neural network model to classify the dry bean as belonging to one out of seven types (classes) from 16 features where

type 0 = Seker,

type 1 = Barbunya,

type 2 = Bombay,

type 3 = Cali,

type 4 = Horoz,

type 5 = Sira, and

type 6 = Dermosan

Cr: This dataset is adapted from KOKLU, M. and OZKAN, I.A., (2020), “Multiclass Classification of Dry Beans Using Computer Vision and Machine Learning Techniques.” Computers and Electronics in Agriculture, 174, 105507. DOI: https://doi.org/10.1016/j.compag.2020.105507

In [1]:
#importing libraries
import os
import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize
import utils_NN as utils

## We start the exam by first loading the dataset. 

In [2]:
# Load training dataset and test dataset
data = np.loadtxt(r'C:\Users\panda\OneDrive\Desktop\fifaaa\Data\NN_BeanData_Train.txt')
# First 16 columns of data are features and the last column is the label.
# Matrix X contains 16 features while vector y contains the label.
X, y = data[:, :16], data[:, 16].astype(int)
m = y.size  # number of training examples

# Read tab separated testing data
data_test = np.loadtxt(r'C:\Users\panda\OneDrive\Desktop\fifaaa\Data\NN_BeanData_Test.txt')
X_test, y_test = data_test[:, :16], data_test[:, 16].astype(int)

In [3]:
X.shape, X_test.shape

((6000, 16), (1500, 16))

You have been provided with a set of initialized network parameters ($\Theta^{(1)}, \Theta^{(2)}$). These are stored in `InitBeanWeight1.txt` and `InitBeanWeight2.txt` which will be loaded in the next cell of this notebook into `Theta1` and `Theta2`. The parameters have dimensions that are sized for a neural network with 40 units in the second layer (hidden layer) and 7 output units (corresponding to 7 dry bean types).

In [4]:
#load initialized network parameters

Theta1 = np.loadtxt(r'C:\Users\panda\OneDrive\Desktop\fifaaa\Data\InitBeanWeight1.txt')
Theta2 = np.loadtxt(r'C:\Users\panda\OneDrive\Desktop\fifaaa\Data\InitBeanWeight2.txt')

# Unroll parameters 
nn_params = np.concatenate([np.ravel(Theta1), np.ravel(Theta2)])

In [5]:
Theta1.shape, Theta2.shape

((40, 17), (7, 41))

### Initial parameters to be used in optimize.minimize

#### *** Do not initialize parameters by yourself in this exam problem. ***

In [6]:
initial_nn_params = nn_params

In [7]:
initial_nn_params.shape

(967,)

### Model representation

This neural network has 3 layers - an input layer, a hidden layer and an output layer. 

The inputs are **16** features of the dry beans.

The hidden layer has **40** neurons.

The outputs are **7** dry bean types (0 to 6).

The training data was loaded into the variables `X` and `y` above.

In [8]:
# Setup the parameters you will use for this exam by yourself!
input_layer_size  = 16 #features
hidden_layer_size = 40 #hidden units
num_labels = 7 #labels

<blockquote>Feed Forward (Forward Propagation)

In [9]:
def sigmoid(z):
    z = np.array(z)
    g = np.zeros(z.shape)
    g = 1 / (1 + np.exp(-z))
    return g

In [10]:
# Test the implementation of sigmoid function here
z = 0
g = sigmoid(z)

print('g(',z,') = ', g)

g( 0 ) =  0.5


In [None]:
#cost function of neural network
def nnCostFunction(nn_params,
                   input_layer_size,
                   hidden_layer_size,
                   num_labels,
                   X, y, lambda_=0.0):
    # Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
    Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],
                        (hidden_layer_size, (input_layer_size + 1)))  

    Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):], 
                        (num_labels, (hidden_layer_size + 1)))  

    # Setup some useful variables
    m = y.size
         
    # You need to return the following variables correctly 
    J = 0
    grad = []
    Theta1_grad = np.zeros(Theta1.shape) 
    Theta2_grad = np.zeros(Theta2.shape) 

    """ Feed Forward """
    #let a1 = X and add the bias unit
    a1 = np.concatenate([np.ones((m,1)),X],axis = 1) 
    z2 = np.dot(a1,Theta1.T) 
    a2 = sigmoid(z2)
    a2 = np.concatenate([np.ones((a2.shape[0], 1)), a2],axis=1) 
    z3 = np.dot(a2,Theta2.T) 
    a3 = sigmoid(z3)
    # h = activation unit values of the last output layer
    h = a3 

    """ Cost Function"""
    y_matrix = y
    y_matrix = np.eye(num_labels)[y_matrix] #dimen = 5000 x 10
    logprobs = np.multiply(np.log(h),y_matrix) + np.multiply(np.log(1-h),1-y_matrix)
    J = (-1/m)*np.sum(logprobs)

    """Regularization term"""
    reg_term = (lambda_ / (2 * m)) * (np.sum(np.square(Theta1[:,1:])) \
                                        + np.sum(np.square(Theta2[:,1:])))
    J = J + reg_term
    
    """Back Propagation"""
    delta_3 = h - y_matrix #dimen = 5000 x 10
    delta_2 = np.dot(delta_3, Theta2[:,1:]) * sigmoidGradient(z2) #dimen = 5000 x 25
    Delta1 = np.dot(delta_2.T, a1) #dimen = 25 x 401
    Delta2 = np.dot(delta_3.T, a2) #dimen = 10 x 26
    Theta1_grad = (1/m) * Delta1
    Theta2_grad = (1/m) * Delta2
    grad = np.concatenate([np.ravel(Theta1_grad), np.ravel(Theta2_grad)])
    
    """Regularized to gradient"""
    Theta1_grad[:,1:] = Theta1_grad[:,1:] + (lambda_ /m) * Theta1[:,1:]
    Theta2_grad[:,1:] = Theta2_grad[:,1:] + (lambda_ /m) * Theta2[:,1:]
    
    #update grad with regularization
    grad = np.concatenate([np.ravel(Theta1_grad),np.ravel(Theta2_grad)])
    
    return J, grad


In [13]:
#run sigmoid gradient first
lambda_ = 0
J, _ = nnCostFunction(nn_params, input_layer_size, hidden_layer_size,
                   num_labels, X, y, lambda_)
print('Cost at parameters (loaded from NN_weights): %.6f ' % J)

Cost at parameters (loaded from NN_weights): 4.898536 


In [14]:
#run sigmoid gradient first
#ไม่ regularized bias
lambda_ = 1
J, _ = nnCostFunction(nn_params, input_layer_size, hidden_layer_size,
                   num_labels, X, y, lambda_)
print('Cost at parameters (loaded from NN_weights): %.6f ' % J)

Cost at parameters (loaded from NN_weights): 4.898584 


<blockquote>Backward Propagation

In [12]:
def sigmoidGradient(z):
    g = np.zeros(z.shape)
    g = sigmoid(z)*(1-sigmoid(z))
    return g

In [15]:
z = np.array([-1, -0.5, 0, 0.5, 1])
g = sigmoidGradient(z)
print('Sigmoid gradient evaluated at [-1 -0.5 0 0.5 1]:\n  ')
print(g)

Sigmoid gradient evaluated at [-1 -0.5 0 0.5 1]:
  
[0.19661193 0.23500371 0.25       0.23500371 0.19661193]


In [16]:
utils.predict

<function utils_NN.predict(Theta1, Theta2, X)>

In [17]:
#Gradient Checking
utils.checkNNGradients(nnCostFunction)

[[-9.27825235e-03 -9.27825236e-03]
 [-3.04978709e-06 -3.04978914e-06]
 [-1.75060084e-04 -1.75060082e-04]
 [-9.62660640e-05 -9.62660620e-05]
 [ 8.89911959e-03  8.89911960e-03]
 [ 1.42869450e-05  1.42869443e-05]
 [ 2.33146358e-04  2.33146357e-04]
 [ 1.17982666e-04  1.17982666e-04]
 [-8.36010761e-03 -8.36010762e-03]
 [-2.59383093e-05 -2.59383100e-05]
 [-2.87468729e-04 -2.87468729e-04]
 [-1.37149709e-04 -1.37149706e-04]
 [ 7.62813550e-03  7.62813551e-03]
 [ 3.69883257e-05  3.69883234e-05]
 [ 3.35320351e-04  3.35320347e-04]
 [ 1.53247082e-04  1.53247082e-04]
 [-6.74798369e-03 -6.74798370e-03]
 [-4.68759742e-05 -4.68759769e-05]
 [-3.76215583e-04 -3.76215587e-04]
 [-1.66560294e-04 -1.66560294e-04]
 [ 3.14544970e-01  3.14544970e-01]
 [ 1.64090819e-01  1.64090819e-01]
 [ 1.64567932e-01  1.64567932e-01]
 [ 1.58339334e-01  1.58339334e-01]
 [ 1.51127527e-01  1.51127527e-01]
 [ 1.49568335e-01  1.49568335e-01]
 [ 1.11056588e-01  1.11056588e-01]
 [ 5.75736494e-02  5.75736493e-02]
 [ 5.77867378e-02  5

In [18]:
#  Check gradients by running checkNNGradients
lambda_ = 3
utils.checkNNGradients(nnCostFunction, lambda_)

# Also output the costFunction debugging values
debug_J, _  = nnCostFunction(nn_params, input_layer_size,
                          hidden_layer_size, num_labels, X, y, lambda_)

print('\n\nCost at (fixed) debugging parameters (w/ lambda = %f): %f ' % (lambda_, debug_J))

[[-9.27825235e-03 -9.27825236e-03]
 [-1.67679797e-02 -1.67679797e-02]
 [-6.01744725e-02 -6.01744725e-02]
 [-1.73704651e-02 -1.73704651e-02]
 [ 8.89911959e-03  8.89911960e-03]
 [ 3.94334829e-02  3.94334829e-02]
 [-3.19612287e-02 -3.19612287e-02]
 [-5.75658668e-02 -5.75658668e-02]
 [-8.36010761e-03 -8.36010762e-03]
 [ 5.93355565e-02  5.93355565e-02]
 [ 2.49225535e-02  2.49225535e-02]
 [-4.51963845e-02 -4.51963845e-02]
 [ 7.62813550e-03  7.62813551e-03]
 [ 2.47640974e-02  2.47640974e-02]
 [ 5.97717617e-02  5.97717617e-02]
 [ 9.14587966e-03  9.14587966e-03]
 [-6.74798369e-03 -6.74798370e-03]
 [-3.26881426e-02 -3.26881426e-02]
 [ 3.86410548e-02  3.86410548e-02]
 [ 5.46101547e-02  5.46101547e-02]
 [ 3.14544970e-01  3.14544970e-01]
 [ 1.18682669e-01  1.18682669e-01]
 [ 2.03987128e-01  2.03987128e-01]
 [ 1.25698067e-01  1.25698067e-01]
 [ 1.76337550e-01  1.76337550e-01]
 [ 1.32294136e-01  1.32294136e-01]
 [ 1.11056588e-01  1.11056588e-01]
 [ 3.81928689e-05  3.81928696e-05]
 [ 1.17148233e-01  1

Do not random weight and make a function of randInitializeWeights

In [19]:
options= {'maxfun':500} #adjust maxfun
#  You should also try different values of lambda
lambda_ = 0.11
costFunction = lambda p: nnCostFunction(p, input_layer_size,
                                        hidden_layer_size,
                                        num_labels, X, y, lambda_)
res = optimize.minimize(costFunction,
                        initial_nn_params,
                        jac=True,
                        method='TNC',
                        options=options)

nn_params = res.x
        
Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],
                    (hidden_layer_size, (input_layer_size + 1)))

Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],
                    (num_labels, (hidden_layer_size + 1)))

  logprobs = np.multiply(np.log(h),y_matrix) + np.multiply(np.log(1-h),1-y_matrix)
  logprobs = np.multiply(np.log(h),y_matrix) + np.multiply(np.log(1-h),1-y_matrix)


In [20]:
print('Cost function when lambda = ', lambda_,'is ', res.fun)

Cost function when lambda =  0.11 is  0.3593126217340986


In [None]:
# debug_J, _  = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, num_labels, X, y, lambda_ = 1)
# debug_J

In [21]:
pred_train = utils.predict(Theta1, Theta2, X)
print('Training Set Accuracy: %f' % (np.mean(pred_train == y) * 100))
pred_test = utils.predict(Theta1, Theta2, X_test)
print('Training Set Accuracy: %f' % (np.mean(pred_test == y_test) * 100))

Training Set Accuracy: 93.500000
Training Set Accuracy: 93.133333


In [22]:
options= {'maxfun': 2000}

#  You should also try different values of lambda
lambda_ = 2
costFunction = lambda p: nnCostFunction(p, input_layer_size,
                                        hidden_layer_size,
                                        num_labels, X, y, lambda_)
res = optimize.minimize(costFunction,
                        initial_nn_params,
                        jac=True,
                        method='TNC',
                        options=options)
nn_params = res.x
Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)],
                    (hidden_layer_size, (input_layer_size + 1)))

Theta2 = np.reshape(nn_params[(hidden_layer_size * (input_layer_size + 1)):],
                    (num_labels, (hidden_layer_size + 1)))

In [23]:
print('Cost function when lambda = ', lambda_,'is ', res.fun)

Cost function when lambda =  2 is  0.4189075576751514


In [None]:
# debug_J, _  = nnCostFunction(nn_params, input_layer_size, hidden_layer_size, num_labels, X, y, lambda_ = 2)
# debug_J

In [24]:
pred_train = utils.predict(Theta1, Theta2, X)
print('Training Set Accuracy: %f' % (np.mean(pred_train == y) * 100))
pred_test = utils.predict(Theta1, Theta2, X_test)
print('Training Set Accuracy: %f' % (np.mean(pred_test == y_test) * 100))

Training Set Accuracy: 93.650000
Training Set Accuracy: 93.400000


In [None]:
# utils.displayData(Theta1[:, 1:])
# plt.show()

### End of Neural Network Problem