# Tuning the hyperparameters of the previously discussed Neural Network

importing libraries

In [31]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from itertools import product

lets start by again importing the training and testing datasets 

then we will separate out the input features and the output labels as shown

In [32]:
# Load the training and testing datasets 
train_data = pd.read_csv('ds2_train.csv')
test_data = pd.read_csv('ds2_test.csv')

# Extract features and labels from the datasets
# the transposes are taken for being able to take easy dot products later where the features will be needed as column vectors
X_train = train_data[['x_1', 'x_2']].values.T
Y_train = train_data['y'].values.reshape(1, -1)
X_test = test_data[['x_1', 'x_2']].values.T
Y_test = test_data['y'].values.reshape(1, -1)

the neural netwrok we chose as described in the subtask_2 jupyter notebook has only 2 types of activations- relu and sigmoid

In [33]:
# Define activation functions
def relu(Z):
    return np.maximum(0, Z)

def sigmoid(Z):
    return 1 / (1 + np.exp(-Z))

we initialize the weights and biases for the layer connections randomly using numpy

multiplication of the W1 and W2 parameters is done with 0.1 to initialise them to small initial randomised values

this is a common technique used to prevent slow convergence during training. This technique helps balance activation and gradient variation. this helps smooth and better training. In simple words, it helps in better backprop and hence better gradient descent which allows us to set better values of activation function faster, since large initial weights causes gradients to explode and cause slowing down of backprop algorithm or not learning effectively at all

In [34]:
# Initialization of parameters
def initialize_params(input_size, hidden_size, output_size):
    W1 = np.random.randn(hidden_size, input_size) * 0.1
    b1 = np.zeros((hidden_size, 1))
    W2 = np.random.randn(output_size, hidden_size) * 0.1
    b2 = np.zeros((output_size, 1))
    return W1, b1, W2, b2

The implementing of forward prop and back prop has already been explained. 

In [35]:
# Implementation of forward propagation
def forward_propagation(X, W1, b1, W2, b2):
    Z1 = W1.dot(X) + b1
    A1 = relu(Z1)
    Z2 = W2.dot(A1) + b2
    A2 = sigmoid(Z2)
    return Z1, A1, Z2, A2

In [36]:
# Implementation of backward propagation
def backward_propagation(X, Y, Z1, A1, Z2, A2, W1, W2):
    m = X.shape[1]
    dZ2 = A2 - Y
    dW2 = (1 / m) * dZ2.dot(A1.T)
    db2 = (1 / m) * np.sum(dZ2, axis=1, keepdims=True)
    dZ1 = W2.T.dot(dZ2) * (Z1 > 0)
    dW1 = (1 / m) * dZ1.dot(X.T)
    db1 = (1 / m) * np.sum(dZ1, axis=1, keepdims=True)
    return dW1, db1, dW2, db2

The code block below is very interesting. 

it covers use of a technique called regularisation that is used a lot in improving the generalisation of the ml models

It has 2 types L1 and L2, we chose to use L2 though L1 can also be used, both have their merits

L2 regularization plays a crucial role in this function. It works by adding a regularization term to the gradient descent update for the neural network's weights. 

The primary aim is to tackle overfitting by making the model to learn smaller weights and, in turn, reducing its complexity. 

by controlling the regularization strength, we can fine-tune the impact of this generalisation of the model. Hence the reg_strength is an external parameter or a hyperparamer

Higher values place more emphasis on penalizing larger weights, leading to a simpler and more versatile model with better generalization abilities. But too large generalise too much which fails to work good while very small values dont prevent the problem of overfitting.

hence we will also later use a technique called random search to choose the optimum set of hyperparameters by testing certain sets of randomly paired set of hperparameter values

the regularisation is done by multiplying the weight matrices by reg_strength/m where m is the number of input examples for training

In [24]:
# Update parameters using gradient descent with L2 regularization
def update_params(W1, b1, W2, b2, dW1, db1, dW2, db2, learning_rate, reg_strength):
    m = X_train.shape[1]
    W1 -= learning_rate * (dW1 + (reg_strength / m) * W1)
    b1 -= learning_rate * db1
    W2 -= learning_rate * (dW2 + (reg_strength / m) * W2)
    b2 -= learning_rate * db2
    return W1, b1, W2, b2

here we make the training function but this time with hyperparameter tuning

the codeblock is explained with extensive use of comments as this is maybe one of the most important code snippets of this notebook

In [30]:
# Train the model along with using hyperparameter tuning
def train_model(X, Y, hidden_size, output_size, num_epochs=1000, learning_rate=0.1, reg_strength=0.01):
    # Get the number of input features
    input_size = X.shape[0]
    
    # Initialize the weights and biases
    W1, b1, W2, b2 = initialize_params(input_size, hidden_size, output_size)
    
    # Start the training process for the specified number of epochs
    # note that the number of epochs or iterations is also a hyperparameter which will also be tuned using random_search
    for epoch in range(num_epochs):
        # Perform forward propagation to compute the predictions (A2) and intermediate values (Z1, A1, Z2)
        Z1, A1, Z2, A2 = forward_propagation(X, W1, b1, W2, b2)
        
        # Perform backward propagation to compute the gradients of the parameters
        dW1, db1, dW2, db2 = backward_propagation(X, Y, Z1, A1, Z2, A2, W1, W2)
        
        # Update the parameters using gradient descent with L2 regularization
        W1, b1, W2, b2 = update_params(W1, b1, W2, b2, dW1, db1, dW2, db2, learning_rate, reg_strength)
        
        # Print the cost (cross-entropy loss) at every 100 epochs to monitor training progress
        # this has been done just for monitoring of the whole training incase any error or anomaly occurs and can be detected
        if epoch % 100 == 0:
            cost = (-1 / X.shape[1]) * np.sum(Y * np.log(A2) + (1 - Y) * np.log(1 - A2))
            print(f"Epoch {epoch}/{num_epochs}, Cost: {cost:.4f}")
    
    # Return the trained parameters (weights and biases) of the neural network
    return W1, b1, W2, b2


the function below is the golden boy of the show

this random search is what helps us tune these features in an automated manner

its working is simple. just randomly take and test the sets of hyperparameters, if the accuracy of a set of hyperparameters is best until the current iteration then it will be set as the best hyperparameter set until another better performing set of hyperparameter values are found in the upcoming iterations of the random search.

Phew! that was a long sentence. anyways thats it, the most important piece of code in this notebook

this code block has also been provided an extensive explanation by many comments

In [38]:
# Random Search for hyperparameter tuning
# It takes the following input arguments:
#   - X: The input features 
#   - Y: The true labels 
#   - hidden_sizes: A list of integers representing the number of neurons in the hidden layer to be tried during random search.
#   - learning_rates: A list of float values representing the learning rates to be tried during random search.
#   - reg_strengths: A list of float values representing the regularization strengths to be tried during random search.
#   - num_epochs_list: A list of integers representing the number of epochs to be tried during random search.

def random_search(X, Y, hidden_sizes, learning_rates, reg_strengths, num_epochs_list):
    # Initialize variables to store the best accuracy and corresponding best hyperparameters.
    best_accuracy = 0.0
    best_params = {}
    
    # Iterate over all combinations of hyperparameters using the 'product' function
    # It is used to generate the Cartesian product of the input variables (hyperparameter lists in this case).
    # why the cartesian product you ask? this is because it generates all possible combinations of hyperparameters by taking one value from each hyperparameter list.
    for hidden_size, learning_rate, reg_strength, num_epochs in product(hidden_sizes, learning_rates, reg_strengths, num_epochs_list):
        # Print the current combination of hyperparameters being used for training.
        print(f"\nTraining with hidden_size={hidden_size}, learning_rate={learning_rate}, reg_strength={reg_strength}, num_epochs={num_epochs}")
        
        # Train the model with the current combination of hyperparameters using the 'train_model' function.
        W1, b1, W2, b2 = train_model(X, Y, hidden_size, 1, num_epochs=num_epochs, learning_rate=learning_rate, reg_strength=reg_strength)
        
        # Make predictions on the training data using the trained model.
        Y_pred_train = predict(X, W1, b1, W2, b2)
        
        # Calculate the training accuracy using the 'accuracy' function.
        train_accuracy = accuracy(Y_pred_train, Y)
        
        # Print the training accuracy achieved with the current hyperparameter combination.
        print(f"Train Accuracy: {train_accuracy:.2%}")

        # Check if the current training accuracy is better than the best accuracy found so far.
        if train_accuracy > best_accuracy:
            # If yes, update the best accuracy and the corresponding best hyperparameters.
            best_accuracy = train_accuracy
            best_params = {
                'hidden_size': hidden_size,
                'learning_rate': learning_rate,
                'reg_strength': reg_strength,
                'num_epochs': num_epochs,
            }
    
    # Return the best hyperparameter values and the corresponding best accuracy achieved during random search.
    return best_params, best_accuracy


now all is ALMOST set and done

predict and accuracy functions are very simple and are defined as shown below

In [27]:
# Make predictions on the test dataset
def predict(X, W1, b1, W2, b2):
    _, _, _, A2 = forward_propagation(X, W1, b1, W2, b2)
    return np.round(A2)

# Evaluate the model
def accuracy(Y_pred, Y_true):
    return np.mean(Y_pred == Y_true)

now lets just get the code over with. we set the values we think need to be searched through for the random search into the lists as shown and then the random search function is called which gives us the best trained model and its hyperparameters and the accuracy it achieved

In [28]:
# Perform random search for hyperparameter tuning
hidden_sizes = [5, 10, 15]
learning_rates = [0.01, 0.1, 0.5,1]
reg_strengths = [0,0.001, 0.01, 0.1,1]
num_epochs_list = [1000, 2000, 3000,5000,10000]

best_params, best_accuracy = random_search(X_train, Y_train, hidden_sizes, learning_rates, reg_strengths, num_epochs_list)

# Print the best hyperparameter values and the corresponding evaluation metric score
print("\nBest Hyperparameter Values:")
print(f"Hidden Size: {best_params['hidden_size']}")
print(f"Learning Rate: {best_params['learning_rate']}")
print(f"Regularization Strength: {best_params['reg_strength']}")
print(f"Number of Epochs: {best_params['num_epochs']}")
print(f"Best Accuracy: {best_accuracy:.2%}")


Training with hidden_size=5, learning_rate=0.01, reg_strength=0, num_epochs=1000
Epoch 0/1000, Cost: 0.6987
Epoch 100/1000, Cost: 0.6918
Epoch 200/1000, Cost: 0.6820
Epoch 300/1000, Cost: 0.6695
Epoch 400/1000, Cost: 0.6515
Epoch 500/1000, Cost: 0.6251
Epoch 600/1000, Cost: 0.5886
Epoch 700/1000, Cost: 0.5406
Epoch 800/1000, Cost: 0.4826
Epoch 900/1000, Cost: 0.4219
Train Accuracy: 90.38%

Training with hidden_size=5, learning_rate=0.01, reg_strength=0, num_epochs=2000
Epoch 0/2000, Cost: 0.6969
Epoch 100/2000, Cost: 0.6835
Epoch 200/2000, Cost: 0.6671
Epoch 300/2000, Cost: 0.6430
Epoch 400/2000, Cost: 0.6073
Epoch 500/2000, Cost: 0.5586
Epoch 600/2000, Cost: 0.4991
Epoch 700/2000, Cost: 0.4357
Epoch 800/2000, Cost: 0.3785
Epoch 900/2000, Cost: 0.3332
Epoch 1000/2000, Cost: 0.2999
Epoch 1100/2000, Cost: 0.2760
Epoch 1200/2000, Cost: 0.2589
Epoch 1300/2000, Cost: 0.2466
Epoch 1400/2000, Cost: 0.2375
Epoch 1500/2000, Cost: 0.2307
Epoch 1600/2000, Cost: 0.2255
Epoch 1700/2000, Cost: 0.22

Epoch 1400/3000, Cost: 0.3658
Epoch 1500/3000, Cost: 0.3487
Epoch 1600/3000, Cost: 0.3339
Epoch 1700/3000, Cost: 0.3211
Epoch 1800/3000, Cost: 0.3099
Epoch 1900/3000, Cost: 0.3002
Epoch 2000/3000, Cost: 0.2916
Epoch 2100/3000, Cost: 0.2840
Epoch 2200/3000, Cost: 0.2774
Epoch 2300/3000, Cost: 0.2715
Epoch 2400/3000, Cost: 0.2662
Epoch 2500/3000, Cost: 0.2616
Epoch 2600/3000, Cost: 0.2573
Epoch 2700/3000, Cost: 0.2535
Epoch 2800/3000, Cost: 0.2501
Epoch 2900/3000, Cost: 0.2470
Train Accuracy: 91.50%

Training with hidden_size=5, learning_rate=0.01, reg_strength=0.001, num_epochs=5000
Epoch 0/5000, Cost: 0.6963
Epoch 100/5000, Cost: 0.6938
Epoch 200/5000, Cost: 0.6918
Epoch 300/5000, Cost: 0.6891
Epoch 400/5000, Cost: 0.6846
Epoch 500/5000, Cost: 0.6763
Epoch 600/5000, Cost: 0.6616
Epoch 700/5000, Cost: 0.6376
Epoch 800/5000, Cost: 0.6018
Epoch 900/5000, Cost: 0.5536
Epoch 1000/5000, Cost: 0.4945
Epoch 1100/5000, Cost: 0.4316
Epoch 1200/5000, Cost: 0.3751
Epoch 1300/5000, Cost: 0.3305
Epo

Epoch 2800/5000, Cost: 0.2059
Epoch 2900/5000, Cost: 0.2055
Epoch 3000/5000, Cost: 0.2052
Epoch 3100/5000, Cost: 0.2049
Epoch 3200/5000, Cost: 0.2047
Epoch 3300/5000, Cost: 0.2045
Epoch 3400/5000, Cost: 0.2043
Epoch 3500/5000, Cost: 0.2041
Epoch 3600/5000, Cost: 0.2039
Epoch 3700/5000, Cost: 0.2038
Epoch 3800/5000, Cost: 0.2036
Epoch 3900/5000, Cost: 0.2035
Epoch 4000/5000, Cost: 0.2034
Epoch 4100/5000, Cost: 0.2033
Epoch 4200/5000, Cost: 0.2032
Epoch 4300/5000, Cost: 0.2031
Epoch 4400/5000, Cost: 0.2030
Epoch 4500/5000, Cost: 0.2029
Epoch 4600/5000, Cost: 0.2028
Epoch 4700/5000, Cost: 0.2027
Epoch 4800/5000, Cost: 0.2026
Epoch 4900/5000, Cost: 0.2026
Train Accuracy: 91.12%

Training with hidden_size=5, learning_rate=0.01, reg_strength=0.01, num_epochs=10000
Epoch 0/10000, Cost: 0.6934
Epoch 100/10000, Cost: 0.6868
Epoch 200/10000, Cost: 0.6794
Epoch 300/10000, Cost: 0.6655
Epoch 400/10000, Cost: 0.6435
Epoch 500/10000, Cost: 0.6092
Epoch 600/10000, Cost: 0.5592
Epoch 700/10000, Cost: 

Epoch 3400/10000, Cost: 0.2588
Epoch 3500/10000, Cost: 0.2568
Epoch 3600/10000, Cost: 0.2550
Epoch 3700/10000, Cost: 0.2532
Epoch 3800/10000, Cost: 0.2516
Epoch 3900/10000, Cost: 0.2500
Epoch 4000/10000, Cost: 0.2486
Epoch 4100/10000, Cost: 0.2472
Epoch 4200/10000, Cost: 0.2459
Epoch 4300/10000, Cost: 0.2447
Epoch 4400/10000, Cost: 0.2435
Epoch 4500/10000, Cost: 0.2424
Epoch 4600/10000, Cost: 0.2413
Epoch 4700/10000, Cost: 0.2402
Epoch 4800/10000, Cost: 0.2392
Epoch 4900/10000, Cost: 0.2383
Epoch 5000/10000, Cost: 0.2373
Epoch 5100/10000, Cost: 0.2364
Epoch 5200/10000, Cost: 0.2356
Epoch 5300/10000, Cost: 0.2348
Epoch 5400/10000, Cost: 0.2340
Epoch 5500/10000, Cost: 0.2332
Epoch 5600/10000, Cost: 0.2325
Epoch 5700/10000, Cost: 0.2317
Epoch 5800/10000, Cost: 0.2311
Epoch 5900/10000, Cost: 0.2304
Epoch 6000/10000, Cost: 0.2297
Epoch 6100/10000, Cost: 0.2291
Epoch 6200/10000, Cost: 0.2284
Epoch 6300/10000, Cost: 0.2278
Epoch 6400/10000, Cost: 0.2272
Epoch 6500/10000, Cost: 0.2266
Epoch 66

Epoch 7900/10000, Cost: 0.2016
Epoch 8000/10000, Cost: 0.2016
Epoch 8100/10000, Cost: 0.2015
Epoch 8200/10000, Cost: 0.2015
Epoch 8300/10000, Cost: 0.2015
Epoch 8400/10000, Cost: 0.2015
Epoch 8500/10000, Cost: 0.2015
Epoch 8600/10000, Cost: 0.2014
Epoch 8700/10000, Cost: 0.2014
Epoch 8800/10000, Cost: 0.2014
Epoch 8900/10000, Cost: 0.2014
Epoch 9000/10000, Cost: 0.2014
Epoch 9100/10000, Cost: 0.2014
Epoch 9200/10000, Cost: 0.2014
Epoch 9300/10000, Cost: 0.2013
Epoch 9400/10000, Cost: 0.2013
Epoch 9500/10000, Cost: 0.2013
Epoch 9600/10000, Cost: 0.2013
Epoch 9700/10000, Cost: 0.2013
Epoch 9800/10000, Cost: 0.2013
Epoch 9900/10000, Cost: 0.2013
Train Accuracy: 91.38%

Training with hidden_size=5, learning_rate=0.1, reg_strength=0, num_epochs=1000
Epoch 0/1000, Cost: 0.6887
Epoch 100/1000, Cost: 0.2878
Epoch 200/1000, Cost: 0.2128
Epoch 300/1000, Cost: 0.2055
Epoch 400/1000, Cost: 0.2035
Epoch 500/1000, Cost: 0.2026
Epoch 600/1000, Cost: 0.2020
Epoch 700/1000, Cost: 0.2016
Epoch 800/1000,

Epoch 700/2000, Cost: 0.2016
Epoch 800/2000, Cost: 0.2014
Epoch 900/2000, Cost: 0.2012
Epoch 1000/2000, Cost: 0.2011
Epoch 1100/2000, Cost: 0.2010
Epoch 1200/2000, Cost: 0.2010
Epoch 1300/2000, Cost: 0.2009
Epoch 1400/2000, Cost: 0.2009
Epoch 1500/2000, Cost: 0.2009
Epoch 1600/2000, Cost: 0.2009
Epoch 1700/2000, Cost: 0.2009
Epoch 1800/2000, Cost: 0.2009
Epoch 1900/2000, Cost: 0.2009
Train Accuracy: 91.50%

Training with hidden_size=5, learning_rate=0.1, reg_strength=0.001, num_epochs=3000
Epoch 0/3000, Cost: 0.6911
Epoch 100/3000, Cost: 0.3322
Epoch 200/3000, Cost: 0.2154
Epoch 300/3000, Cost: 0.2060
Epoch 400/3000, Cost: 0.2037
Epoch 500/3000, Cost: 0.2027
Epoch 600/3000, Cost: 0.2021
Epoch 700/3000, Cost: 0.2017
Epoch 800/3000, Cost: 0.2014
Epoch 900/3000, Cost: 0.2013
Epoch 1000/3000, Cost: 0.2011
Epoch 1100/3000, Cost: 0.2011
Epoch 1200/3000, Cost: 0.2010
Epoch 1300/3000, Cost: 0.2010
Epoch 1400/3000, Cost: 0.2010
Epoch 1500/3000, Cost: 0.2009
Epoch 1600/3000, Cost: 0.2009
Epoch 1

Epoch 1300/5000, Cost: 0.2010
Epoch 1400/5000, Cost: 0.2010
Epoch 1500/5000, Cost: 0.2010
Epoch 1600/5000, Cost: 0.2010
Epoch 1700/5000, Cost: 0.2010
Epoch 1800/5000, Cost: 0.2010
Epoch 1900/5000, Cost: 0.2010
Epoch 2000/5000, Cost: 0.2009
Epoch 2100/5000, Cost: 0.2009
Epoch 2200/5000, Cost: 0.2009
Epoch 2300/5000, Cost: 0.2009
Epoch 2400/5000, Cost: 0.2009
Epoch 2500/5000, Cost: 0.2009
Epoch 2600/5000, Cost: 0.2009
Epoch 2700/5000, Cost: 0.2009
Epoch 2800/5000, Cost: 0.2009
Epoch 2900/5000, Cost: 0.2009
Epoch 3000/5000, Cost: 0.2009
Epoch 3100/5000, Cost: 0.2009
Epoch 3200/5000, Cost: 0.2009
Epoch 3300/5000, Cost: 0.2009
Epoch 3400/5000, Cost: 0.2009
Epoch 3500/5000, Cost: 0.2009
Epoch 3600/5000, Cost: 0.2009
Epoch 3700/5000, Cost: 0.2009
Epoch 3800/5000, Cost: 0.2009
Epoch 3900/5000, Cost: 0.2009
Epoch 4000/5000, Cost: 0.2009
Epoch 4100/5000, Cost: 0.2009
Epoch 4200/5000, Cost: 0.2009
Epoch 4300/5000, Cost: 0.2009
Epoch 4400/5000, Cost: 0.2009
Epoch 4500/5000, Cost: 0.2009
Epoch 4600

Epoch 2400/10000, Cost: 0.6931
Epoch 2500/10000, Cost: 0.6931
Epoch 2600/10000, Cost: 0.6931
Epoch 2700/10000, Cost: 0.6931
Epoch 2800/10000, Cost: 0.6931
Epoch 2900/10000, Cost: 0.6931
Epoch 3000/10000, Cost: 0.6931
Epoch 3100/10000, Cost: 0.6931
Epoch 3200/10000, Cost: 0.6931
Epoch 3300/10000, Cost: 0.6931
Epoch 3400/10000, Cost: 0.6931
Epoch 3500/10000, Cost: 0.6931
Epoch 3600/10000, Cost: 0.6931
Epoch 3700/10000, Cost: 0.6931
Epoch 3800/10000, Cost: 0.6931
Epoch 3900/10000, Cost: 0.6931
Epoch 4000/10000, Cost: 0.6931
Epoch 4100/10000, Cost: 0.6931
Epoch 4200/10000, Cost: 0.6931
Epoch 4300/10000, Cost: 0.6931
Epoch 4400/10000, Cost: 0.6931
Epoch 4500/10000, Cost: 0.6931
Epoch 4600/10000, Cost: 0.6931
Epoch 4700/10000, Cost: 0.6931
Epoch 4800/10000, Cost: 0.6931
Epoch 4900/10000, Cost: 0.6931
Epoch 5000/10000, Cost: 0.6931
Epoch 5100/10000, Cost: 0.6931
Epoch 5200/10000, Cost: 0.6931
Epoch 5300/10000, Cost: 0.6931
Epoch 5400/10000, Cost: 0.6931
Epoch 5500/10000, Cost: 0.6931
Epoch 56

Epoch 7000/10000, Cost: 0.2010
Epoch 7100/10000, Cost: 0.2010
Epoch 7200/10000, Cost: 0.2010
Epoch 7300/10000, Cost: 0.2010
Epoch 7400/10000, Cost: 0.2010
Epoch 7500/10000, Cost: 0.2010
Epoch 7600/10000, Cost: 0.2010
Epoch 7700/10000, Cost: 0.2010
Epoch 7800/10000, Cost: 0.2010
Epoch 7900/10000, Cost: 0.2010
Epoch 8000/10000, Cost: 0.2010
Epoch 8100/10000, Cost: 0.2010
Epoch 8200/10000, Cost: 0.2010
Epoch 8300/10000, Cost: 0.2010
Epoch 8400/10000, Cost: 0.2010
Epoch 8500/10000, Cost: 0.2010
Epoch 8600/10000, Cost: 0.2010
Epoch 8700/10000, Cost: 0.2010
Epoch 8800/10000, Cost: 0.2010
Epoch 8900/10000, Cost: 0.2010
Epoch 9000/10000, Cost: 0.2010
Epoch 9100/10000, Cost: 0.2010
Epoch 9200/10000, Cost: 0.2010
Epoch 9300/10000, Cost: 0.2010
Epoch 9400/10000, Cost: 0.2010
Epoch 9500/10000, Cost: 0.2010
Epoch 9600/10000, Cost: 0.2010
Epoch 9700/10000, Cost: 0.2010
Epoch 9800/10000, Cost: 0.2010
Epoch 9900/10000, Cost: 0.2010
Train Accuracy: 91.50%

Training with hidden_size=5, learning_rate=0.5

Epoch 300/3000, Cost: 0.2272
Epoch 400/3000, Cost: 0.2227
Epoch 500/3000, Cost: 0.2202
Epoch 600/3000, Cost: 0.2194
Epoch 700/3000, Cost: 0.2200
Epoch 800/3000, Cost: 0.2209
Epoch 900/3000, Cost: 0.2112
Epoch 1000/3000, Cost: 0.2073
Epoch 1100/3000, Cost: 0.2156
Epoch 1200/3000, Cost: 0.2066
Epoch 1300/3000, Cost: 0.2040
Epoch 1400/3000, Cost: 0.2034
Epoch 1500/3000, Cost: 0.2048
Epoch 1600/3000, Cost: 0.2028
Epoch 1700/3000, Cost: 0.2107
Epoch 1800/3000, Cost: 0.2027
Epoch 1900/3000, Cost: 0.2024
Epoch 2000/3000, Cost: 0.2155
Epoch 2100/3000, Cost: 0.2023
Epoch 2200/3000, Cost: 0.2021
Epoch 2300/3000, Cost: 0.2020
Epoch 2400/3000, Cost: 0.2153
Epoch 2500/3000, Cost: 0.2021
Epoch 2600/3000, Cost: 0.2019
Epoch 2700/3000, Cost: 0.2018
Epoch 2800/3000, Cost: 0.2017
Epoch 2900/3000, Cost: 0.2017
Train Accuracy: 91.12%

Training with hidden_size=5, learning_rate=0.5, reg_strength=0.001, num_epochs=5000
Epoch 0/5000, Cost: 0.6928
Epoch 100/5000, Cost: 0.4081
Epoch 200/5000, Cost: 0.2054
Epoc

Epoch 1500/5000, Cost: 0.2016
Epoch 1600/5000, Cost: 0.2011
Epoch 1700/5000, Cost: 0.2013
Epoch 1800/5000, Cost: 0.2047
Epoch 1900/5000, Cost: 0.2012
Epoch 2000/5000, Cost: 0.2026
Epoch 2100/5000, Cost: 0.2012
Epoch 2200/5000, Cost: 0.2014
Epoch 2300/5000, Cost: 0.2011
Epoch 2400/5000, Cost: 0.2013
Epoch 2500/5000, Cost: 0.2153
Epoch 2600/5000, Cost: 0.2012
Epoch 2700/5000, Cost: 0.2015
Epoch 2800/5000, Cost: 0.2011
Epoch 2900/5000, Cost: 0.2013
Epoch 3000/5000, Cost: 0.2233
Epoch 3100/5000, Cost: 0.2012
Epoch 3200/5000, Cost: 0.2013
Epoch 3300/5000, Cost: 0.2021
Epoch 3400/5000, Cost: 0.2012
Epoch 3500/5000, Cost: 0.2014
Epoch 3600/5000, Cost: 0.2018
Epoch 3700/5000, Cost: 0.2012
Epoch 3800/5000, Cost: 0.2014
Epoch 3900/5000, Cost: 0.2056
Epoch 4000/5000, Cost: 0.2012
Epoch 4100/5000, Cost: 0.2012
Epoch 4200/5000, Cost: 0.2068
Epoch 4300/5000, Cost: 0.2022
Epoch 4400/5000, Cost: 0.2012
Epoch 4500/5000, Cost: 0.2013
Epoch 4600/5000, Cost: 0.2030
Epoch 4700/5000, Cost: 0.2092
Epoch 4800

Epoch 600/10000, Cost: 0.2094
Epoch 700/10000, Cost: 0.2068
Epoch 800/10000, Cost: 0.2054
Epoch 900/10000, Cost: 0.2038
Epoch 1000/10000, Cost: 0.2024
Epoch 1100/10000, Cost: 0.2022
Epoch 1200/10000, Cost: 0.2029
Epoch 1300/10000, Cost: 0.2045
Epoch 1400/10000, Cost: 0.2082
Epoch 1500/10000, Cost: 0.2062
Epoch 1600/10000, Cost: 0.2023
Epoch 1700/10000, Cost: 0.2017
Epoch 1800/10000, Cost: 0.2027
Epoch 1900/10000, Cost: 0.2103
Epoch 2000/10000, Cost: 0.2024
Epoch 2100/10000, Cost: 0.2015
Epoch 2200/10000, Cost: 0.2019
Epoch 2300/10000, Cost: 0.2133
Epoch 2400/10000, Cost: 0.2026
Epoch 2500/10000, Cost: 0.2011
Epoch 2600/10000, Cost: 0.2024
Epoch 2700/10000, Cost: 0.2040
Epoch 2800/10000, Cost: 0.2025
Epoch 2900/10000, Cost: 0.2019
Epoch 3000/10000, Cost: 0.2028
Epoch 3100/10000, Cost: 0.2027
Epoch 3200/10000, Cost: 0.2022
Epoch 3300/10000, Cost: 0.2026
Epoch 3400/10000, Cost: 0.2022
Epoch 3500/10000, Cost: 0.2018
Epoch 3600/10000, Cost: 0.2026
Epoch 3700/10000, Cost: 0.2025
Epoch 3800/1

Epoch 6800/10000, Cost: 0.2020
Epoch 6900/10000, Cost: 0.2019
Epoch 7000/10000, Cost: 0.2021
Epoch 7100/10000, Cost: 0.2020
Epoch 7200/10000, Cost: 0.2019
Epoch 7300/10000, Cost: 0.2021
Epoch 7400/10000, Cost: 0.2020
Epoch 7500/10000, Cost: 0.2019
Epoch 7600/10000, Cost: 0.2021
Epoch 7700/10000, Cost: 0.2019
Epoch 7800/10000, Cost: 0.2022
Epoch 7900/10000, Cost: 0.2020
Epoch 8000/10000, Cost: 0.2019
Epoch 8100/10000, Cost: 0.2052
Epoch 8200/10000, Cost: 0.2020
Epoch 8300/10000, Cost: 0.2019
Epoch 8400/10000, Cost: 0.3418
Epoch 8500/10000, Cost: 0.2020
Epoch 8600/10000, Cost: 0.2019
Epoch 8700/10000, Cost: 0.2388
Epoch 8800/10000, Cost: 0.2020
Epoch 8900/10000, Cost: 0.2019
Epoch 9000/10000, Cost: 0.2022
Epoch 9100/10000, Cost: 0.2019
Epoch 9200/10000, Cost: 0.2019
Epoch 9300/10000, Cost: 0.2021
Epoch 9400/10000, Cost: 0.2019
Epoch 9500/10000, Cost: 0.2019
Epoch 9600/10000, Cost: 0.2021
Epoch 9700/10000, Cost: 0.2019
Epoch 9800/10000, Cost: 0.2019
Epoch 9900/10000, Cost: 0.2020
Train Ac

Train Accuracy: 50.12%

Training with hidden_size=5, learning_rate=1, reg_strength=0.001, num_epochs=3000
Epoch 0/3000, Cost: 0.6977
Epoch 100/3000, Cost: 0.6931
Epoch 200/3000, Cost: 0.6931
Epoch 300/3000, Cost: 0.6931
Epoch 400/3000, Cost: 0.6931
Epoch 500/3000, Cost: 0.6931
Epoch 600/3000, Cost: 0.6931
Epoch 700/3000, Cost: 0.6931
Epoch 800/3000, Cost: 0.6931
Epoch 900/3000, Cost: 0.6931
Epoch 1000/3000, Cost: 0.6931
Epoch 1100/3000, Cost: 0.6931
Epoch 1200/3000, Cost: 0.6931
Epoch 1300/3000, Cost: 0.6931
Epoch 1400/3000, Cost: 0.6931
Epoch 1500/3000, Cost: 0.6931
Epoch 1600/3000, Cost: 0.6931
Epoch 1700/3000, Cost: 0.6931
Epoch 1800/3000, Cost: 0.6931
Epoch 1900/3000, Cost: 0.6931
Epoch 2000/3000, Cost: 0.6931
Epoch 2100/3000, Cost: 0.6931
Epoch 2200/3000, Cost: 0.3646
Epoch 2300/3000, Cost: 0.2606
Epoch 2400/3000, Cost: 0.2557
Epoch 2500/3000, Cost: 0.2569
Epoch 2600/3000, Cost: 0.2575
Epoch 2700/3000, Cost: 0.2589
Epoch 2800/3000, Cost: 0.2539
Epoch 2900/3000, Cost: 0.2504
Train 

Epoch 2500/5000, Cost: 0.2490
Epoch 2600/5000, Cost: 0.2493
Epoch 2700/5000, Cost: 0.2492
Epoch 2800/5000, Cost: 0.2490
Epoch 2900/5000, Cost: 0.2491
Epoch 3000/5000, Cost: 0.2491
Epoch 3100/5000, Cost: 0.2488
Epoch 3200/5000, Cost: 0.2486
Epoch 3300/5000, Cost: 0.2486
Epoch 3400/5000, Cost: 0.2486
Epoch 3500/5000, Cost: 0.2486
Epoch 3600/5000, Cost: 0.2486
Epoch 3700/5000, Cost: 0.2486
Epoch 3800/5000, Cost: 0.2486
Epoch 3900/5000, Cost: 0.2486
Epoch 4000/5000, Cost: 0.2484
Epoch 4100/5000, Cost: 0.2484
Epoch 4200/5000, Cost: 0.2484
Epoch 4300/5000, Cost: 0.2484
Epoch 4400/5000, Cost: 0.2484
Epoch 4500/5000, Cost: 0.2484
Epoch 4600/5000, Cost: 0.2484
Epoch 4700/5000, Cost: 0.2484
Epoch 4800/5000, Cost: 0.2484
Epoch 4900/5000, Cost: 0.2484
Train Accuracy: 90.50%

Training with hidden_size=5, learning_rate=1, reg_strength=0.01, num_epochs=10000
Epoch 0/10000, Cost: 0.6903
Epoch 100/10000, Cost: 0.6931
Epoch 200/10000, Cost: 0.6931
Epoch 300/10000, Cost: 0.6931
Epoch 400/10000, Cost: 0.6

Epoch 3600/10000, Cost: 0.2314
Epoch 3700/10000, Cost: 0.2314
Epoch 3800/10000, Cost: 0.2314
Epoch 3900/10000, Cost: 0.2314
Epoch 4000/10000, Cost: 0.2314
Epoch 4100/10000, Cost: 0.2314
Epoch 4200/10000, Cost: 0.2314
Epoch 4300/10000, Cost: 0.2314
Epoch 4400/10000, Cost: 0.2314
Epoch 4500/10000, Cost: 0.2314
Epoch 4600/10000, Cost: 0.2314
Epoch 4700/10000, Cost: 0.2314
Epoch 4800/10000, Cost: 0.2314
Epoch 4900/10000, Cost: 0.2314
Epoch 5000/10000, Cost: 0.2314
Epoch 5100/10000, Cost: 0.2314
Epoch 5200/10000, Cost: 0.2314
Epoch 5300/10000, Cost: 0.2314
Epoch 5400/10000, Cost: 0.2314
Epoch 5500/10000, Cost: 0.2314
Epoch 5600/10000, Cost: 0.2314
Epoch 5700/10000, Cost: 0.2314
Epoch 5800/10000, Cost: 0.2314
Epoch 5900/10000, Cost: 0.2314
Epoch 6000/10000, Cost: 0.2314
Epoch 6100/10000, Cost: 0.2314
Epoch 6200/10000, Cost: 0.2314
Epoch 6300/10000, Cost: 0.2314
Epoch 6400/10000, Cost: 0.2314
Epoch 6500/10000, Cost: 0.2314
Epoch 6600/10000, Cost: 0.2314
Epoch 6700/10000, Cost: 0.2314
Epoch 68

Epoch 8600/10000, Cost: 0.6931
Epoch 8700/10000, Cost: 0.6931
Epoch 8800/10000, Cost: 0.6931
Epoch 8900/10000, Cost: 0.6931
Epoch 9000/10000, Cost: 0.6931
Epoch 9100/10000, Cost: 0.6931
Epoch 9200/10000, Cost: 0.6931
Epoch 9300/10000, Cost: 0.6931
Epoch 9400/10000, Cost: 0.6931
Epoch 9500/10000, Cost: 0.6931
Epoch 9600/10000, Cost: 0.6931
Epoch 9700/10000, Cost: 0.6931
Epoch 9800/10000, Cost: 0.6931
Epoch 9900/10000, Cost: 0.6931
Train Accuracy: 50.00%

Training with hidden_size=10, learning_rate=0.01, reg_strength=0, num_epochs=1000
Epoch 0/1000, Cost: 0.6978
Epoch 100/1000, Cost: 0.6884
Epoch 200/1000, Cost: 0.6798
Epoch 300/1000, Cost: 0.6656
Epoch 400/1000, Cost: 0.6408
Epoch 500/1000, Cost: 0.6005
Epoch 600/1000, Cost: 0.5426
Epoch 700/1000, Cost: 0.4736
Epoch 800/1000, Cost: 0.4066
Epoch 900/1000, Cost: 0.3521
Train Accuracy: 90.88%

Training with hidden_size=10, learning_rate=0.01, reg_strength=0, num_epochs=2000
Epoch 0/2000, Cost: 0.6953
Epoch 100/2000, Cost: 0.6835
Epoch 200/

Epoch 1200/3000, Cost: 0.2489
Epoch 1300/3000, Cost: 0.2393
Epoch 1400/3000, Cost: 0.2322
Epoch 1500/3000, Cost: 0.2268
Epoch 1600/3000, Cost: 0.2227
Epoch 1700/3000, Cost: 0.2194
Epoch 1800/3000, Cost: 0.2169
Epoch 1900/3000, Cost: 0.2148
Epoch 2000/3000, Cost: 0.2132
Epoch 2100/3000, Cost: 0.2118
Epoch 2200/3000, Cost: 0.2107
Epoch 2300/3000, Cost: 0.2097
Epoch 2400/3000, Cost: 0.2089
Epoch 2500/3000, Cost: 0.2082
Epoch 2600/3000, Cost: 0.2076
Epoch 2700/3000, Cost: 0.2071
Epoch 2800/3000, Cost: 0.2067
Epoch 2900/3000, Cost: 0.2063
Train Accuracy: 91.00%

Training with hidden_size=10, learning_rate=0.01, reg_strength=0.001, num_epochs=5000
Epoch 0/5000, Cost: 0.7024
Epoch 100/5000, Cost: 0.6800
Epoch 200/5000, Cost: 0.6639
Epoch 300/5000, Cost: 0.6396
Epoch 400/5000, Cost: 0.6043
Epoch 500/5000, Cost: 0.5571
Epoch 600/5000, Cost: 0.5009
Epoch 700/5000, Cost: 0.4414
Epoch 800/5000, Cost: 0.3861
Epoch 900/5000, Cost: 0.3407
Epoch 1000/5000, Cost: 0.3062
Epoch 1100/5000, Cost: 0.2811
Ep

Epoch 3000/5000, Cost: 0.2054
Epoch 3100/5000, Cost: 0.2051
Epoch 3200/5000, Cost: 0.2048
Epoch 3300/5000, Cost: 0.2046
Epoch 3400/5000, Cost: 0.2044
Epoch 3500/5000, Cost: 0.2042
Epoch 3600/5000, Cost: 0.2040
Epoch 3700/5000, Cost: 0.2039
Epoch 3800/5000, Cost: 0.2037
Epoch 3900/5000, Cost: 0.2036
Epoch 4000/5000, Cost: 0.2035
Epoch 4100/5000, Cost: 0.2034
Epoch 4200/5000, Cost: 0.2032
Epoch 4300/5000, Cost: 0.2031
Epoch 4400/5000, Cost: 0.2030
Epoch 4500/5000, Cost: 0.2029
Epoch 4600/5000, Cost: 0.2028
Epoch 4700/5000, Cost: 0.2028
Epoch 4800/5000, Cost: 0.2027
Epoch 4900/5000, Cost: 0.2026
Train Accuracy: 91.12%

Training with hidden_size=10, learning_rate=0.01, reg_strength=0.01, num_epochs=10000
Epoch 0/10000, Cost: 0.6921
Epoch 100/10000, Cost: 0.6795
Epoch 200/10000, Cost: 0.6627
Epoch 300/10000, Cost: 0.6357
Epoch 400/10000, Cost: 0.5938
Epoch 500/10000, Cost: 0.5358
Epoch 600/10000, Cost: 0.4681
Epoch 700/10000, Cost: 0.4031
Epoch 800/10000, Cost: 0.3501
Epoch 900/10000, Cost:

Epoch 3300/10000, Cost: 0.2049
Epoch 3400/10000, Cost: 0.2047
Epoch 3500/10000, Cost: 0.2045
Epoch 3600/10000, Cost: 0.2043
Epoch 3700/10000, Cost: 0.2041
Epoch 3800/10000, Cost: 0.2039
Epoch 3900/10000, Cost: 0.2038
Epoch 4000/10000, Cost: 0.2037
Epoch 4100/10000, Cost: 0.2035
Epoch 4200/10000, Cost: 0.2034
Epoch 4300/10000, Cost: 0.2033
Epoch 4400/10000, Cost: 0.2032
Epoch 4500/10000, Cost: 0.2031
Epoch 4600/10000, Cost: 0.2030
Epoch 4700/10000, Cost: 0.2029
Epoch 4800/10000, Cost: 0.2028
Epoch 4900/10000, Cost: 0.2027
Epoch 5000/10000, Cost: 0.2026
Epoch 5100/10000, Cost: 0.2026
Epoch 5200/10000, Cost: 0.2025
Epoch 5300/10000, Cost: 0.2024
Epoch 5400/10000, Cost: 0.2024
Epoch 5500/10000, Cost: 0.2023
Epoch 5600/10000, Cost: 0.2022
Epoch 5700/10000, Cost: 0.2022
Epoch 5800/10000, Cost: 0.2021
Epoch 5900/10000, Cost: 0.2021
Epoch 6000/10000, Cost: 0.2020
Epoch 6100/10000, Cost: 0.2020
Epoch 6200/10000, Cost: 0.2019
Epoch 6300/10000, Cost: 0.2019
Epoch 6400/10000, Cost: 0.2019
Epoch 65

Epoch 8200/10000, Cost: 0.2017
Epoch 8300/10000, Cost: 0.2016
Epoch 8400/10000, Cost: 0.2016
Epoch 8500/10000, Cost: 0.2016
Epoch 8600/10000, Cost: 0.2016
Epoch 8700/10000, Cost: 0.2016
Epoch 8800/10000, Cost: 0.2016
Epoch 8900/10000, Cost: 0.2015
Epoch 9000/10000, Cost: 0.2015
Epoch 9100/10000, Cost: 0.2015
Epoch 9200/10000, Cost: 0.2015
Epoch 9300/10000, Cost: 0.2015
Epoch 9400/10000, Cost: 0.2015
Epoch 9500/10000, Cost: 0.2015
Epoch 9600/10000, Cost: 0.2015
Epoch 9700/10000, Cost: 0.2014
Epoch 9800/10000, Cost: 0.2014
Epoch 9900/10000, Cost: 0.2014
Train Accuracy: 91.38%

Training with hidden_size=10, learning_rate=0.1, reg_strength=0, num_epochs=1000
Epoch 0/1000, Cost: 0.6989
Epoch 100/1000, Cost: 0.3520
Epoch 200/1000, Cost: 0.2190
Epoch 300/1000, Cost: 0.2077
Epoch 400/1000, Cost: 0.2047
Epoch 500/1000, Cost: 0.2032
Epoch 600/1000, Cost: 0.2024
Epoch 700/1000, Cost: 0.2019
Epoch 800/1000, Cost: 0.2015
Epoch 900/1000, Cost: 0.2013
Train Accuracy: 91.38%

Training with hidden_size

Epoch 1300/2000, Cost: 0.2009
Epoch 1400/2000, Cost: 0.2009
Epoch 1500/2000, Cost: 0.2009
Epoch 1600/2000, Cost: 0.2009
Epoch 1700/2000, Cost: 0.2009
Epoch 1800/2000, Cost: 0.2009
Epoch 1900/2000, Cost: 0.2009
Train Accuracy: 91.50%

Training with hidden_size=10, learning_rate=0.1, reg_strength=0.001, num_epochs=3000
Epoch 0/3000, Cost: 0.6857
Epoch 100/3000, Cost: 0.2572
Epoch 200/3000, Cost: 0.2107
Epoch 300/3000, Cost: 0.2051
Epoch 400/3000, Cost: 0.2034
Epoch 500/3000, Cost: 0.2025
Epoch 600/3000, Cost: 0.2020
Epoch 700/3000, Cost: 0.2016
Epoch 800/3000, Cost: 0.2014
Epoch 900/3000, Cost: 0.2012
Epoch 1000/3000, Cost: 0.2011
Epoch 1100/3000, Cost: 0.2010
Epoch 1200/3000, Cost: 0.2010
Epoch 1300/3000, Cost: 0.2010
Epoch 1400/3000, Cost: 0.2009
Epoch 1500/3000, Cost: 0.2009
Epoch 1600/3000, Cost: 0.2009
Epoch 1700/3000, Cost: 0.2009
Epoch 1800/3000, Cost: 0.2009
Epoch 1900/3000, Cost: 0.2009
Epoch 2000/3000, Cost: 0.2009
Epoch 2100/3000, Cost: 0.2009
Epoch 2200/3000, Cost: 0.2009
Epo

Epoch 1500/5000, Cost: 0.2009
Epoch 1600/5000, Cost: 0.2009
Epoch 1700/5000, Cost: 0.2009
Epoch 1800/5000, Cost: 0.2009
Epoch 1900/5000, Cost: 0.2009
Epoch 2000/5000, Cost: 0.2009
Epoch 2100/5000, Cost: 0.2009
Epoch 2200/5000, Cost: 0.2009
Epoch 2300/5000, Cost: 0.2009
Epoch 2400/5000, Cost: 0.2009
Epoch 2500/5000, Cost: 0.2009
Epoch 2600/5000, Cost: 0.2009
Epoch 2700/5000, Cost: 0.2008
Epoch 2800/5000, Cost: 0.2008
Epoch 2900/5000, Cost: 0.2008
Epoch 3000/5000, Cost: 0.2008
Epoch 3100/5000, Cost: 0.2008
Epoch 3200/5000, Cost: 0.2008
Epoch 3300/5000, Cost: 0.2008
Epoch 3400/5000, Cost: 0.2008
Epoch 3500/5000, Cost: 0.2008
Epoch 3600/5000, Cost: 0.2008
Epoch 3700/5000, Cost: 0.2008
Epoch 3800/5000, Cost: 0.2008
Epoch 3900/5000, Cost: 0.2008
Epoch 4000/5000, Cost: 0.2008
Epoch 4100/5000, Cost: 0.2008
Epoch 4200/5000, Cost: 0.2008
Epoch 4300/5000, Cost: 0.2008
Epoch 4400/5000, Cost: 0.2008
Epoch 4500/5000, Cost: 0.2008
Epoch 4600/5000, Cost: 0.2008
Epoch 4700/5000, Cost: 0.2008
Epoch 4800

Epoch 1800/10000, Cost: 0.2009
Epoch 1900/10000, Cost: 0.2009
Epoch 2000/10000, Cost: 0.2009
Epoch 2100/10000, Cost: 0.2009
Epoch 2200/10000, Cost: 0.2009
Epoch 2300/10000, Cost: 0.2009
Epoch 2400/10000, Cost: 0.2009
Epoch 2500/10000, Cost: 0.2009
Epoch 2600/10000, Cost: 0.2009
Epoch 2700/10000, Cost: 0.2009
Epoch 2800/10000, Cost: 0.2009
Epoch 2900/10000, Cost: 0.2009
Epoch 3000/10000, Cost: 0.2009
Epoch 3100/10000, Cost: 0.2009
Epoch 3200/10000, Cost: 0.2009
Epoch 3300/10000, Cost: 0.2009
Epoch 3400/10000, Cost: 0.2009
Epoch 3500/10000, Cost: 0.2009
Epoch 3600/10000, Cost: 0.2009
Epoch 3700/10000, Cost: 0.2009
Epoch 3800/10000, Cost: 0.2009
Epoch 3900/10000, Cost: 0.2009
Epoch 4000/10000, Cost: 0.2009
Epoch 4100/10000, Cost: 0.2009
Epoch 4200/10000, Cost: 0.2009
Epoch 4300/10000, Cost: 0.2009
Epoch 4400/10000, Cost: 0.2009
Epoch 4500/10000, Cost: 0.2009
Epoch 4600/10000, Cost: 0.2009
Epoch 4700/10000, Cost: 0.2009
Epoch 4800/10000, Cost: 0.2009
Epoch 4900/10000, Cost: 0.2009
Epoch 50

Epoch 7800/10000, Cost: 0.2010
Epoch 7900/10000, Cost: 0.2010
Epoch 8000/10000, Cost: 0.2010
Epoch 8100/10000, Cost: 0.2010
Epoch 8200/10000, Cost: 0.2010
Epoch 8300/10000, Cost: 0.2010
Epoch 8400/10000, Cost: 0.2010
Epoch 8500/10000, Cost: 0.2010
Epoch 8600/10000, Cost: 0.2010
Epoch 8700/10000, Cost: 0.2010
Epoch 8800/10000, Cost: 0.2010
Epoch 8900/10000, Cost: 0.2010
Epoch 9000/10000, Cost: 0.2010
Epoch 9100/10000, Cost: 0.2010
Epoch 9200/10000, Cost: 0.2010
Epoch 9300/10000, Cost: 0.2010
Epoch 9400/10000, Cost: 0.2010
Epoch 9500/10000, Cost: 0.2010
Epoch 9600/10000, Cost: 0.2010
Epoch 9700/10000, Cost: 0.2010
Epoch 9800/10000, Cost: 0.2010
Epoch 9900/10000, Cost: 0.2010
Train Accuracy: 91.50%

Training with hidden_size=10, learning_rate=0.5, reg_strength=0, num_epochs=1000
Epoch 0/1000, Cost: 0.6889
Epoch 100/1000, Cost: 0.2999
Epoch 200/1000, Cost: 0.2547
Epoch 300/1000, Cost: 0.2292
Epoch 400/1000, Cost: 0.2324
Epoch 500/1000, Cost: 0.2174
Epoch 600/1000, Cost: 0.2118
Epoch 700/10

Epoch 300/3000, Cost: 0.2031
Epoch 400/3000, Cost: 0.2088
Epoch 500/3000, Cost: 0.2416
Epoch 600/3000, Cost: 0.2330
Epoch 700/3000, Cost: 0.2072
Epoch 800/3000, Cost: 0.2024
Epoch 900/3000, Cost: 0.2019
Epoch 1000/3000, Cost: 0.2017
Epoch 1100/3000, Cost: 0.2019
Epoch 1200/3000, Cost: 0.2019
Epoch 1300/3000, Cost: 0.2021
Epoch 1400/3000, Cost: 0.2025
Epoch 1500/3000, Cost: 0.2024
Epoch 1600/3000, Cost: 0.2022
Epoch 1700/3000, Cost: 0.2025
Epoch 1800/3000, Cost: 0.2025
Epoch 1900/3000, Cost: 0.2023
Epoch 2000/3000, Cost: 0.2020
Epoch 2100/3000, Cost: 0.2018
Epoch 2200/3000, Cost: 0.2011
Epoch 2300/3000, Cost: 0.2006
Epoch 2400/3000, Cost: 0.2009
Epoch 2500/3000, Cost: 0.2010
Epoch 2600/3000, Cost: 0.2005
Epoch 2700/3000, Cost: 0.2000
Epoch 2800/3000, Cost: 0.1998
Epoch 2900/3000, Cost: 0.1999
Train Accuracy: 91.38%

Training with hidden_size=10, learning_rate=0.5, reg_strength=0.001, num_epochs=5000
Epoch 0/5000, Cost: 0.6862
Epoch 100/5000, Cost: 0.2875
Epoch 200/5000, Cost: 0.2404
Epo

Epoch 1900/5000, Cost: 0.2022
Epoch 2000/5000, Cost: 0.2158
Epoch 2100/5000, Cost: 0.2023
Epoch 2200/5000, Cost: 0.2020
Epoch 2300/5000, Cost: 0.2019
Epoch 2400/5000, Cost: 0.2019
Epoch 2500/5000, Cost: 0.2088
Epoch 2600/5000, Cost: 0.2021
Epoch 2700/5000, Cost: 0.2017
Epoch 2800/5000, Cost: 0.2017
Epoch 2900/5000, Cost: 0.2016
Epoch 3000/5000, Cost: 0.2016
Epoch 3100/5000, Cost: 0.2017
Epoch 3200/5000, Cost: 0.2168
Epoch 3300/5000, Cost: 0.2016
Epoch 3400/5000, Cost: 0.2015
Epoch 3500/5000, Cost: 0.2015
Epoch 3600/5000, Cost: 0.2014
Epoch 3700/5000, Cost: 0.2014
Epoch 3800/5000, Cost: 0.2014
Epoch 3900/5000, Cost: 0.2014
Epoch 4000/5000, Cost: 0.2052
Epoch 4100/5000, Cost: 0.2015
Epoch 4200/5000, Cost: 0.2014
Epoch 4300/5000, Cost: 0.2013
Epoch 4400/5000, Cost: 0.2013
Epoch 4500/5000, Cost: 0.2013
Epoch 4600/5000, Cost: 0.2013
Epoch 4700/5000, Cost: 0.2012
Epoch 4800/5000, Cost: 0.2012
Epoch 4900/5000, Cost: 0.2014
Train Accuracy: 90.88%

Training with hidden_size=10, learning_rate=0.

Epoch 2100/10000, Cost: 0.2040
Epoch 2200/10000, Cost: 0.2022
Epoch 2300/10000, Cost: 0.2022
Epoch 2400/10000, Cost: 0.2036
Epoch 2500/10000, Cost: 0.2035
Epoch 2600/10000, Cost: 0.2029
Epoch 2700/10000, Cost: 0.2029
Epoch 2800/10000, Cost: 0.2029
Epoch 2900/10000, Cost: 0.2019
Epoch 3000/10000, Cost: 0.2022
Epoch 3100/10000, Cost: 0.2023
Epoch 3200/10000, Cost: 0.2026
Epoch 3300/10000, Cost: 0.2025
Epoch 3400/10000, Cost: 0.2026
Epoch 3500/10000, Cost: 0.2023
Epoch 3600/10000, Cost: 0.2024
Epoch 3700/10000, Cost: 0.2022
Epoch 3800/10000, Cost: 0.2022
Epoch 3900/10000, Cost: 0.2021
Epoch 4000/10000, Cost: 0.2020
Epoch 4100/10000, Cost: 0.2019
Epoch 4200/10000, Cost: 0.2019
Epoch 4300/10000, Cost: 0.2018
Epoch 4400/10000, Cost: 0.2016
Epoch 4500/10000, Cost: 0.2016
Epoch 4600/10000, Cost: 0.2015
Epoch 4700/10000, Cost: 0.2013
Epoch 4800/10000, Cost: 0.2011
Epoch 4900/10000, Cost: 0.2011
Epoch 5000/10000, Cost: 0.2008
Epoch 5100/10000, Cost: 0.2007
Epoch 5200/10000, Cost: 0.2003
Epoch 53

Epoch 6800/10000, Cost: 0.2134
Epoch 6900/10000, Cost: 0.2129
Epoch 7000/10000, Cost: 0.2130
Epoch 7100/10000, Cost: 0.2131
Epoch 7200/10000, Cost: 0.2131
Epoch 7300/10000, Cost: 0.2131
Epoch 7400/10000, Cost: 0.2131
Epoch 7500/10000, Cost: 0.2130
Epoch 7600/10000, Cost: 0.2130
Epoch 7700/10000, Cost: 0.2130
Epoch 7800/10000, Cost: 0.2130
Epoch 7900/10000, Cost: 0.2130
Epoch 8000/10000, Cost: 0.2130
Epoch 8100/10000, Cost: 0.2130
Epoch 8200/10000, Cost: 0.2128
Epoch 8300/10000, Cost: 0.2127
Epoch 8400/10000, Cost: 0.2129
Epoch 8500/10000, Cost: 0.2141
Epoch 8600/10000, Cost: 0.2121
Epoch 8700/10000, Cost: 0.2131
Epoch 8800/10000, Cost: 0.2137
Epoch 8900/10000, Cost: 0.2116
Epoch 9000/10000, Cost: 0.2149
Epoch 9100/10000, Cost: 0.2115
Epoch 9200/10000, Cost: 0.2138
Epoch 9300/10000, Cost: 0.2129
Epoch 9400/10000, Cost: 0.2117
Epoch 9500/10000, Cost: 0.2151
Epoch 9600/10000, Cost: 0.2110
Epoch 9700/10000, Cost: 0.2138
Epoch 9800/10000, Cost: 0.2134
Epoch 9900/10000, Cost: 0.2112
Train Ac

Epoch 1300/2000, Cost: 0.2162
Epoch 1400/2000, Cost: 0.2225
Epoch 1500/2000, Cost: 0.2182
Epoch 1600/2000, Cost: 0.2276
Epoch 1700/2000, Cost: 0.2232
Epoch 1800/2000, Cost: 0.2192
Epoch 1900/2000, Cost: 0.2203
Train Accuracy: 91.62%

Training with hidden_size=10, learning_rate=1, reg_strength=0.001, num_epochs=3000
Epoch 0/3000, Cost: 0.6957
Epoch 100/3000, Cost: 0.6931
Epoch 200/3000, Cost: 0.6931
Epoch 300/3000, Cost: 0.6931
Epoch 400/3000, Cost: 0.6915
Epoch 500/3000, Cost: 0.3102
Epoch 600/3000, Cost: 0.2365
Epoch 700/3000, Cost: 0.2279
Epoch 800/3000, Cost: 0.2355
Epoch 900/3000, Cost: 0.2289
Epoch 1000/3000, Cost: 0.2300
Epoch 1100/3000, Cost: 0.2284
Epoch 1200/3000, Cost: 0.2278
Epoch 1300/3000, Cost: 0.2291
Epoch 1400/3000, Cost: 0.2271
Epoch 1500/3000, Cost: 0.2275
Epoch 1600/3000, Cost: 0.2294
Epoch 1700/3000, Cost: 0.2280
Epoch 1800/3000, Cost: 0.2266
Epoch 1900/3000, Cost: 0.2263
Epoch 2000/3000, Cost: 0.2262
Epoch 2100/3000, Cost: 0.2261
Epoch 2200/3000, Cost: 0.2276
Epoch

Epoch 400/5000, Cost: 0.6931
Epoch 500/5000, Cost: 0.6931
Epoch 600/5000, Cost: 0.6931
Epoch 700/5000, Cost: 0.6931
Epoch 800/5000, Cost: 0.6931
Epoch 900/5000, Cost: 0.6845
Epoch 1000/5000, Cost: 0.6931
Epoch 1100/5000, Cost: 0.6931
Epoch 1200/5000, Cost: 0.6931
Epoch 1300/5000, Cost: 0.6931
Epoch 1400/5000, Cost: 0.6931
Epoch 1500/5000, Cost: 0.6931
Epoch 1600/5000, Cost: 0.6931
Epoch 1700/5000, Cost: 0.6931
Epoch 1800/5000, Cost: 0.6931
Epoch 1900/5000, Cost: 0.6931
Epoch 2000/5000, Cost: 0.6931
Epoch 2100/5000, Cost: 0.6931
Epoch 2200/5000, Cost: 0.6931
Epoch 2300/5000, Cost: 0.6931
Epoch 2400/5000, Cost: 0.6931
Epoch 2500/5000, Cost: 0.6931
Epoch 2600/5000, Cost: 0.6931
Epoch 2700/5000, Cost: 0.6931
Epoch 2800/5000, Cost: 0.6931
Epoch 2900/5000, Cost: 0.6931
Epoch 3000/5000, Cost: 0.6931
Epoch 3100/5000, Cost: 0.6931
Epoch 3200/5000, Cost: 0.6931
Epoch 3300/5000, Cost: 0.6931
Epoch 3400/5000, Cost: 0.6931
Epoch 3500/5000, Cost: 0.6931
Epoch 3600/5000, Cost: 0.6931
Epoch 3700/5000,

Epoch 500/10000, Cost: 0.6930
Epoch 600/10000, Cost: 0.6929
Epoch 700/10000, Cost: 0.6929
Epoch 800/10000, Cost: 0.6929
Epoch 900/10000, Cost: 0.6929
Epoch 1000/10000, Cost: 0.6930
Epoch 1100/10000, Cost: 0.6930
Epoch 1200/10000, Cost: 0.6930
Epoch 1300/10000, Cost: 0.6930
Epoch 1400/10000, Cost: 0.6929
Epoch 1500/10000, Cost: 0.6929
Epoch 1600/10000, Cost: 0.6929
Epoch 1700/10000, Cost: 0.6928
Epoch 1800/10000, Cost: 0.6928
Epoch 1900/10000, Cost: 0.6928
Epoch 2000/10000, Cost: 0.6928
Epoch 2100/10000, Cost: 0.6927
Epoch 2200/10000, Cost: 0.6926
Epoch 2300/10000, Cost: 0.6771
Epoch 2400/10000, Cost: 0.2803
Epoch 2500/10000, Cost: 0.2639
Epoch 2600/10000, Cost: 0.2535
Epoch 2700/10000, Cost: 0.2559
Epoch 2800/10000, Cost: 0.2592
Epoch 2900/10000, Cost: 0.2595
Epoch 3000/10000, Cost: 0.2567
Epoch 3100/10000, Cost: 0.2534
Epoch 3200/10000, Cost: 0.2555
Epoch 3300/10000, Cost: 0.2535
Epoch 3400/10000, Cost: 0.2530
Epoch 3500/10000, Cost: 0.2508
Epoch 3600/10000, Cost: 0.2511
Epoch 3700/10

Epoch 5000/10000, Cost: 0.2850
Epoch 5100/10000, Cost: 0.2598
Epoch 5200/10000, Cost: 0.2247
Epoch 5300/10000, Cost: 0.2355
Epoch 5400/10000, Cost: 0.2298
Epoch 5500/10000, Cost: 0.2398
Epoch 5600/10000, Cost: 0.2734
Epoch 5700/10000, Cost: 0.2869
Epoch 5800/10000, Cost: 0.2902
Epoch 5900/10000, Cost: 0.2718
Epoch 6000/10000, Cost: 0.2294
Epoch 6100/10000, Cost: 0.2323
Epoch 6200/10000, Cost: 0.2287
Epoch 6300/10000, Cost: 0.2403
Epoch 6400/10000, Cost: 0.2658
Epoch 6500/10000, Cost: 0.2901
Epoch 6600/10000, Cost: 0.3178
Epoch 6700/10000, Cost: 0.2817
Epoch 6800/10000, Cost: 0.2573
Epoch 6900/10000, Cost: 0.2277
Epoch 7000/10000, Cost: 0.2247
Epoch 7100/10000, Cost: 0.2352
Epoch 7200/10000, Cost: 0.2416
Epoch 7300/10000, Cost: 0.2667
Epoch 7400/10000, Cost: 0.2857
Epoch 7500/10000, Cost: 0.3337
Epoch 7600/10000, Cost: 0.2880
Epoch 7700/10000, Cost: 0.2496
Epoch 7800/10000, Cost: 0.2278
Epoch 7900/10000, Cost: 0.2362
Epoch 8000/10000, Cost: 0.2504
Epoch 8100/10000, Cost: 0.2728
Epoch 82

Epoch 9600/10000, Cost: 0.2011
Epoch 9700/10000, Cost: 0.2011
Epoch 9800/10000, Cost: 0.2011
Epoch 9900/10000, Cost: 0.2011
Train Accuracy: 91.38%

Training with hidden_size=15, learning_rate=0.01, reg_strength=0.001, num_epochs=1000
Epoch 0/1000, Cost: 0.6963
Epoch 100/1000, Cost: 0.6653
Epoch 200/1000, Cost: 0.6374
Epoch 300/1000, Cost: 0.5934
Epoch 400/1000, Cost: 0.5323
Epoch 500/1000, Cost: 0.4625
Epoch 600/1000, Cost: 0.3972
Epoch 700/1000, Cost: 0.3452
Epoch 800/1000, Cost: 0.3073
Epoch 900/1000, Cost: 0.2807
Train Accuracy: 90.62%

Training with hidden_size=15, learning_rate=0.01, reg_strength=0.001, num_epochs=2000
Epoch 0/2000, Cost: 0.6983
Epoch 100/2000, Cost: 0.6826
Epoch 200/2000, Cost: 0.6668
Epoch 300/2000, Cost: 0.6432
Epoch 400/2000, Cost: 0.6069
Epoch 500/2000, Cost: 0.5566
Epoch 600/2000, Cost: 0.4950
Epoch 700/2000, Cost: 0.4304
Epoch 800/2000, Cost: 0.3733
Epoch 900/2000, Cost: 0.3288
Epoch 1000/2000, Cost: 0.2963
Epoch 1100/2000, Cost: 0.2733
Epoch 1200/2000, Cos

Epoch 1400/3000, Cost: 0.2367
Epoch 1500/3000, Cost: 0.2300
Epoch 1600/3000, Cost: 0.2249
Epoch 1700/3000, Cost: 0.2210
Epoch 1800/3000, Cost: 0.2179
Epoch 1900/3000, Cost: 0.2155
Epoch 2000/3000, Cost: 0.2136
Epoch 2100/3000, Cost: 0.2120
Epoch 2200/3000, Cost: 0.2107
Epoch 2300/3000, Cost: 0.2097
Epoch 2400/3000, Cost: 0.2088
Epoch 2500/3000, Cost: 0.2081
Epoch 2600/3000, Cost: 0.2074
Epoch 2700/3000, Cost: 0.2069
Epoch 2800/3000, Cost: 0.2064
Epoch 2900/3000, Cost: 0.2060
Train Accuracy: 91.00%

Training with hidden_size=15, learning_rate=0.01, reg_strength=0.01, num_epochs=5000
Epoch 0/5000, Cost: 0.6989
Epoch 100/5000, Cost: 0.6610
Epoch 200/5000, Cost: 0.6329
Epoch 300/5000, Cost: 0.5898
Epoch 400/5000, Cost: 0.5303
Epoch 500/5000, Cost: 0.4619
Epoch 600/5000, Cost: 0.3974
Epoch 700/5000, Cost: 0.3455
Epoch 800/5000, Cost: 0.3076
Epoch 900/5000, Cost: 0.2807
Epoch 1000/5000, Cost: 0.2618
Epoch 1100/5000, Cost: 0.2484
Epoch 1200/5000, Cost: 0.2386
Epoch 1300/5000, Cost: 0.2314
Epo

Epoch 3100/5000, Cost: 0.2058
Epoch 3200/5000, Cost: 0.2055
Epoch 3300/5000, Cost: 0.2052
Epoch 3400/5000, Cost: 0.2050
Epoch 3500/5000, Cost: 0.2047
Epoch 3600/5000, Cost: 0.2045
Epoch 3700/5000, Cost: 0.2043
Epoch 3800/5000, Cost: 0.2042
Epoch 3900/5000, Cost: 0.2040
Epoch 4000/5000, Cost: 0.2039
Epoch 4100/5000, Cost: 0.2037
Epoch 4200/5000, Cost: 0.2036
Epoch 4300/5000, Cost: 0.2035
Epoch 4400/5000, Cost: 0.2033
Epoch 4500/5000, Cost: 0.2032
Epoch 4600/5000, Cost: 0.2031
Epoch 4700/5000, Cost: 0.2030
Epoch 4800/5000, Cost: 0.2029
Epoch 4900/5000, Cost: 0.2028
Train Accuracy: 91.25%

Training with hidden_size=15, learning_rate=0.01, reg_strength=0.1, num_epochs=10000
Epoch 0/10000, Cost: 0.6936
Epoch 100/10000, Cost: 0.6741
Epoch 200/10000, Cost: 0.6507
Epoch 300/10000, Cost: 0.6145
Epoch 400/10000, Cost: 0.5610
Epoch 500/10000, Cost: 0.4936
Epoch 600/10000, Cost: 0.4246
Epoch 700/10000, Cost: 0.3662
Epoch 800/10000, Cost: 0.3224
Epoch 900/10000, Cost: 0.2913
Epoch 1000/10000, Cost:

Epoch 2600/10000, Cost: 0.2076
Epoch 2700/10000, Cost: 0.2071
Epoch 2800/10000, Cost: 0.2067
Epoch 2900/10000, Cost: 0.2063
Epoch 3000/10000, Cost: 0.2059
Epoch 3100/10000, Cost: 0.2056
Epoch 3200/10000, Cost: 0.2054
Epoch 3300/10000, Cost: 0.2051
Epoch 3400/10000, Cost: 0.2049
Epoch 3500/10000, Cost: 0.2047
Epoch 3600/10000, Cost: 0.2045
Epoch 3700/10000, Cost: 0.2043
Epoch 3800/10000, Cost: 0.2041
Epoch 3900/10000, Cost: 0.2040
Epoch 4000/10000, Cost: 0.2038
Epoch 4100/10000, Cost: 0.2037
Epoch 4200/10000, Cost: 0.2036
Epoch 4300/10000, Cost: 0.2035
Epoch 4400/10000, Cost: 0.2034
Epoch 4500/10000, Cost: 0.2033
Epoch 4600/10000, Cost: 0.2032
Epoch 4700/10000, Cost: 0.2031
Epoch 4800/10000, Cost: 0.2030
Epoch 4900/10000, Cost: 0.2029
Epoch 5000/10000, Cost: 0.2028
Epoch 5100/10000, Cost: 0.2028
Epoch 5200/10000, Cost: 0.2027
Epoch 5300/10000, Cost: 0.2026
Epoch 5400/10000, Cost: 0.2026
Epoch 5500/10000, Cost: 0.2025
Epoch 5600/10000, Cost: 0.2024
Epoch 5700/10000, Cost: 0.2024
Epoch 58

Epoch 8200/10000, Cost: 0.2008
Epoch 8300/10000, Cost: 0.2008
Epoch 8400/10000, Cost: 0.2008
Epoch 8500/10000, Cost: 0.2008
Epoch 8600/10000, Cost: 0.2008
Epoch 8700/10000, Cost: 0.2008
Epoch 8800/10000, Cost: 0.2008
Epoch 8900/10000, Cost: 0.2008
Epoch 9000/10000, Cost: 0.2008
Epoch 9100/10000, Cost: 0.2008
Epoch 9200/10000, Cost: 0.2008
Epoch 9300/10000, Cost: 0.2008
Epoch 9400/10000, Cost: 0.2008
Epoch 9500/10000, Cost: 0.2008
Epoch 9600/10000, Cost: 0.2008
Epoch 9700/10000, Cost: 0.2008
Epoch 9800/10000, Cost: 0.2008
Epoch 9900/10000, Cost: 0.2008
Train Accuracy: 91.50%

Training with hidden_size=15, learning_rate=0.1, reg_strength=0.001, num_epochs=1000
Epoch 0/1000, Cost: 0.6872
Epoch 100/1000, Cost: 0.2540
Epoch 200/1000, Cost: 0.2108
Epoch 300/1000, Cost: 0.2053
Epoch 400/1000, Cost: 0.2035
Epoch 500/1000, Cost: 0.2026
Epoch 600/1000, Cost: 0.2020
Epoch 700/1000, Cost: 0.2016
Epoch 800/1000, Cost: 0.2014
Epoch 900/1000, Cost: 0.2012
Train Accuracy: 91.38%

Training with hidden_

Epoch 300/3000, Cost: 0.2054
Epoch 400/3000, Cost: 0.2034
Epoch 500/3000, Cost: 0.2025
Epoch 600/3000, Cost: 0.2020
Epoch 700/3000, Cost: 0.2016
Epoch 800/3000, Cost: 0.2014
Epoch 900/3000, Cost: 0.2013
Epoch 1000/3000, Cost: 0.2012
Epoch 1100/3000, Cost: 0.2011
Epoch 1200/3000, Cost: 0.2010
Epoch 1300/3000, Cost: 0.2010
Epoch 1400/3000, Cost: 0.2010
Epoch 1500/3000, Cost: 0.2010
Epoch 1600/3000, Cost: 0.2010
Epoch 1700/3000, Cost: 0.2009
Epoch 1800/3000, Cost: 0.2009
Epoch 1900/3000, Cost: 0.2009
Epoch 2000/3000, Cost: 0.2009
Epoch 2100/3000, Cost: 0.2009
Epoch 2200/3000, Cost: 0.2009
Epoch 2300/3000, Cost: 0.2009
Epoch 2400/3000, Cost: 0.2009
Epoch 2500/3000, Cost: 0.2009
Epoch 2600/3000, Cost: 0.2009
Epoch 2700/3000, Cost: 0.2009
Epoch 2800/3000, Cost: 0.2009
Epoch 2900/3000, Cost: 0.2009
Train Accuracy: 91.50%

Training with hidden_size=15, learning_rate=0.1, reg_strength=0.01, num_epochs=5000
Epoch 0/5000, Cost: 0.7007
Epoch 100/5000, Cost: 0.4795
Epoch 200/5000, Cost: 0.2948
Epoc

Epoch 1900/5000, Cost: 0.2009
Epoch 2000/5000, Cost: 0.2009
Epoch 2100/5000, Cost: 0.2009
Epoch 2200/5000, Cost: 0.2009
Epoch 2300/5000, Cost: 0.2009
Epoch 2400/5000, Cost: 0.2009
Epoch 2500/5000, Cost: 0.2009
Epoch 2600/5000, Cost: 0.2009
Epoch 2700/5000, Cost: 0.2009
Epoch 2800/5000, Cost: 0.2008
Epoch 2900/5000, Cost: 0.2008
Epoch 3000/5000, Cost: 0.2008
Epoch 3100/5000, Cost: 0.2008
Epoch 3200/5000, Cost: 0.2008
Epoch 3300/5000, Cost: 0.2008
Epoch 3400/5000, Cost: 0.2008
Epoch 3500/5000, Cost: 0.2008
Epoch 3600/5000, Cost: 0.2008
Epoch 3700/5000, Cost: 0.2008
Epoch 3800/5000, Cost: 0.2008
Epoch 3900/5000, Cost: 0.2008
Epoch 4000/5000, Cost: 0.2008
Epoch 4100/5000, Cost: 0.2008
Epoch 4200/5000, Cost: 0.2008
Epoch 4300/5000, Cost: 0.2008
Epoch 4400/5000, Cost: 0.2008
Epoch 4500/5000, Cost: 0.2008
Epoch 4600/5000, Cost: 0.2008
Epoch 4700/5000, Cost: 0.2008
Epoch 4800/5000, Cost: 0.2008
Epoch 4900/5000, Cost: 0.2008
Train Accuracy: 91.50%

Training with hidden_size=15, learning_rate=0.

Epoch 1300/10000, Cost: 0.2011
Epoch 1400/10000, Cost: 0.2011
Epoch 1500/10000, Cost: 0.2011
Epoch 1600/10000, Cost: 0.2011
Epoch 1700/10000, Cost: 0.2011
Epoch 1800/10000, Cost: 0.2011
Epoch 1900/10000, Cost: 0.2011
Epoch 2000/10000, Cost: 0.2010
Epoch 2100/10000, Cost: 0.2010
Epoch 2200/10000, Cost: 0.2010
Epoch 2300/10000, Cost: 0.2010
Epoch 2400/10000, Cost: 0.2010
Epoch 2500/10000, Cost: 0.2010
Epoch 2600/10000, Cost: 0.2010
Epoch 2700/10000, Cost: 0.2010
Epoch 2800/10000, Cost: 0.2010
Epoch 2900/10000, Cost: 0.2010
Epoch 3000/10000, Cost: 0.2010
Epoch 3100/10000, Cost: 0.2010
Epoch 3200/10000, Cost: 0.2010
Epoch 3300/10000, Cost: 0.2010
Epoch 3400/10000, Cost: 0.2010
Epoch 3500/10000, Cost: 0.2010
Epoch 3600/10000, Cost: 0.2010
Epoch 3700/10000, Cost: 0.2010
Epoch 3800/10000, Cost: 0.2010
Epoch 3900/10000, Cost: 0.2010
Epoch 4000/10000, Cost: 0.2010
Epoch 4100/10000, Cost: 0.2010
Epoch 4200/10000, Cost: 0.2010
Epoch 4300/10000, Cost: 0.2010
Epoch 4400/10000, Cost: 0.2010
Epoch 45

Epoch 6200/10000, Cost: 0.2007
Epoch 6300/10000, Cost: 0.2015
Epoch 6400/10000, Cost: 0.2009
Epoch 6500/10000, Cost: 0.2007
Epoch 6600/10000, Cost: 0.2015
Epoch 6700/10000, Cost: 0.2006
Epoch 6800/10000, Cost: 0.2026
Epoch 6900/10000, Cost: 0.2005
Epoch 7000/10000, Cost: 0.2021
Epoch 7100/10000, Cost: 0.2006
Epoch 7200/10000, Cost: 0.2012
Epoch 7300/10000, Cost: 0.2008
Epoch 7400/10000, Cost: 0.2007
Epoch 7500/10000, Cost: 0.2012
Epoch 7600/10000, Cost: 0.2007
Epoch 7700/10000, Cost: 0.2011
Epoch 7800/10000, Cost: 0.2007
Epoch 7900/10000, Cost: 0.2010
Epoch 8000/10000, Cost: 0.2008
Epoch 8100/10000, Cost: 0.2008
Epoch 8200/10000, Cost: 0.2008
Epoch 8300/10000, Cost: 0.2008
Epoch 8400/10000, Cost: 0.2008
Epoch 8500/10000, Cost: 0.2007
Epoch 8600/10000, Cost: 0.2007
Epoch 8700/10000, Cost: 0.2007
Epoch 8800/10000, Cost: 0.2007
Epoch 8900/10000, Cost: 0.2007
Epoch 9000/10000, Cost: 0.2006
Epoch 9100/10000, Cost: 0.2006
Epoch 9200/10000, Cost: 0.2006
Epoch 9300/10000, Cost: 0.2006
Epoch 94

Epoch 400/2000, Cost: 0.2058
Epoch 500/2000, Cost: 0.2062
Epoch 600/2000, Cost: 0.2106
Epoch 700/2000, Cost: 0.2040
Epoch 800/2000, Cost: 0.2045
Epoch 900/2000, Cost: 0.2046
Epoch 1000/2000, Cost: 0.2059
Epoch 1100/2000, Cost: 0.2063
Epoch 1200/2000, Cost: 0.2032
Epoch 1300/2000, Cost: 0.2045
Epoch 1400/2000, Cost: 0.2042
Epoch 1500/2000, Cost: 0.2041
Epoch 1600/2000, Cost: 0.2035
Epoch 1700/2000, Cost: 0.2027
Epoch 1800/2000, Cost: 0.2039
Epoch 1900/2000, Cost: 0.2043
Train Accuracy: 91.38%

Training with hidden_size=15, learning_rate=0.5, reg_strength=0.01, num_epochs=3000
Epoch 0/3000, Cost: 0.7108
Epoch 100/3000, Cost: 0.6648
Epoch 200/3000, Cost: 0.2030
Epoch 300/3000, Cost: 0.2021
Epoch 400/3000, Cost: 0.2022
Epoch 500/3000, Cost: 0.2023
Epoch 600/3000, Cost: 0.2030
Epoch 700/3000, Cost: 0.2054
Epoch 800/3000, Cost: 0.2075
Epoch 900/3000, Cost: 0.2075
Epoch 1000/3000, Cost: 0.2057
Epoch 1100/3000, Cost: 0.2039
Epoch 1200/3000, Cost: 0.2037
Epoch 1300/3000, Cost: 0.2037
Epoch 1400

Epoch 2500/3000, Cost: 0.2015
Epoch 2600/3000, Cost: 0.2014
Epoch 2700/3000, Cost: 0.2014
Epoch 2800/3000, Cost: 0.2405
Epoch 2900/3000, Cost: 0.2014
Train Accuracy: 91.38%

Training with hidden_size=15, learning_rate=0.5, reg_strength=0.1, num_epochs=5000
Epoch 0/5000, Cost: 0.6888
Epoch 100/5000, Cost: 0.2429
Epoch 200/5000, Cost: 0.2149
Epoch 300/5000, Cost: 0.2112
Epoch 400/5000, Cost: 0.2096
Epoch 500/5000, Cost: 0.2119
Epoch 600/5000, Cost: 0.2077
Epoch 700/5000, Cost: 0.2072
Epoch 800/5000, Cost: 0.2072
Epoch 900/5000, Cost: 0.2060
Epoch 1000/5000, Cost: 0.2051
Epoch 1100/5000, Cost: 0.2046
Epoch 1200/5000, Cost: 0.2054
Epoch 1300/5000, Cost: 0.2068
Epoch 1400/5000, Cost: 0.2064
Epoch 1500/5000, Cost: 0.2050
Epoch 1600/5000, Cost: 0.2040
Epoch 1700/5000, Cost: 0.2041
Epoch 1800/5000, Cost: 0.2049
Epoch 1900/5000, Cost: 0.2050
Epoch 2000/5000, Cost: 0.2046
Epoch 2100/5000, Cost: 0.2045
Epoch 2200/5000, Cost: 0.2044
Epoch 2300/5000, Cost: 0.2043
Epoch 2400/5000, Cost: 0.2043
Epoch

Epoch 4800/5000, Cost: 0.2088
Epoch 4900/5000, Cost: 0.2135
Train Accuracy: 90.75%

Training with hidden_size=15, learning_rate=0.5, reg_strength=1, num_epochs=10000
Epoch 0/10000, Cost: 0.6912
Epoch 100/10000, Cost: 0.2132
Epoch 200/10000, Cost: 0.2075
Epoch 300/10000, Cost: 0.2050
Epoch 400/10000, Cost: 0.2075
Epoch 500/10000, Cost: 0.2147
Epoch 600/10000, Cost: 0.2140
Epoch 700/10000, Cost: 0.2110
Epoch 800/10000, Cost: 0.2122
Epoch 900/10000, Cost: 0.2115
Epoch 1000/10000, Cost: 0.2120
Epoch 1100/10000, Cost: 0.2113
Epoch 1200/10000, Cost: 0.2121
Epoch 1300/10000, Cost: 0.2082
Epoch 1400/10000, Cost: 0.2111
Epoch 1500/10000, Cost: 0.2105
Epoch 1600/10000, Cost: 0.2088
Epoch 1700/10000, Cost: 0.2146
Epoch 1800/10000, Cost: 0.2066
Epoch 1900/10000, Cost: 0.2182
Epoch 2000/10000, Cost: 0.2045
Epoch 2100/10000, Cost: 0.2171
Epoch 2200/10000, Cost: 0.2036
Epoch 2300/10000, Cost: 0.2201
Epoch 2400/10000, Cost: 0.2016
Epoch 2500/10000, Cost: 0.2258
Epoch 2600/10000, Cost: 0.2008
Epoch 270

Epoch 4000/10000, Cost: 0.2310
Epoch 4100/10000, Cost: 0.2310
Epoch 4200/10000, Cost: 0.2309
Epoch 4300/10000, Cost: 0.2309
Epoch 4400/10000, Cost: 0.2309
Epoch 4500/10000, Cost: 0.2309
Epoch 4600/10000, Cost: 0.2309
Epoch 4700/10000, Cost: 0.2309
Epoch 4800/10000, Cost: 0.2309
Epoch 4900/10000, Cost: 0.2309
Epoch 5000/10000, Cost: 0.2309
Epoch 5100/10000, Cost: 0.2309
Epoch 5200/10000, Cost: 0.2309
Epoch 5300/10000, Cost: 0.2309
Epoch 5400/10000, Cost: 0.2309
Epoch 5500/10000, Cost: 0.2309
Epoch 5600/10000, Cost: 0.2309
Epoch 5700/10000, Cost: 0.2309
Epoch 5800/10000, Cost: 0.2309
Epoch 5900/10000, Cost: 0.2309
Epoch 6000/10000, Cost: 0.2309
Epoch 6100/10000, Cost: 0.2309
Epoch 6200/10000, Cost: 0.2309
Epoch 6300/10000, Cost: 0.2309
Epoch 6400/10000, Cost: 0.2309
Epoch 6500/10000, Cost: 0.2309
Epoch 6600/10000, Cost: 0.2309
Epoch 6700/10000, Cost: 0.2309
Epoch 6800/10000, Cost: 0.2309
Epoch 6900/10000, Cost: 0.2309
Epoch 7000/10000, Cost: 0.2309
Epoch 7100/10000, Cost: 0.2309
Epoch 72

Epoch 9500/10000, Cost: 0.2285
Epoch 9600/10000, Cost: 0.2285
Epoch 9700/10000, Cost: 0.2285
Epoch 9800/10000, Cost: 0.2285
Epoch 9900/10000, Cost: 0.2285
Train Accuracy: 90.88%

Training with hidden_size=15, learning_rate=1, reg_strength=0.01, num_epochs=1000
Epoch 0/1000, Cost: 0.6817
Epoch 100/1000, Cost: 0.6931
Epoch 200/1000, Cost: 0.6931
Epoch 300/1000, Cost: 0.6931
Epoch 400/1000, Cost: 0.6931
Epoch 500/1000, Cost: 0.6930
Epoch 600/1000, Cost: 0.6930
Epoch 700/1000, Cost: 0.6930
Epoch 800/1000, Cost: 0.6929
Epoch 900/1000, Cost: 0.6929
Train Accuracy: 50.12%

Training with hidden_size=15, learning_rate=1, reg_strength=0.01, num_epochs=2000
Epoch 0/2000, Cost: 0.7369
Epoch 100/2000, Cost: 0.6932
Epoch 200/2000, Cost: 0.6931
Epoch 300/2000, Cost: 0.6931
Epoch 400/2000, Cost: 0.6931
Epoch 500/2000, Cost: 0.6931
Epoch 600/2000, Cost: 0.6931
Epoch 700/2000, Cost: 0.6931
Epoch 800/2000, Cost: 0.6931
Epoch 900/2000, Cost: 0.6931
Epoch 1000/2000, Cost: 0.6931
Epoch 1100/2000, Cost: 0.69

Epoch 500/3000, Cost: 0.2328
Epoch 600/3000, Cost: 0.2283
Epoch 700/3000, Cost: 0.2337
Epoch 800/3000, Cost: 0.2320
Epoch 900/3000, Cost: 0.2334
Epoch 1000/3000, Cost: 0.2299
Epoch 1100/3000, Cost: 0.2321
Epoch 1200/3000, Cost: 0.2317
Epoch 1300/3000, Cost: 0.2315
Epoch 1400/3000, Cost: 0.2314
Epoch 1500/3000, Cost: 0.2298
Epoch 1600/3000, Cost: 0.2307
Epoch 1700/3000, Cost: 0.2294
Epoch 1800/3000, Cost: 0.2307
Epoch 1900/3000, Cost: 0.2291
Epoch 2000/3000, Cost: 0.2330
Epoch 2100/3000, Cost: 0.2310
Epoch 2200/3000, Cost: 0.2309
Epoch 2300/3000, Cost: 0.2316
Epoch 2400/3000, Cost: 0.2301
Epoch 2500/3000, Cost: 0.2294
Epoch 2600/3000, Cost: 0.2283
Epoch 2700/3000, Cost: 0.2291
Epoch 2800/3000, Cost: 0.2288
Epoch 2900/3000, Cost: 0.2288
Train Accuracy: 90.75%

Training with hidden_size=15, learning_rate=1, reg_strength=0.1, num_epochs=5000
Epoch 0/5000, Cost: 0.7154
Epoch 100/5000, Cost: 0.3536
Epoch 200/5000, Cost: 0.2596
Epoch 300/5000, Cost: 0.2528
Epoch 400/5000, Cost: 0.2552
Epoch 5

Epoch 1800/5000, Cost: 0.2667
Epoch 1900/5000, Cost: 0.2620
Epoch 2000/5000, Cost: 0.2729
Epoch 2100/5000, Cost: 0.2628
Epoch 2200/5000, Cost: 0.2746
Epoch 2300/5000, Cost: 0.2592
Epoch 2400/5000, Cost: 0.2644
Epoch 2500/5000, Cost: 0.2602
Epoch 2600/5000, Cost: 0.2684
Epoch 2700/5000, Cost: 0.2671
Epoch 2800/5000, Cost: 0.2609
Epoch 2900/5000, Cost: 0.2698
Epoch 3000/5000, Cost: 0.2687
Epoch 3100/5000, Cost: 0.2653
Epoch 3200/5000, Cost: 0.2661
Epoch 3300/5000, Cost: 0.2722
Epoch 3400/5000, Cost: 0.2712
Epoch 3500/5000, Cost: 0.2719
Epoch 3600/5000, Cost: 0.2681
Epoch 3700/5000, Cost: 0.2669
Epoch 3800/5000, Cost: 0.2682
Epoch 3900/5000, Cost: 0.2776
Epoch 4000/5000, Cost: 0.2738
Epoch 4100/5000, Cost: 0.2729
Epoch 4200/5000, Cost: 0.2738
Epoch 4300/5000, Cost: 0.2678
Epoch 4400/5000, Cost: 0.2609
Epoch 4500/5000, Cost: 0.2629
Epoch 4600/5000, Cost: 0.2667
Epoch 4700/5000, Cost: 0.2734
Epoch 4800/5000, Cost: 0.2646
Epoch 4900/5000, Cost: 0.2710
Train Accuracy: 90.50%

Training with hi

In [1]:
# lets copy the final output we got
'''
Best Hyperparameter Values:
Hidden Size: 10
Learning Rate: 0.5
Regularization Strength: 0.001
Number of Epochs: 10000
Best Accuracy: 91.75%
'''

'\nBest Hyperparameter Values:\nHidden Size: 10\nLearning Rate: 0.5\nRegularization Strength: 0.001\nNumber of Epochs: 10000\nBest Accuracy: 91.75%\n'

As we can see the accuracy increase we achieved was only nearly 1% increase compared to without hyperparameter tuning:

"Best Hyperparameter Values:
Hidden Size: 10
Learning Rate: 0.5
Regularization Strength: 0.001
Number of Epochs: 10000
Best Accuracy: 91.75%"

This is good enough though as we already have such high accuracy. any further tweaking wont bring much change

the reasons are simple: 

* the amount of training data is not big enough.
* also the model is a simple neural network. not anything fancy with many layers and optimised techniques like convolution or other techniques
* the accuracy for the simple model was already near its peak value, so not much increment could be achieve due to the sheer less room left for improvement
* I am not a professional, as I am also currently learning and am fairly new to the topics of ML and DL, so I am not qualified enough or skillful enough to write better code

### Note:

Since the random search uses a different seed each time it's code block is run, there was one instance where the accuracy i achieved was 91.89% with certain parameter combinations. but i didnt copy paste that output. Hence after realising that everytime randomsearch will give different best outputs, I decided to just copy paste the best output I currently got, as shown in the markdown cells above.

# Thank you

---