CS6910_Assignment-1

Author: Sumit Sharma, EE20D042,

The task is to implement a Feed-forward Neural Network, and write a backpropogation code for training the network. This network will be trained and tested using the Fashion-MNIST dataset. Specifically, given an input image i.e., 28 x 28 = 784 pixels from the Fashion-MNIST dataset, the network will be trained to classify the image into 1 of 10 classes.

In this Github repository, a Feed-forward Neural Network is implemented in a single google colab file (attached). Notations: inn----Input neuron, otn----Output neuron, hls----Hidden layers, hn-----Hidden neuron, xtr----Xtrain, ytr----Ytrain, xte----Xtest, yte----Ytest, W---Weight Dictionary b---Bias Dictionary u---Dictionary for defining input to neuron v---Dictionary for defining output from neuron la---"la"th index for training data

The whole code is divided into different functions. The main functions are as follows:

activ function----For calling different activation functions i.e. Sigmoid, relu and tanh,
deractiv function----For calling derivative activation function in order to get different derivatives for different activation functions,
OUT(ul, otn)----Output function i.e., the softmax function for the last layer of the network, that makes the outputs from different neurons as predictions/probabilities
forwardprop(xtr, hls, hn, inn, otn, W, b, u, v, la, activation)--- For defining the Forward Propogation step, which defines the forward pass, which gives the prediction,
init(W, b, inn, hls, hn, otn, initialization)---For initialization for the weights and biases in which the argument initialization decides regarding the initialization method i.e., "random" or "xavier".
hot(m, n): Defining the hot function in order to provide "hot vectors" i.e. each class gets uniquely identified with different encodings
Backprop(ytr, hls, hn, inn, otn, W, b, u, v, gradW, gradb, la, y_predicted, activation, los): For defining the backpropogation step, in order to find the gradients of loss with respect to weights and biases
LOSS(y_predicted, ytr, los, la, otn)---For defining the loss i.e., Squared error and Cross entropy loss, this function takes "5" arguments.
acc(xte, yte, xtr, ytr, W, b, hls, otn, hn, activation)---This is the accuracy function, which takes "10" arguments and responsible for returning the validation accuracy with confusion matrix

The different optimization methods definitions start in terms of different functions 10. momentumbasedgd(xtr, ytr, hls, hn, inn, otn, W, b, u, v, learning_rate, epochs, size_batch, set_train, activation, los)--Momentumbased gradient descent method

RMS(xtr, ytr, hls, hn, inn, otn, W, b, u, v, learning_rate, epochs, size_batch, set_train, activation, los)--Root-mean-sqaure propogation method
Nesterovagd(xtr, ytr, hls, hn, inn, otn, W, b, u, v, learning_rate, epochs, size_batch, set_train, activation, los)--Nesterov gradient descent method
stochasticgd(xtr, ytr, hls, hn, inn, otn, W, b, u, v, learning_rate, epochs, size_batch, set_train, activation, los)--Stocastic gradient descent method
Nesterovadapt(xtr, ytr, hls, hn, inn, otn, W, b, u, v, learning_rate, epochs, size_batch, set_train, activation, los)--Nesterov adaptive movement estimation method
adaptivem(xtr, ytr, hls, hn, inn, otn, W, b, u, v, learning_rate, epochs, size_batch, set_train, activation, los)--Adaptive movement estimation method
trning(xtr, ytr, hls, hn, inn, otn, W, b, u, v, learning_rate, epochs, size_batch, Optimization, set_train, activation, los, initialization)---Finally the training function with "18" arguments as input to call different functions.

For sweep through wandb, these are the range for different parametres---- sweep_config = { 'name' : "Sumit Sharma", 'method': 'random', 'metric': { 'name': 'val_acc', 'goal': 'maximize'
}, 'parameters': {

    'hls': {
        'values': [3,4, 5]
    },
    'epochs': {
        'values': [10, 15]
    },
    'hn': {
        'values': [32, 64,128]
    },
    'learning_rate': {
        'values': [1e-2,1e-3,5e-3]
    },
    'initialization': {
        'values': ["random","xavier"]
    },
    'size_batch': {
        'values': [32,64]
    },
    'Optimization': {
        'values': ["momentumbasedgd","stochasticgd","Nesterovadapt","RMS","adaptivem","nesterovagd"]
    },
    'activation': {
        'values': ["sigmoid","relu","tanh"]
    }
}

} For training, 90% dataset selected i.e. 54000 samples and 10% dataset as test dataset. No of output neurons selected as "10". No of input neurons selected as "784". "los" variable set as "cross_entropy" Finally there is an execute(), to call the "trning()" and to initiate wandb

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Deep_learning_final1 (1).ipynb		Deep_learning_final1 (1).ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS6910_Assignment-1

About

Releases

Packages

Languages

sumit171993/CS6910_Assignment-1

Folders and files

Latest commit

History

Repository files navigation

CS6910_Assignment-1

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages