This repository is used to demonstrate deep learning algorithm (Convolutional Neural Network) using TensorFlow.
The goal of this project is to create a Deep Learning model. The dataset has 5 types of ships which are {'Cargo': 1, 'Military': 2, 'Carrier': 3, 'Cruise': 4, 'Tankers': 5}. Convolutional Neural Network (CNN) is trained with the images of various types of ships using a 60% of the data and validate simulataneously against 20% of the data and finally test the training with the remaining 20% of the data. The CNN is adjusted with various hyperparameters like using different activation functions, loss functions, changing the epochs, using various initialisers, changing network size and layers and finally obtaining the best accurate version of the CNN for the data. The programming is done in TensorFlow language by Google and loss and accuracy plot indicate how good or bad the network is trained aganist the data.
Activations are changed, keeping all other hyperparameters constant for differentiation
Cost Functions are changed, keeping all other hyperparameters constant
Epochs are changed, keeping all other hyperparameters constant
Gradients are changed, keeping all other hyperparameters constant
Initializations are changed, keeping all other hyperparameters constant
The network architectures, no of layers, epochs and filter sizes were all changed during the course of the earlier training to understand the results and decide which is values of hyperoaramets are functioning better?
Parameters:
-
Activation function = ReLU
-
Loss function = SparseCategoricalCrossentropy()
-
Optimizer = ADAM
In the Jupyter Notebook furthers models can be found. Illustrating one model results below.
- The model is trained with epochs of 15 initially but repeated training of the model in the running kernel makes the model overfit the data and could result in fake accuracies.
- With hyperparameter tuning, the model is improved by initialising with Xavier Glorot Initialization namely Xavier Uniform and Xavier Gaussian.
- It has been observed that the model performed with other loss functions and optimisers but could yield better results with appropriate batch size
- Meanwhile, it is also concluded that using different layers and adding more layers in the model would only increase the comkplexity but not improve the accuray unless an appropriate activation function is given.
- Later, the model is tuned with various epochs and batch_size to improve accuracy.
- Finally, the CNN gave the maximum accuracy of 63% with initial relu activation function and adam optimiser and no initialization and SparseCategoricalCrossEntropy as the best loss function.
References:
[1] ADL (24 April 2018), "An intuitive guide to Convolutional Neural Networks" retrieved from https://www.freecodecamp.org/news/an-intuitive-guide-to-convolutional-neural-networks-260c2de0a050/ [2] TensorFlow Tutorials,"Convolutional Neural Network (CNN)" retrieved from https://www.tensorflow.org/tutorials/images/cnn [3] Analytics Vidhya Courses, "Convolutional Neural Networks (CNN) from Scratch" retrieved from https://courses.analyticsvidhya.com/courses/take/convolutional-neural-networks-cnn-from-scratch/texts/10844923-what-is-a-neural-network [4] TensorFlow Core Documentation, "Module: tf.keras.initializers" retrieved from https://www.tensorflow.org/api_docs/python/tf/keras/initializers?version=nightly
90% contributed by me, 10% derived from official documentation