These are some simple Tensorflow scripts I put together while listening to Martin Görner's talk, Tensorflow and Deep Learning Without a PhD.
The scripts
directory contains the scripts and the output_images
directory contains training/test accuracy and cross entropy plots for each neural network.
The simple neural network is straightforward but only achieves ~92% accuracy, poor performance for digit recognition on the MNIST dataset.
Max test accuracy achieved = 92.70%
The deep network achieves better performance, getting closer to ~98% accuracy, but its performance is unstable. This can be seen in the jagged test accuracy that never stabilizes.
Max test accuracy achieved = 98.15%
Decaying the learning rate helps stabilize model performance as seen in the test accuracy smoothing in later iterations of training. However, this makes the overfit, seen in the increasing test cross entropy, more evident.
Max test accuracy achieved = 98.12%
Normalization via dropout decreases the overfit, as seen in the stable test cross entropy, but does not increase performance.
Max test accuracy achieved = 98.08%
Adding convolutions to the neural network helps recover the dimensions and shape of the data, increasing performance. However, the overfit of test data, as seen in the increasing test cross entropy, returns.
Max test accuracy achieved = 98.75%
Adding dropout to the convolutional neural network once again reduces overfit and even provides a slight performance boost.
Max test accuracy achieved = 99.12%
Using batch normalization gives another performance increase.
Max test accuracy achieved = 99.22%
Using both batch normalization and dropout doesn't seem to provide any benefit.
Max test accuracy achieved = 99.14%