In [1]:
%load_ext autoreload

In [2]:
%autoreload 2

In [3]:
import numpy as np
from plotly.offline import init_notebook_mode, plot, iplot
from plotly.graph_objs import Scatter, Figure, Layout
from tensorflow.examples.tutorials.mnist import input_data
from tensorflow.python.framework import dtypes
from solarimage.classifier.nn.trainer import DataSet, TrainIter
from solarimage.classifier.nn import tester

In [4]:
init_notebook_mode(connected=True)

In [5]:
mnist = input_data.read_data_sets("example/MNIST_data/", one_hot=True)

Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting example/MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting example/MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting example/MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting example/MNIST_data/t10k-labels-idx1-ubyte.gz


In [6]:
trainer = TrainIter(images=mnist.train.images, 
                    labels=mnist.train.labels, 
                    reshape=False,
                    dtype=dtypes.uint8)
test_dataset = DataSet(mnist.test.images, mnist.test.labels,reshape=False,dtype=dtypes.uint8)

In [7]:
gd_loss_list, gd_weights, gd_bias = trainer.run_simple_nn(batch_size=100, 
                                                          iter_max=1e4,
                                                          learning_rate=1e-2,
                                                          solver_method='gd')
sgd_loss_list, sgd_weights, sgd_bias = trainer.run_simple_nn(batch_size=100,
                                                             iter_max=1e4,
                                                             learning_rate=1e-2,
                                                             solver_method='sgd')

In [11]:
np.mean(gd_loss_list[5000:]), np.std(gd_loss_list[5000:]), np.mean(sgd_loss_list[5000:]), np.std(sgd_loss_list[5000:])

(0.36840883, 0.079107985, 0.26504198, 0.10471204)

A simple neural network with the softmax active function is implmented above, and the cross entropy is adapted as an objective function. It is converge after 5k steps if the learning rate is 1e-2. It is shown in the figure below that Adam optimization which approaches 0.27 with the deviation 0.10 is slightly better than the one from gradient decent which approaches 0.36 with the deviation 0.07. Since the error is fluctuated finally, the fidelity of regconizing images must be in a range. Therefore, it would be a work to obtain the better fidelity and to reduce the fluctuation.

In [9]:
data = [
    Scatter(
        x = np.arange(len(gd_loss_list)),
        y = np.array(sgd_loss_list),
        name = 'Gradient Descent',
    ),
    Scatter(
        x = np.arange(len(sgd_loss_list)),
        y = np.array(sgd_loss_list),
        name = 'Adam Optimizer',
    ),
]

layout = dict(title = 'Convergence of Simple Neural Network',
              xaxis = dict(title = 'Iterative Steps'),
              yaxis = dict(title = 'Error'),
              )

fig = dict(data=data, layout=layout)
iplot(fig)

In [34]:
gd_fidelity = tester.runner(test_dataset, gd_weights, gd_bias)
sgd_fidelity = tester.runner(test_dataset, sgd_weights, sgd_bias)

In [35]:
gd_fidelity, sgd_fidelity

(0.87050003, 0.92839998)