Neural Network
========
@author: Matthew Rowley

### Acknowledgements
The whole idea for this notebook is taken from the excellent video series by YouTube Channel "3Blue1Brown". This series begins with the video: *But what **is** a Neural Network? | Deep learning, Part 1* (<https://www.youtube.com/watch?v=aircAruvnKk&t=1s>). The channel "3Blue1Brown" is a one-man labor of love produced by the talented Grant Sanderson. (actually, since I wrote this Grant has hired some help, but he is still the creative genius behind the channel)

### Description
This notebook will explore the implementation and training of a neural network. The network will take images of hand-drawn numerals as inputs, and output guesses for which numeral was drawn.

Following Grant's lead, the network will have two hidden layers, each with 16 nodes. I also weighed the true value 

In [1]:
from __future__ import division
import numpy as np
import cv2
import os
import idx2numpy
import gc
import pickle

#### Now create an instance of the neural network with random weights and zero biases, and test it on a random input.

In [2]:
class NeuralNetwork(object):
    """The Neural Network Class"""
    def __init__(self, W1=None, W2=None, W3=None, B2=None, B3=None, B4=None, ReLU=False, variability=None):
        """Initialization routine generates random values"""
        if(W1 is None): W1 = np.random.random([784,16]) - 0.5
        if(W2 is None): W2 = np.random.random([16,16]) - 0.5
        if(W3 is None): W3 = np.random.random([16,10]) - 0.5
        if(B2 is None): B2 = np.random.random(16) - 0.5
        if(B3 is None): B3 = np.random.random(16) - 0.5
        if(B4 is None): B4 = np.random.random(10) - 0.5
        if(variability is None): variability = 0.1
        self.variability = variability  # How much to randomize weights and biases in the randomize function
        self.W1=W1  # Weights
        self.W2=W2
        self.W3=W3
        self.B2=B2  # Node Biases
        self.B3=B3
        self.B4=B4
        self.N1 = np.zeros(784) # Nodes (initialized as zeros)
        self.N2 = np.zeros(16)
        self.N3 = np.zeros(16)
        self.N4 = np.zeros(10)
        self.ReLU = ReLU
        self.error = 0  # Metric for the quality of this network on the training set, cumulative over all training samples
        
    def identifyNumber(self, N1=None, trueValue=None):
        """Given an array of pixel values, give the most probable numeral"""
        if (N1 is None): N1 = np.random.random(784) # default to an image of random noise
        if (trueValue is None): trueValue = int(np.random.random(1)*10) # default to a random true value
        self.N1 = N1
        self.N2 = self.normalize(np.dot(self.N1, self.W1) - self.B2)
        self.N3 = self.normalize(np.dot(self.N2, self.W2) - self.B3)
        self.N4 = self.normalize(np.dot(self.N3, self.W3) - self.B4)
        max_val = 0
        max_i = 0
        for i, val in enumerate(self.N4):
            if val > max_val:
                max_val = val
                max_i = i
        for i in range(10):
            if i == trueValue:  # For the true value, probability should be close to 1 
                self.error += 2.5*(1.0-self.N4[i])**2  # Weigh the correct answer 2.5 times more than others
            else:  # For all others, probability should be close to 0
                self.error += (0.0-self.N4[i])**2
        return max_i, self.N4, self.error # this is (numeral, [probability values for numerals 0-9], error)
        
    def normalize(self, nodeVals):
        """Normalize the node values according to a sigmoid or ReLU function"""
        if(self.ReLU):
            return nodeVals.clip(min=0)
        else:
            return 1.0 / (1.0 + np.exp(-nodeVals))
    
    def randomize(self):
        """Modify the current networks parameters by small random values"""
        self.W1 = self.W1 + (np.random.random([784,16]) - 0.5)*self.variability
        self.W2 = self.W2 + (np.random.random([16,16]) - 0.5)*self.variability
        self.W3 = self.W3 + (np.random.random([16,10]) - 0.5)*self.variability
        self.B2 = self.B2 + (np.random.random(16) - 0.5)*self.variability
        self.B3 = self.B3 + (np.random.random(16) - 0.5)*self.variability
        self.B4 = self.B4 + (np.random.random(10) - 0.5)*self.variability
        
    def clone(self):
        """A function to return a cloned network - with the same weights and offsets as this one"""
        my_clone = NeuralNetwork(W1=np.copy(self.W1), W2=np.copy(self.W2), W3=np.copy(self.W3),
                                 B2=np.copy(self.B2), B3=np.copy(self.B3), B4=np.copy(self.B4))
        return my_clone
    
    def resetError(self):
        """Reset the cumulative error variable to 0"""
        self.error=0

In [3]:
myNetwork=NeuralNetwork()

In [4]:
print(myNetwork.identifyNumber())

(7, array([0.67236826, 0.65166649, 0.42549631, 0.39339571, 0.27215409,
       0.46897357, 0.63070417, 0.85280158, 0.78170571, 0.1287161 ]), 3.2024103641032298)


#### I've drawn a "6." Here I import it, and run it on the network.

In [5]:
im = cv2.imread(os.path.join("Data","Test.png"), 0)

In [6]:
im = np.ndarray.flatten(im)/255.0
print(myNetwork.identifyNumber(N1=im, trueValue=6))

(7, array([0.66068723, 0.61452162, 0.4120595 , 0.3877832 , 0.2743762 ,
       0.48003389, 0.62309054, 0.85734742, 0.7694498 , 0.1147115 ]), 6.337846658864208)


#### Now I will set up a short script for optimizing a network using a single training sample and a roughly genetic-esque optimization routine.

In [7]:
networks = [0,0,0,0,0,0,0,0,0,0]  # create a "generation" of 10 individuals
for i in range(len(networks)):
    networks[i] = NeuralNetwork()  # Instantiate each individual as a random neural network
for network in networks:  
    network.identifyNumber(N1=im, trueValue=6)  # Test each network on the sample to give it an error value

#### Having set up and run the network once, I am ready to "Train" it to recognize a 6. Run the cell below as many times as necessary to get very good results

In [55]:
min_i = 0
min_val = networks[0].error
for i, network in enumerate(networks):  # Find the index for the individual in this generation with the lowest error
    if network.error < min_val:
        min_i = i
        min_val = network.error
print("Best Error, index: {}, {}".format(min_val, min_i))
W1 = networks[min_i].W1
W2 = networks[min_i].W2
W3 = networks[min_i].W3
B2 = networks[min_i].B2
B3 = networks[min_i].B3
B4 = networks[min_i].B4
for i in range(len(networks)):  # Create a new generation of clones of the best network so far
    networks[i] = NeuralNetwork(W1=W1, W2=W2, W3=W3, B2=B2, B3=B3, B4=B4)
for i, network in enumerate(networks):
    if (i != min_i):  # Preserve the best individual from the last generation and randomize all others  
        network.randomize()
    network.identifyNumber(N1=im, trueValue=6)  # Test the new generation of networks

Best Error, index: 0.24118904178433456, 9


In [56]:
networks[0].identifyNumber(N1=im, trueValue=6)

(6, array([0.16395654, 0.12846182, 0.15337184, 0.14238429, 0.17579153,
        0.18848052, 0.91319048, 0.11418974, 0.22249144, 0.11111472]), 0.49467182507164564)

#### Of course, here I have grossly "overfit" and there is no likelihood that the network is doing any image processing at all, but rather gaming the numbers to always give a 6

### Importing Images

#### I need many images (thousands) to adequately train the network. Thankfully, I can rely on the databases provided by Drs. LeCun and Cortes at: <http://yann.lecun.com/exdb/mnist/>

#### The databases include two sets of images (60,000 training and 10,000 testing), and two sets of true-value labels (training and testing). The training and testing sets are similar in every way, but just include different examples of handwritten numbers. By keeping them separate (i.e. *never* training with the testing set), we can be sure that our networks are able to recognize numbers outside of their training set.

#### The image arrays must also be flattened so that the image data is a 1-D array (rather than a 2-d one) and normalized to a maximum value of 1


In [57]:
unflattened_training_images = idx2numpy.convert_from_file(os.path.join("Data",'train-images.idx3-ubyte'))
training_labels = idx2numpy.convert_from_file(os.path.join("Data",'train-labels.idx1-ubyte'))
unflattened_testing_images = idx2numpy.convert_from_file(os.path.join("Data",'t10k-images.idx3-ubyte'))
testing_labels = idx2numpy.convert_from_file(os.path.join("Data",'t10k-labels.idx1-ubyte'))
print(unflattened_training_images.shape)
training_images = np.empty([60000, 784])
testing_images = np.empty([10000, 784])
for i in range(60000):
    training_images[i] = np.ndarray.flatten(unflattened_training_images[i]) / 255.0
for i in range(10000):
    testing_images[i] = np.ndarray.flatten(unflattened_testing_images[i]) / 255.0
print(training_images.shape)
# Clean up some memory
unflattened_training_images = None
unflattened_testing_images = None
gc.collect()

(60000, 28, 28)
(60000, 784)


41

#### Ideally, a trained network would give high probability to the true answer, and low probabilities to all other answers. I found that even reliable networks would give fairly high probabilities to *all* numbers, and the correct number would just barely beat out the others. I don't really know if this is a problem, but it bothered me so I came up with a method to discourage the network from making overconfident guesses. Here I append some images with random data and a "99" true-value label. These images will never be marked correct, so the network is rewarded for recoginzing when an image is not a number at all, and returning low confidence values for all numerals.

In [58]:
training_images = np.concatenate((training_images, np.random.rand(6000,784)))
training_labels = np.concatenate((training_labels, 99.0 * np.ones(6000)))

testing_images = np.concatenate((testing_images, np.random.rand(1000,784)))
testing_labels = np.concatenate((testing_labels, 99.0 * np.ones(1000)))

print(training_images.shape)
print(training_labels.shape)

(66000, 784)
(66000,)


## Training the network the slow way

#### Now we can train the network using my simple genetic algorithm demonstrated above, and a random sample of 500 images from the training data

#### It will converge very slowly due to the enormous number of parameters and the randomness of the walk toward a local minimum, but it was relatively easy to code

#### Another advantage to this algorithm is that it parallellizes easily, but I haven't bothered with that here

#### The training set is too large to train each generation on the whole set, but we don't want to overtrain on a single small sample of the set. So, each generation selects a new random sample of training images from the whole collection

#### First, I set up the generational structure, this time with 50 individuals per generation

In [85]:
sample_number = 20000  # How many images to train on
load_best = True  # If a champion network is saved from a previous session, it can be loaded and included here
networks = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
for i in range(len(networks)):
    networks[i] = NeuralNetwork()
if load_best:  # Load the saved network into the 0 index for this generation
    loadfile = os.path.join("Data", "Best.pkl")
    with open(loadfile, 'rb') as pickled_network:
        networks[0] = pickle.load(pickled_network, encoding='latin1')
        networks[0].resetError()  # Reset the error, in case it is held over from previous trainings
trainers = np.random.randint(0,65999,sample_number)  # Here we randomly select some images from the training set
for network in networks:
    for index in trainers:
        network.identifyNumber(N1=training_images[index], trueValue=training_labels[index])
min_i = 0
min_val = networks[0].error
for i, network in enumerate(networks):
    if network.error < min_val:
        min_i = i
        min_val = network.error
min_val = min_val / sample_number
for network in networks:
    print(network.error / sample_number)
print("Best Error, index: {}, {}".format(min_val, min_i))

0.5215063002334722
2.49454599492054
3.333975887321924
3.085988107849417
3.1184290445502483
3.6027516039475214
3.8282239726667564
3.0161096906201705
2.9814189178171717
3.3954051459525245
3.0691021853498723
3.1573170169640092
3.1857467721091903
3.641016229138962
3.0697561214336786
3.244727720374784
3.473540500771438
3.0532785706714547
2.9330318052575652
2.9092709025123
3.2270412875834364
2.7349647332858646
2.960290007645651
2.839173800407695
3.217568646569299
4.609163923660004
2.8107678945260863
2.7731143992085254
3.8415748613317446
3.483121414037572
2.4550846694977815
2.7031090937536577
2.5423021971507014
3.345832819550705
2.801765565423807
3.099957769605241
3.2317306764650766
3.6585661391018056
3.11900147345297
2.78553652762927
2.7742257129931858
2.86758721649433
2.733456750574516
3.2954562047160243
2.9104051802153914
3.07439532852658
3.1846171628772617
3.2666017976172252
3.4145008992305206
3.1647903738974077
Best Error, index: 0.5215063002334722, 0


#### Now, there will be some variation in the error values due to the different samples in the training set. Ideally, we would use a large enough sample_number so that the differences between training sample sets are small, but use a samll enough sample_number so that the training proceeds quickly.

In [93]:
sample_number = 20000
errors = np.zeros(100)
for i in range(100):
    trainers = np.random.randint(0,65999,sample_number)  # Here we randomly select some images from the training set
    networks[min_i].resetError()
    for index in trainers:
        networks[min_i].identifyNumber(N1=training_images[index], trueValue=training_labels[index])
    errors[i] = networks[min_i].error / sample_number
print("Errors Standard Deviation: {}".format(np.std(errors)))

Errors Standard Deviation: 0.006762171728391975


#### We also need to choose an appropriate range to vary values over. If we keep the same set of training images but randomize some of the variables, we can see how much the error varies under randomization. This value should be greater than the standard deviation above, but ideally not too large. Maybe about twice as large. This will ensure that a new network will likely only outperform the current champion based on actual improvement in the weights and biases, not bacause of the variation between training sample subsets.

In [106]:
target_error = 0.5
min_val = 0.52
trainers = np.random.randint(0,65999,sample_number)
v_errors = np.zeros(100)
for i in range(100):
    network = networks[min_i].clone()
    network.variability = (1 - target_error/min_val)**2*400
    network.randomize()
    for index in trainers:
        network.identifyNumber(N1=training_images[index], trueValue=training_labels[index])
    v_errors[i] = network.error / sample_number
print("Errors Standard Deviation: {}".format(np.std(v_errors)))

Errors Standard Deviation: 0.009228342802777645


#### Now that the network is initialized, run the cell below to optimize until a target error value is reached

#### I have also set the variability to start large and diminish as the networks converge to the target error

In [108]:
target_error = .5  # The algorithm will stop once the error is equal to or less than this threshold
while(min_val > target_error):
    W1 = networks[min_i].W1
    W2 = networks[min_i].W2
    W3 = networks[min_i].W3
    B2 = networks[min_i].B2
    B3 = networks[min_i].B3
    B4 = networks[min_i].B4
    for i in range(len(networks)):
        networks[i] = NeuralNetwork(W1=W1, W2=W2, W3=W3, B2=B2, B3=B3, B4=B4,
                                    variability = (1 - target_error/min_val)**2*400)
    trainers = np.random.randint(0,65999,sample_number) # Get new training data with each iteration
    for i, network in enumerate(networks):
        if (i != 0):
            network.randomize()
        for index in trainers:
            network.identifyNumber(N1=training_images[index], trueValue=training_labels[index])
    min_i = 0
    min_val = networks[0].error
    for i, network in enumerate(networks):
        if network.error < min_val:
            min_i = i
            min_val = network.error
    min_val = min_val / sample_number
    print("Best Error, index: {}, {}".format(min_val, min_i))

Best Error, index: 0.5083064473964579, 0
Best Error, index: 0.5145689416729682, 40
Best Error, index: 0.5203577393245329, 0
Best Error, index: 0.5161638467304877, 0
Best Error, index: 0.5183822713342823, 0
Best Error, index: 0.5157293719696647, 0
Best Error, index: 0.5112078546793009, 0
Best Error, index: 0.5119283616098627, 0
Best Error, index: 0.5213163063343244, 0
Best Error, index: 0.5266270184091586, 0
Best Error, index: 0.520833121278936, 0
Best Error, index: 0.5174301434165369, 0
Best Error, index: 0.5244092991751279, 0
Best Error, index: 0.5224062264613547, 0
Best Error, index: 0.5056841157689633, 0
Best Error, index: 0.5283783233130219, 28
Best Error, index: 0.5177785502842542, 0
Best Error, index: 0.5197023094410425, 0
Best Error, index: 0.516102610991342, 0
Best Error, index: 0.5148799982330906, 0
Best Error, index: 0.5128952622793153, 0
Best Error, index: 0.5189479113321548, 0
Best Error, index: 0.5241842220385529, 0
Best Error, index: 0.5265243748521553, 0
Best Error, inde

KeyboardInterrupt: 

#### After a full day of number-crunching, the algorithm failed to reach a target error of 0.5, so I stopped it. The best performer is still a fairly well-trained network, so let's test its accuracy on the testing set.

In [124]:
mistakes = 0
set_size = testing_images.shape[0]
networks[min_i].resetError()
for image, label in zip(testing_images, testing_labels):
    guess, probs, error = networks[min_i].identifyNumber(N1=image, trueValue=label)
    if(label == 99):  # These are the random images, the network should have low confidence all around
        if(np.max(probs)>0.5): mistakes = mistakes + 1  # high confidence in any numeral counts as an error
    else:  # These are real images, the guess should match the label
        if(guess != label): mistakes = mistakes + 1
print("Total Mistakes: {} out of {}".format(mistakes, set_size))
print("Error: {}".format(networks[min_i].error / set_size))

Total Mistakes: 1987 out of 11000
Error: 0.5011689078124036


#### Now, 2012 out of 11000 isn't bad, but a more directed optimization algorithm might be able to improve the network better than the random walk used above. Proper gradient descent involves finding the steepest slope within the full multi-dimensional parameter space, and proceeding down that slope. A cheap and easy to code substitute will look at each parameter one at a time, rather than the whole multi-dimensional space. Find the partial derivative of the error with respect to one parameter, and change that parameter if necessary to reduce the error, then move on to the next parameter and so on. To prevent any randomness in this walk toward the minimum, we will use the entire training set.

#### Although this algorithm is guaranteed to proceed only to lower error, never taking a step backward, it will take an incredibly long time to complete even one cycle. This is because of the enormous number of parameters. W1, for example, includes 784*16=12544 individual weights. 

#### First, we save the work from above. Then, initialize a best_network variable to use in the new algorithm (we won't be using 50 individuals per generation any more from here onward).

In [117]:
networks[min_i].resetError()
savefile = os.path.join("Data", "Best.pkl")
with open(savefile, 'wb') as output:
    pickle.dump(networks[min_i], output)
best_network = networks[min_i]

#### The cell below is designed to be run over and over again until a cycle completes without changing any of the parameters (because they are at a local minimum). Then, either make the step size smaller or take the network as good enough.

In [125]:
step_size = 0.01  # This is a relative change in the parameter magnitude
changes = 0
verbose = True  # Set this True to print every improvement. Set it False to run silently until the end
# W1 is 784,16 -- This is simply too big to go over every weight. But we can't simply ignor it either
#                 For now, randomly select 200 weights and hope they make some difference
i_vals = np.random.randint(low=0, high=783, size=200)
j_vals = np.random.randint(low=0, high=15, size=200)
for num in range(200):
    i = i_vals[num]
    j = j_vals[num]
    weight = best_network.W1[i][j]
    best_network.resetError()
    increase = best_network.clone()
    increase.W1[i][j] = (1+step_size)*weight
    decrease = best_network.clone()
    decrease.W1[i][j] = (1-step_size)*weight
    for image, label in zip(training_images, training_labels):
        best_network.identifyNumber(N1=image, trueValue=label)
        increase.identifyNumber(N1=image, trueValue=label)
        decrease.identifyNumber(N1=image, trueValue=label)
    if(increase.error < best_network.error and increase.error < decrease.error):
        best_network = increase
        changes = changes + 1
        if verbose: print("Error: {}".format(best_network.error))
    elif(decrease.error < best_network.error and decrease.error < increase.error):
        best_network = decrease
        changes = changes + 1
        if verbose: print("Error: {}".format(best_network.error))
# W2 is 16x16
for i in range(16):
    for j in range(16):
        weight = best_network.W2[i][j]
        best_network.resetError()
        increase = best_network.clone()
        increase.W2[i][j] = (1+step_size)*weight
        decrease = best_network.clone()
        decrease.W2[i][j] = (1-step_size)*weight
        for image, label in zip(training_images, training_labels):
            best_network.identifyNumber(N1=image, trueValue=label)
            increase.identifyNumber(N1=image, trueValue=label)
            decrease.identifyNumber(N1=image, trueValue=label)
        if(increase.error < best_network.error and increase.error < decrease.error):
            best_network = increase
            changes = changes + 1
            if verbose: print("Error: {}".format(best_network.error))
        elif(decrease.error < best_network.error and decrease.error < increase.error):
            best_network = decrease
            changes = changes + 1
            if verbose: print("Error: {}".format(best_network.error))
# W3 is 16x10
for i in range(16):
    for j in range(10):
        weight = best_network.W3[i][j]
        best_network.resetError()
        increase = best_network.clone()
        increase.W3[i][j] = (1+step_size)*weight
        decrease = best_network.clone()
        decrease.W3[i][j] = (1-step_size)*weight
        for image, label in zip(training_images, training_labels):
            best_network.identifyNumber(N1=image, trueValue=label)
            increase.identifyNumber(N1=image, trueValue=label)
            decrease.identifyNumber(N1=image, trueValue=label)
        if(increase.error < best_network.error and increase.error < decrease.error):
            best_network = increase
            changes = changes + 1
            if verbose: print("Error: {}".format(best_network.error))
        elif(decrease.error < best_network.error and decrease.error < increase.error):
            best_network = decrease
            changes = changes + 1
            if verbose: print("Error: {}".format(best_network.error))
# B2 is 16
for i in range(16):
    bias = best_network.B2[i]
    best_network.resetError()
    increase = best_network.clone()
    increase.B2[i] = (1+step_size)*bias
    decrease = best_network.clone()
    decrease.B2[i] = (1-step_size)*bias
    for image, label in zip(training_images, training_labels):
        best_network.identifyNumber(N1=image, trueValue=label)
        increase.identifyNumber(N1=image, trueValue=label)
        decrease.identifyNumber(N1=image, trueValue=label)
    if(increase.error < best_network.error and increase.error < decrease.error):
        best_network = increase
        changes = changes + 1
        if verbose: print("Error: {}".format(best_network.error))
    elif(decrease.error < best_network.error and decrease.error < increase.error):
        best_network = decrease
        changes = changes + 1
        if verbose: print("Error: {}".format(best_network.error))
# B3 is 16
for i in range(16):
    bias = best_network.B3[i]
    best_network.resetError()
    increase = best_network.clone()
    increase.B3[i] = (1+step_size)*bias
    decrease = best_network.clone()
    decrease.B3[i] = (1-step_size)*bias
    for image, label in zip(training_images, training_labels):
        best_network.identifyNumber(N1=image, trueValue=label)
        increase.identifyNumber(N1=image, trueValue=label)
        decrease.identifyNumber(N1=image, trueValue=label)
    if(increase.error < best_network.error and increase.error < decrease.error):
        best_network = increase
        changes = changes + 1
        if verbose: print("Error: {}".format(best_network.error))
    elif(decrease.error < best_network.error and decrease.error < increase.error):
        best_network = decrease
        changes = changes + 1
        if verbose: print("Error: {}".format(best_network.error))
# B4 is 10
for i in range(10):
    bias = best_network.B4[i]
    best_network.resetError()
    increase = best_network.clone()
    increase.B4[i] = (1+step_size)*bias
    decrease = best_network.clone()
    decrease.B4[i] = (1-step_size)*bias
    for image, label in zip(training_images, training_labels):
        best_network.identifyNumber(N1=image, trueValue=label)
        increase.identifyNumber(N1=image, trueValue=label)
        decrease.identifyNumber(N1=image, trueValue=label)
    if(increase.error < best_network.error and increase.error < decrease.error):
        best_network = increase
        changes = changes + 1
        if verbose: print("Error: {}".format(best_network.error))
    elif(decrease.error < best_network.error and decrease.error < increase.error):
        best_network = decrease
        changes = changes + 1
        if verbose: print("Error: {}".format(best_network.error))
print("Changes: {}".format(changes))
print("Error: {}".format(best_network.error))

KeyboardInterrupt: 

#### After all that work, save the best network so it can be reused in another session

In [122]:
best_network.resetError()
savefile = os.path.join("Data", "Best.pkl")
with open(savefile, 'wb') as output:
    pickle.dump(best_network, output)

#### And run it against the testing set to see if the improvement is meaningful

In [123]:
mistakes = 0
set_size = testing_images.shape[0]
best_network.resetError()
for image, label in zip(testing_images, testing_labels):
    guess, probs, error = best_network.identifyNumber(N1=image, trueValue=label)
    if(label == 99):  # These are the random images, the network should have low confidence all around
        if(np.max(probs)>0.5): mistakes = mistakes + 1  # high confidence in any numeral counts as an error
    else:  # These are real images, the guess should match the label
        if(guess != label): mistakes = mistakes + 1
print("Total Mistakes: {} out of {}".format(mistakes, set_size))
print("Error: {}".format(best_network.error / set_size))

Total Mistakes: 1987 out of 11000
Error: 0.5011689078124036


#### Test the champion network on one of my hand-drawn numbers

In [173]:
im = cv2.imread(os.path.join("Data","One2.png"), 0)
im = np.ndarray.flatten(im)/255.0
print(best_network.identifyNumber(N1=im, trueValue=1))
best_network.resetError()

(8, array([  1.77460565e-03,   5.38313184e-06,   1.11935450e-03,
         1.17980550e-02,   3.67978335e-02,   4.91742404e-02,
         8.40655320e-03,   1.18378643e-02,   9.64306930e-02,
         1.42682652e-02]), 443.75654166911278)


## Below are cells used as calculators for my convenience. They can be ignored

In [156]:
min_i = 0
min_val=300

In [115]:
784*16

12544

In [127]:
step_size = 0.5
i = 3
j = 6
weight = best_network.W1[i][j]
best_network.resetError()
increase = best_network.clone()
increase.W1[i][j] = (1+step_size)*weight
decrease = best_network.clone()
decrease.W1[i][j] = (1-step_size)*weight
for image, label in zip(training_images, training_labels):
    best_network.identifyNumber(N1=image, trueValue=label)
    increase.identifyNumber(N1=image, trueValue=label)
    decrease.identifyNumber(N1=image, trueValue=label)
print("Errors: {}, {}, {}".format(best_network.error, increase.error, decrease.error))

Errors: 34392.923737694226, 34392.923737694226, 34392.923737694226


In [129]:
increase.W1[i][j] = 4
print(increase.W1[i][j])
print(decrease.W1[i][j])
print(best_network.W1[i][j])

4.0
4.0
4.0
