## Part 4: Testing Our PyTorch model

We'll start by loading the same neural network as we trained in the last tutorial. This step will save us time (a few minutes of training in this example but it'll save us a lot of time when we train it on neutrino interactions). 

In [None]:
# import libraries and some functions that are written for you 
from handwrite_functions import *
model = torch.load('./my_mnist_model.pt')

Again, we'll load the datasets as before. We'll also define a path to save our confusion matrix figure.

In [None]:
# where you want to save your dataset
dataset = 'MY_DATASET'
testset = 'MY_TESTSET'

# where you want to save the picture of your confusion matrix
savepic = "confusionmatrix.png"

transform = transforms.Compose([transforms.ToTensor(),
                              transforms.Normalize((0.5,), (0.5,)),
                              ])

# download training sets and test sets 
trainset = datasets.MNIST(dataset, download=False, train=True, transform=transform)
valset = datasets.MNIST(testset, download=False, train=False, transform=transform)

# load training sets and test sets in batch sizes of 64
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
valloader = torch.utils.data.DataLoader(valset, batch_size=64, shuffle=True)

# prepare loaded data sets to iterate over
dataiter = iter(trainloader)
images, labels = next(dataiter)

Time to test our NN on a single image! Do you ever see it guessing incorrectly? If not, what is the lowest confidence you can see?

In [None]:
# pick image below
data_num = 44
images, labels = next(iter(valloader))
img = images[data_num].view(1, 784)
plt.imshow(images[data_num].numpy().squeeze(), cmap='gray_r')
with torch.no_grad():
    logps = model(img)

ps = torch.exp(logps)
probab = list(ps.numpy()[0])
print("Predicted Digit =", probab.index(max(probab)))
print(f"How sure?: {max(probab) * 100:.4} %")
# Print out original shape and new shapea
print(f"Original shape: {img.shape}")

I personally haven't seen an instance where it provides the wrong output, but let's test it on the entirety of our test set (10,000 images). 

In [None]:
# now, let's have the code automatically look through our "testing" dataset and let us know how many it gets right. 
y_test = []
predictions = []
correct_count, all_count = 0, 0
for images,labels in valloader:
    for i in range(len(labels)):
        img = images[i].view(1, 784)
        with torch.no_grad():
            logps = model(img)


        ps = torch.exp(logps)
        probab = list(ps.numpy()[0])
        pred_label = probab.index(max(probab))
        predictions.append(pred_label)
        true_label = labels.numpy()[i]
        y_test.append(true_label)
        if(true_label == pred_label):
            correct_count += 1
        all_count += 1

print("Number Of Images Tested =", all_count)
print("\nModel Accuracy =", (correct_count/all_count))



Alright, it looks pretty good! 

---
## Confusion matrix

A good thing to do is see how many times the NN mistakes similar numbers, such as thinking a $4$ is a $9$ when the label, or answer key, says that it's a $4$. It's called a "confusion" matrix because it will 

It's nice to display this information in a plot. This has been coded for you, but it is important to know how to make one. We'll start with an empty 2-d tensor:
\
\begin{matrix}
\begin{bmatrix}
0 \\
1 \\
2 \\
3 \\
4 \\
5 \\
6 \\
7 \\
8 \\
9 \\
\end{bmatrix}
\quad
\begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
\end{bmatrix}
\end{matrix}




\begin{bmatrix}
0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9
\end{bmatrix}


This tensor is 10 places by 10 places. We'll set the horizontal axis to be the truth values corresponding to each digit. We'll set the vertical axis to be the NN's predicted values for each digit. I've added these in blocks below and to the left of our matrix. 

Now, all there is to do is test our NN and save the results. Imagine that we provide a test image of a $5$ and it predicts the value to be a $6$. We would add one to the cell that corresponds to `truth == 5` and `predicted == 6`. 


\begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
\end{bmatrix}


Now, if we give it a test image of a $7$ and it predicts it to be a $7$, we'll add 1 to the corresponding cell. 

\begin{bmatrix}
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
\end{bmatrix}


Let's print the confusion matrix from our NN and see what it looks like.


In [None]:
confusion_matrix = comp_confmat(y_test, predictions)
print(confusion_matrix)

In [None]:
plot_confusion_matrix(confusion_matrix, savepic)

Activities: 

- what does a confusion matrix with no "confusion" look like?
- what should the sum of all the cells look like? Show how you would calculate that. 
- which numbers confused the NN most? Which numbers would confuse you if you were just learning to recognize digits?
- why does my plotted confusion matrix look different than the example one in the tutorial? Which is better?