<a href="https://colab.research.google.com/github/gburtch/BA510-2026/blob/main/Week%201/1.1%20-%20First%20NN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Our First Neural Network**

We are going to import the MNIST dataset, and train our first neural network! Don't worry too much about what the arguments / parameters are that we are specifying when we get to the neural net piece; we will go over those elements subsequently.

#*Data and Library Imports*

In [1]:
from tensorflow.keras.datasets import mnist
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from PIL import Image as im
from IPython.display import Image

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


We have 60,000 images, and each is made up of 28x28 = 784 pixels.

In [None]:
train_images.shape

Pixels take on values between 0 and 255.

In [None]:
train_vector = np.reshape(train_images,-1)

plt.hist(train_vector, bins=256)
plt.title="Histogram of Pixel Values"
plt.show()

pd.DataFrame(train_vector).describe()

Let's see what one of these arrays looks like as a picture...

In [None]:
plt.imshow(train_images[0],cmap=plt.cm.binary)
plt.show()

Every image array has a single label associated with it, an integer between 0 and 9.

In [None]:
plt.hist(train_labels,bins=10)
plt.show()

pd.DataFrame(np.reshape(train_labels,-1)).describe()

#*Train a Neural Net*

We will instantiate our first neural network. We begin by loading the Keras library, specifying the structure of each layer in the network, and indicating what activation function we will use in each layer.

In [None]:
from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.Dense(512, activation="relu"),
    layers.Dense(10, activation="softmax")
])

Next, we will indicate some configuration parameters. In particular, what optimization algorithm to use (RMSProp), what loss function to use (multinomial cross-entropy), and what metric to optimize on (accuracy).

In [None]:
model.compile(optimizer="rmsprop",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

Finally, we need to reformat the data. We need to convert the values into floats (fractional values), scaled to the 0-1 range. Further, we need to reshape each of the 28x28 arrays into individual vectors of length 784.

In [None]:
train_images = train_images.reshape(len(train_images),28*28)
train_images = train_images.astype(float)/255
test_images = test_images.reshape(len(test_images),28*28)
test_images = test_images.astype(float)/255
print(train_images.shape)
test_images.shape

(60000, 784)


(10000, 784)

Now we can 'fit' the model to the training data. We will come back to what these arguments mean, but batch_size refers to the number of observations that are used in a given iteration of the optimization, an epoch refers to a complete run through of iterations such that the entire sample of training data is 'covered' (60000 / 128 batches per epoch in this case). Thus, 5 epochs means that we repeat the optimization procedure over the whole dataset 5 times.



In [None]:
model.fit(train_images, train_labels, epochs=5, batch_size=128)

Now that we have fit the model, we can use it to generate productions on the holdout data. Note that the output is comprised of 10 class labels. The predictions are probabilistic, and sum to 1. So, of the resulting 10 predictions, the index for the highest value is the most probable class.  

In [None]:
predictions = model.predict(test_images)
predictions[1:5]

As you can see, it's very accurate!

In [None]:
result = pd.DataFrame(test_labels,np.argmax(predictions,axis=1))
print(result[1:10])

We can use the Keras-inherent functions to return accuracy and loss pretty easily. Notice that the accuracy of predictions in the test data is lower than that in the training data.

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print("test_acc: ",test_acc)
test_acc: 0.9785