# A first look at a neural network

In [28]:
import keras
from tensorflow.keras import layers

import pandas as pd
import numpy as np

We will now take a look at a first concrete example of a neural network, which makes use of the Python library Keras to learn to classify hand-written digits. Unless you already have experience with Keras or similar libraries, you will not understand everything about this first example right away.

In [3]:
(train_images, train_labels), (test_images, test_labels) = keras.datasets.mnist.load_data()

`train_images` and `train_labels` form the "training set", the data that the model will learn from. The model will then be tested on the "test set", `test_images` and `test_labels`. Our images are encoded as Numpy arrays, and the labels are simply an array of digits, ranging from 0 to 9. There is a one-to-one correspondence between the images and the labels.

Let's check training data

In [4]:
train_images.shape

(60000, 28, 28)

In [5]:
len(train_labels)

60000

In [6]:
train_labels

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

Let's check the test data

In [7]:
test_images.shape

(10000, 28, 28)

In [8]:
len(test_labels)

10000

In [9]:
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

First we will present our neural network with the training data, train_images and train_labels. The network will then learn to associate images and labels

Let's define neural network architecture

- First two layers are for input layer, so no trainable params  
 
- Next two sequence of layers are dense layers with relu activation. which are densely-connected (also called "fully-connected") neural layers. In these two leayes, Network learns the weights on the input data  

- Last layer is again Dense layer but with softmax activation, which means it will return an array of 10 probability scores (summing to 1). Each score will be the probability that the current digit image belongs to one of our 10 digit classes.


In [10]:
# Build a simple model with the Keras Functional API

inputs = keras.Input(shape=(28,28))

x = layers.Flatten()(inputs)
x = layers.Dense(32, activation="relu")(x)
x = layers.Dense(32, activation="relu")(x)
outputs = layers.Dense(10, activation="softmax")(x)

model = keras.Model(inputs, outputs)
model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 28, 28)]          0         
_________________________________________________________________
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 32)                25120     
_________________________________________________________________
dense_1 (Dense)              (None, 32)                1056      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                330       
Total params: 26,506
Trainable params: 26,506
Non-trainable params: 0
_________________________________________________________________


Trainable params = 100480 + 16512 + 1290 = 118,282


To make our network ready for training, we need to pick three more things, as part of "compilation" step:  


- A loss function: the is how the network will be able to measure how good a job it is doing on its training data, and thus how it will be able to steer itself in the right direction.
- An optimizer: this is the mechanism through which the network will update itself based on the data it sees and its loss function.
- Metrics to monitor during training and testing. Here we will only care about accuracy (the fraction of the images that were correctly classified).

In [11]:
# Compile the model
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=['accuracy'])

In [16]:
# Train the model for 1 epoch from Numpy data
batch_size = 64

history = model.fit(train_images, train_labels, batch_size=batch_size, epochs=10)


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [33]:
test_loss, test_acc = model.evaluate(test_images, test_labels)



Our test set accuracy turns out to be 92.9% -- that's a bit lower than the training set accuracy.

This concludes our very first example -- you just saw how we could build and a train a neural network to classify handwritten digits,


In [18]:
test_images[0].reshape(1, 28, 28).shape

(1, 28, 28)

In [25]:
test_labels.shape

(10000,)

In [32]:
# let's do prediction on one test example

np.argmax(model.predict(test_images[0].reshape(1, 28, 28)))

7

In [29]:
# prediction on all test examples
preds = model.predict(test_images, verbose=2) # Shape -> (28000,)

predicted_labels = []
for pred in preds:
    predicted_labels.append(np.argmax(pred))

sub_df = pd.DataFrame()
sub_df["True Lable"] = test_labels 
sub_df["Predicted Label"] = predicted_labels


313/313 - 0s


In [31]:
sub_df.head()

Unnamed: 0,True Lable,Predicted Label
0,7,7
1,2,2
2,1,1
3,0,0
4,4,4
