This example is similar to, although not identical to, examples given in Chapter 2 of the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). 

# Non CNN fully connected network

This first example looks at generating a simple fully connected network. Its function is both to investigate the network topology and it abailities but also to (re-)familiarise you with how to construct such a network in KERAS. 

This example will use the functional API components of KERAS which is more flexible that the Sequaential API employed in COMP8270 although hopefully will not prove more challenging to learn.

The network will again use a character recognition task but this time using greyscaled numerals as found in the standard MNIST dataset. This first cell loads this dataset for us to use.

In [19]:
from tensorflow.keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Note the images consist of 28 x 28 pixels each. There are 60000 trsining images.

In [20]:
train_images.shape

(60000, 28, 28)

# Defining the network archiecture #

This is the first section you need to write yourselves.

The workshop script takes you through what you need to do. Note the **Inputs** are defined for you however. Your function is to fill in the missing sections to define the layers you will need to employ.

Note the section concludes by printing out of summary of the model that has been defined.

In [21]:
from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.Input(shape=(784,)) # 28 x 28 grey scaled images

first = layers.Dense(784, activation="sigmoid")(inputs)
second = layers.Dense(784, activation="sigmoid")(first)
outputs = layers.Dense(10, activation="softmax")(second)  # 10 output classes

model = keras.Model(inputs=inputs, outputs=outputs)

model.summary()



# Compiling the model #

The next stage is to compile the modeul using an optimiser and an error calculation. Follow the script to deduce what to put here.

In [22]:
model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])


**Preparing the image data**

This cell prepares the image data for presentation to the network. It is provided for you although you should examine it to ensure you understand its function.

It essentially reformats the input image to be a flat vector of pixel values with values ranging from 0 to 255.

In [23]:
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255

# Training the model #

To define this sections, the number of epochs need to be selected together with the batch size, the training algorithm itself has already been selected by the **compile** method but the call to the **fit** method actually performs the training.

In [24]:
print("Train images shape:", train_images.shape)  # (60000, 784)
print("Test images shape:", test_images.shape)  # (10000, 784)

Train images shape: (60000, 784)
Test images shape: (10000, 784)


In [None]:
model.fit(train_images, train_labels, epochs=5, batch_size=32, validation_data=(test_images, test_labels))


Epoch 1/5
[1m 403/1875[0m [32m━━━━[0m[37m━━━━━━━━━━━━━━━━[0m [1m29s[0m 20ms/step - accuracy: 0.6516 - loss: 1.1036

# testing the network #

The following prints out the probability outputs that arise in the final layer. Change the range of test images to investigate different images

Each element in the **predictions** array represents the result for a given test pattern

In [16]:
test_digits = test_images[0:10]
predictions = model.predict(test_digits)
predictions[0]

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 101ms/step


array([0.08837352, 0.11726644, 0.10589028, 0.10279153, 0.0963735 ,
       0.09085272, 0.1002186 , 0.10465974, 0.09830485, 0.09526886],
      dtype=float32)

In [11]:
predictions[0].argmax()

NameError: name 'predictions' is not defined

The **evaluate** method provides metric values for the entire test set

In [18]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"test_acc: {test_acc}")

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.1160 - loss: 2.3009
test_acc: 0.11349999904632568
