2.0 Intro to Computer Vision

References

>[AI and Machine Learning for Coders - Laurence Moroney](https://www.oreilly.com/library/view/ai-and-machine/9781492078180/)

> https://www.kaggle.com/datasets/zalando-research/fashionmnist


IDE (Interactive Development Environment)

>[Colab](https://colab.research.google.com)

Our scripts will recognize items of clothing in an image using an approximation of a biological neural network

>We will use the Fashion MNIST database for this work

> * https://www.kaggle.com/datasets/zalando-research/fashionmnist

> * https://pjreddie.com/projects/mnist-in-csv/

> * https://github.com/zalandoresearch/fashion-

> * https://engineering.zalando.com

> The Fashion MNIST Dataset has the following characteristics

> * 60,000 images in the training set

> * 10,000 images in the test set

> * Each image is associated with a label from 1 of 10 classes

> * Column 1 is the class label

> * Remaining columns are pixel numbers (784 total)

> * Each image is 28 pixels high x 28 pixels wide = 784 pixels

> * Each pixel is grey scale (darkness) and has an integer value range from 0 to 255


Let's take a look at a code snippet that trains a neural network on the Fashion MNIST dataset

> The first line pulls the tensorflow library that we will be using

> The second line pulls the dataset that we will be using from keras

> The third line uses keras to split the dataset into two subsets along with labels for each image

> The fourth and fith line normalize each image. This means scaling from an inital pixel value range of 0 to 255 down to a range of 0 to 1

> * The process of normalization improves the performance of the neural network

> The sixth line has multiple parts:

> * Sequential is used to define the layers of the neural network

> * Dense is a common layer type consisting of fully (densely) connected neurons

> * We are using the Flatten method to crunch the 2 dimensional input array into a 1 dimensional array or line

> * We are using a SWAG of 128 hidden nodes (neurons) and the relu activation function

> * We are using 10 output layers (we have 10 different clothing labels - classes) and the softmax activation function

> The seventh line also has multiple parts:

> * The adam optimizer is an evolved version of the stochastic gradient descent optimizer

> * the sparse_catagorical_crossentropy loss function is built into TensorFlow that is typically used for datasets with catagories (we have 10 different clothing labels - catagories - classes)

> * the accuracy metric reports back on the accuracy of the neural network as it is trained

> The eighth line fits the training images to the training labels over five epochs

> * Each time a dataset passes through an algorithm, it is said to have completed an epoch

> * Chosing the appropriate number of epochs is a judgement call - too many and the model overfits - too few and accuracy suffers

> The ninth line evalutes the perfomance of the model on the test data (images). The model predicts which images are associated with each of the labels

> * Typically test data accuracy is lower than training data - the neural network is learning

In [None]:
import tensorflow as tf

data = tf.keras.datasets.fashion_mnist

(training_images, training_labels), (test_images, test_labels) = data.load_data()

training_images  = training_images / 255.0
test_images = test_images / 255.0

model = tf.keras.models.Sequential([
            tf.keras.layers.Flatten(input_shape=(28, 28)),
            tf.keras.layers.Dense(128, activation=tf.nn.relu),
            tf.keras.layers.Dense(10, activation=tf.nn.softmax)
        ])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

# Moroney, Laurence. AI and Machine Learning for Coders (pp. 26-28). O'Reilly Media. Kindle Edition

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.3710651397705078, 0.8686000108718872]

Let's explore the model output

> Training and test images, from this dataset, are associated with the following labels

> * 0 T-shirt/top

> * 1 Trouser

> * 2 Pullover

> * 3 Dress

> * 4 Coat

> * 5 Sandal

> * 6 Shirt

> * 7 Sneaker

> * 8 Bag

> * 9 Ankle boot

> This code snippet gives us back the values from the 10 output neurons (nodes)

> * These values are the probabilities that the image matches the label at that particular index

In [None]:
classifications = model.predict(test_images)
print(classifications[0])
print(test_labels[0])

[2.0841138e-05 1.1692101e-07 7.7806112e-07 5.1445809e-08 1.7237892e-06
 8.2627229e-02 2.3521095e-05 2.5002664e-01 9.0889516e-06 6.6728991e-01]
9
