In this example, we will build a model to classify whether an image belongs to 'rock', 'paper', or 'scissor' category.

Dataset info: https://www.kaggle.com/drgfreeman/rockpaperscissors

1. Download the dataset from Kaggle (Note: You might require to create an account / login to your Kaggle account.)
2. Unzip the folder 
3. Create a folder named `dataset\rock-paper-scissor`in your home directory
4. Copy all folders named 'paper', 'rock', 'scissors' from the unzipped folder to `<HOME_DIRECTORY>/dataset/rock-paper-scissor`. 

In [None]:
from pathlib import Path
import numpy as np
import os

`pathlib.Path()` is a utility function that handles OS file paths. We are going to use it to access our home folder and list out all the files inside a folder.

In [None]:
data_dir=os.path.join(Path.home(),'dataset','rock-paper-scissor')
print(data_dir)

Next, we use the `.glob()` function to select all the PNG image type, and count the number of images.

In [None]:
data_dir = Path(data_dir)
image_count = len(list(data_dir.glob('*/*.png')))
image_count

Now, we specify the input size for the input images, and some hyperparameters such as the batch size, and epochs (number of times a model observe the full training set).

In [None]:
IMAGE_HEIGHT = 100
IMAGE_WIDTH = 150

BATCH_SIZE = 32

EPOCHS = 10

TRAIN_PERC = 0.8

CLASS_NAMES = np.array([item.name for item in data_dir.glob('*')])
CLASS_NAMES

In [None]:
train_size = int(TRAIN_PERC * image_count)
test_size = int(1-TRAIN_PERC * image_count)

# Create Data Pipeline

To create a data pipeline, We start by providing the list of image paths in the `tf.data.Dataset.list_files()` function

In [None]:
import tensorflow as tf

all_files = tf.data.Dataset.list_files(str(data_dir/'*/*'))

In [None]:
for f in all_files.take(5):
    print(f.numpy())

Shuffle the image paths and split them into a train set and a test set

In [None]:
all_files = all_files.shuffle(buffer_size=image_count)

train_files = all_files.take(train_size)
test_files = all_files.skip(train_size)

Next, we create functions to load the images into a tensor and process their corresponding labels.

The `parse_image()` function reads the image paths and output the tensor of desired size

The `get_label()` function extracts the labels and converts them into one-hot encoded format

In [None]:
def get_label(file_path):
    # convert the path to a list of path components
    parts = tf.strings.split(file_path, os.path.sep)
    # The second to last is the class-directory
    return parts[-2] == CLASS_NAMES

In [None]:
def parse_img(img):
    # load the raw data from the file as a string
    img = tf.io.read_file(img)
    # convert the compressed string to a 3D uint8 tensor
    img = tf.image.decode_jpeg(img, channels=3)
    # Use `convert_image_dtype` to convert to floats in the [0,1] range.
    img = tf.image.convert_image_dtype(img, tf.float32)
    # resize the image to the desired size.
    return tf.image.resize(img, [IMAGE_HEIGHT, IMAGE_WIDTH])

In [None]:
def process_path(file_path):
    label = get_label(file_path)
    img = parse_img(file_path)
    return img, label

In [None]:
train_dataset = train_files.map(process_path)
test_dataset = test_files.map(process_path)

In [None]:
for image, label in train_dataset.take(1):
    print("Image shape: ", image.numpy().shape)
    print("Label: ", label.numpy())

Create batches of samples based on the BATCH_SIZE hyperparameter.

In [None]:
train_dataset_batch = train_dataset.batch(BATCH_SIZE)
test_dataset_batch = test_dataset.batch(BATCH_SIZE)

In [None]:
image_batch, label_batch = next(iter(train_dataset_batch))

In [None]:
import matplotlib.pyplot as plt

def show_batch(image_batch, label_batch):
    plt.figure(figsize=(10,10))
    for n in range(25):
        ax = plt.subplot(5,5,n+1)
        plt.imshow(image_batch[n])
        plt.title(CLASS_NAMES[label_batch[n]==1][0].title())
        plt.axis('off')

In [None]:
show_batch(image_batch.numpy(), label_batch.numpy())

# Model Training

We will define a model with 7 layers to classify the images

It is made up of 3 convolutional layers, 2 max pooling layers, and 2 dense layers.

In [None]:
model = tf.keras.models.Sequential()

model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(IMAGE_HEIGHT, IMAGE_WIDTH, 3)))
model.add(tf.keras.layers.MaxPooling2D((2, 2)))
model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2, 2)))
model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(64, activation='relu'))
model.add(tf.keras.layers.Dense(3))

In [None]:
model.summary()

Next, we define the training configuration, here we use 'adam' as the optimizer, cross entropy as the loss function, and accuracy as our evaluation metric

In [None]:
model.compile(optimizer='adam',
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

The `.fit` starts the training and all the training information will be stored in a 'history' variable.

In [None]:
history = model.fit(train_dataset_batch, epochs=EPOCHS, validation_data=test_dataset_batch)

We can use the information stored in the 'history' variable to visualize the trend of training loss.

In [None]:
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

Again, we plot a graph that shows the accuracy of model in each epoch using information from the 'history' variable.

In [None]:
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Test'], loc='upper left')
plt.show()

# Evaluation

We can use the function provided by sklearn to produce a Confusion Matrix and a classification report to check the performance of our model.

In [None]:
test_pred = []
test_actual = []

for features, labels in test_dataset_batch:
    scores = model.predict(features)
    pred = np.argmax(scores, axis=1)
    
    test_pred = test_pred + list(pred)
    test_actual = test_actual + list(np.argmax(labels.numpy(), axis=1))

In [None]:
print(test_pred)

In [None]:
print(test_actual)

In [None]:
from sklearn.metrics import confusion_matrix, classification_report

print(confusion_matrix(test_actual, test_pred))

In [None]:
print(classification_report(test_actual, test_pred))

# Inference

Once the model is trained, we can use it to predict a single image or batch of images.

We can use the `parse_img()` function that we define above to load the image, and pass them as a tensor to the `model.predict()` function.

In [None]:
inference_image = os.path.join(data_dir, 'rock', 'BvjXvNTvapIFq4bK.png')

In [None]:
test_image = parse_img(inference_image)

In [None]:
prediction_score = model.predict(np.expand_dims(test_image, 0))

In [None]:
prediction_score

When we perform a single sample prediction, the scores of each class are retrieved. In order to get the probability of the prediction, we squash the scores using the Softmax function and get the predictions probability.

In [None]:
softmax = tf.keras.layers.Softmax()
prediction_probability = softmax(prediction_score)

In [None]:
prediction_probability.numpy()

Use `np.argmax()` function to get the index of element that contains the maximum value (the output with the highest probability).

In [None]:
prediction_class = np.argmax(prediction_probability, axis=1)

In [None]:
prediction_class

In [None]:
CLASS_NAMES[prediction_class]