<a href="https://colab.research.google.com/github/nyp-sit/it3103-tutors/blob/main/week3/multi_class_image_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lab Exercise: Multi-class Image Classification

Now that you have learnt how to train a model to do binary image classification of cats and dogs using Convolutional Neural Network. 

Modify the code to train a model to recognise whether a hand gesture is one of the gesture in the rock, paper and scissor game. 

The dataset of rock paper scissor can be downloaded from https://nypai.s3-ap-southeast-1.amazonaws.com/datasets/rps2.zip

### Step 1: Import the necessary packages

In [None]:
import os 
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers
import matplotlib.pyplot as plt
import numpy as np

### Step 2: Download Datasets

Download the dataset and unzip the file to a folder.

In [None]:
dataset_URL = 'https://nypai.s3-ap-southeast-1.amazonaws.com/datasets/rps2.zip'
path_to_zip = tf.keras.utils.get_file('rps2.zip', origin=dataset_URL, extract=True, cache_dir='.')
print(path_to_zip)
PATH = os.path.join(os.path.dirname(path_to_zip), 'rps2')


### Step 3: Set up your train and validation directory. 

Examine your dataset folder and set your train_dir and validation_dir to point to the correct directories.

In [None]:
train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')

### Step 4: Set up the ImageDataGenerator 

Set up the ImageDataGenerator for both train and validation set.

In [None]:
# All images will be rescaled by 1./255

train_datagen = ImageDataGenerator(rescale=1./255)
validation_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # since our dataset has more than 3 classes, we will choose either categorical or sparse categorical
        # this must match with the loss function we choose in our model
        class_mode='sparse')

validation_generator = validation_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='sparse')

In [None]:
for data_batch, labels_batch in train_generator:
    print('data batch shape:', data_batch.shape)
    print('labels batch shape:', labels_batch.shape)
    break

You can see the labels is **NOT** one-hot-encoded.  Try changing the class_mode to 'categorical' and observe that the label will be one-hot-encoded.

Print out the class indices so that you know what label is assigned to which class.  Hint: use ``class_indices`` of the generator.

In [None]:
train_generator.class_indices

### Step 5: Create your model

In [None]:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(3, activation='softmax'))



### Step 6: Compile and Train the Model

Make sure you choose the correct loss function. 

In [None]:
def create_tb_callback(): 
    
    root_logdir = os.path.join(os.curdir, "tb_logs")

    def get_run_logdir():    # use a new directory for each run
	    import time
	    run_id = time.strftime("run_%Y_%m_%d-%H_%M_%S")
	    return os.path.join(root_logdir, run_id)

    run_logdir = get_run_logdir()

    tb_callback = tf.keras.callbacks.TensorBoard(run_logdir)

    return tb_callback

In [None]:
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=optimizers.RMSprop(learning_rate=1e-4),
              metrics=['acc'])

tb_callback = create_tb_callback()

earlystop_callback = tf.keras.callbacks.EarlyStopping(
    monitor='val_acc', patience=10, verbose=0,
    mode='auto', restore_best_weights=True
)

history = model.fit(
      train_generator,
      steps_per_epoch=126,
      epochs=30,
      validation_data=validation_generator,
      validation_steps=18, callbacks=[earlystop_callback, tb_callback])

In [None]:
%load_ext tensorboard 
%tensorboard --logdir tb_logs

### Step 7: Save your Model

Save your model for use in inference later on.

In [None]:
model.save("rps_model")

### Test your model

The following code cells shows you how to set up Google Colab to take a picture using your webcam. Take a picture of your hand gesture (rock, paper or scissors) and infer using your saved model.

In [None]:
from IPython.display import display, Javascript
from google.colab.output import eval_js
from base64 import b64decode

def take_photo(filename='photo.jpg', quality=0.8):
  js = Javascript('''
    async function takePhoto(quality) {
      const div = document.createElement('div');
      const capture = document.createElement('button');
      capture.textContent = 'Capture';
      div.appendChild(capture);

      const video = document.createElement('video');
      video.style.display = 'block';
      const stream = await navigator.mediaDevices.getUserMedia({video: true});

      document.body.appendChild(div);
      div.appendChild(video);
      video.srcObject = stream;
      await video.play();

      // Resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      // Wait for Capture to be clicked.
      await new Promise((resolve) => capture.onclick = resolve);

      const canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      canvas.getContext('2d').drawImage(video, 0, 0);
      stream.getVideoTracks()[0].stop();
      div.remove();
      return canvas.toDataURL('image/jpeg', quality);
    }
    ''')
  display(js)
  data = eval_js('takePhoto({})'.format(quality))
  binary = b64decode(data.split(',')[1])
  with open(filename, 'wb') as f:
    f.write(binary)
  return filename

In [None]:
from IPython.display import Image
try:
  filename = take_photo()
  print('Saved to {}'.format(filename))
  
  # Show the image which was just taken.
  display(Image(filename))
except Exception as err:
  # Errors will be thrown if the user does not have a webcam or if they do not
  # grant the page permission to access it.
  print(str(err))

In [None]:
img = keras.preprocessing.image.load_img(
    filename, target_size=(150, 150)
)

# we convert the image to numpy array
img_array = keras.preprocessing.image.img_to_array(img)

# Although we only have single image, however our model expected data in batches
# so we will need to add in the batch axis too
img_array = tf.expand_dims(img_array, 0) # Create a batch

# we load the model saved earlier and do the inference 
model = tf.keras.models.load_model('rps_model')
predicted_label = model.predict(img_array)
# or predicted_label = model(img_array)

print(predicted_label)

In [None]:
print(train_generator.class_indices)