<a href="https://colab.research.google.com/github/GowthamKumar1626/Machine-Learning-Youtube/blob/master/Computer%20Vision/Rock_Paper_Scissor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Hello Guys Welcome to the new session**<br>
Today we will deal with rock paper scissor dataset.<br>
Have you ever faced any problem with <b>Overfitting</b>?<br>
Do you know how to solve the problem of overfitting in Image Classification task?<br>
Join with me I will show you how to deal with it...

## **Imports**

In [None]:
%tensorflow_version 2.x. #For colab users
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras import layers

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

np.random.seed(42)
tf.random.set_seed(42) #To make this notebook's output stable across runs

## **Dataset Builder**

In [None]:
builder = tfds.builder("rock_paper_scissors")
info = builder.info
print(info)

**About info**<br>
Each image size (300, 300, 3)<br>
No.of labels: 3<br>
No.of splits: 2 (train, test)<br>
Total no.of examples: 2892

## **Download dataset using builder**

In [None]:
builder.download_and_prepare()

In [None]:
(train, val, test) = tfds.load("rock_paper_scissors", split=["train", "test[:90%]", "test[90%:]"], shuffle_files=True, as_supervised=True)

Note: as_supervised=True will return Tuple with image and labels

## **Collect class names**

In [None]:
class_names = []
for i in range(info.features['label'].num_classes):
  class_names.append(info.features['label'].int2str(i))

class_names

## **Plot one random image**

In [None]:
image, label = next(iter(train))
_ = plt.imshow(image)
_ = plt.title(class_names[label])

In [None]:
#Let us define some variables
BATCH_SIZE = 16
BUFFER_SIZE = 1000
NUM_EPOCHS = 5

IMAGE_SIZE = 180
NUM_CLASSES = len(class_names)

## **A sequential model for rescale and resize**

In [None]:
resize_and_rescale = tf.keras.Sequential([
    tf.keras.layers.experimental.preprocessing.Resizing(IMAGE_SIZE, IMAGE_SIZE),
    tf.keras.layers.experimental.preprocessing.Rescaling(1./255)
])

In [None]:
resize_image = resize_and_rescale(np.expand_dims(image, axis=0))
_ = plt.imshow(resize_image[0])
plt.show()

## **Prepare train and val sets**

In [None]:
AUTOTUNE = tf.data.experimental.AUTOTUNE

def prepare(dataset, shuffle=False, training=False):
  if training:
    dataset = dataset.map(lambda x,y: (resize_and_rescale(x, training=True), y),
                        num_parallel_calls=AUTOTUNE)
  else:
    dataset = dataset.map(lambda x,y: (resize_and_rescale(x, training=False), y),
                        num_parallel_calls=AUTOTUNE)
  if shuffle:
    dataset = dataset.shuffle(BUFFER_SIZE)
  dataset = dataset.batch(BATCH_SIZE)

  return dataset.prefetch(buffer_size=AUTOTUNE)

In [None]:
train_ds = prepare(train, shuffle=True, training=True)
val_ds = prepare(val)

## **Create our MODEL**

In [None]:
model = tf.keras.models.Sequential([
        layers.Conv2D(32, kernel_size=3, padding="same", activation="relu"),
        layers.Conv2D(64, kernel_size=3, padding="same", activation="relu"),
        layers.MaxPool2D(),
        layers.Flatten(),
        layers.Dense(128, activation="relu"),
        layers.Dense(10, activation="softmax")
])

model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    optimizer = "adam",
    metrics=["accuracy"]
)

history = model.fit(
    train_ds,
    epochs = NUM_EPOCHS,
    validation_data = val_ds
)

**Overfit**<br>
Our training set achieved 100% accuracy, but validation set 62%
Our model is overfitted
Let us see learning curves

In [None]:
pd.DataFrame(history.history).plot()

Learning curves are too bad

## **Plot some predictions with overfitted model**

In [None]:
plt.figure(figsize=(15, 15))

for i, datapoint in enumerate(test.take(25)):
  ax = plt.subplot(5, 5, i+1)
  plt.imshow(datapoint[0])
  image = resize_and_rescale(datapoint[0])
  image = np.expand_dims(image, axis = 0)

  if datapoint[1] == np.argmax(model.predict(image)):
    plt.title(class_names[np.argmax(model.predict(image))], color="green")
  else:
    plt.title(class_names[np.argmax(model.predict(image))], color="red")

  plt.axis("off")

plt.show()

OMG! More than 10 images are wrong preditions out of 25 images

We will solve this problem in my next session, please watch next video

**Hello Guys**

Welocme back to session. Previously we created a model, that is `overfitted`<br>
What we need to do now inorder to avoid `overfitting`

## **Data Augmentation**
It is a technique to increase the diversity or `randomness` of your training set by applying radom transformations. <br>
This is the first step we need to do if our modelis overfitting (In case of Image classification)

Before that we will create a function for plotting our predictions

In [None]:
def  plot_predictions(data, model, n_rows=5, n_cols=5):
  plt.figure(figsize=(15, 15))

  for i, datapoint in enumerate(data.take(n_rows * n_cols)):
    ax = plt.subplot(n_rows, n_cols, i+1)
    plt.imshow(datapoint[0])
    image = resize_and_rescale(datapoint[0])
    image = np.expand_dims(image, axis = 0)

    if datapoint[1] == np.argmax(model.predict(image)):
      plt.title(class_names[np.argmax(model.predict(image))], color="green")
    else:
      plt.title(class_names[np.argmax(model.predict(image))], color="red")

    plt.axis("off")

  plt.show()

Now lets write our augmentation code

In [None]:
augmentation = tf.keras.Sequential([
      tf.keras.layers.experimental.preprocessing.RandomZoom(0.3),
      tf.keras.layers.experimental.preprocessing.RandomFlip(mode='horizontal_and_vertical'),
      tf.keras.layers.experimental.preprocessing.RandomRotation(0.3)
])

## **Plot some augmented images**

In [None]:
image, lable = next(iter(train))

augmented_images = augmentation(np.expand_dims(image, axis=0))
_ = plt.imshow(augmented_images[0])
plt.show()

In [None]:
plt.figure(figsize=(10, 10))
for i in range(9):
  augmented_images = augmentation(np.expand_dims(image, axis=0))
  ax = plt.subplot(3, 3, i+1)
  plt.imshow(augmented_images[0])
  plt.axis("off")
plt.show()

Yeah it's cool

Now let make some changes in prepare function (which we defined in last session) <br>
Now we will add `augmentation` part in prepare function.

In [None]:
AUTOTUNE = tf.data.experimental.AUTOTUNE

def prepare(dataset, shuffle=False, augment=False):
  dataset = dataset.map(lambda x,y: (resize_and_rescale(x), y),
                        num_parallel_calls=AUTOTUNE)

  if shuffle:
    dataset = dataset.shuffle(BUFFER_SIZE)
  dataset = dataset.batch(BATCH_SIZE)

  if augment:
    dataset = dataset.map(lambda x,y: (augmentation(x, training=True), y),
                          num_parallel_calls=AUTOTUNE)

  return dataset.prefetch(buffer_size=AUTOTUNE)

In [None]:
train_ds = prepare(train, shuffle=True, augment=True)
val_ds = prepare(val)

Ok now we will grab our previus defined model.<br>
* With out augmentation it is overfitted.<br>
* Now let us check Whether the same situation will repeat or not?

In [None]:
model = tf.keras.models.Sequential([
        layers.Conv2D(32, kernel_size=3, padding="same", activation="relu"),
        layers.Conv2D(64, kernel_size=3, padding="same", activation="relu"),
        layers.MaxPool2D(),
        layers.Flatten(),
        layers.Dense(128, activation="relu"),
        layers.Dense(10, activation="softmax")
])

model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    optimizer = "adam",
    metrics=["accuracy"]
)

history = model.fit(
    train_ds,
    epochs = NUM_EPOCHS,
    validation_data = val_ds
)

Ok it seems our model is not ovefitted now. Great! But accuracy is too low.

Let me first plot the learning curves

In [None]:
pd.DataFrame(history.history).plot()

Yeah everything is fine. But low accuracy.<br>
* Before we run our model on 5 epochs.Now we will run our model on more epochs let num_epochs=30 and i will use `EarlyStopping`callback to stop when our model is overfitting.

In [None]:
model = tf.keras.models.Sequential([
        layers.Conv2D(32, kernel_size=3, padding="same", activation="relu"),
        layers.Conv2D(64, kernel_size=3, padding="same", activation="relu"),
        layers.MaxPool2D(),
        layers.Flatten(),
        layers.Dense(128, activation="relu"),
        layers.Dense(10, activation="softmax")
])

model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    optimizer = "adam",
    metrics=["accuracy"]
)

history = model.fit(
    train_ds,
    epochs = 30,
    validation_data = val_ds,
    callbacks = [tf.keras.callbacks.EarlyStopping(patience=2)]
)

It seems we didn't achieve much. Let us plot learning curves.

In [None]:
pd.DataFrame(history.history).plot()

In [None]:
test_ds = prepare(test)
model.evaluate(test_ds)

`62.16%` accuracy on test set. Can we increase more? Yes we can..<br>


Let me create a new model with new architecture. Previous model is not giving good accuracy. Let us make some changes in model. <br> In this new model I am going to change `Optimizer` to `RMSProp` 

## **New model**

In [None]:
tf.keras.backend.clear_session()
tf.random.set_seed(42)
np.random.seed(42)

model = tf.keras.Sequential([
        layers.Conv2D(16, 3, padding='same', activation='relu'),
        layers.MaxPooling2D(),
        layers.Conv2D(32, 3, padding='same', activation='relu'),
        layers.MaxPooling2D(),
        layers.Conv2D(64, 3, padding='same', activation='relu'),
        layers.MaxPooling2D(),
        layers.Flatten(),
        layers.Dropout(0.2),
        layers.Dense(128, activation='relu'),
        layers.Dense(3, activation='softmax')

])


model.compile(
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
    optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),
    metrics=['accuracy']
)

history = model.fit(
    train_ds,
    epochs=15,
    validation_data = val_ds
)

I think it is good. Plot learning curves

In [None]:
pd.DataFrame(history.history).plot()

It is pretty Good. Let us plot some predicitons

In [None]:
plot_predictions(test, model, n_rows=6, n_cols=6)

Out ot `36` images only one image is wrong 😃

In [None]:
test_ds = prepare(test)
model.evaluate(test_ds)

`94.59` Awsome.

Thank you guys. We can do more. Try to achive greater than this <br>
Follow my channel Thank You Guys