##Hand Gesture Recognition by Sky Scavengers

This cell block imports essential libraries from TensorFlow and other packages to set up data handling, image processing, and neural network modeling using Keras. It includes modules for image handling (PIL), numerical arrays (NumPy), and building a sequential model (Sequential) with specific layers for constructing a convolutional neural network (CNN). It also imports tools for regularization (l2), custom callbacks (Callback), and the Adam optimizer for training.


In [1]:
import gdown
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os
from PIL import Image
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, BatchNormalization, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.regularizers import l2
from keras.callbacks import Callback
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing import image

This cell download the dataset zip file from a Google Drive link provided using the "gdown" library. The downloaded content is saved as "dataset.zip" in the "/content" directory. The "quiet=False" parameter displays the download progress.

In [2]:
url = f"https://drive.google.com/uc?id=1w05lAAN30z1Ho7k2731Sv9-THoEPjDe1"
output = "/content/dataset.zip"

gdown.download(url, output, quiet=False)

Downloading...
From: https://drive.google.com/uc?id=1w05lAAN30z1Ho7k2731Sv9-THoEPjDe1
To: /content/dataset.zip
100%|██████████| 474M/474M [00:03<00:00, 146MB/s]


'/content/dataset.zip'

This cell unzips the "dataset.zip" file in the "/content" directory without output using the "unzip" command with the "-qq" flag. It extracts the contents of the zip file.

In [3]:
!unzip -qq dataset.zip

replace Rock-Paper-Scissors/test/paper/testpaper01-00.png? [y]es, [n]o, [A]ll, [N]one, [r]ename: A


These lines assign directory paths to variables "train" and "valid". "train" points to the training data directory "/content/Rock-Paper-Scissors/train", while "valid" points to the validation data directory "/content/Rock-Paper-Scissors/validation".

In [15]:
train = '/content/Rock-Paper-Scissors/train'
valid = '/content/Rock-Paper-Scissors/validation'

Here, an image data generator named "train_data" is created. It applies rescaling to pixel values by dividing them by 255 and allocates 20% of the training data for validation.

Then, a generator called "train_generator" is established using the "flow_from_directory" method. It reads images from the "train" directory, processes them in batches of 32, and categorizes them. The images are resized to (300, 300) pixels, and the "subset" parameter is set to "training" to indicate that this generator is used for training data.

In [16]:
train_data = ImageDataGenerator(rescale = 1.0/255, validation_split=0.2)

train_generator = train_data.flow_from_directory(train, batch_size=32, class_mode='categorical', target_size=(300, 300), subset='training')

Found 2016 images belonging to 3 classes.


This cell segment accomplishes the following tasks:

1. It creates a list called "image_paths" by combining the "valid" directory path with each filename obtained from the "os.listdir" function.
2. It initializes an empty list named "preprocessed_images" to store the processed images.
3. It iterates through each image path in the "image_paths" list:
   a. Opens the image using the PIL library's "Image.open" function.
   b. Resizes the image to dimensions (300, 300).
   c. Converts the image to a NumPy array and normalizes the pixel values by dividing them by 255.0.
   d. Appends the preprocessed image to the "preprocessed_images" list.

This process prepares the validation images by resizing them, converting to NumPy arrays, and normalizing pixel values between 0 and 1.

In [17]:
image_paths = [os.path.join(valid, filename) for filename in os.listdir(valid)]

preprocessed_images = []
for image_path in image_paths:
    img = Image.open(image_path)
    img = img.resize((300, 300))
    img = np.array(img) / 255.0
    preprocessed_images.append(img)

This line of code converts the list of preprocessed validation images stored in the "preprocessed_images" list into a NumPy array named "validation_data". This NumPy array will be used as input data for validation during the model evaluation phase.

In [18]:
validation_data = np.array(preprocessed_images)

The provided cell accomplishes the following tasks:

1. It calculates the number of images in the "image_paths" list and stores the count in the variable "num_images".
2. It creates a NumPy array named "dummy_labels" with dimensions (num_images, ) filled with zeros. This array is intended to serve as placeholder labels for the validation images. The array's shape corresponds to the number of validation images, and each element represents a dummy label (0 in this case).

In [19]:
num_images = len(image_paths)
dummy_labels = np.zeros((num_images,))

This cell segment establishes a validation data generator named "validation_generator" using the "flow" method from the "ImageDataGenerator" class. It specifies the following configurations:

1. **Input Data**: The preprocessed validation images array ("validation_data") serves as the input data ("x").
2. **Labels**: The "dummy_labels" array is used as the labels ("y") for the validation images.
3. **Batch Size**: The generator processes images in batches of 32 ("batch_size").
4. **Shuffle**: The parameter "shuffle" is set to "False," indicating that the order of validation data won't be shuffled.

The validation generator is designed to provide input data and labels for evaluating the model's performance during validation.

In [20]:
validation_generator = ImageDataGenerator().flow(
    x=validation_data,
    y=dummy_labels,
    batch_size=32,
    shuffle=False
)

This cell defines a sequential neural network model using TensorFlow's Keras. The model includes:

1. A convolutional layer with 64 filters and ReLU activation.
2. A batch normalization layer to normalize outputs.
3. A dropout layer to prevent overfitting.

4. Another convolutional layer with 64 filters, ReLU activation, and max-pooling.
5. Another dropout layer.

6. A convolutional layer with 128 filters, ReLU activation, and max-pooling.
7. Another dropout layer.

8. A flatten layer to reshape the data.
9. A dense layer with 256 units, ReLU activation, and regularization.
10. A dropout layer.

11. A dense output layer with 3 units and softmax activation.

The model's architecture is summarized using "model.summary()".

In [21]:
model = Sequential([
    Conv2D(64, (5, 5), activation='relu', input_shape=(300, 300, 3)),
    BatchNormalization(),
    Dropout(0.25),

    Conv2D(64, (3, 3), activation='relu', padding='same'),
    MaxPooling2D((2, 2)),
    Dropout(0.25),

    Conv2D(128, (3, 3), activation='relu', padding='same'),
    MaxPooling2D((2, 2)),
    Dropout(0.25),

    Flatten(),

    Dense(256, activation='relu', kernel_regularizer=l2(0.001)),
    Dropout(0.5),
    Dense(3, activation='softmax')
])

model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_3 (Conv2D)           (None, 296, 296, 64)      4864      
                                                                 
 batch_normalization_1 (Batc  (None, 296, 296, 64)     256       
 hNormalization)                                                 
                                                                 
 dropout_4 (Dropout)         (None, 296, 296, 64)      0         
                                                                 
 conv2d_4 (Conv2D)           (None, 296, 296, 64)      36928     
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 148, 148, 64)     0         
 2D)                                                             
                                                                 
 dropout_5 (Dropout)         (None, 148, 148, 64)     

This cell compiles the previously defined neural network model:

- It sets the loss function to "categorical_crossentropy", suitable for multi-class classification.
- The optimizer is set to Adam, a popular optimization algorithm.
- The metric for evaluation during training is accuracy.

The model is now ready for training using the specified loss function, optimizer, and evaluation metric.

In [22]:
model.compile(loss = 'categorical_crossentropy', optimizer= Adam(), metrics=['accuracy'])

This cell defines a custom callback class named `myCallback`:

- Inside the class, the `on_epoch_end` method is implemented. This method is executed at the end of each training epoch.

- If the accuracy (retrieved from the `logs` dictionary) surpasses 99.5%, the training process is halted using `self.model.stop_training = True`.

- An instance of this custom callback, named `callbacks`, is created. This instance can be used during model training to stop the training process if the desired accuracy threshold is achieved.

In [23]:
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):

    if(logs.get('accuracy') > 0.995):
      self.model.stop_training = True

callbacks = myCallback()

This cell trains the defined neural network model using the provided data generators:

- `train_generator` is used as the training data source.
- The model is trained for 25 epochs.
- The training progress is displayed due to `verbose=1`.
- The number of steps per epoch is set to the length of `train_generator`.
- The number of validation steps is set to the length of `validation_generator`.
- The custom callback `callbacks` is used to potentially stop training if accuracy exceeds a certain threshold.

The training history is stored in the `history` variable, which can be used to analyze and visualize the model's performance during training.

In [24]:
history = model.fit(train_generator, epochs=25, verbose=1, steps_per_epoch=len(train_generator), validation_steps=len(validation_generator), callbacks = [callbacks])

Epoch 1/25
 6/63 [=>............................] - ETA: 35s - loss: 167.0104 - accuracy: 0.4115



Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25


In below cell we are saving the model.

In [25]:
model.save('Sky_Scavengers.h5')

The saved model is loaded to variable name model.

In [26]:
model = tf.keras.models.load_model('Sky_Scavengers.h5')

This cell does the following:

- Imports necessary libraries from TensorFlow for loading the model and preprocessing images.
- Loads a pre-trained model named 'Sky_Scavengers.h5' using `tf.keras.models.load_model`.
- Specifies the path to a test image ('test_image_path').
- Loads the test image, resizes it to (300, 300), and converts it to a NumPy array.
- Normalizes the pixel values of the image array by dividing by 255.
- Defines a list of class names ('class_names') for reference.
- Uses the loaded model to predict the class probabilities for the input image.
- Identifies the index of the class with the highest probability.
- Uses the class index to retrieve the predicted class name.
- Prints the predicted class and the associated class probabilities.

This code effectively loads a trained model, processes a test image, and provides predictions on the image's class and probabilities.

In [27]:
def detect(test_image_path):
  img = image.load_img(test_image_path, target_size=(300, 300))
  img_array = image.img_to_array(img)
  img_array = np.expand_dims(img_array, axis=0)
  img_array /= 255.0
  class_names = ['Paper', 'Rock', 'Scissor']

  predictions = model.predict(img_array)
  predicted_class_index = np.argmax(predictions[0])
  predicted_class = class_names[predicted_class_index]

  print("Predicted class:", predicted_class)
  print("Predicted probabilities:", predictions[0])


The path of each test data images which corresponds to each sub folder is passed to detect function and the detect is function is called. The predictions are returned back and displayed.

In [28]:
test_image_path = '/content/Rock-Paper-Scissors/test/scissors/testscissors01-06.png'
detect(test_image_path)
test_image_path = '/content/Rock-Paper-Scissors/test/paper/testpaper02-13.png'
detect(test_image_path)
test_image_path = '/content/Rock-Paper-Scissors/test/rock/testrock03-16.png'
detect(test_image_path)

Predicted class: Scissor
Predicted probabilities: [2.3357444e-02 2.7949698e-05 9.7661465e-01]
Predicted class: Paper
Predicted probabilities: [9.9913388e-01 1.0707701e-07 8.6605188e-04]
Predicted class: Rock
Predicted probabilities: [0.07260961 0.88902897 0.03836141]
