# Final Project Presentation

## Andrew Bahsoun

## 11 December 2024

In [23]:
from IPython.display import Latex

## Project Details
Image Classification Model to determine what step someone is on when washing their hands.

<img src="images/allsteps.png" alt="drawing" width="400"/>

## Preprocessing the data

- What I have: a large set of videos for each step
- What I need: images of hands for each step


Reads a video file, extracts each frame, and saves the frames as JPEG images to a directory.
It uses OpenCV to read the video, processes frames sequentially, and assigns filenames based on the frame number and the original video filename. 
The process continues until all frames are saved.

```python
def get_frames_from_video(directory, filename, step, output_frames_dir):
    # Creating a VideoCapture object to read the video
    cap = cv2.VideoCapture(os.path.join(directory, steps[step], filename))

    is_success, image = cap.read()
    frame_number = 0

    while is_success:
        out_filename = "frame_{}_{}.jpg".format(frame_number, os.path.splitext(filename)[0])
        save_path_and_name = os.path.join(output_frames_dir, out_filename)
        cv2.imwrite(save_path_and_name, image)
        is_success, image = cap.read()
        frame_number += 1
```

This code splits videos from each step into training and testing datasets based on a `test_ratio` of 30%, then saves the frames to the specific directory
```python
counter = 0
test_ratio = 0.3

for step in range(1,13):
    counter = 0
    for video in all_file_names_dict[step]:
        if (video != ".DS_Store"):

            if ((len(all_file_names_dict) * (1-test_ratio) ) < counter):
                #train data
                get_frames_from_video(input_dir, video, step, output_frames_dir_train)
            else:
                #test data
                get_frames_from_video(input_dir, video, step, output_frames_dir_test)
            counter += 1
```

Now we have 
- output_frames_dir_train
- output_frames_dir_test    
Which contain the frames we need!!

But there is no subdirectory order yet. We will need to make subdirectories for each step so tensorflow can distinguish between our classes

Moving test frames into their directories

```python
#moving all test photos step(1-9) into their respective directories
for step in range(1, 10):
    move_video_into_subdirectory_onedigit(output_frames_dir_test, 
                                          os.path.join(output_frames_dir_test,('step_' + str(step))), step)
    #moving all test photos step(10-12) into their respective directories
for step in range(10, 13):
    move_video_into_subdirectory_twodigit(output_frames_dir_test, 
                                          os.path.join(output_frames_dir_test,('step_' + str(step))), step)
 ```

Moving train photos into their directories
```python
#moving all train photos step(1-9) into their respective directories
for step in range(1, 10):
    move_video_into_subdirectory_onedigit(output_frames_dir_train, 
                                          os.path.join(output_frames_dir_train,('step_' + str(step))), step)
    
#moving all train photos step(10-12) into their respective directories
for step in range(10, 13):
    move_video_into_subdirectory_twodigit(output_frames_dir_train, 
                                          os.path.join(output_frames_dir_train,('step_' + str(step))), step)
```

## Creating our datasets in tensorflow

### Imports
```python
import matplotlib.pyplot as pl
import tensorflow as tf
from sklearn.metrics import confusion_matrix

import numpy as np
import cv2

import os
import PIL
import pathlib


from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D, MaxPooling2D
from keras.optimizers import RMSprop,Adam
from keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.layers import BatchNormalization
```

## Now, we can create our training and validation data sets. 

This will create a shape of (32, 128, 128, 3)
(16 = batch size, 128 = image width, 128 = image height, 3 = features (red green blue))
```python
# Parameters
batch_size = 16
img_height = 128
img_width = 128

# Load training dataset
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_dir,
    batch_size=batch_size,
    image_size=(img_height, img_width),
    seed=123,
    validation_split=0.2,
    subset="training"
)
```

In [None]:
# Load testing dataset
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    train_dir,
    batch_size=batch_size,
    image_size=(img_height, img_width),
    seed=123,
    validation_split=0.2,
    subset="validation"
)

**Lets test to see if our images have loaded correctly**

```python
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
  for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")
```

<img src="images/testcodeoutput.png" alt="drawing" width="400"/>

# All of our images are loaded up, and we are ready to train our model. 

A CNN is composed of many different layers. They are commonly used for image classification because images are so large!
<img src="images/cnn_model.jpg" alt="drawing" width="400"/>

This is what an image is composed of. This is why we have 3 features in our shape (x, 128, 128, 3)
<img src="images/dog_colors.png" alt="drawing" width="300"/>


<img src="images/feature_channels.png" alt="drawing" width="300"/>


**We are ready to start training. Lets create some data augmentation to introduce flexibility to our model.**
```python
data_augmentation = keras.Sequential(
  [
    layers.RandomFlip("horizontal",
                      input_shape=(img_height,
                                  img_width,
                                  3)),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
  ]
)
```

**Now we can create our model.**

We will add every element to our model using model = Sequential([ ... ])

**data_augmentation**

```python
data_augmentation,
```

**normalization**
```python
layers.Rescaling(1./255),
```

**First Conv Block**
```python
layers.Conv2D(32, 3, padding='same', activation='relu'),
BatchNormalization(),
layers.MaxPooling2D(pool_size=(2, 2)),
```

## Convolutions:
![gif](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*D6iRfzDkz-sEzyjYoVZ73w.gif)


## Relu

Adds non-linearity to the model

<img src="images/relu.jpg" alt="relu" width="400"/>

In [40]:
Latex(r"""\[\text{Loss} = - \sum_{i=1}^{\text{output size}} y_i \cdot \log \hat{y}_i\]\[\text{Multiclass Cross-Entropy}\]\[\text{Loss}: \text{ The overall loss function to minimize during training.}\]\[y_i: \text{ The true label for the } i\text{-th class (1 for correct class, 0 otherwise).}\]\[\hat{y}_i: \text{ The predicted probability for the } i\text{-th class.}\]\[\log \hat{y}_i: \text{ Logarithm of the predicted probability; penalizes incorrect predictions.}\]\[\sum_{i=1}^{\text{output size}}: \text{ Summation over all classes in the output.}\]\[\text{output size}: \text{ Total number of classes in the problem.}\]""")

<IPython.core.display.Latex object>

## Pooling Layer
(2,2) pooling, in my model I used max pooling
<img src="images/pooling.png" alt="drawing" width="400"/>

## Batch Normalization
- Normalizes the inputs at each stage
<img src="images/batch_normalization.png" alt="drawing" width="400"/>

In [41]:
Latex(r""" \[[3, 5, 8, 9, 11, 24]\]\[\mu_B = \frac{1}{m_B} \sum_{i=1}^{m_B} x^{(i)} = \frac{1}{6}(3 + 5 + 8 + 9 + 11 + 24) = 10\]\[\sigma_B^2 = \frac{1}{m_B} \sum_{i=1}^{m_B} \left(x^{(i)} - \mu_B\right)^2 = \frac{1}{6}\left((3 - 10)^2 + (5 - 10)^2 + \dots + (24 - 10)^2\right) = 46\]\[\hat{x}^{(i)} = \frac{x^{(i)} - \mu_B}{\sqrt{\sigma_B^2 + \epsilon}}\]\[\hat{x}^{(0)} = \frac{3 - 10}{\sqrt{46 + 0.00001}} = -1.03\]\[[-1.03, -0.74, -0.29, -0.15, 0.15, 2.06]\]Mean = 0Std = 0.998\[z^{(i)} = \gamma \otimes \hat{x}^{(i)} + \beta\]""")

<IPython.core.display.Latex object>

**Second Conv Block**
```python
layers.Conv2D(64, 3, padding='same', activation='relu'),
BatchNormalization(),
layers.MaxPooling2D(pool_size=(2, 2)),
```

**Third Conv Block**
```python
layers.Conv2D(128, 3, padding='same', activation='relu'),
BatchNormalization(),
layers.MaxPooling2D(pool_size=(2, 2)),
```

**Regularization**
```python
layers.Dropout(0.5),
```

**Fully Connected Layers**
```python
layers.Flatten(),
layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)),
BatchNormalization(),
layers.Dropout(0.5),
layers.Dense(num_classes, activation='softmax')
```