# **ASL (American Sign Language) Alphabet Classification Using Convolutional Neural Network**



*   Noam Cohen, 312213218
*   Kathy Agafonov, 206332348
*   Nadav Cherry, 209264241





In this assignment we will use **convolutional neural network (CNN)** to **classify ASL Alphabet**.

The assignment based on the Kaggle link: https://www.kaggle.com/datasets/grassknoted/asl-alphabet

## **Introduction**

American Sign Language (ASL) is a visual language used by the Deaf and Hard of Hearing community for communication. In the realm of computer vision and artificial intelligence, the recognition of ASL gestures holds significant importance.

This assignment focuses on ASL Alphabet Classification using Convolutional Neural Networks (CNNs). The ASL Alphabet comprises distinct hand gestures representing each letter of the English alphabet. Leveraging the power of deep learning, specifically CNNs, our objective is to develop a model capable of accurately identifying and classifying these ASL gestures.

The ASL Alphabet Classification task explores the complexities of image recognition, showcasing the capabilities of neural networks. This project not only demonstrates the potential of advanced technologies but also contributes to the development of tools that facilitate communication and accessibility for the Deaf community.

Through this exploration, we aim to bridge the gap between computer vision and the intricacies of sign language. By fostering inclusivity and technological advancements, we envision this project making a positive impact in the field of assistive technology, bringing us closer to a more inclusive and accessible future.

In [None]:
import matplotlib.pyplot as plt

# Hypothetical data based on provided statistics
categories = ['Children with Hearing Loss', 'Adults with Hearing Trouble', 'Adults with Tinnitus', 'Potential Hearing Aid Users', 'Cochlear Implants']
percentages = [0.2, 15, 10, 10, 0.01]  # Replace with actual data

# Bar Chart
plt.figure(figsize=(10, 6))
plt.barh(categories, percentages, color='skyblue')
plt.xlabel('Percentage of Population')
plt.title('Hearing-Related Statistics in the United States')
plt.grid(axis='x', linestyle='--', alpha=0.6)

plt.show()

The insights from hearing-related statistics emphasize the importance of our deep learning work with American Sign Language (ASL). Considering the prevalence and challenges of hearing loss, especially across diverse demographics, reinforces the need for inclusive ASL recognition models. Insights into early detection, disparities, and real-world complexities underscore the significance of creating a robust and representative dataset for our project. Additionally, addressing the hearing aid usage gap aligns with our goal of leveraging technology to enhance accessibility and communication for individuals with hearing-related challenges.
<br>
<br>
The data in the graph is based on the source:
https://www.nidcd.nih.gov/health/statistics/quick-statistics-hearing




## **Download and import of packages**

In [None]:
# ! pip install scikit-image
! pip install pytorch_lightning
! pip install prettytable

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import vutils
%matplotlib inline
from sklearn.metrics import confusion_matrix,accuracy_score,log_loss
import seaborn as sns
import warnings
warnings.catch_warnings()
warnings.simplefilter("ignore")
import lightning as L
from torchvision import datasets
import torch
from torch import nn
from torchvision.transforms import ToTensor,Lambda
from torchvision import transforms
from torchmetrics import Accuracy
import torch.nn.functional as F
from torch.utils.data import Dataset,DataLoader, random_split
from torchvision.io import read_image
import pandas as pd
import tensorflow as tf
from pathlib import Path
from PIL import Image
from torch.utils.data import DataLoader
from torch import nn
import neptune.new as neptune
from neptune.new.integrations.pytorch_lightning import NeptuneLogger
from sklearn.metrics import auc, precision_recall_curve
import pytorch_lightning as pl
from sklearn.model_selection import StratifiedKFold
# import pytorch_lightning as pl
import os
from pytorch_lightning.callbacks.early_stopping import EarlyStopping

## **Import data from Google Drive**

In [None]:
# Google Drive connection
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# Drive paths

drive_train_path = '/content/drive/MyDrive/Academy/DL/Assignment_1/dataset/Old/asl_alphabet_subset_train.zip'
drive_val_path = '/content/drive/MyDrive/Academy/DL/Assignment_1/dataset/Old/asl_alphabet_subset_valid.zip'
drive_test_path = '/content/drive/MyDrive/Academy/DL/Assignment_1/dataset/Old/asl_alphabet_subset_test.zip'


# Colab paths
train_path = 'img_dataset/train'
val_path = 'img_dataset/val'
test_path = 'img_dataset/test'

In [None]:
"""
Extract the .zip files into the 'train data' folder.
"""
def extractData(sourcePath, DestPath):
  zip_name = sourcePath

  with ZipFile(zip_name, 'r') as zip:
    zip.extractall(DestPath)
    print("Extracted all image files")

In [None]:
# Extract train data
extractData(drive_train_path, train_path)

# Extract validation data
extractData(drive_val_path, val_path)

# Extract test data
extractData(drive_test_path, test_path)

Extracted all image files
Extracted all image files
Extracted all image files


## **Task 1- Data analysis**

Exploring the dataset


In [None]:
import os
from skimage import io

# Specify the path to your dataset
dataset_path = 'C:\\Users\\nadav\\Downloads\\Final_Data_Full\\Final_Data_Full\\train'
test_data_path = 'archive/test'
train_path = 'C:\\Users\\nadav\\Downloads\\Final_Data_Full\\Final_Data_Full\\train'

# Function to check the shape of images in the dataset and count unique classes
def check_image_shapes_and_classes(dataset_path):
    unique_shapes = set()  # Use a set to store unique shapes for the entire dataset
    unique_classes = set()  # Use a set to store unique classes in the dataset

    classes = os.listdir(dataset_path)

    for class_name in classes:
        class_folder = os.path.join(dataset_path, class_name)
        if os.path.isdir(class_folder):
            unique_classes.add(class_name)  # Add the class to the set of unique classes

            images = os.listdir(class_folder)

            for img_name in images:
                img_path = os.path.join(class_folder, img_name)

                # Check if the current item is a file before attempting to read it
                if os.path.isfile(img_path):
                    sample_image = io.imread(img_path)
                    image_shape = sample_image.shape

                    # Add the shape to the set
                    unique_shapes.add(image_shape)

    # Print the set of unique shapes for the entire dataset
    print("Unique Shapes for the Entire Dataset:", unique_shapes)

    # Print the set of unique classes in the dataset
    print("Unique Classes in the Dataset:", sorted(unique_classes))
    print("Number of Unique Classes in the Dataset:", len(unique_classes))

# Check the shape of images and count unique classes in the dataset
check_image_shapes_and_classes(dataset_path)

A Visual and Tabular Examination of Class Distribution

In [None]:
def count_images_per_class(folder_path):
    class_counts = {}
    for class_name in os.listdir(folder_path):
        class_folder = os.path.join(folder_path, class_name)
        if os.path.isdir(class_folder):
            num_images = len(os.listdir(class_folder))
            class_counts[class_name] = num_images
    return class_counts

train_counts = count_images_per_class(train_path)
test_counts = count_images_per_class(test_data_path)

class_labels = list(train_counts.keys())

train_values = [train_counts.get(label, 0) for label in class_labels]
test_values = [test_counts.get(label, 0) for label in class_labels]

bar_width = 0.4
index = np.arange(len(class_labels))

plt.figure(figsize=(12, 6))
plt.bar(index, train_values, width=bar_width, label='Training', color='skyblue')
# plt.bar(index + bar_width, validation_values, width=bar_width, label='Validation', color='orange')
plt.bar(index + 2 * bar_width, test_values, width=bar_width, label='Test', color='green')

plt.title('Distribution of Samples Across Classes')
plt.xlabel('Class')
plt.ylabel('Number of Images')
plt.xticks(index + bar_width, class_labels)
plt.legend()
plt.show()

Benchmarks

The first benchmark model, trained for 5 epochs, achieved a validation accuracy of approximately 89.5%. The training process used a deep neural network,
and the code was executed on Kaggle Kernels with a GPU.
<br>
https://www.kaggle.com/code/dansbecker/running-kaggle-kernels-with-a-gpu

<br>

The second benchmark model, trained for 10 epochs,
achieved a higher validation accuracy of around 91.2%. This model also utilized a convolutional neural network (CNN) implemented with the Keras framework.
<br>
 https://www.kaggle.com/code/paultimothymooney/interpret-sign-language-with-deep-learning

Sample Images

In [None]:
# Specify the path to your training data folder
# train_path = 'archive/asl_alphabet_train/asl_alphabet_train'

# Function to randomly select a sample image from each class
def get_sample_images(folder_path, num_samples_per_class=2):
    sample_images = []
    for class_name in os.listdir(folder_path):
        class_folder = os.path.join(folder_path, class_name)
        if os.path.isdir(class_folder):
            images = os.listdir(class_folder)
            selected_images = random.sample(images, min(num_samples_per_class, len(images)))
            for img_name in selected_images:
                img_path = os.path.join(class_folder, img_name)
                sample_images.append((class_name, cv2.imread(img_path)))
    return sample_images

# Get sample images from each class
sample_images = get_sample_images(train_path)

# Calculate grid size dynamically
num_samples = len(sample_images)
num_cols = min(5, num_samples)
num_rows = -(-num_samples // num_cols)  # Ceiling division

# Plotting the sample images
plt.figure(figsize=(15, 3 * num_rows))
gs = gridspec.GridSpec(num_rows, num_cols, wspace=0.1, hspace=0.1)

for i, (class_name, img) in enumerate(sample_images):
    ax = plt.subplot(gs[i])
    ax.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    ax.set_title(class_name)
    ax.axis('off')

plt.show()

Output hidden; open in https://colab.research.google.com to view.

## **Task 2-**

\## **Task 2**

2A

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

In [None]:
torch.cuda.is_available()

In [None]:
torch.cuda.empty_cache()

In [None]:
# test_data_path = 'archive/test'
# train_path = 'C:\\Users\\nadav\\PycharmProjects\\assignment1\\pythonProject7\\archive\\asl_alphabet_subset'
# train_path = 'C:\\Users\\nadav\\Downloads\\Final_Data_Full\\Final_Data_Full\\train'
train_path = 'path_to_train'
test_data_path = 'path_to_test'

label_mapping = {
        'A': 0, 'B': 1, 'C': 2, 'D': 3, 'E': 4, 'F': 5, 'G': 6, 'H': 7, 'I': 8, 'J': 9,
        'K': 10, 'L': 11, 'M': 12, 'N': 13, 'O': 14, 'P': 15, 'Q': 16, 'R': 17, 'S': 18,
        'T': 19, 'U': 20, 'V': 21, 'W': 22, 'X': 23, 'Y': 24, 'Z': 25, 'del': 26,
        'nothing': 27, 'space': 28, 'five': 29
    }
inverted_label_mapping = {v: k for k, v in label_mapping.items()}

In [None]:
batch_size = 64
data_path = './data_folder/'

class CNN_ASL(pl.LightningModule):
    def __init__(self, train_data, val_data, data_dir=data_path, num_classes=29, learning_rate=2e-4, batch_size=batch_size):
        super().__init__()

        self.run = neptune.init_run(
        project="nadavcherry/dp1",
        api_token="", # your credentials
        )

        # Set our init args as class attributes
        self.transform = transforms.Compose(
            [
                transforms.Resize((200, 200)),
                transforms.Grayscale(num_output_channels=1),
                transforms.ToTensor(),
                # transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

            ]
        )


        self.asl_test = datasets.ImageFolder(test_data_path, transform=transform)
        self.asl_train = train_data
        self.asl_val = val_data
        self.data_dir = data_dir
        self.learning_rate = learning_rate
        self.batch_size=batch_size
        self.y = []
        self.preds = []
        # Hardcode some dataset specific attributes
        self.num_classes = num_classes

        ## input channels 1 - monochrom (for rgb would be 3)
        ## output channels 32 - as the number of filters that we train
        ## kernel size 3 - arbitrary selection
        # self.conv_11 = nn.Conv2d(1,16,11, padding='same')
        # self.conv_7 = nn.Conv2d(16,32,7, padding='same')
        # self.conv1 = nn.Conv2d(32,64,3,padding='same')
        ## input channels 1 - monochrom (for rgb would be 3)
        ## output channels 32 - as the number of filters that we train
        ## kernel size 3 - arbitrary selection
        # self.conv1 = nn.Conv2d(1,32,3,padding='same')
        # self.conv_11 = nn.Conv2d(1,16,11, padding='same')
        # self.conv_7 = nn.Conv2d(16,32,7, padding='same')
        self.conv1 = nn.Conv2d(1,32,3,padding='same')
        self.conv2 = nn.Conv2d(32,64,3,padding='same')
        self.conv3 = nn.Conv2d(64,32,3,padding='same')
        self.conv4 = nn.Conv2d(32,64,3,padding='same')
        self.linear1 = nn.Linear(50*50*64,50)
        self.linear2 = nn.Linear(50,self.num_classes)
        self.mp = nn.MaxPool2d(2,2)
        self.relu=nn.ReLU()
        self.val_accuracy = Accuracy(task="multiclass", num_classes=num_classes)
        self.test_accuracy = Accuracy(task="multiclass", num_classes=num_classes)
        self.train_accuracy = Accuracy(task="multiclass", num_classes=num_classes)

        self.count_batch_train = 0
        self.count_batch_val = 0
        self.count_batch_test = 0
        self.train_loss1 = 0
        self.train_accuracy1 = 0

        self.val_loss1 = 0
        self.val_accuracy1 = 0

        self.test_loss1 = 0
        self.test_accuracy1 = 0

        self.val_loss_arr = []
        self.val_accuracy_arr = []

        self.count_good_high_confidence = 0
        self.count_good_classification_uncertain_confidence = 0
        self.count_bad_classification = 0

    def forward(self, x):
        # x = self.relu(self.conv_11(x))
        # x = self.relu(self.conv_7(x))
        # x = self.mp(x)
        x = self.relu(self.conv1(x))
        x = self.relu(self.conv2(x))
        x = self.mp(x)
        x = self.relu(self.conv3(x))
        x = self.relu(self.conv4(x))
        x = self.mp(x)
        x = x.view(-1,50*50*64)
        x = self.relu(self.linear1(x))
        x = self.linear2(x)
        return F.log_softmax(x,dim=1)


    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        preds = torch.argmax(logits, dim=1)
        acc = self.train_accuracy(preds, y)
        acc1 = acc.item()
        loss1 = loss.item()
        self.train_accuracy1 += acc1
        self.train_loss1 += loss1
        self.count_batch_train += 1
        self.run["train/accuracy_batch"].log(acc1)
        self.run["train/loss_batch"].log(loss1)
        return loss

    def on_train_epoch_end(self):
        self.run["train/accuracy_epochs"].log(self.train_accuracy1/self.count_batch_train)
        self.run["train/loss_epochs"].log(self.train_loss1/self.count_batch_train)
        self.train_accuracy1 = 0
        self.train_loss1 = 0
        self.count_batch_train = 0

    def validation_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        preds = torch.argmax(logits, dim=1)
        acc = self.val_accuracy(preds, y)
        acc1 = acc.item()
        loss1 = loss.item()
        self.val_accuracy1 += acc1
        self.val_loss1 += loss1
        self.count_batch_val += 1
        self.run["val/accuracy_batch"].log(acc1)
        self.run["val/loss_batch"].log(loss1)
        # Calling self.log will surface up scalars for you in TensorBoard
        self.log("val/val_loss", loss, on_step=False, on_epoch=True, prog_bar=True)
        self.log("val/val_acc", self.val_accuracy, on_step=False, on_epoch=True, prog_bar=True)
        return loss

    def on_validation_epoch_end(self):
        val_acc = self.val_accuracy1/self.count_batch_val
        val_loss_temp = self.val_loss1/self.count_batch_val
        self.run["val/accuracy_epochs"].log(val_acc)
        self.run["val/loss_epochs"].log(val_loss_temp)

        self.val_loss_arr.append(val_loss_temp)
        self.val_accuracy_arr.append(val_acc)

        self.val_accuracy1 = 0
        self.val_loss1 = 0
        self.count_batch_val = 0

    def test_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        logits = logits.cpu()
        y = y.cpu()
        preds = torch.argmax(logits, dim=1)
       # Compute softmax probabilities
        probs = F.softmax(logits, dim=1)
        acc = self.test_accuracy(preds, y)
        self.test_accuracy1 += acc.item()
        self.test_loss1 += loss.item()
        self.count_batch_test += 1
        self.count_good_high_confidence = 0
        self.count_good_classification_uncertain_confidence = 0
        transform1 = transforms.ToPILImage()
        for i in range(len(y)):
            self.y.append(y[i])
            self.preds.append(preds[i])
            if y[i] == preds[i]:
                if probs[i, preds[i]] > 0.95:  # High confidence good classification
                    if self.count_good_high_confidence == 2:
                        continue
                    self.count_good_high_confidence += 1
                    img_path = f"images/example_good_high_confidence_{batch_idx}_{i}.png"
                    transform1(x[i]).save(img_path)
                    self.run[f"Good classification High confidence/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
                    # display(transform1(x[i]))
                elif probs[i, preds[i]] < 0.4:  # Uncertain classification
                    if self.count_good_classification_uncertain_confidence == 2:
                        continue
                    self.count_good_classification_uncertain_confidence += 1
                    img_path = f"images/Uncertain_confidence_{batch_idx}_{i}.png"
                    transform1(x[i]).save(img_path)
                    self.run[f"Good classification Uncertain classification/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
            elif probs[i, preds[i]] > 0.4:
                self.count_bad_classification += 1
                # if self.count_bad_classification > 100:
                #     continue
                img_path = f"images/Bad_classification_{batch_idx}_{i}.png"
                transform1(x[i]).save(img_path)
                self.run[f"Bad_classification/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
        self.log("test_loss", loss, on_step=False, on_epoch=True, prog_bar=True)
        self.log("test_acc", self.test_accuracy, on_step=False, on_epoch=True, prog_bar=True)
        return loss

    def on_test_end(self):
        cm = confusion_matrix(self.y, self.preds)

        plt.figure(figsize=(10, 8))
        sns.heatmap(cm, cmap='Greens', annot=True, fmt='d')
        plt.xlabel('Prediction')
        plt.ylabel('True label')
        plt.title('ASL Convolutional Model\nClassification Results on Test Set')

        # Save the confusion matrix plot
        cm_plot_path = 'confusion_matrix_plot.png'
        plt.savefig(cm_plot_path)
        plt.close()


        folder_path = "lightning_logs"
        version = os.listdir(folder_path)[-1] + '/checkpoints'
        file_path = os.listdir(folder_path+'/'+version)
        f = str(folder_path+'/'+version+'/'+file_path[0])
        self.run[f'Confusion_Matrix_Plot'].upload(cm_plot_path)
        self.run["test_accuracy"] = (self.test_accuracy1/self.count_batch_test)
        self.run["test_loss"] = (self.test_loss1/self.count_batch_test)
        self.run[f"{file_path[0]}"].upload(f)
        self.run.stop()

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.learning_rate)
        return optimizer


    def train_dataloader(self):
        return self.asl_train

    def val_dataloader(self):
        return self.asl_val

    def test_dataloader(self):
        return DataLoader(self.asl_test, batch_size=self.batch_size)


In [None]:
transform = transforms.Compose(
    [
        transforms.Resize((200, 200)),
        transforms.Grayscale(num_output_channels=1),
        transforms.ToTensor(),
        # transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ]
)

In [None]:
train_dataset = datasets.ImageFolder(train_path, transform=transform)


In [None]:
max_acc = 0
fold_num = 5
ensemble_models = []
test_accuracy = []
test_loss = []
val_accuracy = []
val_loss = []
kf = StratifiedKFold(n_splits=fold_num, shuffle=True)
for i, (train_index, val_index) in enumerate(kf.split(train_dataset,train_dataset.targets)):



    train_subset = torch.utils.data.Subset(train_dataset, train_index)
    val_subset = torch.utils.data.Subset(train_dataset, val_index)

    train_loader = torch.utils.data.DataLoader(train_subset, batch_size=batch_size,shuffle=True)
    val_loader = torch.utils.data.DataLoader(val_subset, batch_size=batch_size, shuffle=True)

    kf_model = CNN_ASL(train_loader, val_loader, num_classes=29)
    # Define EarlyStopping callback
    early_stop_callback = EarlyStopping(
        monitor='val/val_loss',  # Metric to monitor for improvement
        min_delta=0.001,      # Minimum change in the monitored metric to qualify as improvement
        patience=3,           # Number of epochs with no improvement after which training will be stopped
        verbose=True,         # Print message when training is stopped due to early stopping
        mode='min'            # 'min' or 'max': whether the monitored metric should be minimized or maximized
    )

    # Initialize Trainer with EarlyStopping callback
    trainer = pl.Trainer(
        accelerator="auto",
        max_epochs=50,
        callbacks=[early_stop_callback]  # Pass the EarlyStopping callback to the Trainer
    )
    trainer.fit(kf_model)
    trainer.test(kf_model)

    ensemble_models.append(kf_model)
    # Calculate test accuracy and test loss

    test_accuracy.append(kf_model.test_accuracy1/kf_model.count_batch_test)
    test_loss.append(kf_model.test_loss1/kf_model.count_batch_test)
    val_accuracy.append(kf_model.val_accuracy_arr)
    val_loss.append(kf_model.val_loss_arr)



In [None]:
# Assuming you have lists for validation and test losses and accuracies
epochs = range(0,8)

# Create two subplots (one for loss and one for accuracy)
fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True, figsize=(10, 8))

column_averages_loss = [sum(col) / len(col) for col in zip(*val_loss)]
column_averages_acc = [sum(col) / len(col) for col in zip(*val_accuracy)]

# Plot the training and validation losses
ax1.plot(epochs, column_averages_loss, label='Validation Loss', marker='o', linestyle='-')
ax1.set_ylabel('Loss')
ax1.set_title('Average Validation Loss vs. Epoch')
ax1.legend()

# Plot the training and validation accuracies
ax2.plot(epochs, column_averages_acc, label='Validation Accuracy', marker='s', linestyle='--', color='orange')
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Accuracy')
ax2.set_title('Average Validation Accuracy vs. Epoch')
ax2.legend()

plt.show()

In [None]:
from tabulate import tabulate

# Assuming you have test_accuracy and test_loss lists containing values for each fold

fold_num = len(test_accuracy)

# Create a list of lists to store data for the table
table_data = [["Fold", "Test Accuracy", "Test Loss"]]
for i in range(fold_num):
    table_data.append([i+1, test_accuracy[i], test_loss[i]])

# Calculate average test accuracy and test loss
avg_test_accuracy = sum(test_accuracy) / fold_num
avg_test_loss = sum(test_loss) / fold_num

# Add rows for average values
table_data.append(["Average", avg_test_accuracy, avg_test_loss])

# Print the table
print(tabulate(table_data, headers="firstrow", tablefmt="grid"))


2C

added layers

In [None]:
batch_size = 64
data_path = './data_folder/'

class CNN_ASL(pl.LightningModule):
    def __init__(self, train_data, val_data, data_dir=data_path, num_classes=29, learning_rate=2e-4, batch_size=batch_size):
        super().__init__()

        self.run = neptune.init_run(
        project="nadavcherry/dp1",
        api_token="", # your credentials
        )

        # Set our init args as class attributes
        self.transform = transforms.Compose(
            [
                transforms.Resize((200, 200)),
                transforms.Grayscale(num_output_channels=1),
                transforms.ToTensor(),
                # transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

            ]
        )


        self.asl_test = datasets.ImageFolder(test_data_path, transform=transform)
        self.asl_train = train_data
        self.asl_val = val_data
        self.data_dir = data_dir
        self.learning_rate = learning_rate
        self.batch_size=batch_size
        self.y = []
        self.preds = []
        # Hardcode some dataset specific attributes
        self.num_classes = num_classes

        ## input channels 1 - monochrom (for rgb would be 3)
        ## output channels 32 - as the number of filters that we train
        ## kernel size 3 - arbitrary selection
        # self.conv_11 = nn.Conv2d(1,16,11, padding='same')
        # self.conv_7 = nn.Conv2d(16,32,7, padding='same')
        # self.conv1 = nn.Conv2d(32,64,3,padding='same')
        ## input channels 1 - monochrom (for rgb would be 3)
        ## output channels 32 - as the number of filters that we train
        ## kernel size 3 - arbitrary selection
        # self.conv1 = nn.Conv2d(1,32,3,padding='same')
        self.conv_11 = nn.Conv2d(1,16,11, padding='same')
        self.conv_7 = nn.Conv2d(16,32,7, padding='same')
        self.conv1 = nn.Conv2d(32,64,3,padding='same')
        self.conv2 = nn.Conv2d(64,128,3,padding='same')
        self.conv3 = nn.Conv2d(128,64,3,padding='same')
        self.conv4 = nn.Conv2d(64,64,3,padding='same')
        self.linear1 = nn.Linear(25*25*64,50)
        self.linear2 = nn.Linear(50,self.num_classes)
        self.mp = nn.MaxPool2d(2,2)
        self.relu=nn.ReLU()
        self.val_accuracy = Accuracy(task="multiclass", num_classes=num_classes)
        self.test_accuracy = Accuracy(task="multiclass", num_classes=num_classes)
        self.train_accuracy = Accuracy(task="multiclass", num_classes=num_classes)

        self.count_batch_train = 0
        self.count_batch_val = 0
        self.count_batch_test = 0
        self.train_loss1 = 0
        self.train_accuracy1 = 0

        self.val_loss1 = 0
        self.val_accuracy1 = 0

        self.test_loss1 = 0
        self.test_accuracy1 = 0

        self.val_loss_arr = []
        self.val_accuracy_arr = []

        self.count_good_high_confidence = 0
        self.count_good_classification_uncertain_confidence = 0
        self.count_bad_classification = 0

    def forward(self, x):
        x = self.relu(self.conv_11(x))
        x = self.relu(self.conv_7(x))
        x = self.mp(x)
        x = self.relu(self.conv1(x))
        x = self.relu(self.conv2(x))
        x = self.mp(x)
        x = self.relu(self.conv3(x))
        x = self.relu(self.conv4(x))
        x = self.mp(x)
        x = x.view(-1,25*25*64)
        x = self.relu(self.linear1(x))
        x = self.linear2(x)
        return F.log_softmax(x,dim=1)


    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        preds = torch.argmax(logits, dim=1)
        acc = self.train_accuracy(preds, y)
        acc1 = acc.item()
        loss1 = loss.item()
        self.train_accuracy1 += acc1
        self.train_loss1 += loss1
        self.count_batch_train += 1
        self.run["train/accuracy_batch"].log(acc1)
        self.run["train/loss_batch"].log(loss1)
        return loss

    def on_train_epoch_end(self):
        self.run["train/accuracy_epochs"].log(self.train_accuracy1/self.count_batch_train)
        self.run["train/loss_epochs"].log(self.train_loss1/self.count_batch_train)
        self.train_accuracy1 = 0
        self.train_loss1 = 0
        self.count_batch_train = 0

    def validation_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        preds = torch.argmax(logits, dim=1)
        acc = self.val_accuracy(preds, y)
        acc1 = acc.item()
        loss1 = loss.item()
        self.val_accuracy1 += acc1
        self.val_loss1 += loss1
        self.count_batch_val += 1
        self.run["val/accuracy_batch"].log(acc1)
        self.run["val/loss_batch"].log(loss1)
        # Calling self.log will surface up scalars for you in TensorBoard
        self.log("val/val_loss", loss, on_step=False, on_epoch=True, prog_bar=True)
        self.log("val/val_acc", self.val_accuracy, on_step=False, on_epoch=True, prog_bar=True)
        return loss

    def on_validation_epoch_end(self):
        val_acc = self.val_accuracy1/self.count_batch_val
        val_loss_temp = self.val_loss1/self.count_batch_val
        self.run["val/accuracy_epochs"].log(val_acc)
        self.run["val/loss_epochs"].log(val_loss_temp)

        self.val_loss_arr.append(val_loss_temp)
        self.val_accuracy_arr.append(val_acc)

        self.val_accuracy1 = 0
        self.val_loss1 = 0
        self.count_batch_val = 0

    def test_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        logits = logits.cpu()
        y = y.cpu()
        preds = torch.argmax(logits, dim=1)
       # Compute softmax probabilities
        probs = F.softmax(logits, dim=1)
        acc = self.test_accuracy(preds, y)
        self.test_accuracy1 += acc.item()
        self.test_loss1 += loss.item()
        self.count_batch_test += 1
        self.count_good_high_confidence = 0
        self.count_good_classification_uncertain_confidence = 0
        transform1 = transforms.ToPILImage()
        for i in range(len(y)):
            self.y.append(y[i])
            self.preds.append(preds[i])
            if y[i] == preds[i]:
                if probs[i, preds[i]] > 0.95:  # High confidence good classification
                    if self.count_good_high_confidence == 2:
                        continue
                    self.count_good_high_confidence += 1
                    img_path = f"images/example_good_high_confidence_{batch_idx}_{i}.png"
                    transform1(x[i]).save(img_path)
                    self.run[f"Good classification High confidence/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
                    # display(transform1(x[i]))
                elif probs[i, preds[i]] < 0.4:  # Uncertain classification
                    if self.count_good_classification_uncertain_confidence == 2:
                        continue
                    self.count_good_classification_uncertain_confidence += 1
                    img_path = f"images/Uncertain_confidence_{batch_idx}_{i}.png"
                    transform1(x[i]).save(img_path)
                    self.run[f"Good classification Uncertain classification/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
            elif probs[i, preds[i]] > 0.4:
                self.count_bad_classification += 1
                # if self.count_bad_classification > 100:
                #     continue
                img_path = f"images/Bad_classification_{batch_idx}_{i}.png"
                transform1(x[i]).save(img_path)
                self.run[f"Bad_classification/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
        self.log("test_loss", loss, on_step=False, on_epoch=True, prog_bar=True)
        self.log("test_acc", self.test_accuracy, on_step=False, on_epoch=True, prog_bar=True)
        return loss

    def on_test_end(self):
        cm = confusion_matrix(self.y, self.preds)

        plt.figure(figsize=(10, 8))
        sns.heatmap(cm, cmap='Greens', annot=True, fmt='d')
        plt.xlabel('Prediction')
        plt.ylabel('True label')
        plt.title('ASL Convolutional Model\nClassification Results on Test Set')

        # Save the confusion matrix plot
        cm_plot_path = 'confusion_matrix_plot.png'
        plt.savefig(cm_plot_path)
        plt.close()


        folder_path = "lightning_logs"
        version = os.listdir(folder_path)[-1] + '/checkpoints'
        file_path = os.listdir(folder_path+'/'+version)
        f = str(folder_path+'/'+version+'/'+file_path[0])
        self.run[f'Confusion_Matrix_Plot'].upload(cm_plot_path)
        self.run["test_accuracy"] = (self.test_accuracy1/self.count_batch_test)
        self.run["test_loss"] = (self.test_loss1/self.count_batch_test)
        self.run[f"{file_path[0]}"].upload(f)
        self.run.stop()

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.learning_rate)
        return optimizer


    def train_dataloader(self):
        return self.asl_train

    def val_dataloader(self):
        return self.asl_val

    def test_dataloader(self):
        return DataLoader(self.asl_test, batch_size=self.batch_size)


change colors

In [None]:
batch_size = 64
data_path = './data_folder/'

class CNN_ASL(pl.LightningModule):
    def __init__(self, train_data, val_data, data_dir=data_path, num_classes=29, learning_rate=2e-4, batch_size=batch_size):
        super().__init__()

        self.run = neptune.init_run(
        project="nadavcherry/dp1",
        api_token="", # your credentials
        )

        # Set our init args as class attributes
        self.transform = transforms.Compose(
            [
                transforms.Resize((200, 200)),
                # transforms.Grayscale(num_output_channels=1),
                transforms.ToTensor(),
                transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

            ]
        )


        self.asl_test = datasets.ImageFolder(test_data_path, transform=transform)
        self.asl_train = train_data
        self.asl_val = val_data
        self.data_dir = data_dir
        self.learning_rate = learning_rate
        self.batch_size=batch_size
        self.y = []
        self.preds = []
        # Hardcode some dataset specific attributes
        self.num_classes = num_classes

        ## input channels 1 - monochrom (for rgb would be 3)
        ## output channels 32 - as the number of filters that we train
        ## kernel size 3 - arbitrary selection
        # self.conv_11 = nn.Conv2d(1,16,11, padding='same')
        # self.conv_7 = nn.Conv2d(16,32,7, padding='same')
        # self.conv1 = nn.Conv2d(32,64,3,padding='same')
        ## input channels 1 - monochrom (for rgb would be 3)
        ## output channels 32 - as the number of filters that we train
        ## kernel size 3 - arbitrary selection
        # self.conv1 = nn.Conv2d(1,32,3,padding='same')
        # self.conv_11 = nn.Conv2d(1,16,11, padding='same')
        # self.conv_7 = nn.Conv2d(16,32,7, padding='same')
        self.conv1 = nn.Conv2d(3,32,3,padding='same')
        self.conv2 = nn.Conv2d(32,64,3,padding='same')
        self.conv3 = nn.Conv2d(64,32,3,padding='same')
        self.conv4 = nn.Conv2d(32,64,3,padding='same')
        self.linear1 = nn.Linear(50*50*64,50)
        self.linear2 = nn.Linear(50,self.num_classes)
        self.mp = nn.MaxPool2d(2,2)
        self.relu=nn.ReLU()
        self.val_accuracy = Accuracy(task="multiclass", num_classes=num_classes)
        self.test_accuracy = Accuracy(task="multiclass", num_classes=num_classes)
        self.train_accuracy = Accuracy(task="multiclass", num_classes=num_classes)

        self.count_batch_train = 0
        self.count_batch_val = 0
        self.count_batch_test = 0
        self.train_loss1 = 0
        self.train_accuracy1 = 0

        self.val_loss1 = 0
        self.val_accuracy1 = 0

        self.test_loss1 = 0
        self.test_accuracy1 = 0

        self.val_loss_arr = []
        self.val_accuracy_arr = []

        self.count_good_high_confidence = 0
        self.count_good_classification_uncertain_confidence = 0
        self.count_bad_classification = 0

    def forward(self, x):
        # x = self.relu(self.conv_11(x))
        # x = self.relu(self.conv_7(x))
        # x = self.mp(x)
        x = self.relu(self.conv1(x))
        x = self.relu(self.conv2(x))
        x = self.mp(x)
        x = self.relu(self.conv3(x))
        x = self.relu(self.conv4(x))
        x = self.mp(x)
        x = x.view(-1,50*50*64)
        x = self.relu(self.linear1(x))
        x = self.linear2(x)
        return F.log_softmax(x,dim=1)


    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        preds = torch.argmax(logits, dim=1)
        acc = self.train_accuracy(preds, y)
        acc1 = acc.item()
        loss1 = loss.item()
        self.train_accuracy1 += acc1
        self.train_loss1 += loss1
        self.count_batch_train += 1
        self.run["train/accuracy_batch"].log(acc1)
        self.run["train/loss_batch"].log(loss1)
        return loss

    def on_train_epoch_end(self):
        self.run["train/accuracy_epochs"].log(self.train_accuracy1/self.count_batch_train)
        self.run["train/loss_epochs"].log(self.train_loss1/self.count_batch_train)
        self.train_accuracy1 = 0
        self.train_loss1 = 0
        self.count_batch_train = 0

    def validation_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        preds = torch.argmax(logits, dim=1)
        acc = self.val_accuracy(preds, y)
        acc1 = acc.item()
        loss1 = loss.item()
        self.val_accuracy1 += acc1
        self.val_loss1 += loss1
        self.count_batch_val += 1
        self.run["val/accuracy_batch"].log(acc1)
        self.run["val/loss_batch"].log(loss1)
        # Calling self.log will surface up scalars for you in TensorBoard
        self.log("val/val_loss", loss, on_step=False, on_epoch=True, prog_bar=True)
        self.log("val/val_acc", self.val_accuracy, on_step=False, on_epoch=True, prog_bar=True)
        return loss

    def on_validation_epoch_end(self):
        val_acc = self.val_accuracy1/self.count_batch_val
        val_loss_temp = self.val_loss1/self.count_batch_val
        self.run["val/accuracy_epochs"].log(val_acc)
        self.run["val/loss_epochs"].log(val_loss_temp)

        self.val_loss_arr.append(val_loss_temp)
        self.val_accuracy_arr.append(val_acc)

        self.val_accuracy1 = 0
        self.val_loss1 = 0
        self.count_batch_val = 0

    def test_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = F.nll_loss(logits, y)
        logits = logits.cpu()
        y = y.cpu()
        preds = torch.argmax(logits, dim=1)
       # Compute softmax probabilities
        probs = F.softmax(logits, dim=1)
        acc = self.test_accuracy(preds, y)
        self.test_accuracy1 += acc.item()
        self.test_loss1 += loss.item()
        self.count_batch_test += 1
        self.count_good_high_confidence = 0
        self.count_good_classification_uncertain_confidence = 0
        transform1 = transforms.ToPILImage()
        for i in range(len(y)):
            self.y.append(y[i])
            self.preds.append(preds[i])
            if y[i] == preds[i]:
                if probs[i, preds[i]] > 0.95:  # High confidence good classification
                    if self.count_good_high_confidence == 2:
                        continue
                    self.count_good_high_confidence += 1
                    img_path = f"images/example_good_high_confidence_{batch_idx}_{i}.png"
                    transform1(x[i]).save(img_path)
                    self.run[f"Good classification High confidence/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
                    # display(transform1(x[i]))
                elif probs[i, preds[i]] < 0.4:  # Uncertain classification
                    if self.count_good_classification_uncertain_confidence == 2:
                        continue
                    self.count_good_classification_uncertain_confidence += 1
                    img_path = f"images/Uncertain_confidence_{batch_idx}_{i}.png"
                    transform1(x[i]).save(img_path)
                    self.run[f"Good classification Uncertain classification/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
            elif probs[i, preds[i]] > 0.4:
                self.count_bad_classification += 1
                # if self.count_bad_classification > 100:
                #     continue
                img_path = f"images/Bad_classification_{batch_idx}_{i}.png"
                transform1(x[i]).save(img_path)
                self.run[f"Bad_classification/true label: {inverted_label_mapping[y[i].item()]}, prediction: {inverted_label_mapping[preds[i].item()]}, probabilities: {probs[i, preds[i]]}, Batch_id: {batch_idx}, Example: {i}"].upload(img_path)
        self.log("test_loss", loss, on_step=False, on_epoch=True, prog_bar=True)
        self.log("test_acc", self.test_accuracy, on_step=False, on_epoch=True, prog_bar=True)
        return loss

    def on_test_end(self):
        cm = confusion_matrix(self.y, self.preds)

        plt.figure(figsize=(10, 8))
        sns.heatmap(cm, cmap='Greens', annot=True, fmt='d')
        plt.xlabel('Prediction')
        plt.ylabel('True label')
        plt.title('ASL Convolutional Model\nClassification Results on Test Set')

        # Save the confusion matrix plot
        cm_plot_path = 'confusion_matrix_plot.png'
        plt.savefig(cm_plot_path)
        plt.close()


        folder_path = "lightning_logs"
        version = os.listdir(folder_path)[-1] + '/checkpoints'
        file_path = os.listdir(folder_path+'/'+version)
        f = str(folder_path+'/'+version+'/'+file_path[0])
        self.run[f'Confusion_Matrix_Plot'].upload(cm_plot_path)
        self.run["test_accuracy"] = (self.test_accuracy1/self.count_batch_test)
        self.run["test_loss"] = (self.test_loss1/self.count_batch_test)
        self.run[f"{file_path[0]}"].upload(f)
        self.run.stop()

    def configure_optimizers(self):
        optimizer = torch.optim.Adam(self.parameters(), lr=self.learning_rate)
        return optimizer


    def train_dataloader(self):
        return self.asl_train

    def val_dataloader(self):
        return self.asl_val

    def test_dataloader(self):
        return DataLoader(self.asl_test, batch_size=self.batch_size)


2d

In [None]:
import random
from torchvision import transforms
from PIL import Image
from torchvision import transforms, datasets, utils as tv_utils

class RandomTurnOffPixels(object):
    def __init__(self, probability=0.1):
        self.probability = probability

    def __call__(self, img):
        if random.random() < self.probability:
            img = img.convert("RGB")  # Convert to RGB if not already
            img_array = img.load()
            width, height = img.size
            num_pixels_to_turn_off = int(self.probability * width * height)
            for _ in range(num_pixels_to_turn_off):
                x = random.randint(0, width - 1)
                y = random.randint(0, height - 1)
                img_array[x, y] = (0, 0, 0)  # Set pixel to black
            return img
        return img

In [None]:
correct_predictions = 0
total_predictions = 0
to_pseudo_rgb = Lambda(lambda x: x.repeat(3, 1, 1))
test_dataloader = DataLoader(datasets.ImageFolder(test_data_path, transform=transform), batch_size=64)
augmentations = [
    transforms.Compose([
        transforms.ToPILImage(),
        RandomTurnOffPixels(probability=0.2),
        transforms.Grayscale(num_output_channels=3),  # Convert to grayscale with 3 channels
        transforms.ToTensor(),
    ]),

    transforms.Compose([
        transforms.ToPILImage(),
        transforms.RandomRotation(20),
        transforms.ToTensor(),
    ])

]
x = 0
# Iterate over the test dataset
for images, labels in test_dataloader:
    # Initialize final predictions
    final_predictions = torch.zeros(len(labels), 29)

    # Iterate for the specified number of prediction runs
    for j in range(2):
        # Perform predictions

        aug_index = j % len(augmentations)
        with torch.no_grad():
            aug_images = torch.stack([augmentations[aug_index](image) for image in images])
            preds = kf_model(aug_images)

        # Accumulate predictions
        final_predictions += preds
        if x == 0:
            x += 1
    # Average predictions
    final_predictions /= 2

    # Calculate accuracy
    predicted_labels = torch.argmax(final_predictions, dim=1)

    correct_predictions += (predicted_labels == labels).sum().item()
    total_predictions += len(labels)

    # Display the augmented images


# Calculate accuracy
accuracy = correct_predictions / total_predictions
print("Accuracy:",accuracy)

2e

In [None]:
max_acc = 0
fold_num = 5
ensemble_models = []
test_accuracy = []
test_loss = []
val_accuracy = []
val_loss = []
kf = StratifiedKFold(n_splits=fold_num, shuffle=True)
for i, (train_index, val_index) in enumerate(kf.split(train_dataset,train_dataset.targets)):



    train_subset = torch.utils.data.Subset(train_dataset, train_index)
    val_subset = torch.utils.data.Subset(train_dataset, val_index)

    train_loader = torch.utils.data.DataLoader(train_subset, batch_size=batch_size,shuffle=True)
    val_loader = torch.utils.data.DataLoader(val_subset, batch_size=batch_size, shuffle=True)

    kf_model = CNN_ASL(train_loader, val_loader, num_classes=30)
    # Define EarlyStopping callback
    early_stop_callback = EarlyStopping(
        monitor='val/val_loss',  # Metric to monitor for improvement
        min_delta=0.001,      # Minimum change in the monitored metric to qualify as improvement
        patience=3,           # Number of epochs with no improvement after which training will be stopped
        verbose=True,         # Print message when training is stopped due to early stopping
        mode='min'            # 'min' or 'max': whether the monitored metric should be minimized or maximized
    )

    # Initialize Trainer with EarlyStopping callback
    trainer = pl.Trainer(
        accelerator="auto",
        max_epochs=50,
        callbacks=[early_stop_callback]  # Pass the EarlyStopping callback to the Trainer
    )
    trainer.fit(kf_model)
    trainer.test(kf_model)

    ensemble_models.append(kf_model)
    # Calculate test accuracy and test loss

    test_accuracy.append(kf_model.test_accuracy1/kf_model.count_batch_test)
    test_loss.append(kf_model.test_loss1/kf_model.count_batch_test)
    val_accuracy.append(kf_model.val_accuracy_arr)
    val_loss.append(kf_model.val_loss_arr)



## ***Task 3.a.b.c- Transfer learning model***

### Methods

In [None]:
# Function to save model parameters
def save_model(model, epoch, step, directory):
    if not os.path.exists(directory):
        os.makedirs(directory)
    filename = f"epoch={epoch}-step={step}.ckpt"
    filepath = os.path.join(directory, filename)
    torch.save(model.state_dict(), filepath)
    return filepath

In [None]:
batch_size = 64

# Define the transformations for our data
data_transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

test_data_path = 'archive/test'
# train_path = 'C:\\Users\\nadav\\PycharmProjects\\assignment1\\pythonProject7\\archive\\asl_alphabet_subset'
train_path = 'C:\\Users\\nadav\\Downloads\\Final_Data_Full\\Final_Data_Full\\train'

train_dataset = datasets.ImageFolder(train_path, transform=data_transform)
test_dataset = datasets.ImageFolder(test_data_path, transform=data_transform)

dataset_size = len(train_dataset)
train_size = int(dataset_size * 0.8)
val_size = dataset_size - train_size
train_data, train_val = random_split(train_dataset, [train_size, val_size])
train_loader  = DataLoader(train_data, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(train_val, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size, shuffle=True)

In [None]:
def freeze_layers(model):
    for param in model.parameters():
        param.requires_grad = False

In [None]:
# Replace the last layer function
def replace_last_layer(model, num_classes=29):
    # Check if the model has a classifier
    if hasattr(model, 'classifier'):
      if isinstance(model.classifier, nn.Sequential):
        in_features = model.classifier[-1].in_features
        model.classifier[-1] = nn.Linear(in_features, num_classes)
      else:
        in_features = model.classifier.in_features
        model.classifier = nn.Linear(in_features, num_classes)

    # Check if the model has an fc layer
    elif hasattr(model, 'fc'):
        in_features = model.fc.in_features
        model.fc = nn.Linear(in_features, num_classes)

    else:
        raise ValueError("Unsupported model architecture")

    return model

In [None]:
# Function to calculate validation metrics
def calculate_validation_metrics(model, val_loader, criterion, device):
    model.eval()
    val_loss = 0
    val_correct = 0
    val_total = 0

    with torch.no_grad():
        for images, labels in val_loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            loss = criterion(outputs, labels)
            val_loss += loss.item()
            _, predicted = torch.max(outputs.data, 1)
            val_total += labels.size(0)
            val_correct += (predicted == labels).sum().item()

    val_accuracy = val_correct / val_total if val_total != 0 else 0

    return val_loss / len(val_loader), val_accuracy

In [None]:
def plot_metrics(model):
    # Assuming you have lists for validation and test losses and accuracies
    epochs = range(1, len(model.train_loss1_arr) + 1)

    # Create two subplots (one for loss and one for accuracy)
    fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True, figsize=(10, 8))

    # Plot the training and validation losses
    ax1.plot(epochs, model.train_loss1_arr, label='Training Loss', marker='o', linestyle='-')
    ax1.plot(epochs, model.val_loss1_arr, label='Validation Loss', marker='o', linestyle='-')
    ax1.set_ylabel('Loss')
    ax1.set_title('Training and Validation Loss vs. Epoch')
    ax1.legend()

    # Plot the training and validation accuracies
    ax2.plot(epochs, model.train_accuracy1_arr, label='Training Accuracy', marker='s', linestyle='--', color='red')
    ax2.plot(epochs, model.val_accuracy1_arr, label='Validation Accuracy', marker='s', linestyle='--', color='orange')
    ax2.set_xlabel('Epoch')
    ax2.set_ylabel('Accuracy')
    ax2.set_title('Training and Validation Accuracy vs. Epoch')
    ax2.legend()


            # Save the confusion matrix plot
    # Save the plots
    plot_path = 'metrics_plot.png'
    plt.savefig(plot_path)
    plt.close()
    run["EX.1/metrics_plot"].upload(plot_path)

    plt.show()

In [None]:
def print_errors_images(model, loader, device):
    model.eval()
    error_images = []
    true_labels = []
    predicted_labels = []

    with torch.no_grad():
        for images, labels in loader:
            images, labels = images.to(device), labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)

            incorrect_indices = (predicted != labels).nonzero()
            for index in incorrect_indices:
                error_images.append(images[index].cpu().numpy())
                true_labels.append(labels[index].item())
                predicted_labels.append(predicted[index].item())

    return error_images, true_labels, predicted_labels

In [None]:
# Function to fine-tune the model
def fine_tune_model(model, train_loader, val_loader, test_loader, num_epochs=10, learning_rate=0.001):
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)

    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)

    # Create a table
    table = PrettyTable()
    table.field_names = ["Epoch", "Validation Loss", "Validation Accuracy", "Test Loss", "Test Accuracy", "# Unique Correct Samples", "# Unique Errors"]

    # Lists to store training and validation metrics
    model.train_loss1_arr = []
    model.val_loss1_arr = []
    model.train_accuracy1_arr = []
    model.val_accuracy1_arr = []

    for epoch in range(num_epochs):
        model.train()
        total_correct = 0
        total_samples = 0
        train_loss = 0

        for images, labels in train_loader:
            images, labels = images.to(device), labels.to(device)

            optimizer.zero_grad()
            outputs = model(images)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            train_loss += loss.item()

            _, predicted = torch.max(outputs.data, 1)
            total_samples += labels.size(0)
            total_correct += (predicted == labels).sum().item()

        train_accuracy = total_correct / total_samples if total_samples != 0 else 0

        # Evaluate the model on the validation set
        model.eval()
        # Calculate validation loss and accuracy
        val_loss, val_accuracy = calculate_validation_metrics(model, val_loader, criterion, device)

        # Print error images for validation set
        error_images_val, true_labels_val, predicted_labels_val = print_errors_images(model, val_loader, device)

        # Append values to lists
        model.train_loss1_arr.append(train_loss / len(train_loader))
        model.val_loss1_arr.append(val_loss / len(val_loader))
        model.train_accuracy1_arr.append(train_accuracy)
        model.val_accuracy1_arr.append(val_accuracy)

        val_correct = 0
        val_total = 0
        unique_correct_samples_val = set()
        unique_errors_val = set()
        val_loss = 0
        with torch.no_grad():
            for images, labels in val_loader:
                images, labels = images.to(device), labels.to(device)
                outputs = model(images)
                loss = criterion(outputs, labels)
                val_loss += loss.item()
                _, predicted = torch.max(outputs.data, 1)
                val_total += labels.size(0)
                val_correct += (predicted == labels).sum().item()

                unique_correct_samples_val.update(labels[predicted == labels].cpu().numpy())
                unique_errors_val.update(labels[predicted != labels].cpu().numpy())

        val_accuracy = val_correct / val_total if val_total != 0 else 0

        # Evaluate the model on the test set
        model.eval()
        test_correct = 0
        test_total = 0
        unique_correct_samples_test = set()
        unique_errors_test = set()
        test_loss = 0
        with torch.no_grad():
            for images, labels in test_loader:
                images, labels = images.to(device), labels.to(device)
                outputs = model(images)
                loss = criterion(outputs, labels)
                test_loss += loss.item()
                _, predicted = torch.max(outputs.data, 1)
                test_total += labels.size(0)
                test_correct += (predicted == labels).sum().item()

                unique_correct_samples_test.update(labels[predicted == labels].cpu().numpy())
                unique_errors_test.update(labels[predicted != labels].cpu().numpy())

        test_accuracy = test_correct / test_total if test_total != 0 else 0

        # Print the names of unique errors (images)
        # print("Unique Errors:")
        # for error in unique_errors_test:
        #     print(error)

        # Append information to the table
        table.add_row([epoch + 1, val_loss / len(val_loader), val_accuracy, test_loss / len(test_loader), test_accuracy, len(unique_correct_samples_test), len(unique_errors_test)])

        print("Done", epoch + 1, "/", num_epochs, "epochs")

    # Plot the training and validation metrics
    plot_metrics(model)

    # Save and upload table to Neptune
    table_txt = str(table)
    table_path = 'table.txt'
    with open(table_path, 'w') as f:
        f.write(table_txt)
    if run is not None:
        run["EX.1/Table"].upload(table_path)

    model_path = save_model(model, num_epochs, len(train_loader) * num_epochs, 'model_checkpoints')
    if run is not None:
        run["EX.1/Model"].upload(model_path)
    print(table)

### EX.1 - ResNet

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.001 <br>
**Without using freezing**

In [None]:
run = neptune.init_run(
    project="nadavcherry/dp1",
    api_token="eyJhcGlfYWRkcmVzcyI6Imh0dHBzOi8vYXBwLm5lcHR1bmUuYWkiLCJhcGlfdXJsIjoiaHR0cHM6Ly9hcHAubmVwdHVuZS5haSIsImFwaV9rZXkiOiIyNTVhYzkxZC1jOTc3LTQ4ZjYtOGFhZC00MzljZmVlOGFhYWEifQ==", # your credentials
)

# 1. Load a pre-trained ResNet18 model
resnet_model_1 = models.resnet18(pretrained=True)

# 2. Replace the last layer for the new task
resnet_model_1 = replace_last_layer(resnet_model_1)

# 3. Fine-tune the model on your dataset
fine_tune_model(resnet_model_1, train_loader, val_loader, test_loader)

### EX.2 - ResNet

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.0001 <br>
**Without using freezing**

In [None]:
# 1. Load a pre-trained ResNet18 model
resnet_model_2 = models.resnet18(pretrained=True)

# 2. Replace the last layer for the new task
resnet_model_2 = replace_last_layer(resnet_model_2)

# 3. Fine-tune the model on your dataset
fine_tune_model(resnet_model_2, train_loader, val_loader, test_loader, num_epochs=10, learning_rate=0.0001)

### EX.3 - ResNet

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.0001 <br>
**Using freezing**

In [None]:
# 1. Load a pre-trained ResNet18 model
resnet_model_3 = models.resnet18(pretrained=True)

# 2. Freeze layers
freeze_layers(resnet_model_3)

# 3. Replace the last layer for the new task
resnet_model_3 = replace_last_layer(resnet_model_3)

# Set the requires_grad attribute for the parameters of the last layer to True
for param in resnet_model_3.fc.parameters():
    param.requires_grad = True

# 4. Fine-tune the model on your dataset
fine_tune_model(resnet_model_3, train_loader, val_loader, test_loader, num_epochs=10, learning_rate=0.0001)

### EX.4 - ResNet

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.001 <br>
**Using freezing**

In [None]:
# 1. Load a pre-trained ResNet18 model
resnet_model_4 = models.resnet18(pretrained=True)

# 2. Freeze layers
freeze_layers(resnet_model_4)

# 3. Replace the last layer for the new task
resnet_model_4 = replace_last_layer(resnet_model_4)

# Set the requires_grad attribute for the parameters of the last layer to True
for param in resnet_model_4.fc.parameters():
    param.requires_grad = True

# 4. Fine-tune the model on your dataset
fine_tune_model(resnet_model_4, train_loader, val_loader, test_loader)

### EX.5 - ResNet

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.0005 <br>
**Using freezing**

In [None]:
# 1. Load a pre-trained ResNet18 model
resnet_model_5 = models.resnet18(pretrained=True)

# 2. Freeze layers
freeze_layers(resnet_model_5)

# 3. Replace the last layer for the new task
resnet_model_5 = replace_last_layer(resnet_model_5)

# Set the requires_grad attribute for the parameters of the last layer to True
for param in resnet_model_5.fc.parameters():
    param.requires_grad = True

# 4. Fine-tune the model on your dataset
fine_tune_model(resnet_model_5, train_loader, val_loader, test_loader, num_epochs=10, learning_rate=0.0005)

### EX.6 - ResNet

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.001 <br>
**With Calculate the mean and std for the normalization** <br>
**Using freezing**

In [None]:
# Calculate the mean and std
# --------------------------

# Assuming 'train_loader' is your data loader
mean = 0.0
std = 0.0
total_images = 0

for images, _ in train_loader:
    batch_size = images.size(0)
    images = images.view(batch_size, images.size(1), -1)
    mean += images.mean(2).sum(0)
    std += images.std(2).sum(0)
    total_images += batch_size

mean /= total_images
std /= total_images

# Convert mean and std to arrays
mean_array = mean.numpy()
std_array = std.numpy()

print("Calculated Mean:", mean)
print("Calculated Std Dev:", std)

In [None]:
# Define the transformations for our data
data_transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

test_data_path = 'archive/test'
# train_path = 'C:\\Users\\nadav\\PycharmProjects\\assignment1\\pythonProject7\\archive\\asl_alphabet_subset'
train_path = 'C:\\Users\\nadav\\Downloads\\Final_Data_Full\\Final_Data_Full\\train'

train_dataset = datasets.ImageFolder(train_path, transform=data_transform)
test_dataset = datasets.ImageFolder(test_data_path, transform=data_transform)

dataset_size = len(train_dataset)
train_size = int(dataset_size * 0.8)
val_size = dataset_size - train_size
train_data, train_val = random_split(train_dataset, [train_size, val_size])
train_loader  = DataLoader(train_data, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(train_val, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size, shuffle=True)

In [None]:
# 1. Load a pre-trained ResNet18 model
resnet_model_6 = models.resnet18(pretrained=True)

# 2. Freeze layers
freeze_layers(resnet_model_6)

# 3. Replace the last layer for the new task
resnet_model_6 = replace_last_layer(resnet_model_6)

# Set the requires_grad attribute for the parameters of the last layer to True
for param in resnet_model_6.fc.parameters():
    param.requires_grad = True

# 4. Fine-tune the model on your dataset
fine_tune_model(resnet_model_6, train_loader, val_loader, test_loader)

### EX.7 - AlexNet

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.001 <br>
**Using freezing**

In [None]:
batch_size = 128

# Define the transformations for our data
data_transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

test_data_path = 'archive/test'
# train_path = 'C:\\Users\\nadav\\PycharmProjects\\assignment1\\pythonProject7\\archive\\asl_alphabet_subset'
train_path = 'C:\\Users\\nadav\\Downloads\\Final_Data_Full\\Final_Data_Full\\train'

train_dataset = datasets.ImageFolder(train_path, transform=data_transform)
test_dataset = datasets.ImageFolder(test_data_path, transform=data_transform)

dataset_size = len(train_dataset)
train_size = int(dataset_size * 0.8)
val_size = dataset_size - train_size
train_data, train_val = random_split(train_dataset, [train_size, val_size])
train_loader  = DataLoader(train_data, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(train_val, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size, shuffle=True)

In [None]:
# 1. Load a pre-trained AlexNet model
alexnet_model_1 = models.alexnet(pretrained=True)

# 2. Freeze layers
freeze_layers(alexnet_model_1)

# 3. Replace the last layer for the new task
alexnet_model_1 = replace_last_layer(alexnet_model_1)

# Set the requires_grad attribute for the parameters of the last layer to True
for param in alexnet_model_1.classifier[6].parameters():
    param.requires_grad = True

# 4. Fine-tune the model on your dataset
fine_tune_model(alexnet_model_1, train_loader, val_loader, test_loader)

### EX.8 - DenseNet

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.001 <br>
**Using freezing**

In [None]:
# 1. Load a pre-trained DenseNet model
densenet_model_1 = models.densenet121(pretrained=True)

# 2. Freeze layers
freeze_layers(densenet_model_1)

# 3. Replace the last layer for the new task
densenet_model_1 = replace_last_layer(densenet_model_1)

# Set the requires_grad attribute for the parameters of the last layer to True
for param in densenet_model_1.classifier.parameters():
    param.requires_grad = True

# 4. Fine-tune the model on your dataset
fine_tune_model(densenet_model_1, train_loader, val_loader, test_loader, ex=8)

### EX.9 - VGG16

batch_size = 64 <br>
num_epochs=10 <br>
learning_rate=0.001 <br>
**Using freezing**

In [None]:
# 1. Load a pre-trained DenseNet model
vgg_model_1 = models.vgg16(pretrained=True)

# 2. Freeze layers
freeze_layers(vgg_model_1)

# 3. Replace the last layer for the new task
vgg_model_1 = replace_last_layer(vgg_model_1)

# Set the requires_grad attribute for the parameters of the last layer to True
for param in vgg_model_1.classifier.parameters():
    param.requires_grad = True

# 4. Fine-tune the model on your dataset
fine_tune_model(vgg_model_1, train_loader, val_loader, test_loader, ex=9)

### EX.10 - AlexNet

batch_size = 128 <br>
num_epochs=10 <br>
learning_rate=0.001 <br>
**Using freezing**

In [None]:
# Define the transformations for our data
data_transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean, std)
])

test_data_path = 'archive/test'
# train_path = 'C:\\Users\\nadav\\PycharmProjects\\assignment1\\pythonProject7\\archive\\asl_alphabet_subset'
train_path = 'C:\\Users\\nadav\\Downloads\\Final_Data_Full\\Final_Data_Full\\train'

train_dataset = datasets.ImageFolder(train_path, transform=data_transform)
test_dataset = datasets.ImageFolder(test_data_path, transform=data_transform)

dataset_size = len(train_dataset)
train_size = int(dataset_size * 0.8)
val_size = dataset_size - train_size
train_data, train_val = random_split(train_dataset, [train_size, val_size])
train_loader  = DataLoader(train_data, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(train_val, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size, shuffle=True)

In [None]:
# 1. Load a pre-trained AlexNet model
alexnet_model_2 = models.alexnet(pretrained=True)

# 2. Freeze layers
freeze_layers(alexnet_model_2)

# 3. Replace the last layer for the new task
alexnet_model_2 = replace_last_layer(alexnet_model_2)

# Set the requires_grad attribute for the parameters of the last layer to True
for param in alexnet_model_2.classifier[6].parameters():
    param.requires_grad = True

# 4. Fine-tune the model on your dataset
fine_tune_model(alexnet_model_2, train_loader, val_loader, test_loader)

## **Task 3.d - Classic ML model using Feature extraction**


In [None]:
# Remove the last fully connected layer (classification layer)
def remove_last_layer(model):
    # Replace the last fully connected layer with an identity layer
    if isinstance(model, models.ResNet):
        model.fc = nn.Identity()
    else:
        raise ValueError("Unsupported model architecture")
    return model

resnet_model = remove_last_layer(resnet_model)

# Print the modified model to verify the changes
print(resnet_model)

In [None]:
# Extract features from the last layer before the removed one
def extract_features(model, dataloader):
    features = []
    labels = []
    model.eval()
    with torch.no_grad():
        for images, targets in dataloader:
            outputs = model(images)
            features.append(outputs)
            labels.extend(targets)
    features = torch.cat(features, dim=0)
    return features, labels

train_features, train_labels = extract_features(resnet_model, train_loader)
test_features, test_labels = extract_features(resnet_model, test_loader)

In [None]:
# Train logistic regression classifier
logistic_regression_model = LogisticRegression(max_iter=1000)
logistic_regression_model.fit(train_features, train_labels)

In [None]:
# Evaluate logistic regression classifier
train_predictions = logistic_regression_model.predict(train_features)
train_accuracy = accuracy_score(train_labels, train_predictions)
print("Train Accuracy:", train_accuracy)

test_predictions = logistic_regression_model.predict(test_features)
test_accuracy = accuracy_score(test_labels, test_predictions)
print("Test Accuracy:", test_accuracy)