Importing necessary libraries

In [51]:
import torch
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

Setting manuel_seed to have consistent results.

In [52]:
torch.manual_seed(42) 

<torch._C.Generator at 0x1fc6ce36030>

Load the test dataset which we have created in ImagepredictionCNN.ipynb file.

In [53]:
test_data  = torch.load('test_dataset.pth')

  test_data  = torch.load('test_dataset.pth')


To have initial scores, I only changed the size of images and convert them to tensors.

In [54]:
test_transform = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ToTensor(),
])

Applying transform.

In [55]:
test_data.dataset.transform = test_transform

Loading data by using Pytorch.DataLoader.

In [56]:
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)

Like we used when training cnn, we need to create the class again. But we only need its structure, we will not train again.

In [57]:
class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 20, 3), # 128 - 3 + 1 = 126, After each convolution, the size of the image is reduced by 2 pixels
            nn.BatchNorm2d(20), # Normalize the output of the convolutional layer
            nn.ReLU(), # Activation function
            nn.Conv2d(20, 40, 3), # 126 - 3 + 1 = 124, Again the size of image is reduced by 2 pixels
            nn.BatchNorm2d(40), 
            nn.ReLU(),
            nn.MaxPool2d(2, 2), # 124 / 2 = 62, Max pooling reduces the size of the image by half
            nn.Conv2d(40, 80, 3), # 62 - 3 + 1 = 60
            nn.BatchNorm2d(80),
            nn.ReLU(),
            nn.Conv2d(80, 160, 3), # 60 - 3 + 1 = 58
            nn.BatchNorm2d(160),
            nn.ReLU(),
            nn.MaxPool2d(2, 2) # 58 / 2 = 29
        )
        self.classifier = nn.Sequential(
            nn.Linear(160, 200), # We gave 160 as input because adaptive_avg_pool2d will return 160 features
            nn.ReLU(), # Activation function
            nn.Dropout(0.5), # Randomly drop 50% of the connections, which helps to prevent overfitting
            nn.Linear(200, 100),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(100, 10) # Because we have 10 classes, the last layer has 10 neurons
        )
    # Forward pass
    def forward(self, x):
        x = self.features(x) # Extract features
        x = F.adaptive_avg_pool2d(x, (1, 1)) # Average pooling, the size of the image is reduced to 1x1
        x = torch.flatten(x, 1) # Flatten the output of the convolutional layers
        x = self.classifier(x) # Classify the image
        return x # Return the output

If cuda available, set device to cuda.

In [58]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Assigning our model to available device.

In [59]:
cnn = CNN().to(device)

If we want to go further training, we need to have criterion and optimizer again.

In [60]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(cnn.parameters(), lr=0.001)

By using Pytorch.load() function, we can load pretrained model. 

In [61]:
checkpoint = torch.load("cnn_trained.pth", map_location='cuda')
cnn.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
start_epoch = checkpoint['epoch'] 


  checkpoint = torch.load("cnn_trained.pth", map_location='cuda')


Like I used in training, I created same function here. The comments and functions almost same to previous one. The only difference it takes dataloader as parameter.

In [62]:
def get_scores(model, test_dataloader):
    # Get class labels
    classes = list(test_data.dataset.class_to_idx.keys())
    # Initialize the correct and total predictions
    correct_prediction = 0
    total_prediction = 0
    # Prepare to count predictions for each class
    correct_pred = {classname: 0 for classname in classes}
    total_pred = {classname: 0 for classname in classes}

    # Because we are not evaluating the model, we need to use torch.no_grad() function.
    with torch.no_grad():
        # Get the data from the test dataloader
        for data in test_dataloader:
            # Get images and labels and assign it to device
            images, labels = data
            images, labels = images.to(device), labels.to(device)
            
            # Predict the images
            outputs = model(images)

            # Get the maximum value of the predictions
            _, predictions = torch.max(outputs, 1)

            # Collect the correct predictions for each class
            total_prediction += labels.size(0)
            correct_prediction += (predictions == labels).sum().item()

            # Assign the correct predictions for each class
            for label, prediction in zip(labels, predictions):
                if label == prediction:
                    correct_pred[classes[label.item()]] += 1
                total_pred[classes[label.item()]] += 1

    # Print the accuracy of cnn.
    print(f'Accuracy of the network on the {total_prediction} test images: {100 * correct_prediction / total_prediction:.2f} %')
    
    # Print accuracy for each class
    for classname, correct_count in correct_pred.items():
        accuracy = 100 * float(correct_count) / total_pred[classname]
        print(f'Accuracy for class: {classname:5s} is {accuracy:.1f} %')

To see whether our model loaded correctly, I run get_scores function again.

In [71]:
get_scores(cnn, test_dataloader)

Accuracy of the network on the 1300 test images: 77.23 %
Accuracy for class: collie is 76.9 %
Accuracy for class: dolphin is 93.1 %
Accuracy for class: elephant is 76.2 %
Accuracy for class: fox   is 60.0 %
Accuracy for class: giant+panda is 89.2 %
Accuracy for class: moose is 77.7 %
Accuracy for class: polar+bear is 93.1 %
Accuracy for class: rabbit is 56.9 %
Accuracy for class: sheep is 80.0 %
Accuracy for class: squirrel is 69.2 %


As we can see, this is our model trained model score with no further image manipulation. Let's manipulate test_images to see how our model performing.

The first test is by decreasing contrast. 

In [72]:
decrease_contrast = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ColorJitter(brightness=1, contrast=0.3, saturation=1, hue=0.5),
    transforms.ToTensor(),
])

In [73]:
contrast_test = torch.load('test_dataset.pth')

  contrast_test = torch.load('test_dataset.pth')


In [74]:
contrast_test.dataset.transform = decrease_contrast

In [77]:
contrast_dataloader = DataLoader(contrast_test, batch_size=64, shuffle=True)

In [78]:
get_scores(cnn, contrast_dataloader)

Accuracy of the network on the 1300 test images: 33.00 %
Accuracy for class: collie is 28.5 %
Accuracy for class: dolphin is 23.8 %
Accuracy for class: elephant is 33.1 %
Accuracy for class: fox   is 20.0 %
Accuracy for class: giant+panda is 36.2 %
Accuracy for class: moose is 40.0 %
Accuracy for class: polar+bear is 58.5 %
Accuracy for class: rabbit is 32.3 %
Accuracy for class: sheep is 32.3 %
Accuracy for class: squirrel is 25.4 %


As we can see, decreasing contrast significantly dropped our model performance. Let's change brightness this time.

In [79]:
decrease_brightness = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ColorJitter(brightness=0.3, contrast=1, saturation=1, hue=0.5),
    transforms.ToTensor(),
])

In [80]:
brightness_test = torch.load('test_dataset.pth')

  brightness_test = torch.load('test_dataset.pth')


In [81]:
brightness_test.dataset.transform = decrease_brightness

In [82]:
brightness_dataloader = DataLoader(brightness_test, batch_size=64, shuffle=True)

In [83]:
get_scores(cnn, brightness_dataloader)

Accuracy of the network on the 1300 test images: 29.77 %
Accuracy for class: collie is 34.6 %
Accuracy for class: dolphin is 23.1 %
Accuracy for class: elephant is 30.8 %
Accuracy for class: fox   is 18.5 %
Accuracy for class: giant+panda is 27.7 %
Accuracy for class: moose is 40.0 %
Accuracy for class: polar+bear is 44.6 %
Accuracy for class: rabbit is 28.5 %
Accuracy for class: sheep is 27.7 %
Accuracy for class: squirrel is 22.3 %


As we can see again, our model performance decrease dramatically. Let's run for lower saturation this time.

In [84]:
decrease_saturation = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ColorJitter(brightness=1, contrast=1, saturation=0.3, hue=0.5),
    transforms.ToTensor(),
])

In [85]:
saturation_test = torch.load('test_dataset.pth')

  saturation_test = torch.load('test_dataset.pth')


In [86]:
saturation_test.dataset.transform = decrease_saturation

In [87]:
saturation_dataloader = DataLoader(saturation_test, batch_size=64, shuffle=True)

In [88]:
get_scores(cnn, saturation_dataloader)

Accuracy of the network on the 1300 test images: 26.38 %
Accuracy for class: collie is 29.2 %
Accuracy for class: dolphin is 22.3 %
Accuracy for class: elephant is 23.8 %
Accuracy for class: fox   is 16.9 %
Accuracy for class: giant+panda is 16.2 %
Accuracy for class: moose is 33.8 %
Accuracy for class: polar+bear is 52.3 %
Accuracy for class: rabbit is 32.3 %
Accuracy for class: sheep is 20.0 %
Accuracy for class: squirrel is 16.9 %


As we can see in all three case, decreasing image light properties gives significant drop in model performance. Let's run tests for increasing extreme values.

In [89]:
increase_contrast = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ColorJitter(brightness=1, contrast=3, saturation=1, hue=0.5),
    transforms.ToTensor(),
])

In [90]:
increase_contrast_test = torch.load('test_dataset.pth')

  increase_contrast_test = torch.load('test_dataset.pth')


In [91]:
increase_contrast_test.dataset.transform = increase_contrast

In [92]:
increase_contrast_dataloader = DataLoader(increase_contrast_test, batch_size=64, shuffle=True)

In [93]:
get_scores(cnn, increase_contrast_dataloader)

Accuracy of the network on the 1300 test images: 21.38 %
Accuracy for class: collie is 21.5 %
Accuracy for class: dolphin is 21.5 %
Accuracy for class: elephant is 20.0 %
Accuracy for class: fox   is 12.3 %
Accuracy for class: giant+panda is 15.4 %
Accuracy for class: moose is 21.5 %
Accuracy for class: polar+bear is 42.3 %
Accuracy for class: rabbit is 19.2 %
Accuracy for class: sheep is 20.8 %
Accuracy for class: squirrel is 19.2 %


To make conclusion, our model has poor performance for different light types. Which indicates our model may overfitted to train dataset. To get better performance even in different light situations, we need to train model with different light source types, create new data augmentation properties etc.

## Color Consistency Algorithm

First, I am defining function to apply gray_world algorithm.

In [94]:
def apply_gray_world(image):
    image = image * 255.0  # Convert to [0, 255] range

    # Calculate the average color per channel (B, G, R)
    avg_b, avg_g, avg_r = image[0].mean(), image[1].mean(), image[2].mean()
    
    # Compute the gray value and scaling factors
    gray_value = (avg_b + avg_g + avg_r) / 3
    scaling_factors = torch.tensor([gray_value / avg_b, gray_value / avg_g, gray_value / avg_r])
    
    # Apply scaling factors to each channel (B, G, R)
    corrected_image = image * scaling_factors.view(3, 1, 1)  # Broadcast scaling factors

    # Ensure the result stays within the valid range and return it
    corrected_image = torch.clamp(corrected_image, 0, 255) / 255.0
    return corrected_image

After function definition, I created class, which help us to run the function inside Pytorch.transforms.Compose()

In [95]:
class GrayWorldTransform:
    def __call__(self, image):
        return apply_gray_world(image)

After all setup, we are good to go to apply gray_world algorithm.

In [96]:
transform = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.ToTensor(),
    GrayWorldTransform()
])

To see how algorithm works, I apply the transformation to dataset which we are already have: increase_contrast_test. In this dataset, we changed the contrast and got 21% correct guess. 

In [100]:
increase_contrast_test.dataset.transform = transform # Apply the Gray World transform

In [101]:
increase_contrast_dataloader = DataLoader(increase_contrast_test, batch_size=64, shuffle=True) # Loading the data

In [102]:
get_scores(cnn, increase_contrast_dataloader)

Accuracy of the network on the 1300 test images: 67.69 %
Accuracy for class: collie is 72.3 %
Accuracy for class: dolphin is 76.9 %
Accuracy for class: elephant is 66.9 %
Accuracy for class: fox   is 44.6 %
Accuracy for class: giant+panda is 80.8 %
Accuracy for class: moose is 70.8 %
Accuracy for class: polar+bear is 90.0 %
Accuracy for class: rabbit is 47.7 %
Accuracy for class: sheep is 67.7 %
Accuracy for class: squirrel is 59.2 %


As we can see with results, our model performed pretty good again. This indicates that our model getting significant gain by using Gray World algorithm.

Lets see how our scores changing.

In [104]:
contrast_test.dataset.transform = transform # Apply gray world algorithm

In [105]:
contrast_dataloader = DataLoader(contrast_test, batch_size=64, shuffle=True) # Load the data

In [106]:
get_scores(cnn, contrast_dataloader)

Accuracy of the network on the 1300 test images: 66.92 %
Accuracy for class: collie is 73.8 %
Accuracy for class: dolphin is 76.9 %
Accuracy for class: elephant is 67.7 %
Accuracy for class: fox   is 44.6 %
Accuracy for class: giant+panda is 80.0 %
Accuracy for class: moose is 68.5 %
Accuracy for class: polar+bear is 86.2 %
Accuracy for class: rabbit is 43.8 %
Accuracy for class: sheep is 66.2 %
Accuracy for class: squirrel is 61.5 %


Again the result are increased significantly increased again.

Finally, I want to apply gray world algorithm to our original dataset. To see is there any increase in performance.

In [107]:
test_data.dataset.transform = transform # Apply gray world algorithm

In [108]:
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True) # Load the data

In [109]:
get_scores(cnn, test_dataloader)

Accuracy of the network on the 1300 test images: 69.08 %
Accuracy for class: collie is 76.9 %
Accuracy for class: dolphin is 76.9 %
Accuracy for class: elephant is 71.5 %
Accuracy for class: fox   is 45.4 %
Accuracy for class: giant+panda is 80.8 %
Accuracy for class: moose is 70.8 %
Accuracy for class: polar+bear is 87.7 %
Accuracy for class: rabbit is 49.2 %
Accuracy for class: sheep is 65.4 %
Accuracy for class: squirrel is 66.2 %


Our model's initial score was better.

### To conclusion, our model has good prediction score if the images are good. If the images are not like wanted style, we can apply gray world algorithm and get good predictions.