<a href="https://colab.research.google.com/github/dhiksha08/Sonar-Image-Classification-using-NST/blob/main/NST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Importing the necessary libraries

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from PIL import Image
import matplotlib.pyplot as plt
import torchvision.transforms as transforms
import torchvision.models as models
import copy

Checking for GPU Availability
If yes we keep the size as 512 or trim it down to 128 as CPU is slower comparitively

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
imsize = 512 if torch.cuda.is_available() else 128
print(imsize)

512


This is used to apply transformations on images such as resizing it and converting it to Tensor data type

In [None]:
loader = transforms.Compose([transforms.Resize((imsize,imsize)), transforms.ToTensor()])

Deep Learning Frameworks like Pytorch expect input in the type of batches for parallel processing.

So we are adding another dimension using unsqueeze, to make it as a batch of size 1. Also the datatype of the torch is converted to float.

[channels, height, width], now has dimensions [1, channels, height, width] which is the expected input

In [None]:
def image_loader(image_name):
    image = Image.open(image_name)
    image = loader(image).unsqueeze(0)
    return image.to(device, torch.float)

Getting both the content and style images and performing the required transformations on it

In [None]:
style_img = image_loader("style.jpeg")
content_img = image_loader("content.jpeg")

Both style and content images are of the type tensor

In [None]:
print(style_img.shape,content_img.shape)

torch.Size([1, 3, 512, 512]) torch.Size([1, 3, 512, 512])


Ensuring that both images are of the same size

In [None]:
assert style_img.size() == content_img.size()

### **Style loss is obtained from all the layers whereas content loss is obtained from higher layers. It goes into the deepest of layers to make sure that there is a visible difference between the style image and the generated image.**

The below class is a subclass of nn.Module . The below code is used for extending the pytorch Functionalities which here is used to compute the Content loss using Mean Squared Error

In [None]:
class ContentLoss(nn.Module):
    def __init__(self, target):
        super(ContentLoss, self).__init__()
        self.target = target.detach()
    def forward(self, input):
        self.loss = nn.functional.mse_loss(input, self.target)
        return input

Gram matrix is a measure of style, capturing the correlations between features. The below function computes the Gram matrix of a given tensor.

In [None]:
def gram_matrix(input):
    a, b, c, d = input.size()
    features = input.view(a * b, c * d)
    G = torch.mm(features, features.t())
    return G.div(a * b * c * d)

This class defines the style loss. It calculates the MSE loss between the Gram matrix of the input and the target style. The target style is calculated using the gram_matrix function.

In [None]:
class StyleLoss(nn.Module):
    def __init__(self, target_feature):
        super(StyleLoss, self).__init__()
        self.target = gram_matrix(target_feature).detach()
    def forward(self, input):
        G = gram_matrix(input)
        self.loss = F.mse_loss(G, self.target)
        return input

### **VGG is a classical convolutional neural network architecture. It was based on an analysis of how to increase the depth of such networks.**

Loads a pre-trained VGG19 model from torchvision, moves it to the selected device, and sets it to evaluation mode.

In [None]:
cnn = models.vgg19(pretrained=True).features.to(device).eval()

Define the mean and standard deviation values used for normalizing inputs to the VGG network.

In [None]:
cnn_normalization_mean = torch.tensor([0.485, 0.456, 0.406]).to(device)
cnn_normalization_std = torch.tensor([0.229, 0.224, 0.225]).to(device)

This class defines the normalization module, which normalizes input images using the defined mean and standard deviation values.

In [None]:
class Normalization(nn.Module):
    def __init__(self, mean, std):
        super(Normalization, self).__init__()
        self.mean = torch.tensor(mean).view(-1, 1, 1)
        self.std = torch.tensor(std).view(-1, 1, 1)
    def forward(self, img):
        return (img - self.mean) / self.std

These lists specify the layers in the VGG network used for content and style representations.

In [None]:
content_layers_default = ['conv_4']
style_layers_default = ['conv_1', 'conv_2', 'conv_3', 'conv_4', 'conv_5']

These lists specify the layers in the VGG network used for content and style representations.

In [None]:
def get_style_model_and_losses(cnn, normalization_mean, normalization_std,
                               style_img, content_img,
                               content_layers=content_layers_default,
                               style_layers=style_layers_default):
    cnn = copy.deepcopy(cnn)

    normalization = Normalization(normalization_mean, normalization_std).to(device)

    content_losses = []
    style_losses = []

    model = nn.Sequential(normalization)

    i = 0  # increment every time we see a conv
    for layer in cnn.children():
        if isinstance(layer, nn.Conv2d):
            i += 1
            name = 'conv_{}'.format(i)
        elif isinstance(layer, nn.ReLU):
            name = 'relu_{}'.format(i)
            layer = nn.ReLU(inplace=False)
        elif isinstance(layer, nn.MaxPool2d):
            name = 'pool_{}'.format(i)
        elif isinstance(layer, nn.BatchNorm2d):
            name = 'bn_{}'.format(i)
        else:
            raise RuntimeError('Unrecognized layer: {}'.format(layer.__class__.__name__))

        model.add_module(name, layer)

        if name in content_layers:
            target = model(content_img).detach()
            content_loss = ContentLoss(target)
            model.add_module("content_loss_{}".format(i), content_loss)
            content_losses.append(content_loss)

        if name in style_layers:
            target_feature = model(style_img).detach()
            style_loss = StyleLoss(target_feature)
            model.add_module("style_loss_{}".format(i), style_loss)
            style_losses.append(style_loss)

    for i in range(len(model) - 1, -1, -1):
        if isinstance(model[i], ContentLoss) or isinstance(model[i], StyleLoss):
            break

    model = model[:(i + 1)]

    return model, style_losses, content_losses