## Engineer, Generative AI

### Development of an Adaptable Deep Learning Model for Artistic Style Transfer


Design a neural network, preferably a convolutional neural network, with the aim of learning features that capture the distinctive style of an artist. Select a dataset for training and ensure a proper split into training, validation and test sets to evaluate the model effectively. Use suitable loss functions during training to assess both content preservation and style emulation.

Following the training phase, develop a method to adapt the learned style features to new artworks while maintaining the original content and artistic integrity. Establish criteria for evaluation, including measures such as stylistic accuracy, content preservation and overall visual appeal, to gauge the effectiveness of the style transfer algorithm.

The ultimate objective is to create a versatile system for artistic style transfer that seamlessly incorporates new styles while preserving the essence of the original conten

### Design Approach



Approaching this task involves first designing a convolutional neural network architecture capable of learning intricate features representing an artist's style. The selected model should be trained on a curated dataset, employing a meticulous split into training, validation and test sets for robust evaluation. During the training phase, the focus lies on incorporating appropriate loss functions to effectively measure content preservation and style emulation.

Post-training, the development of a method to adapt learned style features to new artworks while upholding original content and artistic integrity becomes crucial. The approach includes establishing clear evaluation criteria, encompassing factors such as stylistic accuracy, content preservation and overall visual appeal, to systematically assess the performance of the style transfer algorithm.

The overarching goal is to create a flexible and effective system for artistic style transfer, capable of seamlessly integrating new styles while maintaining fidelity to the original content.

### Prepared By

Soumya Tarafder | 19AR10036
<br/>Indian Institute of Technology, Kharagpur

### Code Block

#### Import Libraries

In [None]:
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.optim as optim
from torchvision import transforms, models

%matplotlib inline

#### Preprocess Data

Load and Preprocess the Image

In [None]:
def load_and_preprocess_image(image_path, max_dimension=400, custom_shape=None):
    image = Image.open(image_path).convert('RGB')

    if max(image.size) > max_dimension:
        new_dimension = max_dimension
    else:
        new_dimension = max(image.size)

    if custom_shape is not None:
        new_dimension = custom_shape

    transform_pipeline = transforms.Compose([
        transforms.Resize(new_dimension),
        transforms.ToTensor(),
        transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
    ])

    preprocessed_image = transform_pipeline(image)[:3, :, :].unsqueeze(0)

    return preprocessed_image

Convert a Tensor to a NumPy Image

In [None]:
def tensor_to_image(tensor):
    image = tensor.to("cpu").clone().detach()
    image = image.numpy().squeeze().transpose(1, 2, 0)
    image = image * np.array((0.229, 0.224, 0.225)) + np.array((0.485, 0.456, 0.406))
    image = np.clip(image, 0, 1)

    return image

#### Finalize Model

In [None]:
def get_frozen_vgg_model():
    vgg_model = models.vgg19(pretrained=True).features

    for parameter in vgg_model.parameters():
        parameter.requires_grad_(False)

    return vgg_model

In [None]:
def extract_and_return_features(input_image, model, selected_layers=None):
    if selected_layers is None:
        selected_layers = {'0': 'conv1_1',
                           '5': 'conv2_1',
                           '10': 'conv3_1',
                           '19': 'conv4_1',
                           '21': 'conv4_2',
                           '28': 'conv5_1'}

    features_dict = {}
    x = input_image

    for name, layer in model._modules.items():
        x = layer(x)
        if name in selected_layers:
            features_dict[selected_layers[name]] = x

    return features_dict

In [None]:
def calculate_gram_matrix(tensor):
    _, d, h, w = tensor.size()
    tensor = tensor.view(d, h * w)
    gram_matrix = torch.mm(tensor, tensor.t())
    return gram_matrix

In [None]:
def perform_style_transfer(content_path, style_path):
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    content_image = load_and_preprocess_image(content_path).to(device)
    style_image = load_and_preprocess_image(style_path, custom_shape=content_image.shape[-2:]).to(device)

    vgg_model = get_frozen_vgg_model().to(device)

    content_features = extract_and_return_features(content_image, vgg_model)
    style_features = extract_and_return_features(style_image, vgg_model)

    style_grams = {layer: calculate_gram_matrix(style_features[layer]) for layer in style_features}

    generated_image = content_image.clone().requires_grad_(True).to(device)

    style_weights = {'conv1_1': 1.0,
                     'conv2_1': 0.75,
                     'conv3_1': 0.2,
                     'conv4_1': 0.2,
                     'conv5_1': 0.2}

    content_weight = 1
    style_weight = 1e9

    optimizer = optim.Adam([generated_image], lr=0.003)
    num_steps = 500

    for step in range(1, num_steps + 1):
        generated_features = extract_and_return_features(generated_image, vgg_model)

        content_loss = torch.mean((generated_features['conv4_2'] - content_features['conv4_2'])**2)

        style_loss = 0
        for layer in style_weights:
            target_feature = generated_features[layer]
            target_gram = calculate_gram_matrix(target_feature)
            _, d, h, w = target_feature.shape
            style_gram = style_grams[layer]
            layer_style_loss = style_weights[layer] * torch.mean((target_gram - style_gram)**2)
            style_loss += layer_style_loss / (d * h * w)

        total_loss = content_weight * content_loss + style_weight * style_loss

        optimizer.zero_grad()
        total_loss.backward()
        optimizer.step()

    return generated_image

#### Final Result

Take Example

In [None]:
content_image_path = 'content.jpg'
style_image_path = 'style.jpg'
stylized_image = perform_style_transfer(content_image_path, style_image_path)

Display Content, Style and Stylized Images

In [None]:
fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(15, 15))
ax1.imshow(tensor_to_image(load_and_preprocess_image(content_image_path).to("cpu")))
ax1.set_title("Content Image", fontsize=20)
ax2.imshow(tensor_to_image(load_and_preprocess_image(style_image_path).to("cpu")))
ax2.set_title("Style Image", fontsize=20)
ax3.imshow(tensor_to_image(stylized_image))
ax3.set_title("Stylized Image", fontsize=20)
plt.show()

### Conclusion

In conclusion, the provided code employs deep neural style transfer to seamlessly blend the content of one image with the artistic style of another. Leveraging the power of the VGG19 convolutional neural network and PyTorch, the script transforms input images into visually captivating compositions.

The modular design, with descriptive function and variable names, enhances code readability and maintainability. By utilizing gram matrices to capture style features and optimizing a generated image with a combination of content and style losses, the algorithm iteratively refines the output over a specified number of steps.

The final stylized image, showcased through Matplotlib, demonstrates the successful fusion of content and style, highlighting the effectiveness of the implemented style transfer technique.