<a href="https://colab.research.google.com/github/mahankali777/Style-Transfer-Project/blob/main/transfering_style.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [68]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All"
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

---------------------------------------
# Introduction

![](https://miro.medium.com/max/1200/1*XI3beonBnOwp-y5BwNOqCw.gif)

Picture Credit: https://miro.medium.com

**Neural Style Transfer**

> Neural Style Transfer (NST) refers to a class of software algorithms that manipulate digital images, or videos, in order to adopt the appearance or visual style of another image. NST algorithms are characterized by their use of deep neural networks for the sake of image transformation. Common uses for NST are the creation of artificial artwork from photographs, for example by transferring the appearance of famous paintings to user-supplied photographs. Several notable mobile apps use NST techniques for this purpose, including DeepArt and Prisma. This method has been used by artists and designers around the globe to develop new artwork based on existent style(s).

Ref: https://en.wikipedia.org/wiki/Neural_Style_Transfer

Style transfer means that when a content image and a style image are given, the outline and shape of the image are similar to the content image, and the color or texture is changed to be similar to the style image.

By separating content and style, you can mix content and style of different images.


A pre-trained VGG19 Net is used as a model to extract content and style. It then uses the losses of the content and style to iteratively update the target image until the desired result is achieved.

In [69]:
# import resources
%matplotlib inline

from PIL import Image
from io import BytesIO
import matplotlib.pyplot as plt
import numpy as np

import torch
import torch.optim as optim
import requests
from torchvision import transforms, models

------------------------------------------------
# 1. Load in model

VGG19 is divided into two parts.

* vgg19.features: All convolutional layers and pooling layers
* vgg19.classifier: The last three linear layers are the classifier layer.

We only need the features part. And "freeze" so that the weight is not updated.

In [33]:
# get the "features" portion of VGG19 (we will not need the "classifier" portion)
vgg = models.vgg19(pretrained=True).features

# freeze all VGG parameters since we're only optimizing the target image
for param in vgg.parameters():
    param.requires_grad_(False)

In [32]:
# get the "features" portion of VGG19 (we will not need the "classifier" portion)
vgg = models.vgg19(pretrained=True).features

# freeze all VGG parameters since we're only optimizing the target image
for param in vgg.parameters():
    param.requires_grad_(False)

Downloading: "https://download.pytorch.org/models/vgg19-dcbb9e9d.pth" to /root/.cache/torch/hub/checkpoints/vgg19-dcbb9e9d.pth
100%|██████████| 548M/548M [00:09<00:00, 62.3MB/s]


-----------------------------------------------
# 2. Load in Content and Style Images

Load the content image and style image to be used for style transfer. The load_image function transforms the image and loads it in the form of normalized Tensors.



In [39]:
def load_image(img_path, max_size=128, shape=None):
    ''' Load in and transform an image, making sure the image
       is <= 400 pixels in the x-y dims.'''
    if "http" in img_path:
        response = requests.get(img_path)
        image = Image.open(BytesIO(response.content)).convert('RGB')
    else:
        image = Image.open(img_path).convert('RGB')

    # large images will slow down processing
    if max(image.size) > max_size:
        size = max_size
    else:
        size = max(image.size)

    if shape is not None:
        size = shape

    in_transform = transforms.Compose([
                        transforms.Resize(size),
                        transforms.ToTensor(),
                        transforms.Normalize((0.485, 0.456, 0.406),
                                             (0.229, 0.224, 0.225))])

    # discard the transparent, alpha channel (that's the :3) and add the batch dimension
    image = in_transform(image)[:3,:,:].unsqueeze(0)

    return image

Load the content image and style image.

In [14]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("rithishkannas/stylized-image-dataset")

print("Path to dataset files:", path)

Path to dataset files: /root/.cache/kagglehub/datasets/rithishkannas/stylized-image-dataset/versions/3


In [7]:
# helper function for un-normalizing an image
# and converting it from a Tensor image to a NumPy image for display
def im_convert(tensor):
    """ Display a tensor as an image. """

    image = tensor.to("cpu").clone().detach()
    image = image.numpy().squeeze()
    image = image.transpose(1,2,0)
    image = image * np.array((0.229, 0.224, 0.225)) + np.array((0.485, 0.456, 0.406))
    image = image.clip(0, 1)

    return image

In [18]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("vbookshelf/art-by-ai-neural-style-transfer")

print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/vbookshelf/art-by-ai-neural-style-transfer?dataset_version_number=5...


100%|██████████| 463M/463M [00:06<00:00, 72.7MB/s]

Extracting files...





Path to dataset files: /root/.cache/kagglehub/datasets/vbookshelf/art-by-ai-neural-style-transfer/versions/5


In [9]:
def get_features(image, model, layers=None):
    """ Run an image forward through a model and get the features for
        a set of layers. Default layers are for VGGNet matching Gatys et al (2016)
    """

    if layers is None:
        layers = {'0': 'conv1_1',
                  '5': 'conv2_1',
                  '10': 'conv3_1',
                  '19': 'conv4_1',
                  '21': 'conv4_2',  ## content representation
                  '28': 'conv5_1'}

    features = {}
    x = image
    # model._modules is a dictionary holding each module in the model
    for name, layer in model._modules.items():
        x = layer(x)
        if name in layers:
            features[layers[name]] = x

    return features

--------------------------------------------------
# 3. Gram Matrix

![](https://miro.medium.com/max/1400/1*VAQs1KSfbysnloPah_fHGQ.gif)

Picture Credit: https://miro.medium.com


> The matrix expressing the correlation of this Channel is called Gram Matrix. Loss is minimized by defining the difference between this Gram Matrix and the Gram Matrix of the newly created image as a Loss Function. Next, in order to reflect the content, the loss function is calculated in units of pixels from the feature map spit out from each pre-trained CNN. In this way, a new image is created that minimizes the Loss calculated from Style and Loss calculated from Content.

https://en.wikipedia.org/wiki/Gram_matrix


In [10]:
def gram_matrix(tensor):
    """ Calculate the Gram Matrix of a given tensor
    """

    # get the batch_size, depth, height, and width of the Tensor
    _, d, h, w = tensor.size()

    # reshape so we're multiplying the features for each channel
    tensor = tensor.view(d, h * w)

    # calculate the gram matrix
    gram = torch.mm(tensor, tensor.t())

    return gram

The function that extracts the features of a given convolutional layer and computes the Gram Matrix is made. Putting it all together, we extract the features from the image and compute the Gram Matrix for each layer from the style representation.

In [20]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("burhanuddinlatsaheb/neural-style-transfer")

print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/burhanuddinlatsaheb/neural-style-transfer?dataset_version_number=1...


100%|██████████| 3.17M/3.17M [00:00<00:00, 127MB/s]

Extracting files...
Path to dataset files: /root/.cache/kagglehub/datasets/burhanuddinlatsaheb/neural-style-transfer/versions/1





---------------------------------------------
# 4. Define Losses and Weights

**Individual Layer Style Weights**

You can give the option to weight the style expression in each relevant layer. It is recommended that the layer weight range from 0 to 1. By giving more weight to conv1_1 and conv2_1, more style artifacts can be reflected in the final target image.

**Content and Style Weight**

Define alpha (content_weight) and beta (style_weight). This ratio affects the style of the final image. It is recommended to leave content_weight = 1 and set style_weight to achieve the desired ratio.

In [12]:
# weights for each style layer
# weighting earlier layers more will result in *larger* style artifacts
# notice we are excluding `conv4_2` our content representation
style_weights = {'conv1_1': 1,
                 'conv2_1': 0.75,
                 'conv3_1': 0.2,
                 'conv4_1': 0.2,
                 'conv5_1': 0.2}

content_weight = 1  # alpha
style_weight = 1e3  # beta

# 5. Update Target and Calculate Losses

**Content Loss**

The content loss is calculated as the MSE between the target and the content feature in the 'conv4_2' layer.

**Style Loss**

The style loss is the loss between the target image and the style image. That is, it refers to the difference between the gram matrix of the style image and the gram matrix of the target image. Loss is calculated using MSE

**Total Loss**

Finally, the total loss is calculated by summing the style and content losses and weighting them with the specified alpha and beta.

In [70]:
import kagglehub
import torch
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt
from torchvision import transforms, models
from PIL import Image
from io import BytesIO
import requests

# Define device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Download the dataset (this should be done outside of the load_image function)
# path = kagglehub.dataset_download("dmitryellison/egodekadem")
# print("Path to dataset files:", path)

def load_image(img_path, max_size=128, shape=None):
    ''' Load in and transform an image, making sure the image
       is <= 400 pixels in the x-y dims.'''
    if "http" in img_path:
        response = requests.get(img_path)
        image = Image.open(BytesIO(response.content)).convert('RGB')
    else:
        # Replace "path/to/your/content_image.jpg" with the actual path to your image file.
        image = Image.open(img_path).convert('RGB')

    # large images will slow down processing
    if max(image.size) > max_size:
        size = max_size
    else:
        size = max(image.size)

    if shape is not None:
        size = shape

    in_transform = transforms.Compose([
                        transforms.Resize(size),
                        transforms.ToTensor(),
                        transforms.Normalize((0.485, 0.456, 0.406),
                                             (0.229, 0.224, 0.225))])

    # discard the transparent, alpha channel (that's the :3) and add the batch dimension
    image = in_transform(image)[:3,:,:].unsqueeze(0)

    return image


def im_convert(tensor):
    """ Display a tensor as an image. """

    image = tensor.to("cpu").clone().detach()
    image = image.numpy().squeeze()
    image = image.transpose(1,2,0)
    image = image * np.array((0.229, 0.224, 0.225)) + np.array((0.485, 0.456, 0.406))
    image = image.clip(0, 1)

    return image


def get_features(image, model, layers=None):
    """ Run an image forward through a model and get the features for
        a set of layers. Default layers are for VGGNet matching Gatys et al (2016)
    """

    if layers is None:
        layers = {'0': 'conv1_1',
                  '5': 'conv2_1',
                  '10': 'conv3_1',
                  '19': 'conv4_1',
                  '21': 'conv4_2',  ## content representation
                  '28': 'conv5_1'}

    features = {}
    x = image
    # model._modules is a dictionary holding each module in the model
    for name, layer in model._modules.items():
        x = layer(x)
        if name in layers:
            features[layers[name]] = x

    return features


def gram_matrix(tensor):
    """ Calculate the Gram Matrix of a given tensor
    """

    # get the batch_size, depth, height, and width of the Tensor
    _, d, h, w = tensor.size()

    # reshape so we're multiplying the features for each channel
    tensor = tensor.view(d, h * w)

    # calculate the gram matrix
    gram = torch.mm(tensor, tensor.t())

    return gram


import kagglehub

# Download latest version
path = kagglehub.dataset_download("duttasd28/image-style-transfergoogle-images")

print("Path to dataset files:", path)

Path to dataset files: /root/.cache/kagglehub/datasets/duttasd28/image-style-transfergoogle-images/versions/1


-----------------------------------------------------------------
# 6. Check the last result

In [30]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("tamajitbhattacharjee/neural-style-transfer")

print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/tamajitbhattacharjee/neural-style-transfer?dataset_version_number=1...


100%|██████████| 2.04M/2.04M [00:00<00:00, 111MB/s]

Extracting files...
Path to dataset files: /root/.cache/kagglehub/datasets/tamajitbhattacharjee/neural-style-transfer/versions/1



