## 🧠 Introduction to Neural Style Transfer

Neural Style Transfer is a computer vision technique that allows us to combine the **content of one image** with the **style of another** to produce a new, artistic image.

In this project, we use the **VGG19** convolutional neural network pre-trained on ImageNet to extract:

- **Content features** (structure and layout) from a **content image**
- **Style features** (color, texture, and brushstrokes) from a **style image**

Then, we optimize a new image to minimize a **loss function** that balances:

- **Content loss**: how different the generated image is from the content image
- **Style loss**: how different the textures and colors are from the style image

This notebook implements the entire process step-by-step, using **TensorFlow** and **Keras**.

---

### 📝 Steps Overview:

1. Load and preprocess content & style images
2. Extract feature maps from VGG19 layers
3. Compute Gram matrices for style
4. Define loss functions
5. Optimize an image starting from the content image
6. Visualize stylized results



-----

## 📦 Step 1: Import Dependencies

We begin by importing essential Python libraries:

- **TensorFlow**: For deep learning and model operations (VGG19)
- **NumPy**: For numerical operations on image tensors
- **Matplotlib**: To visualize images during training
- **PIL (Python Imaging Library)**: To load and manipulate images
- **Warnings/OS**: To suppress unnecessary logs and keep the notebook clean


In [1]:
# 📦 Import Required Libraries

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import os
import warnings

# Suppress TensorFlow and Python warnings for cleaner output
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'  # Suppress TF messages: 0=all, 1=ignore INFO, 2=ignore WARNING, 3=ignore ERROR
warnings.filterwarnings('ignore')


-----

## 🖼️ Step 2: Load and Preprocess Images

We define a utility function `load_image()` that:

- Opens the image using **PIL**
- Converts it to **RGB**
- Resizes it to a maximum dimension (default: 300px) while maintaining the aspect ratio
- Normalizes the pixel values to the [0, 1] range
- Adds a batch dimension so it can be fed into the VGG19 model

This ensures both the content and style images are in the correct format for processing.


In [2]:
def load_image(path_to_img, max_dim=300):
    img = Image.open(path_to_img)                  # Open the image file
    img = img.convert('RGB')                       # Ensure it's in RGB format
    img.thumbnail((max_dim, max_dim))              # Resize while preserving aspect ratio
    img = np.array(img)                            # Convert to NumPy array
    img = img[tf.newaxis, ...] / 255.0             # Add batch dimension & normalize [0,1]
    return tf.convert_to_tensor(img, dtype=tf.float32)  # Convert to TensorFlow tensor

In [None]:
content_image = load_image("content.jpg")
style_image = load_image("style.jpg")


----

## 🔍 Step 3: Load Pre-trained VGG19

We load the **VGG19** convolutional neural network from Keras applications.

- `include_top=False` removes the final fully connected layers, as we only need intermediate **feature maps**.
- `weights='imagenet'` loads pre-trained weights that have learned rich feature representations from the ImageNet dataset.
- `vgg.trainable = False` ensures the model parameters are **frozen** — we’re only using it as a **feature extractor**, not for training.

This model will provide the content and style representations for our input images.


In [3]:
# Load pretrained VGG19 without the fully connected layers (for feature extraction)
vgg = tf.keras.applications.VGG19(include_top=False, weights='imagenet')

vgg.trainable = False  # Freeze the model weights; do not update during training

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg19/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m80134624/80134624[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step



----
## 🧬  Step 4: Select VGG19 Layers and Build Feature Extractors

To extract meaningful information from our images, we use specific layers of the **VGG19** model:

- **Content features**: captured from deeper layers (`block4_conv2`) that encode the structure and layout of the image.
- **Style features**: captured from multiple layers to encode textures, brushstrokes, and colors at various levels.

We then define a helper function that returns a new model that outputs activations from only the selected layers.


In [4]:
content_layers = ['block4_conv2']  # Layer used to extract content features (high-level image structure)

style_layers = [   # Layers used to extract style features (textures, colors, patterns)
    'block1_conv1',
    'block2_conv1',
    'block3_conv1',
    'block4_conv1',
    'block5_conv1']


def vgg_layers(layer_names):
    outputs = [vgg.get_layer(name).output for name in layer_names]  # Get outputs from specified VGG layers
    model = tf.keras.Model([vgg.input], outputs)                   # Create new model returning these outputs
    return model

In [5]:
style_extractor = vgg_layers(style_layers)    # Model that outputs style layer activations
content_extractor = vgg_layers(content_layers)  # Model that outputs content layer activations


----

## 🧠 Step 5: Extract Feature Representations

To capture the **style** of an image, we use the **Gram matrix** of the feature maps. It measures correlations between different filter responses and effectively encodes textures and patterns.

We also define a function to extract:

- **Content features** using the `content_extractor`
- **Style features**, converted into **Gram matrices**, using the `style_extractor`

These will be used to compute the content and style loss during optimization.


In [6]:
# Compute the Gram Matrix (Style Representation)
def gram_matrix(input_tensor):
    """
    Computes the Gram matrix of a 4D input tensor.

    Args:
        input_tensor (tf.Tensor): Tensor of shape (1, height, width, channels)

    Returns:
        tf.Tensor: Gram matrix of shape (channels, channels)
    """
    result = tf.linalg.einsum('bijc,bijd->bcd', input_tensor, input_tensor)  # Batch-wise dot product
    input_shape = tf.shape(input_tensor)
    num_locations = tf.cast(input_shape[1] * input_shape[2], tf.float32)
    return result / num_locations  # Normalize by number of spatial locations



# Extract Content and Style Features from Given Images
def get_feature_representations(model, content_img, style_img):
    """
    Extracts content and style features from the given images.

    Args:
        model: VGG19 model
        content_img (tf.Tensor): Preprocessed content image
        style_img (tf.Tensor): Preprocessed style image

    Returns:
        tuple: (content_features, style_gram_matrices)
    """
    content_outputs = content_extractor(content_img)
    style_outputs = style_extractor(style_img)
    style_grams = [gram_matrix(output) for output in style_outputs]

    return content_outputs, style_grams


-----

## ⚖️ Step 6: Compute Total Loss

The total loss is a combination of:

- **Content Loss**: Measures how different the generated image is from the content image in terms of high-level structure.
- **Style Loss**: Measures how different the style (Gram matrices) of the generated image is from the target style image.

We also allow for **per-layer weights** to control how much each style layer contributes to the overall loss.

Finally, the total loss is calculated as:  

**Total Loss = (style_weight × style_loss) + (content_weight × content_loss)**



In [7]:
# ⚖️ Compute Total Loss: Style + Content
def compute_loss(model, loss_weights, init_image, gram_style_features, content_features, style_layers, style_layer_weights):
    """
    Computes the total loss combining style and content loss.

    Args:
        model: (unused) Placeholder for consistency
        loss_weights (tuple): (style_weight, content_weight)
        init_image (tf.Variable): Current image being optimized
        gram_style_features (list): Target style Gram matrices
        content_features (list): Target content features
        style_layers (list): Names of style layers
        style_layer_weights (dict): Layer-wise weight for style loss

    Returns:
        tf.Tensor: Scalar loss value
    """
    input_tensor = tf.concat([init_image], axis=0)

    # Get activations
    style_output = style_extractor(input_tensor)
    content_output = content_extractor(input_tensor)

    style_score = 0
    content_score = 0

    style_weight, content_weight = loss_weights

    # Compute style loss with per-layer weights
    for i, (output, target) in enumerate(zip(style_output, gram_style_features)):
        gram_out = gram_matrix(output)
        layer_name = style_layers[i]
        weight = style_layer_weights.get(layer_name, 1.0)  # Default to 1.0
        style_score += weight * tf.reduce_mean(tf.square(gram_out - target))

    # Normalize style loss (optional but improves stability)
    total_style_weight = tf.reduce_sum(list(style_layer_weights.values()))
    style_score /= total_style_weight

    # Compute content loss
    for output, target in zip(content_output, content_features):
        content_score += tf.reduce_mean(tf.square(output - target))

    # Total loss
    loss = style_weight * style_score + content_weight * content_score

    return loss



-----

## 🚀 Step 7: Optimize the Image

We use **gradient descent** to update the pixels of the generated image so that:

- It becomes more similar to the **content image** in structure.
- It mimics the **style image** in texture and color.

The `train_step()` function:

1. Computes the loss using our custom loss function.
2. Calculates gradients of the loss with respect to the image.
3. Applies those gradients using the **Adam optimizer**.
4. Clips the image values to keep them in the valid range \([0, 1]\).


In [8]:
def train_step(image, loss_weights, gram_style_features, content_features, optimizer,
               style_layers, style_layer_weights):
    """
    Performs one optimization step on the image.

    Args:
        image (tf.Variable): Image being optimized
        loss_weights (tuple): (style_weight, content_weight)
        gram_style_features (list): Target style representations
        content_features (list): Target content representations
        optimizer (tf.optimizers.Optimizer): Optimizer instance
        style_layers (list): Names of style layers
        style_layer_weights (dict): Weights for each style layer
    """
    with tf.GradientTape() as tape:
        loss = compute_loss(None, loss_weights, image,
                            gram_style_features, content_features,
                            style_layers, style_layer_weights)

    # Compute gradients of the loss w.r.t. the image
    grad = tape.gradient(loss, image)

    # Apply the gradients to the image
    optimizer.apply_gradients([(grad, image)])

    # Ensure pixel values stay in valid range [0, 1]
    image.assign(tf.clip_by_value(image, 0.0, 1.0))



-----

## 🧪 Step 8: Stylization and Training Loop

We now run the optimization process over several epochs. At each step:

1. We compute the style and content loss.
2. Update the image using the Adam optimizer.
3. Display the stylized image at the end of each epoch.

We also use:
- **Layer weights** to control the contribution of each style layer.
- `style_weight` and `content_weight` to balance the two objectives.


In [9]:
def show_image(tensor, title=''):
    img = tensor.numpy().squeeze()
    plt.imshow(img)
    plt.title(title)
    plt.axis('off')
    plt.show()

In [None]:
import time  # To measure training time

# Style layer weights (you can tune these for different effects)
style_layer_weights = {
    'block1_conv1': 1,
    'block2_conv1': 0.75,
    'block3_conv1': 0.2,
    'block4_conv1': 0.2,
    'block5_conv1': 0.2
}

# Style vs Content balance
style_weight = 1e3
content_weight = 1

# Initialize image (starting from content image)
init_image = tf.Variable(content_image)

# Optimizer
optimizer = tf.optimizers.Adam(learning_rate=0.003)

# Extract fixed features from content and style images
content_features, gram_style_features = get_feature_representations (vgg, content_image, style_image)

# Training settings
epochs = 10
steps_per_epoch = 200

start = time.time()

# Stylization loop
for epoch in range(epochs):

    start_epoch = time.time()

    for step in range(steps_per_epoch):
        train_step(
            init_image,
            (style_weight, content_weight),
            gram_style_features,
            content_features,
            optimizer,
            style_layers,
            style_layer_weights
        )

    end_epoch = time.time()

    print(f"Epoch {epoch + 1} completed in {end_epoch - start_epoch:.1f} seconds.")

    show_image(init_image, f"Stylized Image {epoch + 1}")

end = time.time()

print(f"✅ Total stylization time: {end - start:.1f} seconds")