# **Historical style generator**

#### **Students – Leor Ariel Rose, Yahav Bar David**

#### **Academic advisor - Dr. Irina Rabaev**

This notebook contains our document style tranfer model and the explanations of the model.

##### Copyright 2018 The TensorFlow Authors.

In [1]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

### Import and configure modules

First we will import all necessary libraries


In [2]:
import os
import shutil
import numpy as np
import tensorflow as tf
from typing import List
from tensorflow import keras
from PIL import Image
from shutil import copyfile
from google.colab import drive

Next lets mount our drive with our data and folders to save


In [3]:
drive.mount('/content/drive')

Mounted at /content/drive


### Define global variables

Choose intermediate layers from the network to represent the style and content of the image:


In [4]:
# best intermediate layers for content representation
content_layers: List[str] = ['block3_conv1'] 

# best intermediate layers for style representation
style_layers: List[str] = ['block1_conv1', 'block2_conv1', 'block3_conv1', 'block4_conv1', 'block5_conv1']

# get layers length
num_content_layers: int = len(content_layers)
num_style_layers: int = len(style_layers)

Create an optimizer. The paper recommends LBFGS, but `Adam` works okay, too:

In [5]:
opt: tf.optimizers.Adam = tf.optimizers.Adam(learning_rate=0.02, beta_1=0.99, epsilon=1e-1)

### Define helper functions

Define a function to transform tensor to an image:

In [6]:
def tensor_to_image(tensor: tf.Tensor) -> Image:
  """
  function to transform tensor to an image
  
  Args:
    tensor (tf.Tensor): tensor representing image

  Returns:
    PIL.Image: pillow image object
  """ 
  # multuply by 255 to undo normalization
  tensor = tensor*255
  # turn to array
  tensor: np.ndarray = np.array(tensor, dtype=np.uint8)
  # remove the array dimension, leave only image
  if np.ndim(tensor)>3:
    assert tensor.shape[0] == 1
    tensor: tf.Tensor = tensor[0]
  # return array from image
  return Image.fromarray(tensor)

Define a function to load an image and limit its maximum dimension to 512 pixels.

In [7]:
def load_img(path_to_img: str) -> tf.Tensor:
  """
  Function to load an image and limit its maximum dimension to 512 pixels
  
  Args:
    path_to_img (str): image path
    
  Returns:
    tf.Tensor: a tensor that represents an image
  """
  # set max dimensions
  max_dim: int = 512
  # read document
  img: tf.Tensor = tf.io.read_file(path_to_img)
  # decode to image
  img: tf.Tensor = tf.image.decode_image(img, channels=3)
  # convert to float, each val between [0,1]
  img: tf.Tensor = tf.image.convert_image_dtype(img, tf.float32)

  # get image width and hieght
  shape: tf.Tensor = tf.cast(tf.shape(img)[:-1], tf.float32)
  # get max height or width
  long_dim: tf.Tensor= max(shape)
  # get the ratio scale to 512
  scale: tf.Tensor = max_dim / long_dim

  # get the nex image shape by scale ratio
  new_shape: tf.Tensor = tf.cast(shape * scale, tf.int32)

  # rezise image
  img: tf.Tensor = tf.image.resize(img, new_shape)
  
  # expand dimension to image array
  img: tf.Tensor = img[tf.newaxis, :]
  return img

Since this is a float image, define a function to keep the pixel values between 0 and 1:

In [8]:
def clip_0_1(image: tf.Tensor) -> tf.Tensor:
  """
  Function to keep the pixel values between 0 and 1

  Args:
      inputs (tf.Tensor): a tensor that represents an image

  Returns:
       tf.Tensor: a clipped tensor
  """
  return tf.clip_by_value(image, clip_value_min=0.0, clip_value_max=1.0)

Define a method for gram matrix.

The content of an image is represented by the values of the intermediate feature maps.

It turns out, the style of an image can be described by the means and correlations across the different feature maps. Calculate a Gram matrix that includes this information by taking the outer product of the feature vector with itself at each location, and averaging that outer product over all locations. This Gram matrix can be calculated for a particular layer as:

$$G^l_{cd} = \frac{\sum_{ij} F^l_{ijc}(x)F^l_{ijd}(x)}{IJ}$$

This can be implemented concisely using the `tf.linalg.einsum` function:

In [9]:
def gram_matrix(input_tensor: tf.Tensor) -> tf.Tensor:
  """
  Functtion to calculate gram matrix of an image tensor (feature-wise outer product)

  Args:
    input_tensor (tf.Tensor): a tensor that represents an image

  Returns:
    tf.Tensor: the gram matrix result

  """
  # calculate the outer product of the feature vector with itself at each location
  result: tf.Tensor = tf.linalg.einsum('bijc,bijd->bcd', input_tensor, input_tensor)
  # get tensor shape
  input_shape: tf.Tensor = tf.shape(input_tensor)
  # calculate shape to avarge outer product
  num_locations: tf.Tensor = tf.cast(input_shape[1]*input_shape[2], tf.float32)
  # return averaging outer product over all locations
  return result/(num_locations)

### Build the model 
This following function builds a VGG19 model that returns a list of intermediate layer outputs:

In [10]:
def vgg_layers(layer_names: List[str]) ->  tf.keras.Model:
  """ 
  Function to create a vgg model that returns a list of intermediate output values.

  Args:
    layer_names (Lisr[str]): intermediate layers to output value
    
  Returns:
     tf.keras.Model:  a vgg model that returns a list of intermediate output values
  
  """
  # Build a VGG19 model loaded with pre-trained ImageNet weights
  vgg: tf.keras.Model = tf.keras.applications.VGG19(include_top=False, weights='imagenet')
  
  # set model to not train itself
  vgg.trainable = False
  
  # get ouitput of intermediate layers
  outputs: List[tf.Tensor] = [vgg.get_layer(name).output for name in layer_names]

  # create a vgg model
  model: tf.keras.Model = tf.keras.Model([vgg.input], outputs)
  return model

### Extract style and content


Build a model that returns the style and content tensors.

In [11]:
class StyleContentModel(tf.keras.models.Model):
  """
  Model that returns the style and content tensors.
  """
  def __init__(self, style_layers, content_layers):
    # init super
    super(StyleContentModel, self).__init__()
    # get a vgg model that returns a list of intermediate output values.
    self.vgg =  vgg_layers(style_layers + content_layers)
    # set style layers
    self.style_layers = style_layers
    # set content layers
    self.content_layers = content_layers
    # set number of style layers
    self.num_style_layers = len(style_layers)
    # set model to be untranable
    self.vgg.trainable = False

  def call(self, inputs: tf.Tensor) -> dict:
    """
    Overide the call method of object, to return the style and content tensors.

    Args:
        inputs (tf.Tensor): a tensor that represents an image, expects float input in [0,1]

    Returns:
       dict: the gram matrix result
    """
    # undo normalization of image
    inputs: tf.Tensor  = inputs*255.0
    
    # pre process for vgg19 model
    preprocessed_input: tf.Tensor = tf.keras.applications.vgg19.preprocess_input(inputs)
    
    # foward pass image
    outputs: tf.Tensor = self.vgg(preprocessed_input)
    
    # get layers style and content output
    style_outputs, content_outputs = (outputs[:self.num_style_layers], 
                                      outputs[self.num_style_layers:])

    # calc gram matrix for style output
    style_outputs: List[tf.Tensor] = [gram_matrix(style_output) for style_output in style_outputs]

    # get content layers value as dict that key is the layer name
    content_dict: dict = {content_name:value for content_name, value in zip(self.content_layers, content_outputs)}

    # get style layers value as dict that key is the layer name
    style_dict: dict = {style_name:value for style_name, value in zip(self.style_layers, style_outputs)}

    # return the style and content tensors
    return {'content':content_dict, 'style':style_dict}

### Apply style transfer
Build a model that runs the style transfer process.*italicised text*

In [12]:
class DocumentStyleTransfer():
  """
  Modal to aplly document style transfer
  """

  def __init__(self, content_path, style_path):
    self.content_image: tf.Tensor = load_img(content_path)
    self.style_image: tf.Tensor = load_img(style_path)
    self.extractor: StyleContentModel =  StyleContentModel(style_layers, content_layers)
    self.style_targets: tf.Tensor = self.extractor(self.style_image)['style']
    self.content_targets: tf.Tensor = self.extractor(self.content_image)['content']
    self.image: tf.Tensor = tf.Variable(self.content_image)


  def style_content_loss(self, outputs: dict, style_weight: float, content_weight: float) -> float:
    """
    Function to calculate style and content loss
    
    Args:
        outputs (dict): style and content tensors

    Returns:
        float: style and content loss
    """
    # get style tensor
    style_outputs: tf.Tensor = outputs['style']
    
    # get content tensor
    content_outputs: tf.Tensor = outputs['content']
    
    # calc style loss
    style_loss: tf.Tensor = tf.add_n([tf.reduce_mean((style_outputs[name]-self.style_targets[name])**2) for name in style_outputs.keys()])
    
    # average loss by layers
    style_loss *= style_weight / num_style_layers

    # calculate content loss
    content_loss: tf.Tensor = tf.add_n([tf.reduce_mean((content_outputs[name]-self.content_targets[name])**2) for name in content_outputs.keys()])
    
    # average loss by layers
    content_loss *= content_weight / num_content_layers
    
    # add style and content loss
    loss: tf.Tensor = style_loss + content_loss
    return loss


  @tf.function()
  def train_step(self, image, style_weight: float, content_weight: float, total_variation_weight: float):
    # 
    with tf.GradientTape() as tape:
      # get image style and content tensors
      outputs: tf.Tensor = self.extractor(image)
      # calculate style and content loss
      loss: tf.Tensor = self.style_content_loss(outputs, style_weight, content_weight)
      # add total variation loss
      loss += total_variation_weight*tf.image.total_variation(image)

    # calc gradient descent
    grad = tape.gradient(loss, image)
    # apply gradient descent
    opt.apply_gradients([(grad, image)])
    # update image
    image.assign(clip_0_1(image))


  def render_image(self, results_dir:str = './', total_variation_weight:float = 1e-6, style_weight:float = 1e-6, content_weight:float = 2.5e-8, iter_save:int = 4000) -> None:
    """
    Method to render a style transfer image from content and style

    """
    # create directory for results
    if os.path.exists(results_dir):
        shutil.rmtree(results_dir)
    os.mkdir(results_dir)
    
    # set number of iteration of model
    iterations: int = 4000
    for i in range(1, iterations + 1):
      # apply a step in training process
      self.train_step(self.image, style_weight, content_weight, total_variation_weight)

      # for each iter_save iteration output results
      if i % iter_save == 0:
        print(f"iteration: {i}")
        file_name: str = f"{results_dir}/result_at_iteration_{i}.png"
        tensor_to_image(self.image).save(file_name) 

### Run the process

Next we will define our content and style loss values in order to keep a raitio between them

In [13]:
# ratio content_weight \ style_weight
# from the paper 1 × 10^−3, 8 × 10^−4, , 5 × 10^−3, 5 × 10^−4
# weights
content_weights: List[float] = [1.00E-07, 1.00E-08, 1.00E-09, 1.00E-10, 1.00E-06, 1.00E-06, 1.00E-06, 1.00E-06, 1.00E-06, 1e8]
style_weights: List[float] =   [1.00E-06, 1.00E-06, 1.00E-06, 1.00E-06, 1.00E-06, 1.00E-07, 1.00E-08, 1.00E-09, 1.00E-10, 1e-2]
total_variation_weight:float = 1.00E-06

And now for each content and style loss we will apply document style transfer and save the result and our research parameters

In [None]:
experements_path: str = '/content/drive/MyDrive/final_project/experiments/content_modern_hebrew/style_middle_ages_hebrew/content_is_handwritten_document/style_is_document_with_text/our_model/block3_conv1'
experement_number: int = 1

# for each image in content
for i in os.listdir('/content/content'):
  # for each image in style
  for j in os.listdir('/content/style'):
    print(f"content:{i}, style:{j}")
    print(f"experement: {experement_number}")

    if not i.startswith('.') and not j.startswith("."):
      # experements
      for k in range(0, len(content_weights)):
        content_path = f'/content/content/{i}'
        style_path = f'/content/style/{j }'
        experement_dir: str = f"{experements_path}/test {experement_number}/experement{k}"

        if os.path.exists(experement_dir):
              shutil.rmtree(experement_dir)
        os.makedirs(experement_dir)
        
        DocumentStyleTransfer(content_path, style_path).render_image(experement_dir, total_variation_weight, style_weights[k], content_weights[k], 50)
        
        # create read me of the experement inside the folder
        with open(f"{experement_dir}/readme.txt", 'w') as readme:
          readme.write(f"experementNumber = {k}\n")
          readme.write(f"from = modern hebrew\n")
          readme.write(f"to = middle ages hebrew\n")
          # readme.write(f"totalVariationWeight = {totalVariationWeight}\n")
          readme.write(f"styleWeight = {style_weights[k]}\n")
          readme.write(f"contentWeight = {content_weights[k]}\n")
        
        # add contetnt and style to experement folder   
        copyfile(content_path, f"{experement_dir}/content.png")
        copyfile(style_path, f"{experement_dir}/style.png")
      experement_number += 1

content:1.png, style:1.png
experement: 1
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg19/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5
iteration: 50
iteration: 100
iteration: 150
iteration: 200
iteration: 250
iteration: 300
iteration: 350
iteration: 400
iteration: 450
iteration: 500
iteration: 550
iteration: 600
iteration: 650
iteration: 700
iteration: 750
iteration: 800
iteration: 850
iteration: 900
iteration: 950
iteration: 1000
iteration: 1050
iteration: 1100
iteration: 1150
iteration: 1200
iteration: 1250
iteration: 1300
iteration: 1350
iteration: 1400
iteration: 1450
iteration: 1500
iteration: 1550
iteration: 1600
iteration: 1650
iteration: 1700
iteration: 1750
iteration: 1800
iteration: 1850
iteration: 1900
iteration: 1950
iteration: 2000
iteration: 2050
iteration: 2100
iteration: 2150
iteration: 2250
iteration: 2300
iteration: 2350
iteration: 2400
iteration: 2450
iteration: 2500
iteration: 2550
iteration: 2600
iteration: 2650
iteratio

In [None]:
os.listdir('/content/style')