# Gradient Ascent

Generate a synthetic image that maximally activates a neuron.

$$\mathbf{I^{*}} = arg max_{I} f(I) + R(I)$$

Where, $f(I)$ is neuron value and $R(I)$ is natural image regularizer

## Deep Dream

Making the "dream" images is very simple. Essentially it is just a gradient ascent process that tries to maximize the L2 norm of activations of a particular CNN layer. The optimization resembles Backpropagation, however instead of adjusting the network weights, the weights are held fixed and the input is adjusted. Idea: Use gradient ascent to optimize an image so it maximizes the mean value of the given layer tensor.Here are a few simple tricks that were found useful for getting good images:

- Offset image by a random jitter
- Normalize the magnitude of gradient ascent steps
- Apply ascent across multiple scales (octaves)

### Algorithm

Calculate the gradient of a given layer of the network with respect to input image. The gradient is then added to the input image so the mean value of the layer tensor is increased. This process is repeated a number of times and amplifies whatever patterns the Inception model sees in the input image. 

Google Implementation: Implemented Gradient Ascent through different scales, these scales were called as "octaves".

In [57]:
import os
import zipfile
import PIL.Image

import numpy as np
import tensorflow as tf
from IPython.display import Image, display

`tf.app.flags` - Google uses this setting global data for parsing arguments from the commandline

In [2]:
FLAGS = tf.app.flags.FLAGS

tf.app.flags.DEFINE_string('inception', './data/inception', 
                           help='Directory for storing Inception Network.')

tf.app.flags.DEFINE_string('jpeg', 'deep-dream.jpg', 
                           help='Where to save the resulting JPEG.')

# if __name__ == '__main__':
#   tf.app.run()

### Utility Functions

In [55]:
def get_layer(layer):
    """
    Helper for getting layer output (Tensor) in model Graph.
    
    Args:
        layer: str, layer name
    
    Returns:
        Tensor for the given layer name
    """
    graph = tf.get_default_graph()
    return graph.get_tensor_by_name('import/%s:0' % layer)

def download_network(dir_path):
    """
    Maybe download pretrained Inception Network.
    
    Args:
      dir_path: str, directory path to save data.
    """
    url = 'https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip'
    basename = 'inception5h.zip'
    local_file = tf.contrib.learn.datasets.base.maybe_download(basename, dir_path, url)
    
    # Uncompress the pretrained Inception Network
    print('Extracting: Inception Network')
    zip_ref = zipfile.ZipFile(local_file, 'r')
    zip_ref.extractall(dir_path)
    zip_ref.close()
    
def normalize_image(image):
    """
    Args:
      image: numpy array
    """
    # Clip to [0, 1] and then convert to uint8
    image = np.clip(image, 0, 1)
    image = np.uint8(image * 255)
    return image

def save_jpeg(jpeg_file, image):
    pil_image = PIL.Image.fromarray(image)
    pil_image.save(jpeg_file)
    print('Saved to file: ', jpeg_file)
    
def show_image(a):
    a = np.uint8(np.clip(a, 0, 255))
    PIL.Image.fromarray(a).save(f, fmt)
    display(Image(data=f.getvalue()))

In [4]:
# Download inception
download_network('./data/inception')

Instructions for updating:
Please write your own downloading logic.
Extracting: Inception Network


In [82]:
# Load the pretrained Inception model as a GraphDef
model_fn = os.path.join('./data/inception', 'tensorflow_inception_graph.pb')

# Open Inception graph using FastGFile
with tf.gfile.FastGFile(model_fn, mode='rb') as f:
    graph_def = tf.GraphDef()
    graph_def.ParseFromString(f.read())  # Reading graph
    
with tf.Graph().as_default():
    
    # Define input for the network
    input_image = tf.placeholder(np.float32, name='input')
    imagenet_mean = 117.0
    input_preprocessed = tf.expand_dims(input_image - imagenet_mean, 0)
    
    # Load initialized graph definition
    tf.import_graph_def(graph_def, {'input': input_preprocessed})
    
    ## Get a list of Tensor names that are the output of convolutions
    # Get a list of Convolution Op's 
    graph = tf.get_default_graph()
    layers = [op.name for op in graph.get_operations() if op.type == 'Conv2D' 
              and 'import/' in op.name]  # List of op_name
    
    # Tensor names are of the form "<op_name>:<output_index>".
    feature_nums = [int(graph.get_tensor_by_name(name + ':0').get_shape()[-1]) for 
                    name in layers]
    print('Number of Conv layers: ', len(layers))
    print('Number of feature channels: ', sum(feature_nums))  # Sum of n_channels
    
    # Pick an internal layer and node to visualize. NOTE: Use outputs before applying
    # the ReLU non-linearity to have non-zero gradients for features with negative activation
    layer = 'mixed4d_3x3_bottleneck_pre_relu'
    layer_n_channels = graph.get_tensor_by_name('import/' + layer + ':0').get_shape()[-1]
    print('Number of channels in "{}": {}'.format(layer, layer_n_channels))
    channel = 139
    layer_channel = get_layer(layer)[:, :, :, channel]
    
    # Define the optimization: Maximize L2 norm of activation
    objective = tf.reduce_mean(layer_channel)  # Maximize mean of layer channel activations
    
    # Gradients with respect to input image using Autodiff
    input_gradient = tf.gradients(objective, input_image)[0]
    
    # Use random noise as an image
    noise_image = np.random.uniform(size=(224*2, 224*3, 3)) + 100.0
    image = noise_image.copy()
    
    # Deep Dream
    step_scale = 1.0
    n_iter = 25
    with tf.Session() as sess:
        for i in range(n_iter):
            image_gradient, obj_value = sess.run([input_gradient, objective], 
                                                 {input_image: image})
            
            # Normalize the gradient, so the same step size should work
            image_gradient /= image_gradient.std() + 1e-8
            image += image_gradient * step_scale
            print('At step {}: objective value: {}'.format(i, obj_value))
            
    # Save the image
    std_dev = 0.1
    image = (image - image.mean()) / max(image.std(), 1e-4) * std_dev + 0.5
    image = normalize_image(image)
    save_jpeg('deep-dream.jpg', image)

Number of Conv layers:  59
Number of feature channels:  7548
Number of channels in "mixed4d_3x3_bottleneck_pre_relu": 144
At step 0: objective value: -12.84211540222168
At step 1: objective value: -29.949438095092773
At step 2: objective value: 11.467972755432129
At step 3: objective value: 74.13787841796875
At step 4: objective value: 137.16217041015625
At step 5: objective value: 192.8675537109375
At step 6: objective value: 243.3765869140625
At step 7: objective value: 294.82867431640625
At step 8: objective value: 350.43646240234375
At step 9: objective value: 390.0959167480469
At step 10: objective value: 439.7633361816406
At step 11: objective value: 477.1222229003906
At step 12: objective value: 522.21142578125
At step 13: objective value: 555.4069213867188
At step 14: objective value: 588.0800170898438
At step 15: objective value: 616.86865234375
At step 16: objective value: 644.4393310546875
At step 17: objective value: 672.7307739257812
At step 18: objective value: 696.489807

In [71]:
print(', '.join(layers))

import/conv2d0_pre_relu/conv, import/conv2d1_pre_relu/conv, import/conv2d2_pre_relu/conv, import/mixed3a_pool_reduce_pre_relu/conv, import/mixed3a_5x5_bottleneck_pre_relu/conv, import/mixed3a_5x5_pre_relu/conv, import/mixed3a_3x3_bottleneck_pre_relu/conv, import/mixed3a_3x3_pre_relu/conv, import/mixed3a_1x1_pre_relu/conv, import/mixed3b_pool_reduce_pre_relu/conv, import/mixed3b_5x5_bottleneck_pre_relu/conv, import/mixed3b_5x5_pre_relu/conv, import/mixed3b_3x3_bottleneck_pre_relu/conv, import/mixed3b_3x3_pre_relu/conv, import/mixed3b_1x1_pre_relu/conv, import/mixed4a_pool_reduce_pre_relu/conv, import/mixed4a_5x5_bottleneck_pre_relu/conv, import/mixed4a_5x5_pre_relu/conv, import/mixed4a_3x3_bottleneck_pre_relu/conv, import/mixed4a_3x3_pre_relu/conv, import/mixed4a_1x1_pre_relu/conv, import/head0_bottleneck_pre_relu/conv, import/mixed4b_pool_reduce_pre_relu/conv, import/mixed4b_5x5_bottleneck_pre_relu/conv, import/mixed4b_5x5_pre_relu/conv, import/mixed4b_3x3_bottleneck_pre_relu/conv, imp

The image detail generation method described above tends to produce some patterns more often the others. One easy way to improve the generated image diversity is to tweak the optimization objective. Use one more input "guide" image.

In [61]:
guide = np.float32(PIL.Image.open('flowers.jpg'))  # TODO: Follow Google's Code

[Hvass Lab: Visual Analysis](https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/13_Visual_Analysis.ipynb)