# Ch. 8.2 - *DeepDream*

DeepDream is an artistic image-modification technique that uses the representations learned by convolutional neural networks. It quickly became a sensation thanks to the trippy pictures it could generate filled with bird feathers and dog eyes. This can be attributed to the fact that DeepDream was trained on ImageNet, where dogs and birds are aplenty (see below).

![deepdream](images/8_2_0_DeepDream.png)

The DeepDream algorithm is almost identical to the convnet filter-visualization technique, consisting of running a convnet in reverse: doing gradient ascent on the input to the convnet in order to maximize the activation of a specific filter in an upper layer of the convnet. DeepDream uses this same idea, with a few simple differences:

 - With DeepDream, you try to maximize the activation of entire layers rather than that of a specific filter, thus mixing together visualizations of large numbers of features at once.
 - You start not from blank, slightly noisy input, but rather from an existing image—thus the resulting effects latch on to preexisting visual patterns, distorting elements of the image in a somewhat artistic fashion.
 - The input images are processed at different scales (called octaves), which improves the quality of the visualizations.
 
Now let's make some DeepDreams!

## 8.2.1 Implementing DeepDream in Keras
We’ll start from a convnet pretrained on ImageNet. Many such convnets are available in Keras: VGG16, VGG19, Xception, ResNet50, and so on. We can implement DeepDream with any of them, but our convnet of choice will naturally affect our visualizations. The convnet used in the original DeepDream release was an Inception model, so we’ll use the Inception V3 model that comes with Keras.

In [1]:
from keras.applications import inception_v3
from keras import backend as K

K.set_learning_phase(0)

model = inception_v3.InceptionV3(weights='imagenet', include_top=False)

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.5/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5

Next, we will compute the loss, which is the quantity we will seek to maximize during the gradient-ascent process. To do this, we'll simultaneously maximize the activation of all filters in a number of layers. Specifically, we will maximize a weighted sum of the L2 norm of the activations of a set of high-level layers. Lower layers result in geometric patterns, whereas higher layers result in visuals in which you can recognize some classes from ImageNet. We will start from an arbitrary configuration involving four layers.

**SETTING UP THE DEEPDREAM CONFIGURATION**

In [2]:
# create dictionary mapping layer names to a coefficient quantifying how much the layer's
# activation contributes to the loss we want to maximize.

layer_contributions = {
    'mixed2': 0.2,
    'mixed3': 3.,
    'mixed4': 2.,
    'mixed5': 1.5,
}

In [3]:
# Now define a tensor that contains the loss: the weighted sum of the L2 norm of the activations of the layers

# create dict that maps layer names to layer instances
layer_dict = dict([(layer.name, layer) for layer in model.layers])

# define the loss by adding layer contributions to this scalar variable
loss = K.variable(0.)
for layer_name in layer_contributions:
    coeff = layer_contributions[layer_name]
    activation = layer_dict[layer_name].output # revive the layers output
    
    scaling = K.prod(K.cast(K.shape(activation), 'float32'))
    loss += coeff * K.sum(K.square(activation[:, 2: -2, 2: -2, :])) / scaling
    # ^^ adds the L2 norm of features of a layer to the loss

**GRADIENT-ASCENT PROCESS**

In [4]:
# This tensor holds the generated image
dream = model.input

# Computes the gradients of the dream with regard to loss
grads = K.gradients(loss, dream)[0]

# Normalizes the gradients (important trick!)
grads /= K.maximum(K.mean(K.abs(grads)), 1e-7)

# Set up a Keras function to retrieve the value of loss & gradients
outputs = [loss, grads]
fetch_loss_and_grads = K.function([dream], outputs)

def eval_loss_and_grads(x):
    outs = fetch_loss_and_grads([x])
    loss_value = outs[0]
    grad_values = outs[1]
    return loss_value, grad_values

# This function runs gradient ascent for a number of iterations
def gradient_ascent(x, iterations, step, max_loss=None):
    for i in range(iterations):
        loss_value, grad_values = eval_loss_and_grads(x)
        if max_loss is not None and loss_value > max_loss:
            break
        print('...Loss value at', i, ':', loss_value)
        x += step * grad_values
    return x

Finally, the actual DeepDream algorithm. First, we will define a list of scales (or octaves) at which to process the images. Each successive scale is larger than the previous one by a factor of 1.4. We start by processing a small image and then increasingly scale it up (see image below).

![deepdreamprocess](images/8_2_1_process.jpg)

For each successive scale, from smallest to the largest, we will run gradient ascent to maximize the loss that was previously defined. After each gradient ascent run, we upscale the resulting image by 40%.

To avoid losing a lot of image detail after each successive scale-up, we can use a simple trick: after each scale-up, we’ll reinject the lost details back into the image, which is possible because we know what the original image should look like. Given a small image size S and a larger image size L, we can compute the difference between the original image resized to size L and the original resized to size S. This difference quantifies the details lost when going from S to L.

**RUNNING THE GRADIENT ASCENT OVER DIFFERENT SUCCESSIVE SCALES**

In [7]:
import numpy as np

In [10]:
# Playing with these hyperparameters will allow us to achieve new effects
step = 0.01 # gradient ascent step size
num_octave = 3 # number of scales to run gradient ascent
octave_scale = 1.4 # size ratio between scales
iterations = 20 # number of ascent steps to run at each scale

# if loss grows larger than 10, we interupt to avoid ugly artifacts
max_loss = 10.

In [13]:
# HELPER FUNCTIONS

import scipy
from keras.preprocessing import image

def resize_img(img, size):
    img = np.copy(img)
    factors = (1,
               float(size[0]) / img.shape[1],
               float(size[1]) / img.shape[2],
               1)
    return scipy.ndimage.zoom(img, factors, order=1)

def save_img(img, fname):
    pil_img = deprocess_image(np.copy(img))
    scipy.misc.imsave(fname, pil_img)

# Util function to open, resize, and format pictures into tensors that Inception can process
def preprocess_image(image_path):
    img = image.load_img(image_path)
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = inception_v3.preprocess_input(img)
    return img

# Util function to convert tensor into a valid image
def deprocess_image(x):
    if K.image_data_format() == 'channels_first':
        x = x.reshape((3, x.shape[2], x.shape[3]))
        x = x.transpose((1, 2, 0))
    else:
        x = x.reshape((x.shape[1], x.shape[2], 3)) # undoes propocessing performed by inception_v3.proprocess_input
    x /= 2.
    x += 0.5
    x *= 255.
    x = np.clip(x, 0, 255).astype('uint8')
    return x

In [14]:
# Define path of image we want to use
base_image_path = 'images/rlatimer.jpg'

# Load the base image into a Numpy array
img = preprocess_image(base_image_path)

# Prepare a list of shape tuples defining the diff scales to run grad asc.
original_shape = img.shape[1:3]
successive_shapes = [original_shape]
for i in range(1, num_octave):
    shape = tuple([int(dim / (octave_scale ** i)) for dim in original_shape])
    successive_shapes.append(shape)
    
# Reverse the list of shapes so they're in increasing order
successive_shapes = successive_shapes[::-1]

In [21]:
# Resize Numpy array of the image to the smallest scale
original_img = np.copy(img)
shrunk_original_img = resize_img(img, successive_shapes[0])

for shape in successive_shapes:
    print('Processing image shape', shape)
    img = resize_img(img, shape) # scales up the dream image
    
    # Run gradient ascent, altering the dream image
    img = gradient_ascent(img, iterations=iterations, step=step, max_loss=max_loss)
    
    # Scale up the smaller version of original image
    upscaled_shrunk_original_img = resize_img(shrunk_original_img, shape)
    
    # Compute the high-quality version of the original
    same_size_original = resize_img(original_img, shape)
    
    # Difference between the two is the detail that was lost when scaling up
    lost_detail = same_size_original - upscaled_shrunk_original_img
    
    # Reinjects list detail into Dream
    img += lost_detail
    shrunk_original_img = resize_img(original_img, shape)
    save_img(img, fname='dream_at_scale_' + str(shape) + '.png')
    
save_img(img, fname='final_dream.png')

('Processing image shape', (1175, 1175))
('...Loss value at', 0, ':', 1.5334526)
('...Loss value at', 1, ':', 2.1093473)
('...Loss value at', 2, ':', 2.860016)
('...Loss value at', 3, ':', 3.751883)
('...Loss value at', 4, ':', 4.6573687)
('...Loss value at', 5, ':', 5.56007)
('...Loss value at', 6, ':', 6.4297724)
('...Loss value at', 7, ':', 7.250205)
('...Loss value at', 8, ':', 8.061167)
('...Loss value at', 9, ':', 8.838395)
('...Loss value at', 10, ':', 9.5841465)


`imsave` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imwrite`` instead.
  


('Processing image shape', (1645, 1645))
('...Loss value at', 0, ':', 3.6731281)
('...Loss value at', 1, ':', 5.0932336)
('...Loss value at', 2, ':', 6.2126055)
('...Loss value at', 3, ':', 7.2114797)
('...Loss value at', 4, ':', 8.140279)
('...Loss value at', 5, ':', 9.023996)
('...Loss value at', 6, ':', 9.836564)
('Processing image shape', (2304, 2304))
('...Loss value at', 0, ':', 3.704988)
('...Loss value at', 1, ':', 5.058856)
('...Loss value at', 2, ':', 6.1736317)
('...Loss value at', 3, ':', 7.187393)
('...Loss value at', 4, ':', 8.1419)
('...Loss value at', 5, ':', 9.051799)
('...Loss value at', 6, ':', 9.908897)


**NOTE:** Because the original Inception V3 network was trained to recognize concepts in images of size 299 × 299, and given that the process involves scaling the images down by a reasonable factor, the DeepDream implementation produces much better results on images that are somewhere between 300 × 300 and 400 × 400. Regardless, we can run the same code on images of any size and any ratio.

I started with this picture of me:

![starting_image](images/rlatimer.jpg)

And here is our DeepDream output!

![output_image](images/final_dream.png)

Now that we have an introduction into DeepDream, it would be valuable to explore what we can do by adjusting which layers we use in our loss. Layers that are lower in the network contain more-local, less-abstract representations and lead to dream patterns that look more geometric. Layers that are higher up lead to more-recognizable visual patterns based on the most common objects found in ImageNet. We can use random generation of the parameters in the `layer_contributions` dictionary to quickly explore many different layer combinations. The image below shows a range of results obtained using different layer configurations, from an image of a batch of pastries.

![pastries](images/8_2_1_pastries.png)

## 8.2.2 Wrapping up
 - DeepDream consists of running a convnet in reverse to generate inputs based on the representations learned by the network.
 - The results produced are fun and somewhat similar to the visual artifacts induced in humans by the disruption of the visual cortex via psychedelics.
 - Note that the process isn’t specific to image models or even to convnets. It can be done for speech, music, and more.