<a href="https://colab.research.google.com/github/mlartorg/visualML/blob/master/sin_xy2rgb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##### Copyright 2018 Google LLC.

Licensed under the Apache License, Version 2.0 (the "License");

In [1]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Compositional Pattern Producing Networks for Feature Visualization

This notebook uses [**Lucid**](https://github.com/tensorflow/lucid) to produce aesthetically pleasing feature visualizations using a [Differentiable Image Parameterization](https://distill.pub/2018/differentiable-parameterizations/#section-xy2rgb) called a **Compositional Pattern Producing Network** (CPPN).

![](https://storage.googleapis.com/tensorflow-lucid/notebooks/xy2rgb/cppn-header.jpg)

This notebook additionally demonstrates:

* rendering videos of the training process of the CPPN generating the visualizations,
* rendering videos of interpolating between sets of learned CPPN parameters
* rendering high resolution visualizations from a set of CPPN parameters.


This notebook doesn't introduce the abstractions behind lucid; you may wish to also read the [Lucid tutorial](https://colab.research.google.com/github/tensorflow/lucid/blob/master/notebooks/tutorial.ipynb).

**Note**: The easiest way to use this tutorial is as a colab notebook, which allows you to dive in with no setup. We recommend you enable a free GPU by going:

> **Runtime**   →   **Change runtime type**   →   **Hardware Accelerator: GPU**

## Install, Import, and load a model

In [3]:
!pip install -q lucid>=0.2.3

In [3]:
!pip install numpy==1.16  # allow_pickle workaround, restart runtime for this to take effect

Collecting numpy==1.16
[?25l  Downloading https://files.pythonhosted.org/packages/7b/74/54c5f9bb9bd4dae27a61ec1b39076a39d359b3fb7ba15da79ef23858a9d8/numpy-1.16.0-cp36-cp36m-manylinux1_x86_64.whl (17.3MB)
[K     |████████████████████████████████| 17.3MB 198kB/s 
[31mERROR: umap-learn 0.4.6 has requirement numpy>=1.17, but you'll have numpy 1.16.0 which is incompatible.[0m
[31mERROR: datascience 0.10.6 has requirement folium==0.2.1, but you'll have folium 0.8.3 which is incompatible.[0m
[31mERROR: albumentations 0.1.12 has requirement imgaug<0.2.7,>=0.2.5, but you'll have imgaug 0.2.9 which is incompatible.[0m
[?25hInstalling collected packages: numpy
  Found existing installation: numpy 1.18.5
    Uninstalling numpy-1.18.5:
      Successfully uninstalled numpy-1.18.5
Successfully installed numpy-1.16.0


In [4]:
# For video rendering

!pip install -q moviepy
!imageio_download_bin ffmpeg

Ascertaining binaries for: ffmpeg.


In [5]:
%tensorflow_version 1.x
from __future__ import print_function
import io
import string
import numpy as np
import PIL
import base64
from glob import glob

import matplotlib.pylab as pl

import tensorflow as tf
from tensorflow.contrib import slim

from IPython.display import clear_output, Image, display, HTML

import moviepy.editor as mpy
from moviepy.video.io.ffmpeg_writer import FFMPEG_VideoWriter


from google.colab import files

TensorFlow 1.x selected.


In [6]:
from lucid.modelzoo import vision_models
from lucid.misc.io import show, save, load
from lucid.optvis import objectives
from lucid.optvis import render
from lucid.misc.tfutil import create_session

In [7]:
model = vision_models.InceptionV1()
model.load_graphdef()













## Setting up the CPPN 

In [8]:
def composite_activation(x):
  x = tf.atan(x)
  # Coefficients computed by:
  #   def rms(x):
  #     return np.sqrt((x*x).mean())
  #   a = np.arctan(np.random.normal(0.0, 1.0, 10**6))
  #   print(rms(a), rms(a*a))
  return tf.concat([x/0.67, (x*x)/0.6], -1)


def composite_activation_unbiased(x):
  x = tf.atan(x)
  # Coefficients computed by:
  #   a = np.arctan(np.random.normal(0.0, 1.0, 10**6))
  #   aa = a*a
  #   print(a.std(), aa.mean(), aa.std())
  return tf.concat([x/0.67, (x*x-0.45)/0.396], -1)

def composite_activation_sin(x):
  x = tf.sin(x)
  # Coefficients computed by:
  #   a = np.arcsin(np.random.normal(0.0, 1.0, 10**6))
  #   aa = a*a
  #   print(a.std(), aa.mean(), aa.std())
  return tf.concat([x/0.631, (x*x)/0.616], -1)
  #   return tf.concat([x/0.63, (x*x-0.4)/0.47], -1)


def relu_normalized(x):
  x = tf.nn.relu(x)
  # Coefficients computed by:
  #   a = np.random.normal(0.0, 1.0, 10**6)
  #   a = np.maximum(a, 0.0)
  #   print(a.mean(), a.std())
  return (x-0.40)/0.58


def image_cppn(
    size,
    num_output_channels=3,
    num_hidden_channels=24,
    num_layers=8,
    activation_fn=composite_activation,
    normalize=False):
  r = 3.0**0.5  # std(coord_range) == 1.0
  coord_range = tf.linspace(-r, r, size)
  y, x = tf.meshgrid(coord_range, coord_range, indexing='ij')
  net = tf.expand_dims(tf.stack([x, y], -1), 0)  # add batch dimension

  with slim.arg_scope([slim.conv2d], kernel_size=1, activation_fn=None):
    for i in range(num_layers):
      in_n = int(net.shape[-1])
      net = slim.conv2d(
          net, num_hidden_channels,
          # this is untruncated version of tf.variance_scaling_initializer
          weights_initializer=tf.random_normal_initializer(0.0, np.sqrt(1.0/in_n)),
      )
      if normalize:
        net = slim.instance_norm(net)
      net = activation_fn(net)

    rgb = slim.conv2d(net, num_output_channels, activation_fn=tf.nn.sigmoid,
                      weights_initializer=tf.zeros_initializer())
  return rgb

Estimating the number of parameters of CPPN.

In [9]:
with tf.Graph().as_default():
  image_cppn(224)
  variables = tf.get_collection('variables')
  param_n = sum([v.shape.num_elements() for v in variables])
  print('CPPN parameter count:', param_n)

Instructions for updating:
Please use `layer.__call__` method instead.


Instructions for updating:
Please use `layer.__call__` method instead.


CPPN parameter count: 8451


Let's quickly sanity check that this CPPN can learn to produce an image with the properties we expect it to.
As a simplistic test we try to fit the XOR function by imposing a loss on four corner points of the image.

In [10]:
cppn_f = lambda: image_cppn(64)
optimizer = tf.train.AdamOptimizer(0.01)

def xor_objective(T):
  a = T('input')[0]
  return -(tf.square(a[0, 0])      + tf.square(a[-1, -1]) + 
           tf.square(1.0-a[-1, 0]) + tf.square(1.0-a[0, -1]))

vis = render.render_vis(model, xor_objective, param_f=cppn_f, optimizer=optimizer, transforms=[], thresholds=range(10), verbose=False)
show(vis)











































That looks reasonable enough!
Let's move on to our original goal: Feature Visualizations

# Feature Visualization

Let's use our new CPPN to produce one of the feature visualizations similar to those in the header image:

In [11]:
def render_feature(
    cppn_f = lambda: image_cppn(224, activation_fn=composite_activation_sin),
    optimizer = tf.train.AdamOptimizer(0.005),
    objective = objectives.deepdream("mixed4d")):
  vis = render.render_vis(model, objective, param_f=cppn_f, optimizer=optimizer, transforms=[], thresholds=[2**i for i in range(5,10)], verbose=False)
  show(vis)


# render_feature( objective=objectives.channel("mixed4b_pool_reduce_pre_relu", 16))

### Varying the activation function

The following `render_story` function accomplishes a bunch of things: it sets up the optimization problem, saves out frames to a video at each step of the optimization, and finally saves out the weights and the final optimization result.

In [28]:
render.make_vis_T?

In [12]:
from lucid.misc.io.serialize_array import _normalize_array

def render_story(size, obj_str, lr=0.004, step_n=512,
                 normalize=False,
                 activation_fn=composite_activation, num_layers=8, num_hidden=24):
  sess = create_session()

  obj = objectives.deepdream(obj_str)
  # obj = objectives.channel("mixed4b_pool_reduce_pre_relu", 16)
  # Set up optimization problem
  # size = 512
  t_size = tf.placeholder_with_default(size, [])
  T = render.make_vis_T(
      model, obj, 
      param_f=lambda: image_cppn(
          t_size, num_layers=num_layers, num_hidden_channels=num_hidden, normalize=normalize, activation_fn=activation_fn),
      transforms=transforms,
      optimizer=tf.train.AdamOptimizer(lr),
  )
  tf.global_variables_initializer().run()

  # Prepare video writer and filenames
  subst = {ord(':'):'_', ord('/'):'_'}
  out_name = 'xy2rgb_' + obj_str.translate(subst)
  video_fn = out_name + '.mp4'
  writer = FFMPEG_VideoWriter(video_fn, (size, size), 60.0)

  # Optimization loop
  try:
    for i in range(step_n):
      _, loss, img = sess.run([T("vis_op"), T("loss"), T("input")])
      writer.write_frame(_normalize_array(img))
      if i > 0 and i % 50 == 0:
        clear_output()
        print("%d / %d  score: %f"%(i, step_n, loss))
        show(img)
  except KeyboardInterrupt:
    pass
  finally:
    writer.close()

  # Show the resulting video
  clear_output()
  display(mpy.ipython_display(video_fn, height=size))

  # Save trained variables
  train_vars = sess.graph.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
  params = np.array(sess.run(train_vars), object)
  save(params, out_name + '.npy')

  # Save final image
  final_img = T("input").eval({t_size: size})
  save(final_img, out_name+'.jpg', quality=90)

In [37]:
from lucid.optvis import transform

NUM_LAYERS = 8
NUM_FEATURES = 64

# transforms=[transform.random_rotate([137.5 * i % 180 for i in range(17)]), transform.jitter(1), transform.random_scale([1.61 * i % 1 for i in range(1, 2)])]
transforms = [transform.jitter(1)]

In [33]:
render_story(224, 'mixed4d', step_n=500, lr=0.005, activation_fn=composite_activation_sin, num_layers=NUM_LAYERS, num_hidden=NUM_FEATURES)

In [51]:
render_story(224, 'mixed5b', step_n=500, lr=0.003, activation_fn=composite_activation_sin, num_layers=NUM_LAYERS, num_hidden=NUM_FEATURES)

# Arbitrary resolution images (install numpy==1.16 and restart runtime for this to work)


In [45]:
sess = create_session()
t_size = tf.placeholder_with_default(224, [])
t_image = image_cppn(t_size, activation_fn=composite_activation_sin, num_layers=NUM_LAYERS, num_hidden_channels=NUM_FEATURES)

train_vars = sess.graph.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)

def render_params(params, size=224):
  feed_dict = dict(zip(train_vars, params))
  feed_dict[t_size] = size
  return sess.run(t_image, feed_dict)[0]




In [56]:
params = load('xy2rgb_mixed4d.npy')
vis = render_params(params, 512)
show(vis)

In [50]:
def interpolate_params(param1, param2, duration=5.0, size=224):

  def frame(t):
    t = t / duration
    t = (1.0-np.cos(2.0*np.pi*t))/2.0       # looping & easing
    params = param1*(1.0-t) + param2*t      # blending
    params *= 1.0 + t*(1.0-t)               # exaggerating
    img = render_params(params, size=size)
    return _normalize_array(img)

  clip = mpy.VideoClip(frame, duration=duration)
  clip.write_videofile('tmp.mp4', fps=30.0)
  display(mpy.ipython_display('tmp.mp4', height=512))

In [None]:
interpolate_params(
    load('xy2rgb_mixed4d.npy'),
    load('xy2rgb_mixed5b.npy'),
    size=512
  )