神经风格迁移分为三个步骤。

1. 设置一个网络来同时计算 VGG19 网络对风格图片，目标图片，以及生成图片的层响应。
2. 使用这三个图片的层响应来计算整体损失函数。
3. 设置一个梯度下降过程来最小化这个损失函数。

In [1]:
import tensorflow as tf

tf.compat.v1.disable_eager_execution()

from keras.preprocessing.image import load_img, img_to_array

target_image_path = '/Users/bifnudozhao/Projects/ai-playground/datasets/images/firenze_duomo.jpg'
style_reference_image_path = '/Users/bifnudozhao/Projects/ai-playground/datasets/images/starry_night.jpg'

width, height = load_img(target_image_path).size
img_height = 400
img_width = int(width * img_height / height)

In [2]:
import numpy as np
from keras.applications import vgg19

def preprocess_image(image_path):
    img = load_img(image_path, target_size=(img_height, img_width))
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = vgg19.preprocess_input(img)
    return img

# reverse vgg19.preprocess_input
def deprocess_image(x):
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68
    x = x[:, :, ::-1] # converts images from BGR to RGB
    x = np.clip(x, 0, 255).astype('uint8')
    return x

下面来设置 VGG19 网络。它的输入是三张图片：原图，风格图片，以及最后的生成图片。原图和风格图片不会发生变化，所以使用 `K.constant`，而生成图则会一直发生变化。

In [3]:
from keras import backend as K

target_image = K.constant(preprocess_image(target_image_path))
style_reference_image = K.constant(preprocess_image(style_reference_image_path))
combination_image = K.placeholder((1, img_height, img_width, 3))

input_tensor = K.concatenate([target_image,
                              style_reference_image,
                              combination_image], axis=0)

model = vgg19.VGG19(input_tensor=input_tensor,
                    weights='imagenet',
                    include_top=False) # without the last categorical layer

2023-10-06 22:54:02.791701: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1 Pro
2023-10-06 22:54:02.791723: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 32.00 GB
2023-10-06 22:54:02.791729: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 10.67 GB
2023-10-06 22:54:02.791764: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-10-06 22:54:02.791779: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2023-10-06 22:54:02.800486: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:382] MLIR V1 optimization pass is not enabled
2023-10-06 22:54:02.806540: I tensorflow/core/gr

In [4]:
def content_loss(base, combination):
    return K.sum(K.square(combination - base))

def gram_matrix(x):
    features = K.batch_flatten(K.permute_dimensions(x, (2, 0, 1)))
    gram = K.dot(features, K.transpose(features))
    return gram

def style_loss(style, combination):
    S = gram_matrix(style)
    C = gram_matrix(combination)
    channels = 3
    size = img_height * img_width
    return K.sum(K.square(S - C)) / (4. * (channels ** 2) * (size ** 2))

def total_variation_loss(x):
    a = K.square(
        x[:, :img_height - 1, :img_width - 1, :] -
        x[:, 1:, :img_width - 1, :])
    b = K.square(
        x[:, :img_height - 1, :img_width - 1, :] -
        x[:, :img_height - 1, 1:, :])
    return K.sum(K.pow(a + b, 1.25))

In [5]:
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])
content_layer = 'block5_conv2'
style_layers = [
    'block1_conv1',
    'block2_conv1',
    'block3_conv1',
    'block4_conv1',
    'block5_conv1']
total_variation_weight = 1e-4
style_weight = 1.
content_weight = 0.025

loss = K.variable(0.)
layer_features = outputs_dict[content_layer]
target_image_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]
loss = loss + content_weight * content_loss(target_image_features,
                                      combination_features)

for layer_name in style_layers:
    layer_features = outputs_dict[layer_name]
    style_reference_features = layer_features[1, :, :, :]
    combination_features = layer_features[2, :, :, :]
    sl = style_loss(style_reference_features, combination_features)
    loss = loss + (style_weight / len(style_layers)) * sl

loss = loss + total_variation_weight * total_variation_loss(combination_image)

In [6]:
grads = K.gradients(loss, combination_image)[0]
fetch_loss_and_grads = K.function([combination_image], [loss, grads])

In [7]:
class Evaluator(object):
    def __init__(self):
        self.loss_value = None
        self.grads_values = None

    def loss(self, x):
        assert self.loss_value is None
        x = x.reshape((1, img_height, img_width, 3))
        outs = fetch_loss_and_grads([x])
        loss_value = outs[0]
        grad_values = outs[1].flatten().astype('float64')
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value

    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values

evaluator = Evaluator()

In [8]:
from scipy.optimize import fmin_l_bfgs_b
import imageio
import time

result_prefix = '/Users/bifnudozhao/Projects/ai-playground/results/neural_style_transfer/duomo'
iterations = 20

x = preprocess_image(target_image_path)
x = x.flatten()
for i in range(iterations):
    print('Start of iteration ', i)
    start_time = time.time()
    x, min_val, info = fmin_l_bfgs_b(evaluator.loss,
                                     x,
                                     fprime=evaluator.grads,
                                     maxfun=20)
    print('Current loss value ', min_val)
    img = x.copy().reshape((img_height, img_width, 3))
    img = deprocess_image(img)
    fname = result_prefix + '_at_iteration_%d.png' % i
    imageio.imwrite(fname, img)
    print('Image saved as ', fname)
    end_time = time.time()
    print('Iteration %d completed in %ds' % (i, end_time - start_time))

Start of iteration  0


2023-10-06 22:54:04.192186: W tensorflow/c/c_api.cc:305] Operation '{name:'Variable/Assign' id:528 op device:{requested: '', assigned: ''} def:{{{node Variable/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](Variable, Variable/Initializer/initial_value)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.


Current loss value  1479477500.0
Image saved as  /Users/bifnudozhao/Projects/ai-playground/results/neural_style_transfer/duomo_at_iteration_0.png
Iteration 0 completed in 14s
Start of iteration  1
Current loss value  514530620.0
Image saved as  /Users/bifnudozhao/Projects/ai-playground/results/neural_style_transfer/duomo_at_iteration_1.png
Iteration 1 completed in 8s
Start of iteration  2
Current loss value  326447550.0
Image saved as  /Users/bifnudozhao/Projects/ai-playground/results/neural_style_transfer/duomo_at_iteration_2.png
Iteration 2 completed in 9s
Start of iteration  3
Current loss value  232816240.0
Image saved as  /Users/bifnudozhao/Projects/ai-playground/results/neural_style_transfer/duomo_at_iteration_3.png
Iteration 3 completed in 9s
Start of iteration  4
Current loss value  193397150.0
Image saved as  /Users/bifnudozhao/Projects/ai-playground/results/neural_style_transfer/duomo_at_iteration_4.png
Iteration 4 completed in 9s
Start of iteration  5
Current loss value  169