# Super-Resolution
Thank you to Jeremy Howard and Rachel Thomas of fast.ai for the code on which this program is based.

It uses the approach of Justin Johnson, Alexandre Alahi, Li Fei-Fei in the following paper.(https://arxiv.org/abs/1603.08155).

In [None]:
%matplotlib inline
import importlib
import utils2; importlib.reload(utils2)
from utils2 import *

from scipy.optimize import fmin_l_bfgs_b
from scipy.misc import imsave
from keras import metrics

from vgg16_avg import VGG16_Avg


In [None]:
# Tell Tensorflow to use no more GPU RAM than necessary
#limit_mem()

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.8
session = tf.Session(config=config)

In [None]:
cnmem= 0.8

In [None]:
path = './'
dpath = 'data/'

## Setup

We'll be using VGG16. Therefore, we need to subtract the mean of each channel of the imagenet data and reverse the order of RGB->BGR since those are the preprocessing steps that the VGG authors did - so their model won't work unless we do the same thing.
We can do this in one step using broadcasting, which is a topic we'll be returning to many times during this course.

In [None]:
img = Image.open(os.path.join(path, 'CampusBuild7272.jpg'))
img

In [None]:
rn_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32)
preproc = lambda x: (x - rn_mean)[:, :, :, ::-1]

When we generate images from this network, we'll need to undo the above preprocessing in order to view them.

In [None]:
deproc = lambda x,s: np.clip(x.reshape(s)[:, :, :, ::-1] + rn_mean, 0, 255)

## Recreate input

In [None]:
model = VGG16_Avg(include_top=False)

# Use content loss to create a super-resolution network

The following data will need to be downloaded from http://files.fast.ai/data/ into your dpath directory.

In [None]:
arr_lr = bcolz.open(dpath+'trn_resized_72.bc')[:] #this changes a bcolz array into an numpy array by slicing
arr_hr = bcolz.open(dpath+'trn_resized_288.bc')[:]

In [None]:
parms = {'verbose': 0, 'callbacks': [TQDMNotebookCallback(leave_inner=True)]}

To start we'll define some of the building blocks of our network. In particular recall the residual block (as used in Resnet), which is just a sequence of 2 convolutional layers that is added to the initial block input. We also have a de-convolutional layer (also known as a "transposed convolution" or "fractionally strided convolution"), whose purpose is to learn to "undo" the convolutional function. It does this by padding the smaller image in such a way to apply filters on it to produce a larger image.

In [None]:
def conv_block(x, numFilters, size, stride=(2,2), mode='same', act=True):
    x = Conv2D(numFilters, size, strides=stride, padding=mode)(x)
    x = BatchNormalization()(x)
    return Activation('relu')(x) if act else x

In [None]:
def res_block(ip, numFilters=64):
    x = conv_block(ip, numFilters, 3, (1,1))
    x = conv_block(x, numFilters, 3, (1,1), act=False)
    return add([x, ip])

In [None]:
def deconv_block(x, numFilters, size, shape, stride=(2,2)):
    x = Conv2DTranspose(numFilters, (size, size), strides=stride, 
        padding='same')(x) #, output_shape=(None,)+shape
    x = BatchNormalization()(x)
    return Activation('relu')(x)

In [None]:
def up_block(x, numFilters, size):
    x = keras.layers.UpSampling2D()(x)
    x = Conv2D(numFilters, (size, size), padding='same')(x)
    x = BatchNormalization()(x)
    return Activation('relu')(x)

In [None]:
def up_block(x, numFilters, size):
    x = keras.layers.UpSampling2D()(x)
    x = Conv2D(numFilters, (size, size), padding='same')(x)
    x = BatchNormalization()(x)
    return Activation('relu')(x)

In [None]:
### this is a simple clipping layer which is some of our experiments replaces the tanh.
def clipLayer(x):    
    return keras.backend.clip(x, -1,1)

This model here is using the previously defined blocks to encode a low resolution image and then upsample it to match the same image in high resolution.

In [None]:
from keras import layers

inp=Input(arr_lr.shape[1:])
conv_in=conv_block(inp, 64, 9, (1,1))
x=res_block(conv_in)
for i in range(3): x=res_block(x)
x=up_block(x, 64, 3)
x=up_block(x, 64, 3)
#x=Conv2D(3, (9, 9), activation='tanh', padding='same')(x)


x = layers.Conv2D(3, 9, padding='same', kernel_initializer='random_normal',
                bias_initializer='zeros')(x)

### uncomment/comment the following lines for the combination of BN, clipping, tanh you wish to test.
#x = layers.BatchNormalization()(x)
#x=layers.Lambda(clipLayer, tuple(list((288,288,3))))(x)
x = Activation('tanh')(x)
outp=Lambda(lambda x: (x+1)*127.5)(x)

The idea here is we're going to feed two images to Vgg16 and compare their convolutional outputs at some layer. These two images are the target image (which in our case is the same as the original but at higher resolution), and the output of the previous network we just defined, which we hope will learn to output a high resolution image.
The key then is to train this other network to produce an image that minimizes the loss between the outputs of some convolutional layer in Vgg16 (which the paper refers to as "perceptual loss"). In doing so, we are able to train a network that can upsample an image and recreate the higher resolution details.

In [None]:
shp = (288, 288, 3)

In [None]:

vgg_inp=Input(shp)
vgg= VGG16(include_top=False, input_tensor=Lambda(preproc)(vgg_inp)) #Lambda, turns the function into a layer of the network.

Since we only want to learn the "upsampling network", and are just using VGG to calculate the loss function, we set the Vgg layers to not be trainable.

In [None]:
for l in vgg.layers: l.trainable=False 

An important difference in training for super resolution is the loss function. We use what's known as a perceptual loss function (which is simply the content loss for some layer).

So it seems here we are taking content (perceptual output) from Layers 1,2 and 3 with only the fist convolutional layer in each. Remember that layers lower in the network maintian content detail better than upper layers like 4 and 5.


In [None]:
def get_outp(m, ln): return m.get_layer('block'+ str(ln)+'_conv1').output

vgg_content = Model(vgg_inp, [get_outp(vgg, o) for o in [1,2,3,4,5]]) #I think vgg_content is just the three layers. but we can treat
                                                        #these like a network and the result of them by sending in an input
vgg1 = vgg_content(vgg_inp) #Send in the high res target 
vgg2 = vgg_content(outp) # send in the generated high res prediction, note this is the output of our other model that we do train

In [None]:
def mean_sqr_b(diff): 
    print("mean_sqr")
    dims = list(range(1,K.ndim(diff)))
    return K.expand_dims(K.sqrt(K.mean(diff**2, dims)), 0)

In [None]:

w=[1.0/10, 8.0/10, 1.0/10,0,0] #Taking 80% of the content or layer 2 but only 10% of content of layers 1 and 3
def content_fn(x): 
    print("content_fn")
    res = 0; n=len(w)
    for i in range(n): 
        res += mean_sqr_b(x[i]-x[i+n]) * w[i]
        print(res, i)
    return res

In [None]:
vgg1+vgg2

In [None]:
#vgg1+vgg2 is not a sum it's just a concatenation of the two sets of output layers

m_sr = Model([inp, vgg_inp], Lambda(content_fn)(vgg1+vgg2))
targ = np.zeros((arr_hr.shape[0], 1))

In [None]:
m_sr.summary() ###Super-resolution network summary

# Initializing the weights of the final BN Layer
If you wish to set the initial weights of the final BatchNorm layer you can do so by uncommenting the line below.
I have initialized to the RGB standard-deviation and RGB mean of the target distribution.
final two lists are the running-mean and running-SD which we set to 0 and 1 respectively
Keras BatchNorm implementation is here for reference
https://github.com/keras-team/keras/blob/master/keras/layers/normalization.py
if you change the architecture you will have to determine the correct layer number 
by using generator.summary() you can count your way down to the appropriate BN layer.

In [None]:
#m_sr.layers[37].set_weights(np.array([[0.606, 0.585, 0.595],[-0.142, -0.1866, -0.27282], [0,0,0], [1,1,1]]))

### you may wish to freeze the weights of the final BN Layer. If so you can do that with the following line.
#generator.layers[37].trainable = False


# A note on the values used above

In Jeremy's data, images are resized to a square aspect ratio without changing the original aspect ratio of the images. This means than the bottom portion of the images are colour values of [0,0,0]. We have choosen to calculate the values only of the data set only using the valid colour values. Using all values darkened the images in the initial part of training seemed to required the network to learn its way back to appropriate values.
The [0,0,0] appear to start from rows 215.

In [None]:
print(arr_hr[:,0:215,:,0].mean()/127.5-1)
print(arr_hr[:,0:215,:,1].mean()/127.5-1)
print(arr_hr[:,0:215,:,2].mean()/127.5-1)
print(math.sqrt(arr_hr[:,0:215,:,0].var())/127.5)
print(math.sqrt(arr_hr[:,0:215,:,1].var())/127.5)
print(math.sqrt(arr_hr[:,0:215,:,2].var())/127.5)

Finally we compile this chain of models and we can pass it the original low resolution image as well as the high resolution to train on. We also define a zero vector as a target parameter, which is a necessary parameter when calling fit on a keras model.

In [None]:
from tqdm import tqdm
import time
from time import sleep
from tqdm import trange
start = 0
batch_size = 8
m_sr.compile('adam', 'mse')
targ = np.zeros((batch_size, 1))
targ.shape


iterations = 2001
t =  trange(iterations, desc='')
top_model = Model(inp, outp) #This is the generator only and we can use this to test inference.

current_step = 0
img=Image.open((os.path.join(path, 'CampusBuild7272.jpg')))
img_arr = np.expand_dims(np.array(img), 0)
K.set_value(m_sr.optimizer.lr, 1e-4)
loss_curve = []

In [None]:

t =  trange(iterations, desc='')
for i in t:
    stop = start + batch_size
    
    lr_batch = arr_lr[start: stop]
    hr_batch = arr_hr[start: stop]
    
    loss = m_sr.train_on_batch([lr_batch, hr_batch], targ)
    
    start += batch_size
    
    if start > len(arr_hr[0]) - batch_size:
            start = 0   
    
    t.set_description('%d Loss ' % (loss))
    if current_step%50 == 0:
        p = top_model.predict(img_arr)
    
        imsave(os.path.join(path, 'out/sr_' + str(current_step)+'.png'), p[0])
    loss_curve.append(loss)
    current_step += 1

In [None]:
import csv
with open(os.path.join(path, 'out/loss.csv'), 'w') as myfile:
            wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
            wr.writerow(loss_curve)