# **Single Image Super-Resolution** 
**Joshua Lo (Peer Mentor), Justin Ashbaugh, Jared Habermehl, Allen Tu, Addison Waller**

Super-resolution is the process of recovering a high resolution (HR) image or video from its low resolution (LR) counterpart. It has a myriad of applications in many fields, including autonomous vehicles, medical imaging, security, and entertainment. 
![diagram](https://www.mathworks.com/help/examples/deeplearning_shared/win64/VeryDeepSuperResolutionDeepLearningExample_01.png)
Machine learning super resolution uses a model trained with a dataset of images to predict additional pixels from a LR image input, essentially "filling in" the gaps in between the pixels of a LR image to create a HR output. We refer to a recovered HR image as a super-resolved (SR) image. A SR image has more pixels than the LR image that it was created from, so it contains more information and will be appear clearer due to its higher pixel density.

This notebook demonstrates our single image super-resolution (SISR) program and upscales a still image to 3 times its original size. Our full project can be found on our team's [GitHub](https://github.com/umd-fire-coml/2020-Image-Super-Resolution) repository. It is written in TensorFlow and uses the following imports.

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, Model
from tensorflow.python.keras.utils.data_utils import Sequence
import numpy as np
import math
import os
from sys import maxsize
import random
import requests
import imageio
import matplotlib.pyplot as plt

## Designing the Model
Our [model](https://github.com/umd-fire-coml/2020-Image-Super-Resolution/blob/master/model.md) uses a series of convolutional layers to extract, or learn, information from the LR image. Then, it combines the data that it collected to create the SR image.  

![architecture](https://miro.medium.com/max/4902/1*n4cXo7DASn1_HEGrDNJVFg.png)

In technical terms, this is a seven-layer [Efficient Sub-Pixel Convolutional Neural Network (ESPCN)](https://arxiv.org/pdf/1609.05158.pdf) SISR model, which takes a LR image input, extracts LR feature maps through a series of convolutional layers, then applies a sub-pixel convolution layer to assemble the LR feature maps into a HR image output. You can learn more about our model and how we arrived at it in its [documentation](https://github.com/umd-fire-coml/2020-Image-Super-Resolution/blob/master/model.md). 


In [None]:
# 6-layer ESPCN SISR model
def espcn_model(r, channels = 3):
    # Arguments for Conv2D
    conv_args = {
      "activation": "relu",
      "padding" : "same",
    }
    # Input
    inputs = keras.Input(shape=(None, None, channels))
    # Feature Maps Extraction
    conv1 = layers.Conv2D(64, 5, **conv_args)(inputs)
    conv2 = layers.Conv2D(64, 3, **conv_args)(conv1)
    conv3 = layers.Conv2D(32, 3, **conv_args)(conv2)
    conv4 = layers.Conv2D(32, 3, **conv_args)(conv3)
    conv5 = layers.Conv2D(32, 3, **conv_args)(conv4)
    conv6 = layers.Conv2D(channels*(r*r), 3, **conv_args)(conv5)
    # Efficient Sub-Pixel Convolutional Layer
    outputs = tf.nn.depth_to_space(conv6, r)
    return Model(inputs, outputs)

The upscale factor, `r`, represents how much the model will upscale the LR image. For example, `r=3` below, so our SR image will be 3 times taller and wider in pixels than the LR image. It will also look sharper than the LR image if they are displayed at the same size because it has 9 times as many pixels. 

Training an effective SISR model takes hours, so we [pre-trained our model](https://github.com/umd-fire-coml/2020-Image-Super-Resolution/blob/master/training.ipynb) using a [dataset of 900 images](https://data.vision.ee.ethz.ch/cvl/DIV2K/) over 100 epochs to save you (a lot) of time. We'll load in those saved weights after we mount the drive. For now, we compile the model.



In [None]:
r = 3 # Upscale Factor 

# Compile model
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
def PSNR(y_true, y_pred):
    max_pixel = 1.0
    return tf.image.psnr(y_true, y_pred, max_val=max_pixel)
model = espcn_model(r)
model.compile(optimizer=opt, loss='mse', metrics=[PSNR])
model.summary()

## Loading and Generating Data
Now, we load the aforementioned pre-trained weights into our model. The `testing_dict()` function finds all of the images in our testing dataset and creates a dictionary matching each unique image to its scales. We also have a [`training_dict()`](https://github.com/umd-fire-coml/2020-Image-Super-Resolution/blob/master/dictionary.py) function, but we left it out here because we already trained the model. The training and testing datasets all support `r=2`, `3`, and `4`.

In [None]:
# Load pre-trained weights
filepath = "model/weights/r3bs10epochs100weights.h5"
model.load_weights(filepath)

# Returns a dictionary containing all classical SR filepaths
def testing_dict():
    data_directory = 'data/datasets/'
    datasets = ['BSDS100', 'BSDS200', 'General100', 'historical', 'Set5', 'Set14', 'T91', 'urban100', 'manga109']
    scales = ['LRbicx2', 'LRbicx3', 'LRbicx4']

    # key = image name without directory path
    # value = dict of filepaths of original and scaled images
    images = {}
    # Build images dict for all classical SR images
    for dataset in datasets:
        dataset_directory = data_directory + dataset + '/'
        # Get list of all image names
        image_names = os.listdir(dataset_directory + 'original')
        for image_name in image_names:
            # image_scales dict for storing filepaths of original and scaled images
            # key = scale
            # value = filepath of image
            image_scales = {}
            image_scales['original'] = dataset_directory + 'original/' + image_name
            for scale in scales:
                image_scales[scale] = dataset_directory + scale + '/' + image_name
            # image name points to dictionary of scales
            images[image_name] = image_scales
    return images

num_images = len(testing_dict()) 
print("Number of Images in Testing Dataset: " + str(num_images))

A machine learning program's data generator feeds information to the model during training. Our [`DataGenerator`](https://github.com/umd-fire-coml/2020-Image-Super-Resolution/blob/master/datagenerator.py) loads LR and HR image pairs from the dictionaries, processes them so that they are compatible with the model, and then outputs batches of data as arrays.

We're using saved weights, so it won't be generating batches for training here. `testing_generator` outputs batches containing one LR image and HR image pair from the testing dataset, as this code block demonstrates.

In [None]:
# Generates batches of LR and HR pairs
class DataGenerator(Sequence):
    def __init__(self, scale, batch_size, dictionary = "train", shuffle=True):
        'Initialization'
        if dictionary == "test":
          self.images = testing_dict()
        else:
          self.images = training_dict()
        self.scale = scale
        self.r = int(scale[-1])
        self.list_IDs = list(self.images.keys())
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.on_epoch_end()

    def __len__(self):
        'denotes the number of batches per epoch'
        return int(np.floor(len(self.list_IDs) / self.batch_size))

    def __getitem__(self, index):
        'Makes one batch of data'
        indexes = self.indexes[index*self.batch_size: (index+1)*self.batch_size] 
        list_IDs_temp = [self.list_IDs[k] for k in indexes] 
        # generate data
        X = self.__data_generation(list_IDs_temp)
        return X

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        self.indexes = np.arange(len(self.list_IDs))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)

    def __data_generation(self, list_IDs_temp):
        'Generates data containing batch_size samples' 
        LR = []
        HR = []
        min_height_LR = maxsize
        min_width_LR = maxsize
        # Append images as arrays to LR and HR
        for ID in list_IDs_temp:
            low_res = keras.preprocessing.image.load_img(self.images[ID][self.scale])
            high_res = keras.preprocessing.image.load_img(self.images[ID]['original'])
            low_res = np.asarray(low_res)
            high_res = np.asarray(high_res)
            low_res = low_res.astype('float32')
            high_res = high_res.astype('float32')
            # Normalize images to [0,1]
            low_res /= 255.0
            high_res /= 255.0
            LR.append(low_res)
            HR.append(high_res)
            # Find the minimum LR dimensions 
            min_height_LR = min(min_height_LR, low_res.shape[0])
            min_width_LR = min(min_width_LR, low_res.shape[1])
        # HR/SR image is bigger by a factor of r
        min_height_HR = self.r * min_height_LR
        min_width_HR = self.r * min_width_LR
        for i in range (0, len(LR)):
            # Crop LR and HR images to have the same dimensions 
            LR[i] = self.crop_center(LR[i], min_width_LR, min_height_LR)
            HR[i] = self.crop_center(HR[i], min_width_HR, min_height_HR)
        LR = np.asarray(LR)
        HR = np.asarray(HR)    
        return LR, HR 
    
    def crop_center(self, img, min_width, min_height):        
        'Crops image around the center given minimum width and height'
        width = img.shape[1]
        height = img.shape[0]
        # Calculates new boundaries around the center
        left = int(np.ceil((width - min_width) / 2))
        right = left + min_width
        top = int(np.ceil((height - min_height) / 2))
        bottom = top + min_height
        # Crop original image
        cropped_img = img[top:bottom, left:right, ...]
        return cropped_img

testing_generator = DataGenerator('LRbicx' + str(r), batch_size = 1, dictionary = "test")

# Display a random LR, HR pair
lr, hr = testing_generator.__getitem__(random.randint(0,num_images - 10))
fig = plt.figure()
fig.set_size_inches(10, 10)
ax1 = fig.add_subplot(1,2,1)
ax1.set_title('Low Resolution (LR): ' + str(lr[0].shape[0]) + ' x ' + str(lr[0].shape[1]) + ' pixels')
ax1.imshow(lr[0])
ax2 = fig.add_subplot(1,2,2)
ax2.set_title('High Resolution (HR): ' + str(hr[0].shape[0]) + ' x ' + str(hr[0].shape[1]) + ' pixels')
ax2.imshow(hr[0])
plt.show()

## Testing and Performing Single Image Super-Resolution
Now, we have what we need to perform SISR. We randomly generate a LR and HR pair from the training dataset using the data generator. Then, we use the trained model to predict a SR image from the LR image. Finally, we display the three images and their dimensions side-by-side to qualitatively compare them to each other. 

The [Peak Signal to Noise Ratio (PSNR)](https://github.com/umd-fire-coml/2020-Image-Super-Resolution/blob/master/psnr.py) is a quantitative measurement that represents the distance between a prediction and its ground truth; the higher the PSNR, the higher the quality of the prediction. We calculate the PSNR of the SR image and the HR image.

**Try running this code block multiple times to see how well the model performs with different images.**

In [None]:
# Generate random LR, HR and predict SR
lr, hr = testing_generator.__getitem__(random.randint(0,num_images - 10))
sr = model.predict(lr)

# Display Images Side by Side
fig = plt.figure()
fig.set_size_inches(28, 28)
ax1 = fig.add_subplot(1,3,1)
ax1.set_title('Low Resolution (LR): ' + str(lr[0].shape[0]) + ' x ' + str(lr[0].shape[1]) + ' pixels')
ax1.imshow(lr[0])
ax2 = fig.add_subplot(1,3,2)
ax2.set_title('Super Resolution (SR): ' + str(sr[0].shape[0]) + ' x ' + str(sr[0].shape[1]) + ' pixels')
ax2.imshow(sr[0])
ax3 = fig.add_subplot(1,3,3)
ax3.set_title('High Resolution (HR): ' + str(hr[0].shape[0]) + ' x ' + str(hr[0].shape[1]) + ' pixels')
ax3.imshow(hr[0])
plt.show()

# Peak Signal to Noise Ratio
def psnr(oldimg, newimg):
    mse = np.mean((oldimg.astype(float) - newimg.astype(float)) ** 2)
    if mse != 0:
        max_pixel = 1.0
        return 20 * math.log10(max_pixel / math.sqrt(mse))
    else:
        return -1
print("PSNR(SR, HR): " + str(psnr(hr[0], sr[0])))

# Results and Further Improvements
As you can see from your results and the example below, the SR image generated by our model is both qualitatively and quantitatively higher quality than the LR image. The PSNR of the SR and HR image is always greater than 10, indicating that they very similar.
![example](https://i.imgur.com/ToXzT1w.png)

The results are decent, but it is visibly apparent that the SR image is not quite on par with the HR image. Therefore, there is room for improvement:
* Create a model with more layers in hopes of capturing more information during training. This is called creating a deeper model.
* Add more images to the training dataset to provide more input to the model. This is called creating a wider model.
* Train for more epochs (train this model longer). More information can be captured this way, but it is possible to overfit a model and negatively impact your results.  
* Fine tune the convolutional layers in `espcn_model` through experimentation to more effectively capture information from the LR images in the training dataset. Since we chose the parameters for `Conv2D` as estimates, this is our most promising next step. 

Improvements to the program would lead to even clearer SR images with higher PSNR scores.



# Try it Yourself
This model doesn't just work with images generated by our generator; you can use it on your own images as well! Change `url` to the image address of your image, then run the code to super-resolution it. 

In [None]:
url = "https://i.imgur.com/oIm04AT.png"
response = requests.get(url)
image = imageio.imread(url)


# Load image and convert it to compatible LR format
# image = keras.preprocessing.image.load_img('/content/drive/Shared drives/COML-STUDENTS-2020/Fall/Team Projects/T4 Image Super-Resolution/' + image_name)
image = np.asarray(image)
image = image.astype('float32')
image /= 255.0
LR = [image]
LR = np.asarray(LR)
# Predict SR image
SR = model.predict(LR)
# Display images side by side
fig = plt.figure()
fig.set_size_inches(20, 20)
ax1 = fig.add_subplot(1,2,1)
ax1.set_title('Low Resolution (LR): ' + str(lr[0].shape[0]) + ' x ' + str(lr[0].shape[1]) + ' pixels')
ax1.imshow(LR[0])
ax2 = fig.add_subplot(1,2,2)
ax2.set_title('Super Resolution (SR): ' + str(hr[0].shape[0]) + ' x ' + str(hr[0].shape[1]) + ' pixels')
ax2.imshow(SR[0])
plt.show()