# PA228 Project - machine learning in image processing

Author: Petr Kadlec, UČO: 485208

## Loading the dataset:

The basic structure of this code was taken from https://keras.io/examples/vision/super_resolution_sub_pixel/.

The definition of EDSR model was done using the ideas from the paper found here: https://arxiv.org/pdf/1707.02921.pdf

In this project we are using three models:

- baseline:           As baseline we are using _rescale_ method from skimage
- basic model:        This model is only composed of a few CNN layers, this is just so we have a comparison if we truly need a bigger model.
- EDSR model:         This model is the main "star of the show". It's architecture is explained later in the notebook

Basic functions and loading of the dataset

In [None]:
import numpy as np
from matplotlib import pyplot as plt
from tqdm.notebook import tqdm

%matplotlib inline

In [None]:
# Enable MYPY in the jupyter notebook => commented out because then I wouldn't be able to do y = y.astype("float32") and similar
#%load_ext nb_mypy

In [None]:
import tensorflow as tf

In [None]:
gpus = tf.config.list_physical_devices('GPU')
print(f'Detected gpus: {gpus}')
for gpu in gpus:
    tf.config.experimental.set_memory_growth(gpu, True)

Basic downloader utility for all the datasets:

In [None]:
"""
import wget
import os

import shutil

basic_url: str = "https://data.vision.ee.ethz.ch/cvl/DIV2K/"

print("\nDownloading and extracting dataset to this directory.\n")

datasets = ["DIV2K_train_HR", "DIV2K_valid_HR"]

for line in datasets:
    # if you don't have wget module then uncomment the next line and comment the wget command
    # os.system("wget " + basic_url + line[:-1] + ".zip")
    wget.download(basic_url + line[:-1] + ".zip")

    shutil.unpack_archive("./" + line[:-1] + ".zip")
"""

print("\nAll done!")

In [None]:
dataset_location: str = "./../dataset/"

training_prefix = dataset_location + "DIV2K_train_"
validation_prefix = dataset_location + "DIV2K_valid_"

original_train = training_prefix + "HR"
original_test = validation_prefix + "HR"

set_difficult: list[str] = [training_prefix + "LR_difficult", original_train, validation_prefix + "LR_difficult", original_test, "4"]
set_mild: list[str] = [training_prefix + "LR_mild", original_train, validation_prefix + "LR_mild", original_test, "4"]
set_wild: list[str] = [training_prefix + "LR_wild", original_train, validation_prefix + "LR_wild", original_test, "4"]
set_x8: list[str] = [training_prefix + "LR_x8", original_train, validation_prefix + "LR_x8", original_test, "8"]

In [None]:
crop_size = 512
batch_size = 8

In [None]:
current_set: list[str] = set_mild

In [None]:
test_orig_ds = tf.keras.utils.image_dataset_from_directory(current_set[3],
                                                 labels=None,
                                                 label_mode="categorical",
                                                 image_size=(crop_size, crop_size),
                                                 batch_size=batch_size,
                                                 interpolation="nearest",
                                                 seed=1,
                                                 )

train_orig_ds = tf.keras.utils.image_dataset_from_directory(current_set[1],
                                                 labels=None,
                                                 label_mode="categorical",
                                                 image_size=(crop_size, crop_size),
                                                 batch_size=batch_size,
                                                 interpolation="nearest",
                                                 validation_split=0.2,
                                                 subset="training",
                                                 seed=1,
                                                 )

validation_orig_ds = tf.keras.utils.image_dataset_from_directory(current_set[1],
                                                 labels=None,
                                                 label_mode="categorical",
                                                 image_size=(crop_size, crop_size),
                                                 batch_size=batch_size,
                                                 interpolation="nearest",
                                                 validation_split=0.2,
                                                 subset="validation",
                                                 seed=1,
                                                 )

upscale_factor = int(current_set[4])
input_size = crop_size // upscale_factor

In [None]:
import os

def get_images_in_dir(dir_name: str) -> list[str]:
    return sorted(
        [
            os.path.join(dir_name + "/", fname)
            for fname in os.listdir(dir_name + "/")
            if fname.endswith(".png")
        ]
    )

Rescale all the datasets:

In [None]:
def scaling(input_image):
    input_image = input_image / 255.0
    return input_image

test_orig_ds = test_orig_ds.map(lambda x: tf.cast(x, tf.float32)).map(scaling)
validation_orig_ds = validation_orig_ds.map(lambda x: tf.cast(x, tf.float32)).map(scaling)
train_orig_ds = train_orig_ds.map(lambda x: tf.cast(x, tf.float32)).map(scaling)

## Crop and resize images

In [None]:
# Use TF Ops to process.

# Only the y channel is interesting as people are the most sensitive to it
def process_input(input, input_size, upscale_factor):
    input = tf.image.rgb_to_yuv(input)
    last_dimension_axis = len(input.shape) - 1
    y, u, v = tf.split(input, 3, axis=last_dimension_axis)
    return tf.image.resize(y, [input_size, input_size], method="area")


def process_target(input):
    input = tf.image.rgb_to_yuv(input)
    last_dimension_axis = len(input.shape) - 1
    y, u, v = tf.split(input, 3, axis=last_dimension_axis)
    return y

In [None]:
train_orig_ds = train_orig_ds.map(
    lambda x: (process_input(x, input_size, upscale_factor), process_target(x))
)
train_orig_ds = train_orig_ds.prefetch(buffer_size=32)

test_orig_ds = test_orig_ds.map(
    lambda x: (process_input(x, input_size, upscale_factor), process_target(x))
)
test_orig_ds = test_orig_ds.prefetch(buffer_size=32)


validation_orig_ds = validation_orig_ds.map(
    lambda x: (process_input(x, input_size, upscale_factor), process_target(x))
)
validation_orig_ds = validation_orig_ds.prefetch(buffer_size=32)


Defining the basic model used for upscaling (this is basically a basic model, not baseline!)

In [None]:
def get_model(upscale_factor=3, channels=1):
    conv_args = {
        "activation": "relu",
        "kernel_initializer": "Orthogonal",
        "padding": "same",
    }
    inputs = tf.keras.Input(shape=(None, None, channels))
    x = tf.keras.layers.Conv2D(64, 5, **conv_args)(inputs)
    x = tf.keras.layers.Conv2D(64, 3, **conv_args)(x)
    x = tf.keras.layers.Conv2D(32, 3, **conv_args)(x)
    x = tf.keras.layers.Conv2D(channels * (upscale_factor ** 2), 3, **conv_args)(x)
    outputs = tf.nn.depth_to_space(x, upscale_factor)

    return tf.keras.Model(inputs, outputs)

In [None]:
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1.inset_locator import zoomed_inset_axes
from mpl_toolkits.axes_grid1.inset_locator import mark_inset
from tensorflow.keras.preprocessing.image import img_to_array
import PIL


def plot_results(img, prefix, title, directory = None):
    """Plot the result with zoom-in area."""
    img_array = img_to_array(img)
    img_array = img_array.astype("float32") / 255.0

    # Create a new figure with a default 111 subplot.
    fig, ax = plt.subplots(figsize=(20,20))
    im = ax.imshow(img_array[::-1], origin="lower")

    plt.title(title)
    # zoom-factor: 2.0, location: upper-left
    axins = zoomed_inset_axes(ax, 2, loc=2)
    axins.imshow(img_array[::-1], origin="lower")

    # Specify the limits.
    x1, x2, y1, y2 = 200, 300, 100, 200
    # Apply the x-limits.
    axins.set_xlim(x1, x2)
    # Apply the y-limits.
    axins.set_ylim(y1, y2)

    plt.yticks(visible=False)
    plt.xticks(visible=False)

    # Make the line.
    mark_inset(ax, axins, loc1=1, loc2=3, fc="none", ec="blue")
    if (directory is not None): 
        plt.savefig("./output_images/" + directory + str(prefix) + "-" + title + ".png")
    plt.show()


def get_lowres_image(img, upscale_factor):
    """Return low-resolution image to use as model input."""
    return img.resize(
        (img.size[0] // upscale_factor, img.size[1] // upscale_factor),
        PIL.Image.Resampling.BICUBIC,
    )


def upscale_image(model, img):
    """Predict the result based on input image and restore the image as RGB."""
    ycbcr = img.convert("YCbCr")
    y, cb, cr = ycbcr.split()
    y = img_to_array(y)
    y = y.astype("float32") / 255.0

    input = np.expand_dims(y, axis=0)
    out = model.predict(input)

    out_img_y = out[0]
    out_img_y *= 255.0

    # Restore the image in RGB color space.
    out_img_y = out_img_y.clip(0, 255)
    out_img_y = out_img_y.reshape((np.shape(out_img_y)[0], np.shape(out_img_y)[1]))
    out_img_y = PIL.Image.fromarray(np.uint8(out_img_y), mode="L")
    out_img_cb = cb.resize(out_img_y.size, PIL.Image.Resampling.BICUBIC)
    out_img_cr = cr.resize(out_img_y.size, PIL.Image.Resampling.BICUBIC)
    out_img = PIL.Image.merge("YCbCr", (out_img_y, out_img_cb, out_img_cr)).convert(
        "RGB"
    )
    return out_img


Defining the main matric for evaluating the images => PSNR

PSNR stands for peak signal-to-noise ratio and in general the higher the PSNR, the better quality of the input image

In [None]:
from tensorflow.keras.preprocessing.image import load_img
import math

class ESPCNCallback(tf.keras.callbacks.Callback):
    def __init__(self):
        super(ESPCNCallback, self).__init__()
        self.test_img = get_lowres_image(load_img(original_test + "/0801.png"), upscale_factor)

    # Store PSNR value in each epoch.
    def on_epoch_begin(self, epoch, logs=None):
        self.psnr = []

    def on_epoch_end(self, epoch, logs=None):
        print("Mean PSNR for epoch: %.2f" % (np.mean(self.psnr)))
        if epoch % 20 == 0:
            prediction = upscale_image(self.model, self.test_img)
            plot_results(prediction, "epoch-" + str(epoch), "prediction", "training/")

    # Computing PSNR from definition here
    def on_test_batch_end(self, batch, logs=None):
        self.psnr.append(10 * math.log10(1 / logs["loss"]))


In [None]:
test_img_paths = get_images_in_dir(original_test)

In [None]:
import shutil

log_dir = 'logs'
shutil.rmtree(log_dir)

tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir + '/basic', histogram_freq=1)

In [None]:
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=5)

checkpoint_filepath = "./tmp/checkpoint/basic"

model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_filepath,
    save_weights_only=True,
    monitor="loss",
    mode="min",
    save_best_only=True,
)

model_basic = get_model(upscale_factor=upscale_factor, channels=1)
model_basic.summary()

callbacks = [ESPCNCallback(), early_stopping_callback, model_checkpoint_callback, tensorboard_callback]

loss_fn = tf.keras.losses.MeanSquaredError()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)


In [None]:
tf.keras.utils.plot_model(model_basic, show_shapes=True)

In [None]:
def PSNR(y_true, y_pred):
    return tf.image.psnr(y_true, y_pred, max_val=1.0)

In [None]:
epochs = 100

model_basic.compile(
    optimizer=optimizer, loss=loss_fn, metrics=[PSNR]
)


# split the dataset to validation and training
train_log = model_basic.fit(
    train_orig_ds, epochs=epochs, callbacks=callbacks, validation_data=validation_orig_ds, verbose=2
)

# The model weights (that are considered the best) are loaded into the model.
model_basic.load_weights(checkpoint_filepath)


In [None]:
# plot training history
names = train_log.history.keys()

for name in names:
    plt.plot(train_log.history[name])

plt.title("training history")
plt.ylabel("loss")
plt.xlabel("Epoch")
plt.legend(names)
plt.show()

In [None]:
%load_ext tensorboard
%tensorboard --logdir logs/basic --host localhost --port 8081

In [None]:
from skimage.transform import rescale
from tensorflow.keras.preprocessing.image import array_to_img

total_bicubic_psnr = 0.0
total_test_psnr = 0.0
total_baseline_psnr = 0.0

model_prefix: str = "model_easy/"
model = model_basic

for index, test_img_path in enumerate(test_img_paths[30:40]):
    img = load_img(test_img_path)
    lowres_input = get_lowres_image(img, upscale_factor)
    w = lowres_input.size[0] * upscale_factor
    h = lowres_input.size[1] * upscale_factor
    highres_img = img.resize((w, h))
    prediction = upscale_image(model, lowres_input)
    lowres_img = lowres_input.resize((w, h))
    lowres_img_arr = img_to_array(lowres_img)
    highres_img_arr = img_to_array(highres_img)
    predict_img_arr = img_to_array(prediction)

    baseline_rescaled = array_to_img(rescale(img_to_array(lowres_input), (upscale_factor, upscale_factor, 1)))
    
    bicubic_psnr = tf.image.psnr(lowres_img_arr, highres_img_arr, max_val=255)
    test_psnr = tf.image.psnr(predict_img_arr, highres_img_arr, max_val=255)
    baseline_psnr = tf.image.psnr(img_to_array(baseline_rescaled), highres_img_arr, max_val=255)

    total_bicubic_psnr += bicubic_psnr
    total_test_psnr += test_psnr
    total_baseline_psnr += baseline_psnr

    print(
        "PSNR of low resolution image and high resolution image is %.4f" % bicubic_psnr
    )
    print("PSNR of predict and high resolution is %.4f" % test_psnr)
    print("PSNR of baseline and high resolution is %.4f" % baseline_psnr)
    plot_results(lowres_img, index, "lowres", model_prefix)
    plot_results(highres_img, index, "highres", model_prefix)
    plot_results(prediction, index, "prediction", model_prefix)
    plot_results(baseline_rescaled, index, "baseline", model_prefix)

print("Avg. PSNR of lowres images is %.4f" % (total_bicubic_psnr / 10))
print("Avg. PSNR of reconstructions is %.4f" % (total_test_psnr / 10))
print("Avg. PSNR of baseline is %.4f" % (total_baseline_psnr / 10))


As you can see, our basic model is only slightly better than the baseline.

However, I would still argue, that the uspcaling of the image was successfull, however the model has mostly learned to "smear" the image.

## EDSR model:

[Enhanced Deep Residual Networks for Single Image Super-Resolution](https://arxiv.org/abs/1707.02921) (EDSR) is a winner of the [NTIRE 2017](http://www.vision.ee.ethz.ch/ntire17/) super-resolution challenge. Here's an overview of the EDSR architecture:

![Fig. 1](edsr_archi.png)
<center>Fig. 1. EDSR architecture.</center>

Its residual block design differs from that of ResNet. Batch normalization layers have been removed together with the final ReLU activation as shown on the right side of the next figure (Fig. 2).  

![Fig. 2](blocks_diff.png)
<center>Fig. 2. Residual block design in ResNet (left) and in EDSR (right).</center>

The EDSR authors argue that batch normalization loses scale information of images and reduces the range flexibility of activations. Removal of batch normalization layers not only increases super-resolution performance but also reduces GPU memory up to 40%, thus significantly larger models can be trained.

EDSR uses a single sub-pixel upsampling layer for super-resolution scales (i.e. upsampling factors) $\times 2$ and $\times 3$ and two upsampling layers for scale $\times 4$.

In [None]:
conv_args = {
    "activation": "relu",
    "kernel_initializer": "Orthogonal",
    "padding": "same",
}

def edsr(scale, num_filters=64, num_res_blocks=8, res_block_scaling=None, channels=1):
    """Creates an EDSR model."""
    x_in = tf.keras.layers.Input(shape=(None, None, channels))

    x = b = tf.keras.layers.Conv2D(num_filters, 3, **conv_args)(x_in)
    for i in range(num_res_blocks):
        b = res_block(b, num_filters, res_block_scaling)
    b = tf.keras.layers.Conv2D(num_filters, 3, **conv_args)(b)
    x = tf.keras.layers.Add()([x, b])

    x = upsample(x, scale, num_filters)
    x = tf.keras.layers.Conv2D(channels, 3, **conv_args)(x)

    return tf.keras.models.Model(x_in, x, name="edsr")


def res_block(x_in, filters, scaling):
    """Creates an EDSR residual block."""
    x = tf.keras.layers.Conv2D(filters, 3, **conv_args)(x_in)
    x = tf.keras.layers.Conv2D(filters, 3, **conv_args)(x)
    if scaling:
        x = tf.keras.layers.Lambda(lambda t: t * scaling)(x)
    x = tf.keras.layers.Add()([x_in, x])
    return x


def upsample(x, scale, num_filters):
    def upsample_1(x, factor, **kwargs):
        """Sub-pixel convolution."""
        x = tf.keras.layers.Conv2D(num_filters * (factor ** 2), 3, padding='same', **kwargs)(x)
        return tf.keras.layers.Lambda(pixel_shuffle(scale=factor))(x)

    if scale == 2:
        x = upsample_1(x, 2, name='conv2d_1_scale_2')
    elif scale == 3:
        x = upsample_1(x, 3, name='conv2d_1_scale_3')
    elif scale == 4:
        x = upsample_1(x, 2, name='conv2d_1_scale_2')
        x = upsample_1(x, 2, name='conv2d_2_scale_2')

    return x


def pixel_shuffle(scale):
    return lambda x: tf.nn.depth_to_space(x, scale)

In [None]:
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir + '/edsr', histogram_freq=1)

In [None]:
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor="val_loss", patience=5)

checkpoint_filepath = "./tmp/checkpoint/edsr"

model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_filepath,
    save_weights_only=True,
    monitor="loss",
    mode="min",
    save_best_only=True,
)

model_edsr = edsr(upscale_factor)
model_edsr.summary()

callbacks = [ESPCNCallback(), early_stopping_callback, model_checkpoint_callback, tensorboard_callback]

loss_fn = tf.keras.losses.MeanSquaredError()
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)


In [None]:
tf.keras.utils.plot_model(model_edsr, show_shapes=True)

In [None]:
epochs = 100

model_edsr.compile(
    optimizer=optimizer, loss=loss_fn, metrics=[PSNR]
)


# split the dataset to validation and training
train_log = model_edsr.fit(
    train_orig_ds, epochs=epochs, callbacks=callbacks, validation_data=validation_orig_ds, verbose=2
)

# The model weights (that are considered the best) are loaded into the model.
model_edsr.load_weights(checkpoint_filepath)


In [None]:
model_edsr.save('pa228_project.h5')

In [None]:
# plot training history
names = train_log.history.keys()

for name in names:
    plt.plot(train_log.history[name])

plt.title("training history")
plt.ylabel("loss")
plt.xlabel("Epoch")
plt.legend(names)
plt.show()

In [None]:
%load_ext tensorboard
%tensorboard --logdir logs/edsr --host localhost --port 8082

In [None]:
from skimage.transform import rescale

total_bicubic_psnr = 0.0
total_test_psnr = 0.0
total_baseline_psnr = 0.0

model_prefix: str = "model_edsr/"
model = model_edsr

for index, test_img_path in enumerate(test_img_paths[30:40]):
    img = load_img(test_img_path)
    lowres_input = get_lowres_image(img, upscale_factor)
    w = lowres_input.size[0] * upscale_factor
    h = lowres_input.size[1] * upscale_factor
    highres_img = img.resize((w, h))
    prediction = upscale_image(model, lowres_input)
    lowres_img = lowres_input.resize((w, h))
    lowres_img_arr = img_to_array(lowres_img)
    highres_img_arr = img_to_array(highres_img)
    predict_img_arr = img_to_array(prediction)

    baseline_rescaled = array_to_img(rescale(img_to_array(lowres_input), (upscale_factor, upscale_factor, 1)))
    
    bicubic_psnr = tf.image.psnr(lowres_img_arr, highres_img_arr, max_val=255)
    test_psnr = tf.image.psnr(predict_img_arr, highres_img_arr, max_val=255)
    baseline_psnr = tf.image.psnr(img_to_array(baseline_rescaled), highres_img_arr, max_val=255)

    total_bicubic_psnr += bicubic_psnr
    total_test_psnr += test_psnr
    total_baseline_psnr += baseline_psnr

    print(
        "PSNR of low resolution image and high resolution image is %.4f" % bicubic_psnr
    )
    print("PSNR of predict and high resolution is %.4f" % test_psnr)
    print("PSNR of baseline and high resolution is %.4f" % baseline_psnr)
    plot_results(lowres_img, index, "lowres", model_prefix)
    plot_results(highres_img, index, "highres", model_prefix)
    plot_results(prediction, index, "prediction", model_prefix)
    plot_results(baseline_rescaled, index, "baseline", model_prefix)

print("Avg. PSNR of lowres images is %.4f" % (total_bicubic_psnr / 10))
print("Avg. PSNR of reconstructions is %.4f" % (total_test_psnr / 10))
print("Avg. PSNR of baseline is %.4f" % (total_baseline_psnr / 10))


I would say the predictions are relatively good, as the whole image looks really sharper and not smeared / distorted. The model mostly learned to smear the images, but to a lesser degree than the simple model and we can sometimes see the sharp edges and clear lines.

## Evaluation over the whole dataset

In [None]:
model_basic.evaluate(test_orig_ds)

In [None]:
model_edsr.evaluate(test_orig_ds)

In [None]:
total_bicubic_psnr = []
total_edsr_psnr = []
total_baseline_psnr = []
total_basic_psnr = []

for index, test_img_path in enumerate(test_img_paths):
    img = load_img(test_img_path)
    lowres_input = get_lowres_image(img, upscale_factor)
    w = lowres_input.size[0] * upscale_factor
    h = lowres_input.size[1] * upscale_factor
    highres_img = img.resize((w, h))
    prediction_basic = upscale_image(model_basic, lowres_input)
    prediction_edsr = upscale_image(model_edsr, lowres_input)
    lowres_img = lowres_input.resize((w, h))
    lowres_img_arr = img_to_array(lowres_img)
    highres_img_arr = img_to_array(highres_img)
    predict_img_arr_b = img_to_array(prediction_basic)
    predict_img_arr_e = img_to_array(prediction_edsr)

    baseline_rescaled = array_to_img(rescale(img_to_array(lowres_input), (upscale_factor, upscale_factor, 1)))
    
    bicubic_psnr = tf.image.psnr(lowres_img_arr, highres_img_arr, max_val=255)
    basic_psnr = tf.image.psnr(predict_img_arr_b, highres_img_arr, max_val=255)
    edsr_psnr = tf.image.psnr(predict_img_arr_e, highres_img_arr, max_val=255)
    baseline_psnr = tf.image.psnr(img_to_array(baseline_rescaled), highres_img_arr, max_val=255)

    total_bicubic_psnr.append(bicubic_psnr)
    total_edsr_psnr.append(edsr_psnr)
    total_baseline_psnr.append(baseline_psnr)
    total_basic_psnr.append(basic_psnr)

    print(f"Image {index}, path: {test_img_path}:")
    print(
        "\tPSNR of low resolution image and high resolution image is %.4f" % bicubic_psnr
    )
    print("\tPSNR of baseline and high resolution is %.4f" % baseline_psnr)
    print("\tPSNR of basic model prediction and high resolution is %.4f" % basic_psnr)
    print("\tPSNR of EDSR model prediction and high resolution is %.4f" % edsr_psnr)
    print()

In [None]:
print("Avg. PSNR of lowres images is %.4f" % np.mean(total_bicubic_psnr))
print("Avg. PSNR of baseline is %.4f" % np.mean(total_baseline_psnr))
print("Avg. PSNR of reconstructions using basic model is %.4f" % np.mean(total_basic_psnr))
print("Avg. PSNR of reconstructions using EDSR model is %.4f" % np.mean(total_edsr_psnr))

In [None]:
x_arr = list(range(1, 101))

fig = plt.figure(figsize=(30, 10))

plt.title("Comparison of PSNR between models")

plt.plot(x_arr, total_bicubic_psnr, label="lowres")
plt.plot(x_arr, total_baseline_psnr, label="baseline")
plt.plot(x_arr, total_basic_psnr, label="basic")
plt.plot(x_arr, total_edsr_psnr, label="edsr")

plt.xticks(np.arange(1,101))

plt.legend()
plt.show()

As we can see from the metrics the baseline model produced the lowest quality images out of all methods presented. Its PSNR is even lower than the PSNR of the compressed image compared to the original. This is probably because the upscaling introduces more noise into the image. 

Surprisingly the more basic model managed to achieve better PSNR value than the more complex method. If we compare however the results on images then we can that the EDSR model produces better images than the basic model, as the lines are way sharper and the image overall feels less blurry. And the reason for the basic model getting better score than the more complex one is that that the smearing is usually preferred in PSNR.

## Problems and challenges

First to write a functional utility for downloading, got stuck on this for a long time for no apparent reason.

Designing how to use dataset - the DIV2K comes with already predownscaled images, however I had difficulties getting them to work with my model - as all inputs needed to be the same size for batching, I couldn't just open the images "willy nilly" and I needed a proper preprocessing - I decided to thus skip the predefined datasets and downscale the images myself, which made the work more berable and easier on me (and also made it that I can train on any provided dataset). 

Another problem was with defining the more complex model, as I got stuck on their explanation and it just took time and carefull thinking on how to connect it all together.

## Future work

Maybe try additional models, play with GANs and Autoencoders.