# Computer Vision Project #3 - Super Resolution Problem

## Table of contents
1. [The Dataset](#dataset)
2. [The Problem](#problem)
3. [Used Architectures](#architecture)
4. [Runtime Enviroment](#env)
5. [Model analysis](#model)
6. [Training](#training)
7. [Points Table](#point-table)
8. [Bibliography](#bib)

## Required packages

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

import json
import random
from collections import Counter

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator

from sklearn.metrics import confusion_matrix, precision_score, recall_score

from src.utils import imshow,scaled_imshow,seed_everything,plot_model,plot_history
from src.blocks.layers import EncoderBlock,DecoderBlock

## The Dataset <a class="anchor" id="dataset"/>

### Main dataset

The dataset was uploaded to kaggle [here](https://www.kaggle.com/datasets/saputrahas/dataset-image-super-resolution), by user *saputra has*. The dataset created is intended for super-resolution, it consists of over 5000 images. Their sizes are all the same, equalling 1024 x 720.

In [None]:
IMG_SHAPE = (400,400)
#IMG_SHAPE = (64,128)

In [None]:
train_datagen = ImageDataGenerator(
        rescale=1./255,
        zoom_range=0.2,
        shear_range=0.2,
        #rotation_range=45,
        horizontal_flip = True
    )
train_generator = train_datagen.flow_from_directory(
    "./data/finished/train/dataraw/",
    target_size=IMG_SHAPE,
    batch_size=32,
    class_mode='input',
)
validation_generator = train_datagen.flow_from_directory(
    "./data/finished/valid/dataraw/",
    target_size=IMG_SHAPE,
    batch_size=32,
    class_mode='input',
    )

In [None]:
x = next(train_generator)[0]

In [None]:
def draw_mosaic(batch,dsize=IMG_SHAPE):
    img_shape = dsize + (3,)
    mosaic = np.zeros((img_shape[0] * 3, img_shape[1] * 3, img_shape[2]))
    # Fill in the mosaic with images
    for i in range(3):
        for j in range(3):
            index = i * 3 + j
            mosaic[i * img_shape[0]:(i + 1) * img_shape[0], j * img_shape[1]:(j + 1) * img_shape[1], :] = batch[index]
    scaled_imshow(cv.cvtColor((mosaic*255).astype(np.uint8),cv.COLOR_RGB2BGR))

In [None]:
draw_mosaic(x)

The images show people, nature, places, objects, etc. Generally a wide variety of things, however people are a substantial part of the dataset.

### Additional dataset

Another dataset was created for the purpose of this project. The dataset contains almost no people in comparison to the main dataset.


In [None]:
custom = ImageDataGenerator(
    rescale=1./255
)
custom_data = custom.flow_from_directory(
    "./data/custom/",
    target_size=IMG_SHAPE,
    batch_size=32,
    class_mode='input',
)

In [None]:
draw_mosaic(next(custom_data)[0])

## The Problem <a class="anchor" id="problem"/>

Super-resolution refers to the task of enhancing the resolution or level of detail in an image, typically by increasing its pixel count. The goal is to generate a high-resolution (HR) image from a low-resolution (LR) or degraded version of the same image. This problem is particularly important in computer vision and image processing, where obtaining high-quality images is crucial for various applications.

In this case the model will increase the resolution two times.

## Used Architectures <a class="anchor" id="architecture"/>
description of used architectures with diagram showing the layers; For large models containing blocks, the blocks and the connections between them can be shown separately.

Basic autoencoder consists of an encoder and a decoder. Encoder consists of several Convolution layers with increasing filters, but they decrease the image in size, using Pooling or strides. The decoder on the other hand decreases the number of channels back to 3, but increases the image back in size.
The encoder and decoder meet in the middle in a bottleneck, which describes the images in a latent space. Using a lower amount of flat variables.



In [None]:
IMG_SHAPE = (400,400)
INPUT_SHAPE = (200,200)

In [None]:
model = models.Sequential()

model.add(layers.InputLayer(IMG_SHAPE+(3,)))
model.add(layers.Resizing(*INPUT_SHAPE))
model.add(EncoderBlock(8))
model.add(EncoderBlock(16))
model.add(EncoderBlock(32))
# model.add(layers.GlobalAveragePooling2D())
# model.add(layers.Dense(
# model.add(layers.Reshape(target_shape=(45, 64, 128)))
model.add(DecoderBlock(32))
model.add(DecoderBlock(16))
model.add(DecoderBlock(8))
model.add(DecoderBlock(8))
model.add(layers.Conv2D(8,kernel_size=(3, 3),strides=(1, 1), padding='same'))
model.add(layers.BatchNormalization())
model.add(layers.Activation('relu'))
model.add(layers.Conv2D(3,kernel_size=(3, 3),strides=(1, 1), padding='same'))
model.add(layers.Activation('sigmoid'))

model.build()
model.compile(optimizer='adam',
              loss='mean_absolute_error',
              metrics=['mean_squared_error'])

model.summary()

In [None]:
plot_model(model)

## Runtime Enviroment <a class="anchor" id="env"/>
- maybe run in docker
  

## Model analysis <a class="anchor" id="model"/>
- size in memory, number of parameters,  
- comparison of models


## Training <a class="anchor" id="training"/>
- training and inference time,
- description of the training and the required commands to run it
- description of used metrics, loss, and evaluation
- plots: training and validation loss, metrics
- used hyperparameters along with an explanation of each why such value was chosen

In [None]:
history = model.fit(train_generator,batch_size=4,epochs=5,validation_data=validation_generator)

In [None]:
plot_history(history)

In [None]:
model.evaluate(custom_data)

In [None]:
x = next(custom_data)[0]
y_pred = model.predict(x)

In [None]:
draw_mosaic(x)

In [None]:
draw_mosaic(y_pred)

In [None]:
imshow(x[0]*255)

In [None]:
imshow(y_pred[0]*255)

## Points Table <a class="anchor" id="point-table"/>

| Type    | Item                 | Points |
|---------|----------------------|--------|
| Problem | Super Resolution     | 3      |
| Model   | Autoencoder          | 1      |
| Dataset | Our own dataset      | 1      |
| Sum     | sum of points        | 5      |

## Bibliography <a class="anchor" id="bib"/>
preparation of a bibliography - the bibliography should contain references to the data set (preferably the article in which the collection was presented) and all scientific works and studies, including websites with tips on the solution.
