# Convulational Neural Network Assignment
The goal of this project is to train a regression CNN to predict the resolution. We
will use the square loss functions on the training examples $(\mathbf{x}_i, y_i),i=1, \dots,n$:

$$S(\mathbf{w})=\frac{1}{n}\Sigma_{i=1}^n(y_i-f_\mathbf{w}(\mathbf{x}_i))^2 + \lambda ||\mathbf{w}||^2$$

Besides the loss function, we will measure the $R^2$

$R^2=1-\frac{\Sigma_{i=1}^n(y_i-\hat{y}_i)^2}{\Sigma_{i=1}^n(y_i-\bar{y}_i)^2}$

where $\hat{y}=f_\mathbf{w}(\mathbf{x}_i)$ and $\bar{y}=\frac{1}{n}\Sigma_{i=1}^n y_i$.

Experiment with different CNN architectures to obtain a good result. One example of a CNN you could use contains five convolutional layers with stride $1$ and zero
padding, the first four with filters of size $5 \times 5$ with or without holes (atrous), and the
last of the appropriate size to obtain a $1 \times 1$ output. The first two convolutions have
$16$ filters, the next two have $32$ filters, and the last has one filter. The first three convolutions are followed by $2 \times 2$ max pooling with stride $2$ respectively. The fourth
convolution layer is followed by ReLU.


#### Import dependencies

In [4]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from sklearn.metrics import accuracy_score
from scipy.special import expit
from keras.preprocessing.image import ImageDataGenerator
import os

#### Data loading function

In [5]:
def get_data():
    data_path = 'data/cnn'
    train_path = os.path.join(data_path,'cnntrain')
    test_path = os.path.join(data_path,'cnntest')

    data_gen = ImageDataGenerator()

    train_dataframe = pd.DataFrame(
        zip([os.path.join(train_path, image) for image in os.listdir(test_path)],
            [int(filename[:2])/100 for filename in os.listdir(test_path)]),
        columns=['path', 'resolution'])

    test_dataframe = pd.DataFrame(
        zip([os.path.join(test_path, image) for image in os.listdir(test_path)],
            [int(filename[:2])/100 for filename in os.listdir(test_path)]),
        columns=['path', 'resolution'])

    train_data = data_gen.flow_from_dataframe(train_dataframe,
        x_col='path', y_col='resolution', class_mode='raw', target_size=(64,64))
    test_data = data_gen.flow_from_dataframe(test_dataframe,
        x_col='path', y_col='resolution', class_mode='raw', target_size=(64,64))
    print(train_dataframe)
    print(test_dataframe)
    return train_data, test_data

In [6]:
for i in get_data():
    print(i)

Found 0 validated image filenames.
Found 2261 validated image filenames.
                                        path  resolution
0     data/cnn\cnntrain\02_136201.jpg_01.jpg        0.02
1     data/cnn\cnntrain\02_136201.jpg_02.jpg        0.02
2     data/cnn\cnntrain\02_136201.jpg_03.jpg        0.02
3     data/cnn\cnntrain\02_136201.jpg_04.jpg        0.02
4     data/cnn\cnntrain\02_136201.jpg_05.jpg        0.02
...                                      ...         ...
2256  data/cnn\cnntrain\96_133900.jpg_46.jpg        0.96
2257  data/cnn\cnntrain\96_133900.jpg_47.jpg        0.96
2258  data/cnn\cnntrain\96_133900.jpg_48.jpg        0.96
2259  data/cnn\cnntrain\96_133900.jpg_49.jpg        0.96
2260  data/cnn\cnntrain\96_133900.jpg_50.jpg        0.96

[2261 rows x 2 columns]
                                       path  resolution
0     data/cnn\cnntest\02_136201.jpg_01.jpg        0.02
1     data/cnn\cnntest\02_136201.jpg_02.jpg        0.02
2     data/cnn\cnntest\02_136201.jpg_03.jpg       

  .format(n_invalid, x_col)


#### Loss function

In [None]:
def loss(x, y, w, f_w, lmda):
    return np.mean(y - f_w(x) + lmda*np.linalg.norm(w)**2)

#### $R^2$ function

In [None]:
def R2(y_true, y_predicted):
    return 1 - np.sum((y_true-y_predicted)**2) / np.sum((y_true-np.mean(y_true))**2)