# Autoencoder Neural Networks

## Introduction

The autoencoder neural network is an unsupervised type of artificial neural network that learns to represent the data in a lower-dimensional space by encoding the data and then reconstructing the data back from the reduced space representation of the data. The autoencoder is trained to get an output as close as possible to its input. 

The autoencoder neural consists of three main parts:

1. **Encoder**: The data input dimensions are reduced and then encoded into a lower-dimensional representation.
2. **Bottleneck**: It is the layer where it contains the lowest dimensional representation of the input data in the neural network.
3. **Decoder**: The original data is reconstructed from the encoded representation.

A simple schema of an autoencoder neural network where its input is an image is presented below:

<figure>
  <img src="https://blog.keras.io/img/ae/autoencoder_schema.jpg">
  <figcaption>Fig.1 - Autoencoder [<a href="https://blog.keras.io/building-autoencoders-in-keras.html">Keras</a>].</figcaption>
</figure> 

The encoder and the decoder parts of the autoencoder neural network could be simply visualised as an encoder function $\textbf{h} = f(\textbf{x})$ and as a decoder function $\textbf{x} = g(\textbf{h})$. Thus, an autoencoder could simply be represented as $g(f(\textbf{x})) = \textbf{x}$. However, according to [Goodfellow et al. 2016](https://www.deeplearningbook.org/contents/autoencoders.html), the autoencoder neural networks have an generalised idea of an encoder and a decoder, where the encoder and decoder can be viewed as a stochastic mappings $p_{encoder}(\textbf{h}|\textbf{x})$ and $p_{decoder}(\textbf{x}|\textbf{h})$. Because the autoencoders are unable to learn to copy perfectly and they are restricted in ways that allows only to approximately copy the input data. The model during the training process is forced to determine which aspects of the input should be learned, and often useful properties of the
data are learned.

Traditionally, the autoencoders have been used for dimensionality reduction and feature learning. However, within the development of the generative modelling theoretical connections between autoencoders and latent variable models were found and brought the autoencoder as a competitive model of generative modelling.

### Architectures

The autoencoder neural network architecture is quite diverse. If it is possible to sequentially reduce and then expand the dimensions back to the original dimension, it is possible to obtain an autoencoder neural network. Therefore, some autoencoders use fully-connected, convolutional, LSTM, or a combination of then. The usage of each layer type depends on the problem that being tried to address.

A different architecture that received quite atention recently are the architectures based on the Variational Autoencoder (VAE). These architectures are generative models and as mentioned by [Goodfellow et al. 2016](https://www.deeplearningbook.org/contents/autoencoders.html), they have a theoretical connections with latent variable models such as Generative Adversarial Networks (GAN).

### Applications

The autoencoders have a wide variety of applications such as dimensionality reduction [[Wang, et al. 2014](https://www.cv-foundation.org/openaccess/content_cvpr_workshops_2014/W15/papers/Wang_Generalized_Autoencoder_A_2014_CVPR_paper.pdf)], feature learning, anomaly detection [[Sakurada and Yairi 2014](https://dl.acm.org/doi/pdf/10.1145/2689746.2689747)], data denoising [[Vincent, et al. 2010](http://www.jmlr.org/papers/volume11/vincent10a/vincent10a.pdf)].

Another application for the autoencoder is for generative modelling, where, for instance, images are generated by sampling data from reduced lower dimension space.

## Development

It is quite known in the neural network development the necessity to perform a hyperparameter optimisation. However, the training process of a single neural network can take hours depending on the architecture and the amount of data being trained. Therefore, this notebook will quickly assess different autoencoder neural network architectures with similar sizes to select the most promising architectures to perform a hyperparameter optimisation.

For the sake of simplicity, it was created a Jupyter notebook for each architecture, where the training process can be followed. In this notebook the architectures will be loaded and the results will be compared.

The architectures evaluated were: 
1. [Convolutional autoencoder](./tests/jupyter-notebooks/train_ae_conv.ipynb);
1. [Convolutional autoencoder with dropout](./tests/jupyter-notebooks/train_ae_conv_drop.ipynb);
1. [VQ-VAE-2 based autoencoder](./tests/jupyter-notebooks/train_ae_add.ipynb);
1. [Depthwise convolutional autoencoder](./tests/jupyter-notebooks/train_ae_depth.ipynb);
1. [Dual convolutional autoencoder](./tests/jupyter-notebooks/train_dual_model.ipynb).

### Preliminary steps

Import modules and load the data.

In [4]:
# Import modules
import h5py
import keras.layers as layers
import numpy as np
import pandas as pd
import tensorflow as tf
from keras.callbacks import EarlyStopping
from keras.models import Model

from utils import plot_red_comp, slicer, split
from utils_keras import loss_norm_error, loss_norm_error_np

In [3]:
# Datasets to load
dt_fl = "nn_data.h5"
dt_dst = "scaled_data"

# Open data file
f = h5py.File(dt_fl, "r")
dt = f[dt_dst]

### Convolutional autoencoder

### Convolutional autoencoder with dropout

### VQ-VAE-2 based autoencoder

### Depthwise convolutional autoencoder

### Dual convolutional autoencoder