# Building an Autoencoder

An **autoencoder** is a neural network with the task of reconstructing its own input. 

The network architecture is composed of two parts - the **encoder** and the **decoder**. It can be thought of as having a bowtie shape.

![autencoder](https://upload.wikimedia.org/wikipedia/commons/thumb/3/37/Autoencoder_schema.png/528px-Autoencoder_schema.png)

The input is a data vector of dimension $n$ that is given to an **input layer**. It is then processed through one or more layers until it arrives at a **hidden layer** with fewer neurons, which gives a vector of dimension $k < n$ - this part of the network is the **encoder** part.

The following layers comprise the **decoder**. It receives the vector of size $k$ as input and is trained to reconstruct the original input from that.



In this example we are going to see how to construct and train an autoencoder.

## Preamble

In [None]:
import matplotlib.pyplot as plt
import numpy

In [None]:
from tensorflow import keras

## Data Preprocessing

We use the MNIST dataset in this example:

In [None]:
(X_train, y_train),(X_test, y_test) = keras.datasets.mnist.load_data()

Since neural networks are sensitive to the scale of the input values, we start by scaling all pixel values to the interval $[0,1]$:

In [None]:
X_train, X_test = X_train / 255.0, X_test / 255.0

## Network Architecture

The key property of our network architecture is the size of the latent layer: The number of neurons in this "bottleneck" layer determines how much of a dimensionality reduction is going to happen and how difficult it will be for the decoder to reconstruct the image.

We start with a guess of 32 neurons in the latent layer.

In [None]:
img_width = 28
input_dim = img_width * img_width
latent_dim = 32

We choose the following architecture for the encoder part of the network:

* a `Flatten` layer turns the image into a 1-dimensional vector of $n$ grayscale values 
* two `Dense` layers implement the reduction from $n$ to $k$ values

In [None]:
encoder_layers = [
    keras.layers.Flatten(),
    keras.layers.Dense(input_dim, activation="sigmoid"),
    keras.layers.Dense(latent_dim, activation='relu')
]

For the decoder part, we add:
* another `Dense` layer that expands the $k$ latent values back to $n$ values
* a `Reshape` layer undos the `Flatten` operation and outputs a 2-dimensional image again

In [None]:
decoder_layers = [
    keras.layers.Dense(input_dim, activation="sigmoid"),
    keras.layers.Reshape((28,28))
]

We assemble encoder and decoder layers into a sequential model:

In [None]:
autoencoder = keras.models.Sequential(encoder_layers + decoder_layers)

The model is compiled with the _Mean Squared Error_ as a loss function, since our objective is to minimize the difference between the pixel values.

In [None]:
autoencoder.compile(
    optimizer='adam',
    loss='mse'
)

Now to training: Since an autoencoder learns the _identity_ function - that is, a function where input and output are the same - the feature and target tensors we pass are one and the same.

In [None]:
autoencoder.fit(
    X_train, 
    X_train, 
    epochs=5,
    batch_size=16
)

In [None]:
autoencoder.summary()

## Applying the Autoencoder

After training is finished, we feed the test set of images to the autoencoder in order to see if it can accurately reconstruct the images.

In [None]:
decoded = autoencoder.predict(X_test)

In [None]:
for i in range(25):
    f, axarr = plt.subplots(nrows=1, ncols=2)
    axarr[0].imshow(X_test[i], cmap="binary")
    axarr[1].imshow(decoded[i], cmap="binary")


## Exercise: Experiment with the Architecture

Try experimenting with the autoencoder's network architecture. In particular, try to change the size of the latent layer. Can you minimize it? Can you achieve a good encoding performance with fewer neurons?

In [None]:
# Your code here

## Exercise: Fashion Autoencoder

Try building an autoencoder for the _Fashion MNIST_ dataset!


In [None]:
# Your code here

---
_This notebook is licensed under a [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](https://creativecommons.org/licenses/by-nc-sa/4.0/). Copyright © 2018-2025 [Point 8 GmbH](https://point-8.de)_