In [1]:
import keras
import tensorflow as tf

print(keras.__version__)
print(tf.__version__)

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


2.1.3
1.5.0


## PCA Autoencoder

We all know what PCA (Principal Component Analysis) is. All PCA does is nothing but find an optmial hyperplane where higher dimensional data can be projected onto, in order to reduce the dimension of the original data. Normally, that hyperplane is along the eigen vector. It requires some old-school mathematics to do so.

However, same can be done using neural networks to reduce the dimension as well as encode a higher dimensional data into lower dimensional features. Such technique is called **Encoding**.  

The reverse of encoding is **decoding**.


### Example
Say we have an image **64 X 64**.  

Number of pixels = 4096  
Number of channels = 1 (for simplicity)

When we try to create a model for image with such (relatively) huge number of pixels, the machine learning model will be computationally expensive. So, we encode the pixels for further usage. Further usages can be:
- image labelling
- image captioning
- semantic segmentation
- ...

### Architecture
The architecture that **encodes** and **decodes** is **autoencoder** - as simple as that. :D

**Encoder**
- flatten the original image -> 2d to 1d
- Add neural network layers (dense)
- This dense layer's size is our encoding code size

**Decoder**
- accept a 1d vector -> encoded vector
- add a dense layer equal to the size of original image (including the number of channels)
- Hence, we get original image

Although we can technically compress an image (or remove irrelevant features - in case of image, pixels), 
the reconstruction is lossy. While encoding the image, we lose certain information that the decoder will never recover.
So, our best effort lies in minimizing such reconstruction error.

So, an autoencoder is nothing but a neural network as:

input -> encoder -> [encoded output] -> decoder -> [decoded output as near to the input]


In [3]:
def build_pca_autoencoder(img_shape, code_size):
    """
    Here we define a simple linear autoencoder as described above.
    We also flatten and un-flatten data to be compatible with image shapes
    """
    
    encoder = keras.models.Sequential()
    
    # accept image
    encoder.add(L.InputLayer(img_shape))
    
    # flatten pixels
    encoder.add(L.Flatten())
    
    # add a dense layer to encode the pixels
    encoder.add(L.Dense(code_size))

    decoder = keras.models.Sequential()
    # accept encoded input
    decoder.add(L.InputLayer((code_size,)))
    
    # add a dense layer equal to original image (with channels)
    decoder.add(L.Dense(np.prod(img_shape)))
    
    # reshape as 2d image (h, w, c)
    decoder.add(L.Reshape(img_shape))
    
    return encoder,decoder