See `dataAnalysis/main.ipynb` for the preceeding and successive stages.

This file implements an unsupervised deep embedding process for clustering analysis.
<br>
> A. Constructing an unsupervised deep learning model (e.g. autoencoders).
<br>
> B. Training the deep embedding model to minimise a reconstruction error.
<br>
> C. Embedding generation. 
<br>
> D. Clustering in the embedding space.


Pre-requisite knowledge: simple feedforward perceptrons.

In [21]:
# Importing NN libraries
import tensorflow as tf
from tensorflow import keras
import numpy as np

__A. Constructing An Autoencoder.__
<br>
Autoencodoers consist of two parts: an `encoder` and a `decoder`.
<br>
The autoencoder can be expressed as

> `L(x,g(f(x))`,

for some input space `x` and non-linear encoder and decoder functions `f,` `g`. Note the lower-dimensional latent space is often denoted `f(x) = h`.

In [22]:
import import_ipynb
from DataAnalysis import DataAnalysis

class undercompleteAE():
    def __init__(self):
        data = DataAnalysis("data/star_data.fits")
        # Take the parameters from dataAnalysis, i.e. a numCol-dimensional column vector. This is the input space.
        encoderInputLayer = keras.Input(shape=(data.numCol(),), name="img")
        # Compress this input space into a lower-dimensional latent space, i.e. h.
        encoderOutputLayer = keras.layers.Dense(np.floor(data.numCol()/2), activation="relu")(encoderInputLayer)

        # Similarly for the decoder, we take the latent space h and (try to) reconstruct to the input space x
        decoderInputLayer = keras.layers.Dense(data.numCol(), activation="relu")(encoderOutputLayer)
        # Hence giving L(x, g(f(x)))
        decoderOutputLayer = keras.layers.Reshape((data.numCol(),1))(decoderInputLayer)

        opt = keras.optimizers.Adam(learning_rate=0.001)
        self.autoencoder = keras.Model(encoderInputLayer, decoderOutputLayer, name="autoencoder")

    def summary(self):
        self.autoencoder.summary()


In [23]:
e = undercompleteAE()
e.summary()

24
Model: "autoencoder"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 img (InputLayer)            [(None, 23)]              0         
                                                                 
 dense_10 (Dense)            (None, 11)                264       
                                                                 
 dense_11 (Dense)            (None, 23)                276       
                                                                 
 reshape_4 (Reshape)         (None, 23, 1)             0         
                                                                 
Total params: 540 (2.11 KB)
Trainable params: 540 (2.11 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
