Adversarial Autoencoders (as explained [here][1]) are a great way to use unsupervised learning for finding latent space representations of a given dataset and to generate images similar to this dataset. In the following, I will briefly explain the principle behind AAE and give the Code for programming one in Keras.

The Autoencoder will take an Image x and will "encode" it to a latent space z. This is done by the encoder part of the Network. The decoder will then take the vector z and try to restore the original image as close as possible (x'). This means the encoder imperfectly "compresses" the image x and the decoder imperfectly restores the image from z.

Training an Autoencoder is fairly easy: Take the decoder network,  put it directly on top of the encoder network and then train with x as input and x as desired output.  After training, the encoder can be used to find the representation of a given image as z. If we then change z around a little bit, it is likely, that this will not lead to an image similar to the dataset we started out with (e.g. not a valid number in the case of MNIST). We would like to be able to change around z and get meaningful changes in x' though.

That is a problem that is solved by trying to force z to take on a certain probability-distribution (e.g. normal distribution or others). I know 2 ways of how to do that:
a) Use the Kullback–Leibler divergence as an additional loss term during training (This is called a variational autoencoder) or b) Use an adversarial network as an additional training partner for the encoder.

In this Kernel, I opted for Option b and thus built an Adversarial Autoencoder (AAE).

Here are the generation results of running the code below for 100 Epochs:
http://imgur.com/a/1pNdM

  [1]: http://hjweide.github.io/adversarial-autoencoders

First, the MNIST Dataset is imported

In [None]:
import numpy as np
import keras as ke
import pandas as pd
import matplotlib.pyplot as plt

train = pd.read_csv("../input/train.csv").values

x_train = train[:, 1:].reshape(train.shape[0], 28, 28, 1)
x_train = x_train.astype(float)
x_train /= 255.0

#Use Entire Train-Set when trying this code on your own machine
x_train = x_train[:2500]

Next, the models for Encoder, Decoder and Discriminator (the adversarial opponent) are defined:

In [None]:
def build_model_enc():
    model = ke.models.Sequential()
    model.add(ke.layers.Conv2D(32, (5,5), padding="same", activation="relu", input_shape=(28, 28, 1)))
    model.add(ke.layers.Conv2D(64, (5,5), strides=(2,2), activation="relu", padding="same"))
    model.add(ke.layers.Conv2D(128, (5,5), strides=(2,2), activation="relu", padding="same"))
    model.add(ke.layers.Flatten())
    model.add(ke.layers.Dense(2, activation="linear"))

    return model

def build_model_dec():
    model = ke.models.Sequential()
    model.add(ke.layers.Dense(6272, input_shape=(2,)))
    model.add(ke.layers.Reshape((7, 7, 128)))
    model.add(ke.layers.Conv2D(64, (5,5), activation="relu", padding="same"))
    model.add(ke.layers.UpSampling2D())
    model.add(ke.layers.Conv2D(32, (5,5), activation="relu", padding="same"))
    model.add(ke.layers.UpSampling2D())
    model.add(ke.layers.Conv2D(1, (5,5), activation="sigmoid", padding="same"))

    return model

def build_model_disc():
    model = ke.models.Sequential()
    model.add(ke.layers.Dense(32, activation="relu", input_shape=(2,)))
    model.add(ke.layers.Dense(32, activation="relu"))
    model.add(ke.layers.Dense(1, activation="sigmoid"))
    return model

The Models are put together in the required combinations and compiled

In [None]:
def build_model_aae():
    model_enc = build_model_enc()
    model_dec = build_model_dec()
    model_disc = build_model_disc()
    
    model_ae = ke.models.Sequential()
    model_ae.add(model_enc)
    model_ae.add(model_dec)
    
    model_enc_disc = ke.models.Sequential()
    model_enc_disc.add(model_enc)
    model_enc_disc.add(model_disc)
    
    return model_enc, model_dec, model_disc, model_ae, model_enc_disc

model_enc, model_dec, model_disc, model_ae, model_enc_disc = build_model_aae()

model_enc.summary()
model_dec.summary()
model_disc.summary()
model_ae.summary()
model_enc_disc.summary()

model_disc.compile(optimizer=ke.optimizers.Adam(lr=1e-4), loss="binary_crossentropy")
model_enc_disc.compile(optimizer=ke.optimizers.Adam(lr=1e-4), loss="binary_crossentropy")
model_ae.compile(optimizer=ke.optimizers.Adam(lr=1e-3), loss="binary_crossentropy")

Some helper functions to facilitate training and give a nice overview of the 2 dimensional latent space are defined.

In [None]:

def imagegrid(dec, epochnumber):        
        fig = plt.figure(figsize=[20, 20])
        
        for i in range(-5, 5):
            for j in range(-5,5):
                topred = np.array((i*0.5,j*0.5))
                topred = topred.reshape((1, 2))
                img = dec.predict(topred)
                img = img.reshape((28, 28))
                ax = fig.add_subplot(10, 10, (i+5)*10+j+5+1)
                ax.set_axis_off()
                ax.imshow(img, cmap="gray")
        
        fig.savefig(str(epochnumber)+".png")
        plt.show()
        plt.close(fig)
        
def settrainable(model, toset):
    for layer in model.layers:
        layer.trainable = toset
    model.trainable = toset

The Model is trained in the following steps for each minibatch:
    1) The Autoencoder is trained
    2) The Discriminator is trained to differentiate between z from images and the distribution we want z to have (normal distribution in this case)
    3) The Encoder is trained to fool the parameter-fixed discriminator into thinking the z are real

In [None]:
batchsize=50
#Set Number of Epochs to 10-20 or higher.
for epochnumber in range(1):
    np.random.shuffle(x_train)
    
    for i in range(int(len(x_train) / batchsize)):
        settrainable(model_ae, True)
        settrainable(model_enc, True)
        settrainable(model_dec, True)
        
        batch = x_train[i*batchsize:i*batchsize+batchsize]
        model_ae.train_on_batch(batch, batch)
        
        settrainable(model_disc, True)
        batchpred = model_enc.predict(batch)
        fakepred = np.random.standard_normal((batchsize,2))
        discbatch_x = np.concatenate([batchpred, fakepred])
        discbatch_y = np.concatenate([np.zeros(batchsize), np.ones(batchsize)])
        model_disc.train_on_batch(discbatch_x, discbatch_y)
        
        settrainable(model_enc_disc, True)
        settrainable(model_enc, True)
        settrainable(model_disc, False)
        model_enc_disc.train_on_batch(batch, np.ones(batchsize))
    
    print ("Reconstruction Loss:", model_ae.evaluate(x_train, x_train, verbose=0))
    print ("Adverserial Loss:", model_enc_disc.evaluate(x_train, np.ones(len(x_train)), verbose=0))
    
    
    imagegrid(model_dec, epochnumber)     
        