# Data Set Information

This radar data was collected by a system in Goose Bay, Labrador. This system consists of a phased array of 16 high-frequency antennas with a total transmitted power on the order of 6.4 kilowatts. See the paper for more details. The targets were free electrons in the ionosphere. "Good" radar returns are those showing evidence of some type of structure in the ionosphere. "Bad" returns are those that do not; their signals pass through the ionosphere. 

Received signals were processed using an autocorrelation function whose arguments are the time of a pulse and the pulse number. There were 17 pulse numbers for the Goose Bay system. Instances in this databse are described by 2 attributes per pulse number, corresponding to the complex values returned by the function resulting from the complex electromagnetic signal.

## Attribute Information

- All 34 are continuous 
- The 35th attribute is either "good" or "bad" according to the definition summarized above. This is a binary classification task. 
- https://archive.ics.uci.edu/ml/machine-learning-databases/ionosphere/ionosphere.names

## Data Import and preprocessing

In [14]:
data = np.genfromtxt("data/ionosphere.data")

In [19]:
data = pd.read_csv('data/ionosphere.data', sep=",", header=None)

In [28]:
df_x = data.iloc[:,:33]

In [30]:
df_y = data.iloc[:,-1]

In [40]:
np.loadtxt("Data/Ionosphere.data", converters = {33: lambda x: 1 if x=="g" else 0})

IndexError: list assignment index out of range

In [37]:
map(lambda x: x^2, [1,2,3])

<map at 0x2a67e684b70>

## Modules

In [17]:
from keras.layers import Input, Dense, Dropout
from keras.models import Model
from keras.datasets import mnist
from keras.models import Sequential, load_model
from keras.optimizers import RMSprop
from keras.callbacks import TensorBoard
from __future__ import print_function
from keras.utils import plot_model
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot

import keras
import matplotlib.pyplot as plt
import numpy as np
import math
import pydot
import graphviz
import pandas as pd

# Single layer autoencoder

## Data import and preprocessing

All data is normalized and serialized into a vector.

In [None]:
(x_train, y_train), (x_val, y_val) = mnist.load_data()

x_train = x_train.astype('float32') / 255.
x_val = x_val.astype('float32') / 255.

x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_val = x_val.reshape((len(x_val), np.prod(x_val.shape[1:])))

print(x_train.shape)
print(x_val.shape)

## Model Definitions

Using keras module with compression to 32 floats.

In [None]:
######## constants for autoencoder ############
# this is the size of our encoded representations
encoding_dim = 36
input_dim = 784
epochs = 50
batch_size = 256

In [None]:
# input placeholder
input_img = Input(shape=(input_dim,))
encoded = Dense(encoding_dim, activation='relu')(input_img)
decoded = Dense(input_dim, activation='sigmoid')(encoded)

single_autoencoder = Model(input_img, decoded)

Encoder Model:

In [None]:
# this model maps an input to its encoded representation
single_encoder = Model(input_img, encoded)

Decoder Model:

In [None]:
encoded_input = Input(shape=(encoding_dim,))
# retrieve the last layer of the autoencoder model
decoder_layer = single_autoencoder.layers[-1]
# create the decoder model
single_decoder = Model(encoded_input, decoder_layer(encoded_input))

First, we'll configure our model to use a per-pixel binary crossentropy loss, and the Adadelta optimizer:

Binary Cross Entropy = Binomial Cross Entropy = Special Case of Multinomial Cross Entropy 

In [None]:
single_autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')

In [None]:
single_autoencoder.summary()

### Train or load single autoencoder model

In [2]:
single_autoencoder = keras.models.load_model('models/single_autoencoder.h5')
# single_autoencoder.fit(x_train, x_train,
#                 epochs=epochs,
#                 batch_size=batch_size,
#                 shuffle=True,
#                 validation_data=(x_val, x_val),
#                 callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])

### Save Models

In [None]:
# single_autoencoder.save('models/single_autoencoder.h5')

In [None]:
score = single_autoencoder.evaluate(x_val, x_val, verbose=0)
print(score)

In [11]:
plot_model(single_autoencoder, to_file='images/single_autoencoder.png', show_shapes=True, show_layer_names=True, rankdir='LR')

![single_autoencoder](images/single_autoencoder.png)

After 50 epochs, the autoencoder seems to reach a stable train/test loss value of about {{score}}. We can try to visualize the reconstructed inputs and the encoded representations. We will use Matplotlib.

In [None]:
encoded_imgs = single_encoder.predict(x_val)
# decoded_imgs = single_decoder.predict(encoded_imgs)
decoded_imgs = single_autoencoder.predict(x_val)

In [None]:
n = 10
plt.figure(figsize=(20, 6))
for i in range(n):
    # display original
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_val[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
    
    # display reconstruction
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

The following code only works if the encoder is compiled during this session

In [None]:
n = 10
plt.figure(figsize=(40, 10))
for i in range(n):  
    # display encoded
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(encoded_imgs[i].reshape(6, 6))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

plt.show()

Some Neurons are always 0. Further investigation?

# Stacked Autoencoder

In [None]:
######## constants for stacked autoencoder ############
input_dim = 784
encoding_dim1 = 128
encoding_dim2 = 64
encoding_dim3 = 32
decoding_dim1 = 64
decoding_dim2 = 128
decoding_dim3 = input_dim
epochs = 100
batch_size = 256

In [None]:
input_img = Input(shape=(input_dim,))
encoded = Dense(encoding_dim1, activation='relu')(input_img)
encoded = Dense(encoding_dim2, activation='relu')(encoded)
encoded = Dense(encoding_dim3, activation='relu')(encoded)

decoded = Dense(decoding_dim1, activation='relu')(encoded)
decoded = Dense(decoding_dim2, activation='relu')(decoded)
decoded = Dense(decoding_dim3, activation='sigmoid')(decoded)

In [None]:
stacked_autoencoder = Model(input_img, decoded)
stacked_autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
stacked_autoencoder.summary()

In [6]:
stacked_autoencoder = keras.models.load_model('models/stacked_autoencoder.h5')
# stacked_autoencoder.fit(x_train, x_train,
#                 epochs=epochs,
#                 batch_size=batch_size,
#                 shuffle=True,
#                 validation_data=(x_val, x_val))

Save the model

In [None]:
# stacked_autoencoder.save('models/stacked_autoencoder.h5')

In [None]:
score = stacked_autoencoder.evaluate(x_val, x_val, verbose=0)
print(score)

In [12]:
plot_model(stacked_autoencoder, to_file='images/stacked_autoencoder.png', show_shapes=True, show_layer_names=True, rankdir='LR')

![stacked_autoencoder](images/stacked_autoencoder.png)

In [None]:
decoded_imgs = stacked_autoencoder.predict(x_val)

In [None]:
n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    # display original
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_val[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

    # display reconstruction
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

# Denoising Data

In [None]:
noise_factor = 0.5

In [None]:
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) 
x_val_noisy = x_val + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_val.shape) 

# re-normalization by clipping to the intervall (0,1)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_val_noisy = np.clip(x_val_noisy, 0., 1.)

In [None]:
n = 10
plt.figure(figsize=(20, 2))
for i in range(n):
    ax = plt.subplot(1, n, i + 1)
    plt.imshow(x_val_noisy[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

Stick with the stack

In [None]:
denoising_autoencoder = Model(input_img, decoded)
denoising_autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
denoising_autoencoder.summary()

### Train or load a stacked denoising autoencoder

In [None]:
# denoising_autoencoder = keras.models.load_model('models/denoising_autoencoder.h5')
denoising_autoencoder.fit(x_train_noisy, x_train,
                epochs=epochs,
                batch_size=batch_size,
                shuffle=True,
                validation_data=(x_val_noisy, x_val),
                callbacks=[TensorBoard(log_dir='/tmp/autoencoder')])

Save trained model

In [None]:
# denoising_autoencoder.save('models/denoising_autoencoder.h5')

In [None]:
decoded_imgs = denoising_autoencoder.predict(x_val_noisy)

In [None]:
n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    # display original
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_val[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)

    # display reconstruction
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

# Train Neural Net to recognize MNIST digits

In [None]:
# constants
batch_size = 128
num_classes = 10
epochs = 20
hidden1_dim = 512
hidden2_dim = 512

## Preprocess input data

In [None]:
y_train = keras.utils.to_categorical(y_train, num_classes)
y_val = keras.utils.to_categorical(y_val, num_classes)
print(y_train[0])

In [None]:
model = Sequential()
model.add(Dense(hidden1_dim, activation='relu', input_shape=(input_dim,)))
model.add(Dropout(0.2))
model.add(Dense(hidden2_dim, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))

In [None]:
model.summary()

In [None]:
model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

### Train or load Model

In [13]:
model = keras.models.load_model('models/model.h5')
# model.fit(x_train, y_train, 
#           batch_size=batch_size,
#           epochs=epochs,
#           verbose=1,
#           validation_data=(x_val, y_val))

Save trained model

In [None]:
# model.save('models/model.h5')

In [14]:
plot_model(model, to_file='images/mnist_nn.png', show_shapes=True, show_layer_names=True, rankdir='LR')

![MNIST Neural Net](images/mnist_nn.png)

In [None]:
score = model.evaluate(x_val, y_val, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

# Compare results

## Classification of noisy data

In [None]:
score = model.evaluate(x_val_noisy, y_val, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

## Classification of denoised data

In [None]:
score = model.evaluate(decoded_imgs, y_val, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

# Additional Information and Footage

## Cross Entropy

![Cross Entropy](images\2017-12-03 10_42_19-Machine Learning_ Should I use a categorical cross entropy or binary cross entro.png)

## Batch Size

Batch size defines number of samples that going to be propagated through the network.

For instance, let's say you have 1050 training samples and you want to set up batch_size equal to 100. Algorithm takes first 100 samples (from 1st to 100th) from the training dataset and trains network. Next it takes second 100 samples (from 101st to 200th) and train network again. We can keep doing this procedure until we will propagate through the networks all samples. The problem usually happens with the last set of samples. In our example we've used 1050 which is not divisible by 100 without remainder. The simplest solution is just to get final 50 samples and train the network.

Advantages:

* It requires less memory. Since you train network using less number of samples the overall training procedure requires less memory. It's especially important in case if you are not able to fit dataset in memory.
* Typically networks trains faster with mini-batches. That's because we update weights after each propagation. In our example we've propagated 11 batches (10 of them had 100 samples and 1 had 50 samples) and after each of them we've updated network's parameters. If we used all samples during propagation we would make only 1 update for the network's parameter.

Disadvantages:

* The smaller the batch the less accurate estimate of the gradient. In the figure below you can see that mini-batch (green color) gradient's direction fluctuates compare to the full batch (blue color).

![https://stats.stackexchange.com/questions/153531/what-is-batch-size-in-neural-network](images\lU3sx.png)

to be done:
- check various encoders with different numbers of hidden layers
- feature extraction