# MNIST multilayer network

## Exercise - Load and preprocess data

> **Exercise**: Load the MNIST data. Split it into train, validation and test sets. Standardize the images. Define a `get_batches(X, y, batch_size)` function to generate random X/y batches of size `batch_size` using a Python generator.

In [5]:
# define which xk.npz file to load (give the number of k)
number_of_k=6
data_filename='mnist-{}k.npz'.format(number_of_k)
print('Using file:',data_filename)

Using file: mnist-6k.npz


In [6]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np


In [7]:
# Load data
with np.load(data_filename, allow_pickle=False) as npz_file:
    # Load items into a dictionary
    mnist = dict(npz_file.items())

In [9]:
# Create train, test and validation sets
X_train, X_test, y_train, y_test = train_test_split(
    # Convert uint8 pixel values to float
    mnist['data'].astype(np.float32),
    mnist['labels'],
    test_size=1000, random_state=0)

X_test, X_valid, y_test, y_valid = train_test_split(
    # Convert uint8 pixel values to float
    X_test,
    y_test,
    test_size=500, random_state=0)

print("Train:", X_train.shape, y_train.shape)
print("Test :", X_test.shape, y_test.shape)
print("Valid:", X_valid.shape, y_valid.shape)

# Rescale train and validation data
scaler = StandardScaler()
X_train_rescaled = scaler.fit_transform(X_train)
X_test_rescaled = scaler.fit_transform(X_test)
X_valid_rescaled = scaler.transform(X_valid)



Train: (5000, 784) (5000,)
Test : (500, 784) (500,)
Valid: (500, 784) (500,)


In [10]:
# Batch generator
def get_batches(X, y, batch_size):
    # Shuffle X,y
    shuffled_idx = np.arange(len(y)) # 1,2,...,n
    np.random.shuffle(shuffled_idx)

    # Enumerate indexes by steps of batch_size
    # i: 0, b, 2b, 3b, 4b, .. where b is the batch size
    for i in range(0, len(y), batch_size):
        # Batch indexes
        batch_idx = shuffled_idx[i:i+batch_size]
        yield X[batch_idx], y[batch_idx]

## Exercise - Create and train a multilayer network

> **Exercise:** Create a multilayer neural network and train it using your batch generator. Evaluate the accuracy on the validation set after each epoch. Test different architectures and parameters. Evaluate your best network on the test set. Save the trained weights of the first fully connected layer in a variable.

In [11]:
# Definition of the different networks to try
neural_networks={
    '2-layer-32': {
        'hidden': [32],
        'graph': None,
    },
    '2-layer-64': {
        'hidden': [64],
        'graph': None,
    }
}

In [19]:
import tensorflow as tf

for network_name in neural_networks.keys():
    print("Working with network:", network_name)
    
    # Redefine the graph
    neural_networks[network_name]['graph'] = tf.Graph()

    with neural_networks[network_name]['graph'].as_default():
        # Create placeholders
        X = tf.placeholder(dtype=tf.float32, shape=[None, 784])
        y = tf.placeholder(dtype=tf.int32, shape=[None])

        for i in neural_networks[network_name]['hidden']:
            print('  Adding hidden layer with {} neurons'.format(i))
            # Hidden layer with 16 units
            hidden = tf.layers.dense(
                X, i, activation=tf.nn.relu, # ReLU
                kernel_initializer=tf.variance_scaling_initializer(scale=2, seed=0),
                bias_initializer=tf.zeros_initializer(),
                name='hidden'
            )

        # Output layer
        logits = tf.layers.dense(
            hidden, 10, activation=None, # No activation function
            kernel_initializer=tf.variance_scaling_initializer(scale=1, seed=0),
            bias_initializer=tf.zeros_initializer(),
            name='output'
        )
    
        

Working with network: 2-layer-32
  Adding hidden layer with 32 neurons
Working with network: 2-layer-64
  Adding hidden layer with 64 neurons


In [21]:
with neural_networks['2-layer-64']['graph'].as_default():
    # Get variables in the graph
    for v in tf.trainable_variables():
        print(v)


<tf.Variable 'hidden/kernel:0' shape=(784, 64) dtype=float32_ref>
<tf.Variable 'hidden/bias:0' shape=(64,) dtype=float32_ref>
<tf.Variable 'output/kernel:0' shape=(64, 10) dtype=float32_ref>
<tf.Variable 'output/bias:0' shape=(10,) dtype=float32_ref>


## Exercise - Visualize weights

> **Exercise**: Plot the weights from the first fully connected layer (the templates) with the `imshow()` function.

In [None]:
???