In [1]:
import numpy as np
from keras.datasets import mnist 
from keras import Model
from keras.layers import Input, Dense
from keras.optimizers import SGD, Adam

import keras

import matplotlib.pyplot as plt
%matplotlib inline

Using TensorFlow backend.


## Loading data

In [2]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [3]:
X_train = X_train.reshape(60000, 784).astype('float32') / 255
X_test = X_test.reshape(10000, 784).astype('float32') / 255
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)
n_classes = 10

print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

60000 train samples
10000 test samples


# Input as a matrix
Our input is stored as a matrix of shape **(samples, N_parameters)**

In [4]:
X_train.shape

(60000, 784)

In [5]:
y_train.shape

(60000, 10)

In [6]:
batch_size = 128

# Slicing Data to have a batch

In [7]:
batch = X_train[:batch_size]
batch.shape

(128, 784)

# Creating Input for NN
Let's create input for the NN and check its shape

In [8]:
input = Input(shape=(784,))
input.shape

TensorShape([Dimension(None), Dimension(784)])

You see **Dimension(None)** as a first dimension because Keras assume that you will have a batch as an input and **None** is the way it treats currently unknown batch size.

But you can create an input with a predefined batch size.

In [9]:
input_with_predefined_batch_size = Input(batch_shape=(batch_size, 784))
input_with_predefined_batch_size.shape

TensorShape([Dimension(128), Dimension(784)])

# Creating a Dense layer
Dense is an Python object which has some parameters underneath it like `activation` or `weights` and others.

Let's create one

In [10]:
dense_layer = Dense(20, activation='sigmoid')

In [11]:
dense_layer.activation

<function keras.activations.sigmoid>

**Weights are currently don't exist because Keras doesn't know the size of the input**

In [12]:
dense_layer.get_weights()

[]

**But if we apply our *intput* as an input it will immediately create weights**

In [13]:
output = dense_layer(input) 
W, b = dense_layer.get_weights()
print(W.shape)
print(b.shape)

(784, 20)
(20,)


Let's check out weights

In [14]:
W

array([[-0.06482614,  0.05635738,  0.06714468, ...,  0.07981512,
         0.05947004, -0.01975584],
       [-0.05659121,  0.05504735, -0.06946041, ...,  0.04504037,
         0.01825145,  0.07140395],
       [ 0.06022672, -0.06013019, -0.05267657, ...,  0.0480587 ,
         0.08198392, -0.08600754],
       ...,
       [ 0.02683932, -0.02114712, -0.07852899, ...,  0.03451909,
         0.06873423,  0.07862003],
       [-0.02791116,  0.01348917,  0.0096719 , ...,  0.02525312,
         0.00074495, -0.00530349],
       [ 0.0080116 ,  0.0766926 ,  0.0352729 , ..., -0.00964007,
        -0.07203107, -0.0193848 ]], dtype=float32)

In [15]:
b

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0.], dtype=float32)

The way **W** and **b** are initialized is specified by `kernel_initializer` and `bias_initializer` parameters

In [16]:
dense_layer = Dense(20, kernel_initializer='zeros')
output = dense_layer(input)
W, b = dense_layer.get_weights()

In [17]:
W

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

# Keras Model
In Keras, to train a Neural Network we need create a model using inputs of NN and its outputs

In [18]:
input = Input(shape=(784,))

# creating the dense layer object
dense_layer = Dense(10, kernel_initializer='zeros', activation='softmax')

# applying dense layer to input
output = dense_layer(input)

model = Model(inputs=input, outputs=output)

## Keras model main methods

We can use even untrained NN to predict something, which will be mostly random

In [19]:
y_prediction = model.predict(X_train)
y_prediction.shape

(60000, 10)

Let's check how accurate that was

In [20]:
'Accuracy', np.mean(np.equal(np.argmax(y_prediction, axis=-1), np.argmax(y_train, axis=-1)))

('Accuracy', 0.09871666666666666)

To use the model more thoroughly we need to compile it with **model.compile()**

<img src="modelcompile.jpg"/>

In [21]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

You can actually compute accuracy right away using Keras

In [22]:
model.evaluate(X_train, y_train)



[2.3025853633880615, 0.09871666666666666]

# Training the model

We can train our model using **model.fit** function

<img src="modelfit.png"/>

In [23]:
model.fit(x=X_train, y=y_train, batch_size=32, epochs=1, validation_split=0.2, shuffle=True)

Train on 48000 samples, validate on 12000 samples
Epoch 1/1


<keras.callbacks.History at 0x6e54a84e80>