## Loading MNIST dataset in keras

In [3]:
from keras.datasets import mnist

In [6]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

### training data

In [7]:
train_images.shape

(60000, 28, 28)

In [8]:
len(train_labels)

60000

### test data

In [9]:
test_images.shape

(10000, 28, 28)

In [10]:
len(test_labels)

10000

## Define network architecture

In [11]:
from keras import models
from keras import layers

In [12]:
network = models.Sequential()
network.add(layers.Dense(512, activation='relu', input_shape=(28*28,)))
network.add(layers.Dense(10, activation='softmax'))

To make our network ready for training, we need to pick three more things, as part of training
1. A loss function : This is how the network will be able to measure how good a job it is doing on its training data, and thus how it will be able to steer itself in the right direction.
2. An optimizer : this is the mechanism through which the network will update itself based
on the data it sees and its loss function.
3. Metrics to monitor during training and testing.

## Compilation

In [13]:
network.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

### Preparing the data

In [14]:
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype('float32')/255

In [15]:
test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype('float32')/255

### preparing the labels

In [16]:
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

## training the network

In [17]:
network.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0xb24963da0>

## evaluating the network

In [18]:
test_loss, test_acc = network.evaluate(test_images, test_labels)
print('test_acc : ', test_acc)

test_acc :  0.9799


# what is a tensor?

At its core, a tensor is a container for data—almost always numerical data. So, a
container for numbers. You may be already familiar with matrices, which are 2D tensors:
tensors are merely a generalization of matrices to an arbitrary number of dimensions

A tensor is defined by 3 key attributes:
1. The number of axes it has, its rank. For instance, a 3D tensor has 3 axes, and a matrix has
2 axes. This is also called the tensor’s ndim, throughout Python libraries such as Numpy.
2. Its shape. This is a tuple of integers that describes how many dimensions the tensor has
along each axis. For instance, our matrix example above has shape (3, 5), and our 3D
tensor example had shape (3, 3, 5). A vector will have a shape with a single element,
such as (5,), while a scalar will have an empty shape, ().
3. Its data type (usually called dtype throughout Python libraries). This is the type of the
data contained inside the tensor; for instance a tensor’s type could be float32, uint8,
float64… In rare occasions you may witness a char tensor. Note that string tensors
don’t exist in Numpy (nor in most other libraries), since tensors live in pre-allocated
contiguous memory segments, and strings, being variable-length, would preclude the use
of this implementation.

### slicing a tensor

In [20]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [21]:
my_slice = train_images[10:100]
print(my_slice.shape)

(90, 28, 28)


In [22]:
my_slice = train_images[10:100, :, :]
my_slice.shape

(90, 28, 28)

# Keras

- It allows the same code to run on CPU or on GPU, seamlessly.
- It has a user-friendly API which makes it easy to quickly prototype deep learning models.
- It has build-in support for convolutional networks (for computer vision), recurrent
networks (for sequence processing), and any combination of both.
- It supports arbitrary network architectures: multi-input or multi-output models, layer
sharing, model sharing, etc. This means that Keras is appropriate for building essentially
any deep learning model, from a generative adversarial network to a neural Turing
machine.

### Keras workflow

- Define your training data: input tensors and target tensors.
- Define a network of layers (or model) that maps your inputs to your targets.
- Configure the learning process by picking a loss function, an optimizer, and some metrics
to monitor.
- Iterate on your training data by calling the fit method of your model.

In Keras we can define network in two ways

In [24]:
from keras import models
from keras import layers

Using sequential model

In [25]:
model = models.Sequential()
model.add(layers.Dense(32, activation='relu', input_shape=(784,)))
model.add(layers.Dense(10, activation='softmax'))

Using the functional APIs

In [26]:
input_tensor = layers.Input(shape=(784,))
x = layers.Dense(32, activation='relu')(input_tensor)
output_tensor = layers.Dense(10, activation='softmax')(x)

model = models.Model(input=input_tensor, output=output_tensor)

  """


### Defining loss function and an optimizer

In [27]:
from keras import optimizers

model.compile(optimizer=optimizers.RMSprop(lr=0.01), loss='mse', metrics=['accuracy'])

Training a model

In [None]:
model.fit(input_tensor, target_tensor, batch_size=128, epochs=10)