# <center> Keras </center>
## <center>1.7.1 Batch size and Epochs</center>

# Batch Size and Epochs

Batch size is the number of training instances used in one iteration. 

For instance, let's say you have 1000 training samples and you want to set up batch_size equal to 100. The algorithm then takes the first 100 samples (1-100) from the training dataset and trains the network. Next it takes once again 100 samples (101-200) and trains the network again. We keep doing this procedure until all the samples have been used for training. 
After each batch, the weights are adjusted. 

An Epoch defines the number of training iterations, each sample has been through.

# Code

Let us run the code from the previous section first.

In [None]:
# Importing the MNIST dataset
from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Processing the input data
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

# Processing the output data
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# Build a network
from keras import models
from keras import layers

network = models.Sequential()
network.add(layers.Dense(units=512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(units=10, activation='softmax'))

# Compile the network
network.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

Let us now examine the code related to training a network. 

In [None]:
# Train the network
history = network.fit(train_images, train_labels, epochs=5, batch_size=128, 
                      verbose=1, validation_data=(test_images, test_labels))

Here we are using `128` as the batch size and `5` epochs.

In [None]:
import matplotlib.pyplot as plt
def plot_training_history(history):
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()
    #loss
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

In [None]:
# Plot the training results
plot_training_history(history)

# Task
- Find an equation that computes the number of updates performed by the optimizer using batch size 'bs' and epoch size 'es' as parameters. The number of samples is located in numSamples.

- Change the batch size to 1 and look what happens.
- Try out different combinations of Batch size and Epochs. Document your findings.

# Summary
In this section we learned about the batch size and epochs.

# Feedback
<a href = "http://goto/ml101_doc/Keras12">Feedback: Batch Size and Epochs</a> <br>

# Navigation

<div>
<span> <h3 style="display:inline">&lt;&lt; Prev: <a href = "keras11.ipynb">Train a network</a></h3> </span>
<span style="float: right"><h3 style="display:inline">Next: <a href = "keras13.ipynb">Overfitting</a> &gt;&gt; </h3></span>
</div>