In [1]:
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
np.set_printoptions(precision=3)
np.set_printoptions(suppress=True)

# Creating Mini-batches For Neural Network Training

#### Combining data points with their labels

We have two numpy arrays `dpoints` and `labels`. Let's imagine that `dpoints` contains our data points, and `labels` the corresponding labels:

In [16]:
dpoints = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13])
labels = np.array(['A','B','C','D','E','F','G','H','I','J','K','L','M'])

We can use the `zip` function to combine them and then interate trought the zipped object:

In [17]:
for dp, l in zip(dpoints, labels):
    print (dp, l)

1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
9 I
10 J
11 K
12 L
13 M


However, if we try to print the zip object, we get this:

In [18]:
zip(dpoints, labels)

<zip at 0x7fbe41998320>

If we wanted to convert the zip object into a list, we can use the `list` function.

In [19]:
list(zip(dpoints, labels))

[(1, 'A'),
 (2, 'B'),
 (3, 'C'),
 (4, 'D'),
 (5, 'E'),
 (6, 'F'),
 (7, 'G'),
 (8, 'H'),
 (9, 'I'),
 (10, 'J'),
 (11, 'K'),
 (12, 'L'),
 (13, 'M')]

If we wanted to turn the zip into a numpy array, we can use the command `np.array()` on top of the `list` command:

In [20]:
dataset = np.array(list(zip(dpoints, labels)))
dataset

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C'],
       ['4', 'D'],
       ['5', 'E'],
       ['6', 'F'],
       ['7', 'G'],
       ['8', 'H'],
       ['9', 'I'],
       ['10', 'J'],
       ['11', 'K'],
       ['12', 'L'],
       ['13', 'M']], dtype='<U21')

The variable `dataset` now contains both data points and labels.

## Creating mini-batches

### Shuffling the dataset

The idea of mini-batches is to take a large data set (like MNIST), and split it into smaller, managable chunks of data that can be trained more efficiently using our algorithm. Let's see how this can be accomplished using `dataset`:

In [21]:
dataset

array([['1', 'A'],
       ['2', 'B'],
       ['3', 'C'],
       ['4', 'D'],
       ['5', 'E'],
       ['6', 'F'],
       ['7', 'G'],
       ['8', 'H'],
       ['9', 'I'],
       ['10', 'J'],
       ['11', 'K'],
       ['12', 'L'],
       ['13', 'M']], dtype='<U21')

First we define the size of chunks that we would like to subdivide our data into with `mb_size`. Then we compute the length of our data with `data_len`: 

In [23]:
mb_size = 3
data_len = len(dataset)
print (f'chunk size: {mb_size}, number of data points {data_len}')

chunk size: 3, number of data points 13


In every epoch, we should first shuffle our data set as it was a deck of cards. If our dataset is a numpy array, we can shuffle it by using the command `np.random.shuffle`:

In [24]:
np.random.shuffle(dataset)
print(dataset)

[['9' 'I']
 ['7' 'G']
 ['1' 'A']
 ['5' 'E']
 ['4' 'D']
 ['13' 'M']
 ['6' 'F']
 ['3' 'C']
 ['11' 'K']
 ['12' 'L']
 ['10' 'J']
 ['8' 'H']
 ['2' 'B']]


Every time we run this command we get a different ordering. The reason why we 'zipped' our data poins and labels together is that they are shuffled together. If we shuffled them separately, the correspondance between a data point and its label would be lost.

In [25]:
np.random.shuffle(dataset)
print(dataset)

[['12' 'L']
 ['8' 'H']
 ['1' 'A']
 ['11' 'K']
 ['9' 'I']
 ['3' 'C']
 ['13' 'M']
 ['5' 'E']
 ['4' 'D']
 ['10' 'J']
 ['7' 'G']
 ['2' 'B']
 ['6' 'F']]


Notice that the label `A` is still associated with the data point 1.

### Producing mini-batches

The python function `range` takes three arguments. The first one defines the staring value of the range. The second defines the end value. The third value defines the number of steps to get from the starting value to the end value.

In [34]:
for i in range(0,10,2):
    print (i)

0
2
4
6
8


Again, we use the command `list` to display the range.

In [35]:
list(range(0,10,2))

[0, 2, 4, 6, 8]

In our case, we want the ranges to go from the first data point (index 0), until the last point of our data set, in the increments that are equal to the mini batch size:

In [36]:
list(range(0, data_len, mb_size))

[0, 3, 6, 9, 12]

Then we can take parts of our data and extract them according to the indexes until we reach the end of the list.

In [38]:
mini_batches = [dataset[a:a+mb_size] for a in range(0, data_len, mb_size)]

Now we can iterate trought the mini-batches stored in the variable `mini_batches`:

In [40]:
for i, batch in enumerate(mini_batches):
    print (f'batch {i+1}:\n{batch}\n')

batch 1:
[['12' 'L']
 ['8' 'H']
 ['1' 'A']]

batch 2:
[['11' 'K']
 ['9' 'I']
 ['3' 'C']]

batch 3:
[['13' 'M']
 ['5' 'E']
 ['4' 'D']]

batch 4:
[['10' 'J']
 ['7' 'G']
 ['2' 'B']]

batch 5:
[['6' 'F']]



Now we can use these small datasets as the input of the neural network algorithm in one epoch. In the next epoch, we start by reshuffling our data set:

In [41]:
np.random.shuffle(dataset)
print(dataset)

[['8' 'H']
 ['4' 'D']
 ['11' 'K']
 ['2' 'B']
 ['12' 'L']
 ['6' 'F']
 ['9' 'I']
 ['10' 'J']
 ['13' 'M']
 ['3' 'C']
 ['5' 'E']
 ['7' 'G']
 ['1' 'A']]


Then we create new mini-batches:

In [43]:
mini_batches = [dataset[a:a+mb_size] for a in range(0, data_len, mb_size)]
for i, batch in enumerate(mini_batches):
    print (f'batch {i+1}:\n{batch}\n')

batch 1:
[['8' 'H']
 ['4' 'D']
 ['11' 'K']]

batch 2:
[['2' 'B']
 ['12' 'L']
 ['6' 'F']]

batch 3:
[['9' 'I']
 ['10' 'J']
 ['13' 'M']]

batch 4:
[['3' 'C']
 ['5' 'E']
 ['7' 'G']]

batch 5:
[['1' 'A']]



This procedure is repeated in every epoch of our neural network training.