# 305 Batch Train

View more, visit my tutorial page: https://morvanzhou.github.io/tutorials/
My Youtube Channel: https://www.youtube.com/user/MorvanZhou

https://www.quora.com/What-is-an-epoch-in-deep-learning

- 1 Epoch = 1 Forward pass + 1 Backward pass for ALL training samples.
- Batch Size = Number of training samples in 1 Forward/1 Backward pass. (With increase in Batch size, required memory space increases.)
- Number of iterations = Number of passes i.e. 1 Pass = 1 Forward pass + 1 Backward pass (Forward pass and Backward pass are not counted differently.)

In [13]:
import torch
import torch.utils.data as Data

torch.manual_seed(1)    # reproducible

<torch._C.Generator at 0x10805b1b0>

In [14]:
BATCH_SIZE = 5
# BATCH_SIZE = 8

In [15]:
# torch.linspace(start, end, step): returns a one-dimensional tensor of steps equally spaced points between start and end.
x = torch.linspace(1, 10, 10)       # this is x data (torch tensor)
y = torch.linspace(10, 1, 10)       # this is y data (torch tensor)
print (x, y)

tensor([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.]) tensor([10.,  9.,  8.,  7.,  6.,  5.,  4.,  3.,  2.,  1.])


In [16]:
torch_dataset = Data.TensorDataset(x, y)
loader = Data.DataLoader(
    dataset=torch_dataset,      # torch TensorDataset format
    batch_size=BATCH_SIZE,      # mini batch size
    shuffle=True,               # random shuffle for training
    num_workers=2,              # subprocesses for loading data
)

In [17]:
for epoch in range(3):   # train entire dataset 3 times
    for step, (batch_x, batch_y) in enumerate(loader):  # for each training step
        # train your data...
        print('Epoch: ', epoch, '| Step: ', step, '| batch x: ',
              batch_x.numpy(), '| batch y: ', batch_y.numpy())


Epoch:  0 | Step:  0 | batch x:  [ 5.  7. 10.  3.  4.] | batch y:  [6. 4. 1. 8. 7.]
Epoch:  0 | Step:  1 | batch x:  [2. 1. 8. 9. 6.] | batch y:  [ 9. 10.  3.  2.  5.]
Epoch:  1 | Step:  0 | batch x:  [ 4.  6.  7. 10.  8.] | batch y:  [7. 5. 4. 1. 3.]
Epoch:  1 | Step:  1 | batch x:  [5. 3. 2. 1. 9.] | batch y:  [ 6.  8.  9. 10.  2.]
Epoch:  2 | Step:  0 | batch x:  [ 4.  2.  5.  6. 10.] | batch y:  [7. 9. 6. 5. 1.]
Epoch:  2 | Step:  1 | batch x:  [3. 9. 1. 8. 7.] | batch y:  [ 8.  2. 10.  3.  4.]


### Suppose a different batch size that cannot be fully divided by the number of data entreis:

In [19]:
BATCH_SIZE = 3
loader = Data.DataLoader(
    dataset=torch_dataset,      # torch TensorDataset format
    batch_size=BATCH_SIZE,      # mini batch size
    shuffle=True,               # random shuffle for training
    num_workers=2,              # subprocesses for loading data
)
for epoch in range(3):   # train entire dataset 3 times
    for step, (batch_x, batch_y) in enumerate(loader):  # for each training step
        # train your data...
        print('Epoch: ', epoch, '| Step: ', step, '| batch x: ',
              batch_x.numpy(), '| batch y: ', batch_y.numpy())

Epoch:  0 | Step:  0 | batch x:  [3. 9. 4.] | batch y:  [8. 2. 7.]
Epoch:  0 | Step:  1 | batch x:  [1. 7. 8.] | batch y:  [10.  4.  3.]
Epoch:  0 | Step:  2 | batch x:  [ 5.  6. 10.] | batch y:  [6. 5. 1.]
Epoch:  0 | Step:  3 | batch x:  [2.] | batch y:  [9.]
Epoch:  1 | Step:  0 | batch x:  [1. 5. 3.] | batch y:  [10.  6.  8.]
Epoch:  1 | Step:  1 | batch x:  [2. 8. 6.] | batch y:  [9. 3. 5.]
Epoch:  1 | Step:  2 | batch x:  [ 9.  7. 10.] | batch y:  [2. 4. 1.]
Epoch:  1 | Step:  3 | batch x:  [4.] | batch y:  [7.]
Epoch:  2 | Step:  0 | batch x:  [1. 6. 4.] | batch y:  [10.  5.  7.]
Epoch:  2 | Step:  1 | batch x:  [3. 8. 5.] | batch y:  [8. 3. 6.]
Epoch:  2 | Step:  2 | batch x:  [10.  7.  9.] | batch y:  [1. 4. 2.]
Epoch:  2 | Step:  3 | batch x:  [2.] | batch y:  [9.]
