# **Dataset, DataLoader, and batch size**

One hyperparameter in a neural network that we have not considered yet is the batch size. Batch size refers to the number of data points considered to calculate the loss value or update weights.

his hyperparameter especially comes in handy in scenarios where there are millions of data points, and using all of them for one instance of weight update is not optimal, as memory is not available to hold so much information. In addition, a sample can be representative enough of the data. Batch size helps in fetching multiple samples of data that are representative enough, but not necessarily 100% representative of the total data.

In this section, we will come up with a way to specify the batch size to be considered when calculating the gradient of weights, to update weights, which is in turn used to calculate the updated loss value

# **Import the methods that help in loading data and dealing with datasets:**

In [1]:
import torch
from torch.utils.data import Dataset,DataLoader
import torch
import torch.nn as nn

# Import the data, convert the data into floating-point numbers, and register them to a device

In [2]:
x = [[1,2],[3,4],[5,6],[7,8]]
y = [[3],[7],[11],[15]]

# Convert the data into floating-point numbers

In [3]:
X = torch.tensor(x).float()
Y = torch.tensor(y).float()

# Register data to the device – given that we are working on a GPU, we specify that the device is 'cuda'. 

In [4]:
device = 'cuda' if torch.cuda.is_available() else 'gpu'
X = X.to(device)
Y = Y.to(device)

# Instantiate a class of the dataset – MyDataset:

Within the MyDataset class, we store the information to fetch one data point at a time so that a batch of data points can be bundled together (using DataLoader) and be sent through one forward and one back-propagation in order to update the weights:


*   Define an __init__ method that takes input and output pairs and converts them into Torch float objects:
*   Specify the length (__len__) of the input dataset:


*   Finally, the __getitem__ method is used to fetch a specific row:

*   In the preceding code, ix refers to the index of the row that is to be fetched from the dataset.





In [5]:
class MyDataset(Dataset):
    def __init__(self,x,y):
        self.x = torch.tensor(x).float()
        self.y = torch.tensor(y).float()
    def __len__(self):
        return len(self.x)
    def __getitem__(self, ix):
        return self.x[ix], self.y[ix]
ds = MyDataset(X, Y)

  This is separate from the ipykernel package so we can avoid doing imports until
  after removing the cwd from sys.path.


# Pass the dataset instance defined previously through DataLoader to fetch the batch_size number of data points from the original input and output tensor objects:

In [6]:
dl = DataLoader(ds, batch_size=2, shuffle=True)

# **To fetch the batches from dl, we loop through it:**

In [7]:
# NOTE - This line of code is not a part of model building, 
# this is used only for illustration of 
# how to print the input and output batches of data
for x,y in dl:
    print(x,y)

tensor([[7., 8.],
        [5., 6.]], device='cuda:0') tensor([[15.],
        [11.]], device='cuda:0')
tensor([[3., 4.],
        [1., 2.]], device='cuda:0') tensor([[7.],
        [3.]], device='cuda:0')


# Now, we define the neural network class as we defined in the previous section:

In [8]:
class MyNeuralNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.input_to_hidden_layer = nn.Linear(2,8)
        self.hidden_layer_activation = nn.ReLU()
        self.hidden_to_output_layer = nn.Linear(8,1)
    def forward(self, x):
        x = self.input_to_hidden_layer(x)
        x = self.hidden_layer_activation(x)
        x = self.hidden_to_output_layer(x)
        return x

# Next, we define the model object (mynet), loss function (loss_func), and optimizer (opt) too, as defined in the previous section:

In [9]:
mynet = MyNeuralNet().to(device)
loss_func = nn.MSELoss()
from torch.optim import SGD
opt = SGD(mynet.parameters(), lr = 0.001)

# Finally, loop through the batches of data points to minimize the loss value, just like we did in step 6 in the previous section:

In [10]:
import time
loss_history = []
start = time.time()
for _ in range(50):
    for data in dl:
        x, y = data
        opt.zero_grad()
        loss_value = loss_func(mynet(x),y)
        loss_value.backward()
        opt.step()
        loss_history.append(loss_value)
end = time.time()
print(end - start)

0.10752296447753906


In [11]:
val_x = [[10,11]]

In [12]:
val_x = torch.tensor(val_x).float().to(device)

In [13]:
mynet(val_x)

tensor([[20.1701]], device='cuda:0', grad_fn=<AddmmBackward>)