# Basic practices of Deep Learning using PyTorch

## I. Example 1 for Automatic Differentiation (autograd)

Automatic Differentiation in PyTorch

It's a system that automatically computes gradients of expressions with respect to input variables. Gradients are essential for optimizing neural network models during the training process using techniques like gradient descent.

In [None]:
import torch
import torchvision
import torch.nn as nn
import torchvision.transforms as transforms

first things first, linear equations and creating tensors, performing predictions, etc but what is nn? neural network... to understanr refer neural network's youtube video by 3blue1brown, (my favourite channel) to build stronger intuitions behind understanding neural networks, weights and biases, and understanding them, reason behind calculating losses, gradient descents, and much more...

In [None]:
x = torch.tensor(1., requires_grad=True)
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)

grad true means that it is gointo track the operations performed on/with x, w, b

In [None]:
y = w * x + b

this is a simple linear equation
In this case, the requires_grad has tracked the operations and built a compuattion graph

In [None]:
y.backward()

the backward pass now performs differentiation with respect to y

In [None]:
print(x.grad)
print(w.grad)
print(b.grad)

tensor(2.)
tensor(1.)
tensor(1.)


## II. Example 2

In [None]:
x = torch.randn(10, 3)
y = torch.randn(10,2)

In [None]:
linear = nn.Linear(3, 2)
print('w', linear.weight)
print('b', linear.bias)

w Parameter containing:
tensor([[-0.1423,  0.2529, -0.0654],
        [ 0.5331, -0.4084,  0.4343]], requires_grad=True)
b Parameter containing:
tensor([-0.3153, -0.4255], requires_grad=True)


In [None]:
y

tensor([[-7.6141e-01,  1.0809e-04],
        [-1.3537e+00,  1.7671e+00],
        [-1.0346e+00, -2.4248e-01],
        [ 9.9441e-01,  6.7304e-01],
        [ 4.7188e-01, -2.1565e+00],
        [ 2.3015e-01, -6.5607e-01],
        [ 2.3410e-01, -4.8376e-01],
        [ 1.5548e+00, -1.3485e-01],
        [ 1.3650e-02,  2.0139e+00],
        [ 7.7922e-02, -9.2036e-02]])

In [None]:
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.01)

In [None]:
pred = linear(x)

In [None]:
pred

tensor([[-0.2893, -0.2346],
        [-0.6698,  0.4193],
        [-0.4674,  0.0723],
        [-0.2020, -0.5157],
        [-0.3000, -0.4427],
        [-0.2821, -1.0992],
        [-0.0706, -0.0329],
        [-0.6721,  1.0849],
        [-0.2303, -0.7887],
        [-0.2509, -0.4587]], grad_fn=<AddmmBackward0>)

In [None]:
loss = criterion(pred, y)
print('loss: ', loss.item())


loss:  1.2359490394592285


In [None]:
loss.backward()

In [None]:
print ('dL/dw: ', linear.weight.grad)
print ('dL/db: ', linear.bias.grad)

dL/dw:  tensor([[ 0.1418, -0.0057, -0.6323],
        [ 0.2210,  0.1109,  0.3854]])
dL/db:  tensor([-0.3862, -0.2684])


In [None]:
optimizer.step()

In [None]:
pred = linear(x)


In [None]:
pred

tensor([[-0.2863, -0.2343],
        [-0.6646,  0.4207],
        [-0.4707,  0.0741],
        [-0.1961, -0.5147],
        [-0.2926, -0.4407],
        [-0.2873, -1.0892],
        [-0.0519, -0.0411],
        [-0.6533,  1.0778],
        [-0.2235, -0.7845],
        [-0.2486, -0.4568]], grad_fn=<AddmmBackward0>)

In [None]:
loss = criterion(pred, y)
print('loss after 1 step optimization: ', loss.item())

loss after 1 step optimization:  1.22749924659729


In [None]:
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(linear.parameters(), lr=0.1)

In [None]:
pred = linear(x)

# Compute loss.
loss = criterion(pred, y)
print('loss: ', loss.item())

loss:  1.22749924659729


In [None]:
loss.backward()

In [None]:
optimizer.step()

In [None]:
pred = linear(x)
loss = criterion(pred, y)
print('loss after 1 step optimization: ', loss.item())

loss after 1 step optimization:  1.0838451385498047


In [None]:
pred

tensor([[-0.2259, -0.2270],
        [-0.5618,  0.4473],
        [-0.5356,  0.1099],
        [-0.0794, -0.4947],
        [-0.1443, -0.3995],
        [-0.3912, -0.8909],
        [ 0.3200, -0.2051],
        [-0.2812,  0.9368],
        [-0.0890, -0.7011],
        [-0.2021, -0.4187]], grad_fn=<AddmmBackward0>)

In [None]:
# summary
# created x, y random tensors
# built linear nn with 3 inputs and 2 outputs
# assigned loss function - MSELoss()
# optimizer method - SGD with LR = 0.01
# pred - linear(x)
# calculate loss
# perform loss backward
# optimizer - 1 step gradient descent
# again linear(x)
# loss calculated
# print and see loss
# inc lr because data is not so large and/or lr is very small? i dont know

## III. loading data from numpy

In [None]:
x = np.array([[1,2], [3,4]])
y = torch.from_numpy(x)
z = y.numpy()

## IV. Input pipeline

Now Im downloading CIFAR10 data set

In [None]:
train_dataset = torchvision.datasets.CIFAR10(root='../data,',
                                             train = True,
                                             transform = transforms.ToTensor(),
                                             download = True)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ../data,/cifar-10-python.tar.gz


100%|██████████| 170498071/170498071 [00:01<00:00, 94630117.09it/s]


Extracting ../data,/cifar-10-python.tar.gz to ../data,


In [None]:
image, label = train_dataset[0]
print (image.size())
print(label)
# 1 2 3 ...32...32
# Just fetching this data ->

torch.Size([3, 32, 32])
6


The DataLoader typically uses a queue to load and enqueue batches of data in parallel with the training process. The queue helps manage the flow of data between the CPU (where data is loaded) and the GPU (where the model is trained). This is especially important when the data loading process is slower than the model training process, as it allows the model to continuously receive batches without waiting for each batch to be loaded.

So lets create the train_loader... this provides queues and threads to efficiently parallalize loading of bayches.


In [None]:
import torch.utils.data
train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                      batch_size=64,
                                      shuffle=True)
# shuffling is used to create randomness in loading the batches..so it can prevent from our model to learn patterns specific to the order but in reality order means nothing for the data. so we dont need thaat.

after this get the train_loader iterable in data_iter so you can get the next batches using next. this is usually used to manually inspect the batches. We often use a loop to iterate through batches and perform trainign.

In [None]:
data_iter = iter(train_loader)



In [None]:
images, labels = next(data_iter)

loop ->


In [None]:
for images, labels in train_loader:
    # Training code should be written here.
    pass

```
components of an input pipeline:
1. Data Loading: Loading data from storage
2. Data Preprocessing: Preprocessing involves transforming the raw data into a format suitable for training.
3. Batching: Grouping the data into batches. Training a machine learning model with batches of data, rather than individual samples.
4. Shuffling: Randomizing the order of data samples.
5. Prefetching: Overlapping data loading and model training.

```

## V. Input pipeline for custom dataset
Now...

Custom Dataset: Created by the user to handle specific data formats, sources, or preprocessing steps. It involves defining a class that inherits from torch.utils.data.Dataset and implementing the __getitem__ and __len__ methods.

Pre-built Dataset: Provided by libraries like torchvision and is ready-made for common tasks. These datasets are often well-known and used for benchmarking and experimentation.

Custom Dataset: Users define how data is loaded from files, databases, or any source within the __getitem__ method.

Pre-built Dataset: Data loading logic is predefined and often includes mechanisms for downloading, extracting, and organizing data.

In [None]:
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self):

        # 1. Initialize file paths or a list of file names.
        pass
    def __getitem__(self, index):

        # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).
        # 2. Preprocess the data (e.g. torchvision.Transform).
        # 3. Return a data pair (e.g. image and label).
        pass
    def __len__(self):
        # You should change 0 to the total size of your dataset.
        return 1

custom_dataset = CustomDataset()
train_loader = torch.utils.data.DataLoader(dataset=custom_dataset,
                                           batch_size=64,
                                           shuffle=True)

## VI. How to download and load the pretrained model.

here im using resnet-18 which is a pretrained image recognition model with 1000 classes/categories
while we use this model, the training is freezed so we can only fine-tune the params based on our i/p and outputs..and the model training will not learn anything from my data, because itsfreezed.
then, a random image tensor is created with the outputs size 100 i.e., 100 classes/categories because i just have 100 categories..
so i can used this trained model and use my parameters.

In [None]:
resnet = torchvision.models.resnet18(pretrained=True)
for param in resnet.parameters():
  param.requires_grad = False

resnet.fc = nn.Linear(resnet.fc.in_features, 100)

images = torch.randn(64, 3, 224, 224)
outputs = resnet(images)
print(outputs.size())


torch.Size([64, 100])


```
Input Layer      Hidden Layer      Output Layer
  o o o             o o o o          o o o
   \ \ \           / / / /           \ \ \
    \ \ \         / / / /             \ \ \
     \ \ \       / / / /               \ \ \
      \ \ \     / / / /                 \ \ \
       \ \ \   / / / /                   \ \ \
        \ \ \ / / / /                     \ \ \
```

```
Input Layer      Hidden Layer      New Output Layer
  o o o             o o o o               o o o
   \ \ \           / / / / \              | | |
    \ \ \         / / / /   \             | | |
     \ \ \       / / / /     \            | | |
      \ \ \     / / / /       \           | | |
       \ \ \   / / / /         \          | | |
        \ \ \ / / / /           \         | | |
```



## VII. Save and load the model.

In [62]:
torch.save(resnet, 'model.ckpt')
model = torch.load('model.ckpt')

# we can also save and load just the the model parameters.

torch.save(resnet.state_dict(),'params.ckpt')
resnet.load_state_dict(torch.load('params.ckpt'))

# ckpt - checkpoint

<All keys matched successfully>