## Pytorch
PyTorch is an open-source machine learning framework that is primarily used for developing and training deep learning models, It was
developed by Facebooks Al Research Lab and released in 2016. PyTorch provides a flexible and dynamic approach to building neural networks,
making it a popular choice among researchers and developers.
The framework is built on a dynamic computational graph concept. which means that the graph is built and modified on-the-fly as the program
runs. This allows for more intuitive and flexible model development. as you can use standard Python control flow statements and debug the
model easily.
PyTorch supports automatic differentiation, which enables efficient computation of gradients for training neural networks using
backpropagation, It provides a rich set of tools and libraries for tasks such as data loading. model building. optimization. and evaluation.
One of the key advantages of PyTorch is its support for GPU acceleration, allowing you to train models on GPUs to significantly speed up
computations. It also has a large and active community. which means there are plenty Of resources. tutorials, and pretrained models available,
PyTorch is often compared to TensorFlow, another popular deep learning framework. While TensorFlow focuses more on static computation
graphs;PyTorch emphasizes dynamic computation graphs. This fundamental difference in design philosophy gives PyTorch an edge when it
comes to flexibility and ease of use.
Overall. PyTorch is Widely used in the research community and is garning popularity in industry applications as well. It provides a powerful and
user-friendly platform for building and training deep learning models.

In [46]:
import torch

### Tensors
- At its core, PyTorch is a library for processing tensors. A tensor is a number, vector, matrix, or any n-dimensional array. 
- Lets create a tensor with
a single number.

In [47]:
# Create a tensor with a single number
t1 = torch.tensor(7.)
t1

tensor(7.)

In [48]:
t1.dtype

torch.float32

In [49]:
# Vector
t2 = torch.tensor([1., 2, 3, 4])
t2

tensor([1., 2., 3., 4.])

In [50]:
# 2D tensor
t3 = torch.tensor([[5., 6], 
                   [7, 8], 
                   [9, 10]])

In [51]:
# 3D array
t4 = torch.tensor([[[11, 12, 13],
                    [13, 14, 15]],
                   [[15, 16, 17],
                    [17, 18, 19]]])
t4

tensor([[[11, 12, 13],
         [13, 14, 15]],

        [[15, 16, 17],
         [17, 18, 19]]])

- Tensors can have any number of dimensions and different lengths along each dimension. 
- We can inspect the length along each dimension
using the . shape property Of a tensor.

In [52]:
print(t1)
t1.shape

tensor(7.)


torch.Size([])

In [53]:
print(t2)
t2.shape

tensor([1., 2., 3., 4.])


torch.Size([4])

In [54]:
print(t3)
t3.shape

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.]])


torch.Size([3, 2])

In [55]:
print(t4)
t4.shape

tensor([[[11, 12, 13],
         [13, 14, 15]],

        [[15, 16, 17],
         [17, 18, 19]]])


torch.Size([2, 2, 3])

In [56]:
t5 = torch.tensor([[5., 6, 11],
                    [7, 8, 12],
                    [9, 10]])

ValueError: expected sequence of length 3 at dim 1 (got 2)

### Tensor operations and gradients
We can combine tensors With the usual arithmetic operations Let's look at an example:

In [57]:
# Create tensor
x = torch.tensor(3.)
w = torch.tensor(4., requires_grad=True)
b = torch.tensor(5., requires_grad=True)
x, w, b

(tensor(3.), tensor(4., requires_grad=True), tensor(5., requires_grad=True))

In [58]:
y = w * x + b
y

tensor(17., grad_fn=<AddBackward0>)

- As expected, y isa tensor with the value 3 * 4 + = 17. 
- What makes PyTorch unique is that we can automatically compute the derivative of y w.r.t. the tensors that have requires_grad set to true ie. w and b.
- This feature of PyTorch called autograd (automatic gradients).
- To compute the derivatives. we can invoke the . backward method on our result y.

In [59]:
y.backward()

In [60]:
print('dy/dx:', x.grad) 
print('dy/dw:', w.grad) # 3
print('dy/db:', b.grad) # 1

dy/dx: None
dy/dw: tensor(3.)
dy/db: tensor(1.)


### Tensor Functions

Apart from arithmetic operations, the torch module also contains many functions for creating and manipulating tensors. Let's look at some
examples.

In [61]:
t6 = torch.full((3, 2), 42)
t6

tensor([[42, 42],
        [42, 42],
        [42, 42]])

In [62]:
t7 = torch.cat((t3, t6))
t7

tensor([[ 5.,  6.],
        [ 7.,  8.],
        [ 9., 10.],
        [42., 42.],
        [42., 42.],
        [42., 42.]])

In [63]:
t8 = torch.sin(t7) 
t8

tensor([[-0.9589, -0.2794],
        [ 0.6570,  0.9894],
        [ 0.4121, -0.5440],
        [-0.9165, -0.9165],
        [-0.9165, -0.9165],
        [-0.9165, -0.9165]])

In [64]:
t9 = t8.reshape(3, 2, 2)
t9

tensor([[[-0.9589, -0.2794],
         [ 0.6570,  0.9894]],

        [[ 0.4121, -0.5440],
         [-0.9165, -0.9165]],

        [[-0.9165, -0.9165],
         [-0.9165, -0.9165]]])

### Interoperability with Numpy 

In [65]:
import numpy as np
x = np.array([[1, 2], [3, 4.]])
x

array([[1., 2.],
       [3., 4.]])

In [66]:
y = torch.from_numpy(x)
y

tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)

In [67]:
x.dtype, y.dtype    

(dtype('float64'), torch.float64)

We can convert a PyTorch tensor to a Numpy array using the . numpy method of a tensor.

In [68]:
y.numpy()

array([[1., 2.],
       [3., 4.]])

### Linear Regression

In [97]:
import torch
import numpy as np

In [98]:
# Input (temp, rainfall, humidity)
input = np.array([[73, 67, 43], 
                   [91, 88, 64], 
                   [87, 134, 58], 
                   [102, 43, 37], 
                   [69, 96, 70]], dtype='float32')

In [99]:
# Targets (apples, oranges)
target = np.array([[56, 70],
                    [81, 101],
                    [119, 133],
                    [22, 37],
                    [103, 119]], dtype='float32')

In [100]:
inputs = torch.from_numpy(input)
targets = torch.from_numpy(target)
print(inputs)
print(targets)

tensor([[ 73.,  67.,  43.],
        [ 91.,  88.,  64.],
        [ 87., 134.,  58.],
        [102.,  43.,  37.],
        [ 69.,  96.,  70.]])
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


In [101]:
w = torch.randn(2, 3, requires_grad=True)
b = torch.randn(2, requires_grad=True)
print(w)
print(b)

tensor([[ 0.0161, -1.4949, -0.4853],
        [-0.6346,  0.5469,  1.6303]], requires_grad=True)
tensor([-1.2032,  0.1876], requires_grad=True)


In [102]:
def model(x):
    return x @ w.t() + b

In [103]:
preds = model(inputs)
preds

tensor([[-121.0542,   60.6050],
        [-162.3482,   94.9028],
        [-228.2649,  112.8183],
        [ -81.8000,   19.2927],
        [-177.5721,  123.0218]], grad_fn=<AddBackward0>)

In [104]:
print(targets)

tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


In [105]:
def MSE(actual, target):
    diff = actual - target
    return torch.sum(diff * diff) / diff.numel()

In [106]:
loss  = MSE(targets, preds)
print(loss)

tensor(30151.6992, grad_fn=<DivBackward0>)


In [107]:
# Compute gradients
loss.backward()

In [108]:
print(w.grad)

tensor([[-19045.7500, -22241.8164, -13361.9238],
        [  -905.0275,   -849.1395,   -467.6789]])


In [109]:
b.grad

tensor([-230.4079,   -9.8719])

In [110]:
# reset the gradients to zero
w.grad.zero_()
b.grad.zero_()

print(w.grad)
print(b.grad)   

tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([0., 0.])


In [111]:
preds = model(inputs)
loss = MSE(targets, preds)
loss

tensor(30151.6992, grad_fn=<DivBackward0>)

In [112]:
loss.backward()

In [113]:
# Adjust weights & reset gradients
with torch.no_grad():
    w -= w.grad * 1e-5
    b -= b.grad * 1e-5
    w.grad.zero_()
    b.grad.zero_()

In [114]:
print(w)
print(b)

tensor([[ 0.2065, -1.2725, -0.3517],
        [-0.6256,  0.5554,  1.6350]], requires_grad=True)
tensor([-1.2009,  0.1877], requires_grad=True)


In [115]:
preds = model(inputs)
loss = MSE(targets, preds)  
loss

tensor(20701.7910, grad_fn=<DivBackward0>)

In [116]:
for i in range(400):
    preds = model(inputs)
    loss = MSE(targets, preds)
    loss.backward()
    with torch.no_grad():
        w -= w.grad * 1e-5
        b -= b.grad * 1e-5
        w.grad.zero_()
        b.grad.zero_()
    print("Epoch:", i, "Loss:", loss)

Epoch: 0 Loss: tensor(20701.7910, grad_fn=<DivBackward0>)
Epoch: 1 Loss: tensor(14328.9238, grad_fn=<DivBackward0>)
Epoch: 2 Loss: tensor(10029.6953, grad_fn=<DivBackward0>)
Epoch: 3 Loss: tensor(7127.9126, grad_fn=<DivBackward0>)
Epoch: 4 Loss: tensor(5167.9092, grad_fn=<DivBackward0>)
Epoch: 5 Loss: tensor(3842.6140, grad_fn=<DivBackward0>)
Epoch: 6 Loss: tensor(2945.0977, grad_fn=<DivBackward0>)
Epoch: 7 Loss: tensor(2335.9094, grad_fn=<DivBackward0>)
Epoch: 8 Loss: tensor(1921.0756, grad_fn=<DivBackward0>)
Epoch: 9 Loss: tensor(1637.2703, grad_fn=<DivBackward0>)
Epoch: 10 Loss: tensor(1441.8156, grad_fn=<DivBackward0>)
Epoch: 11 Loss: tensor(1305.9521, grad_fn=<DivBackward0>)
Epoch: 12 Loss: tensor(1210.2986, grad_fn=<DivBackward0>)
Epoch: 13 Loss: tensor(1141.7937, grad_fn=<DivBackward0>)
Epoch: 14 Loss: tensor(1091.6343, grad_fn=<DivBackward0>)
Epoch: 15 Loss: tensor(1053.8883, grad_fn=<DivBackward0>)
Epoch: 16 Loss: tensor(1024.5565, grad_fn=<DivBackward0>)
Epoch: 17 Loss: tenso

tensor(25.7482, grad_fn=<DivBackward0>)
Epoch: 393 Loss: tensor(25.6148, grad_fn=<DivBackward0>)
Epoch: 394 Loss: tensor(25.4827, grad_fn=<DivBackward0>)
Epoch: 395 Loss: tensor(25.3519, grad_fn=<DivBackward0>)
Epoch: 396 Loss: tensor(25.2224, grad_fn=<DivBackward0>)
Epoch: 397 Loss: tensor(25.0942, grad_fn=<DivBackward0>)
Epoch: 398 Loss: tensor(24.9673, grad_fn=<DivBackward0>)
Epoch: 399 Loss: tensor(24.8416, grad_fn=<DivBackward0>)


In [117]:
loss

tensor(24.8416, grad_fn=<DivBackward0>)

In [118]:
preds, targets

(tensor([[ 58.3406,  69.7955],
         [ 81.7443, 104.4120],
         [117.8515, 125.2747],
         [ 28.9534,  34.2601],
         [ 96.2854, 127.2921]], grad_fn=<AddBackward0>),
 tensor([[ 56.,  70.],
         [ 81., 101.],
         [119., 133.],
         [ 22.,  37.],
         [103., 119.]]))

### Neural Network using PyTorch

In [119]:
! nvidia-smi

Wed Mar 13 16:30:02 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 551.76                 Driver Version: 551.76         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce GTX 1650      WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   46C    P3             12W /   30W |       0MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [1]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda, Compose
import matplotlib.pyplot as plt

In [2]:
# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

In [3]:
type(training_data)

torchvision.datasets.mnist.FashionMNIST

In [5]:
batch_size = 64
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print("Shape of X [N, C, H, W]: ", X.shape)
    print("Shape of y: ", y.shape, y.dtype)
    break

Shape of X [N, C, H, W]:  torch.Size([64, 1, 28, 28])
Shape of y:  torch.Size([64]) torch.int64


In [6]:
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using {} device".format(device))

Using cpu device


In [127]:
# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
            nn.ReLU()
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

In [128]:
model = NeuralNetwork().to(device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
    (5): ReLU()
  )
)


In [129]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

In [133]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        
        # Compute gradients
        loss.backward()
        
        # Adjust weights
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

In [134]:
def test(dataloader, model):
    size = len(dataloader.dataset)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= size
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

In [135]:
epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model)

Epoch 1
-------------------------------
loss: 1.587989  [    0/60000]
loss: 1.620953  [ 6400/60000]
loss: 1.532960  [12800/60000]
loss: 1.653096  [19200/60000]
loss: 1.403436  [25600/60000]
loss: 1.470638  [32000/60000]
loss: 1.481000  [38400/60000]
loss: 1.456102  [44800/60000]
loss: 1.430298  [51200/60000]
loss: 1.409010  [57600/60000]
Test Error: 
 Accuracy: 53.1%, Avg loss: 0.023358 

Epoch 2
-------------------------------
loss: 1.483725  [    0/60000]
loss: 1.532291  [ 6400/60000]
loss: 1.431131  [12800/60000]
loss: 1.579913  [19200/60000]
loss: 1.310635  [25600/60000]
loss: 1.393952  [32000/60000]
loss: 1.400206  [38400/60000]
loss: 1.377579  [44800/60000]
loss: 1.354373  [51200/60000]
loss: 1.347777  [57600/60000]
Test Error: 
 Accuracy: 53.8%, Avg loss: 0.022282 

Epoch 3
-------------------------------
loss: 1.408212  [    0/60000]
loss: 1.468591  [ 6400/60000]
loss: 1.351567  [12800/60000]
loss: 1.524507  [19200/60000]
loss: 1.246955  [25600/60000]
loss: 1.337704  [32000/600

In [136]:
# Save the model
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

Saved PyTorch Model State to model.pth


In [137]:
# Load the model
model = NeuralNetwork()
model.load_state_dict(torch.load("model.pth"))

<All keys matched successfully>

In [139]:
classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

In [140]:
model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

Predicted: "Ankle boot", Actual: "Ankle boot"


In [None]:
plt.heatmap(test_data[0][0][0], cmap='gray')