<a href="https://colab.research.google.com/github/cosraj/learning_keras_with_tensorflow/blob/main/mnistpytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

use !pip list to see the list of python modules installed. From the list, we can see torchvision's version 0.16.+ which is the latest version. So, we can continue with it. If we need an upgrade, we should uninstall all torch modules like torchaudio, torchvision, torchtext etc and then install them all using pip install

In [1]:
!pip list


Package                          Version
-------------------------------- ---------------------
absl-py                          1.4.0
aiohttp                          3.8.6
aiosignal                        1.3.1
alabaster                        0.7.13
albumentations                   1.3.1
altair                           4.2.2
anyio                            3.7.1
appdirs                          1.4.4
argon2-cffi                      23.1.0
argon2-cffi-bindings             21.2.0
array-record                     0.5.0
arviz                            0.15.1
astropy                          5.3.4
astunparse                       1.6.3
async-timeout                    4.0.3
atpublic                         4.0
attrs                            23.1.0
audioread                        3.0.1
autograd                         1.6.2
Babel                            2.13.1
backcall                         0.2.0
beautifulsoup4                   4.11.2
bidict                           0.22.1
b

PyTorch has two primitives to work with data:  **torch.utils.data.DataLoader** and **torch.utils.data.Dataset**. Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset.

In tensorflow, I had to split train and test data explicitly. Here, datasets has an argument "train" which can be set to true or false to get training or testing data

argument "transform" is used to pass a function that is used to transform a PIL Image(Python Image Library) to a random library

In [2]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26421880/26421880 [00:01<00:00, 20780536.70it/s]


Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29515/29515 [00:00<00:00, 347118.30it/s]


Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4422102/4422102 [00:00<00:00, 6209358.71it/s]


Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5148/5148 [00:00<00:00, 5105764.24it/s]

Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw






type(training_data)

In [3]:
type(training_data)
print(training_data)

Dataset FashionMNIST
    Number of datapoints: 60000
    Root location: data
    Split: Train
    StandardTransform
Transform: ToTensor()


Pytorch tensors have these four dimensions: N,C,H,W

*   N: The number of data samples or the batch size
*   C: The number of channels, often referring to color channels in imagescor feature maps in deep learning.
*   H: The height of the data, typically the number of rows in an image.
*   W: The width of the data, typically the number of columns in an image.


In the above examples, we set the batch size to 64 meaning in every feedforward step, 64 instances of data is processed.

For gray scale images, number of color channels(C) is set to 1

Dataset represents the actual data while Dataloaders are iterables that are utility classes that allow the data to be passed in minibatches for epochs.





In [4]:
batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break


for X, y in train_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64
Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


In tensorflow, we were building a sequential model with flatten as the first layer, followed by input layer of 28x28 and then an hidden layer and finally an output layer of size 10.

Similarly , here in Pytorch we extend a class call "nn.Module" to do the same. Semantics are slightly different but the idea is the same.

[Building Models in pytorch: ](https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html)

Subclass nn.Module and implement __init__ and the forward() methods

In [5]:
# Get cpu, gpu or mps device for training.
device = (
    "cuda"
    if torch.cuda.is_available()
    else "mps"
    if torch.backends.mps.is_available()
    else "cpu"
)
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

Using cpu device
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


Now that we have a model, we need to define the loss function and the optimizer.

Here , we are using CrossEntropy loss function and SGD optimizer
CrossEntroy calculates the differences between two probability distributions.

In [8]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

#train method
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()

        if batch % 100 == 0:
            loss, current = loss.item(), (batch + 1) * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

#test method
def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")


#run epochs
epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 2.318966  [   64/60000]
loss: 2.299660  [ 6464/60000]
loss: 2.280388  [12864/60000]
loss: 2.262654  [19264/60000]
loss: 2.248044  [25664/60000]
loss: 2.227991  [32064/60000]
loss: 2.226712  [38464/60000]
loss: 2.198313  [44864/60000]
loss: 2.195837  [51264/60000]
loss: 2.151132  [57664/60000]
Test Error: 
 Accuracy: 42.3%, Avg loss: 2.149891 

Epoch 2
-------------------------------
loss: 2.172243  [   64/60000]
loss: 2.153461  [ 6464/60000]
loss: 2.097438  [12864/60000]
loss: 2.107549  [19264/60000]
loss: 2.051408  [25664/60000]
loss: 2.001539  [32064/60000]
loss: 2.023330  [38464/60000]
loss: 1.948368  [44864/60000]
loss: 1.957653  [51264/60000]
loss: 1.865679  [57664/60000]
Test Error: 
 Accuracy: 58.2%, Avg loss: 1.873688 

Epoch 3
-------------------------------
loss: 1.918185  [   64/60000]
loss: 1.876420  [ 6464/60000]
loss: 1.765798  [12864/60000]
loss: 1.801090  [19264/60000]
loss: 1.682454  [25664/60000]
loss: 1.649310  [32064/600

Now, let's save the model so that we can reuse it

In [9]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

Saved PyTorch Model State to model.pth


Load the model and do some sample predictions

In [17]:
model = NeuralNetwork().to(device)
model.load_state_dict(torch.load("model.pth"))

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]

#purpose of torch.no_grad is to disable computing the gradients. Need more research on this
with torch.no_grad():
    x = x.to(device)
    pred = model(x)
    print(type(pred))
    print(pred[0])
    print(pred[0].argmax(0))
    print(y)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

<class 'torch.Tensor'>
tensor([-2.2547, -2.7040, -0.8660, -2.3885, -0.9598,  2.6316, -0.9585,  2.8076,
         1.9656,  3.2555])
tensor(9)
9
Predicted: "Ankle boot", Actual: "Ankle boot"
