<a href="https://colab.research.google.com/github/Kai0421/PyTorchLearning/blob/main/LearningPyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [7]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

In [8]:
# Download training data from the open datasets
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)
#ToTensor - convert a PIL(python imaging library) images or ndarray and scale the values accordingly
#This transform doesnt support torchscript :  https://pytorch.org/vision/main/generated/torchvision.transforms.ToTensor.html

# Download test data from open datasets
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/26421880 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/29515 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/4422102 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/5148 [00:00<?, ?it/s]

Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw



# [Datasets and Dataloaders](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html)

Passing dataset into the dataloader, this wraps an iterable over the datasets 
and support batching, sampling, shuffling and multiprocess data loading

In [9]:
batch_size = 64

# Create data loaders
# more information on dataloaders - https://pytorch.org/tutorials/beginner/basics/data_tutorial.html
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

# print data shape
train_features, train_label = next(iter(train_dataloader))
print(f"Feature batch shape: {train_features.size()}")
print(f"Labels batch shape: {train_label.size()}")
print(f"Feature batch shape: {train_features.shape}")
print(f"Shape of y: {train_label.shape} {train_label.dtype}")

#for X, y in test_dataloader:
#  print(f"Shape of X [N, C, H, W]: {X.shape}")
#  print(f"Shape of y: {y.shape} {y.dtype}")


Feature batch shape: torch.Size([64, 1, 28, 28])
Labels batch shape: torch.Size([64])
Feature batch shape: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


# Create Models
## nn.Flatten
We initialize the nn.Flatten layer to convert each 2D 28*28 pixel image into contiguous array of 784 pixel values (the minibatch dimension (at dim=0) is maintained).
```
input_image = torch.rand(3,28,28)
print(input_image.size())

flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size())

# torch.Size([3, 28, 28])

```
---
## nn.Linear
is a module that applies a linear transformation on the input using its stored weights and biases
[i.e.] 
```
layer1 = nn.Linear(in_features=28*28, out_features=20)
hidden1 = layer1(flat_image)
print(hidden1.size())

# torch.Size([3,28])
```


---


## nn.Sequential
is an ordered ontainer of modules. The data is passed through all the modules in teh same order as defined. you can use sequential container to put together a quick network like seq_modules.

```
seq_modules = nn.Sequential(
    flatten,
    layer1,
    nn.ReLU(),
    nn.Linear(20, 10)
)
input_image = torch.rand(3,28,28)
logits = seq_modules(input_image)
```
---
# nn.Softmax
The last linear layer of the neural network return logits-raw values in [-infty, infty] - which are passed to the nn.Softmax module. The logits are scaled to values [0, 1] representing the model;s predictied probabilities for each class. *dim* paramtere indicates teh dimension along which the values must sum to 1.

```
softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)
```

In [10]:
from typing_extensions import NewType
device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"
print(f"Using {device} device")

# Create model
class NeuralNetwork(nn.Module):
  def __init__(self):
    super().__init__()
    self.flatten = nn.Flatten()
    self.linear_relu_stack = nn.Sequential(
        nn.Linear(28*28, 512),
        nn.ReLU(),
        nn.Linear(512, 512),
        nn.ReLU(),
        nn.Linear(512, 10) #last one is 10 because there 10 categories
    )
  def forward(self, x):
    x = self.flatten(x)
    # means the input of the function is supposed to be the output of last neuron layer as described above
    logits = self.linear_relu_stack(x)
    return logits

model = NeuralNetwork().to(device)
print(model)


Using cpu device
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


# Print Model, params and Values

In [11]:
# Iterate over each params in the model and prints it size and preview its values
for name, param in model.named_parameters():
  print(f"Layer: {name} | Size: {param.size()} | Values: {param[:2]}\n")

Layer: linear_relu_stack.0.weight | Size: torch.Size([512, 784]) | Values: tensor([[-0.0036,  0.0256, -0.0222,  ...,  0.0300,  0.0234,  0.0160],
        [ 0.0146, -0.0202,  0.0134,  ..., -0.0094,  0.0061,  0.0269]],
       grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.0.bias | Size: torch.Size([512]) | Values: tensor([ 0.0096, -0.0305], grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.2.weight | Size: torch.Size([512, 512]) | Values: tensor([[ 0.0241,  0.0050,  0.0256,  ..., -0.0415, -0.0232, -0.0198],
        [ 0.0027,  0.0052, -0.0441,  ...,  0.0301,  0.0142, -0.0354]],
       grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.2.bias | Size: torch.Size([512]) | Values: tensor([0.0250, 0.0011], grad_fn=<SliceBackward0>)

Layer: linear_relu_stack.4.weight | Size: torch.Size([10, 512]) | Values: tensor([[-0.0408, -0.0325, -0.0411,  ..., -0.0020, -0.0421, -0.0348],
        [ 0.0240,  0.0286, -0.0438,  ...,  0.0299,  0.0079, -0.0129]],
       grad_fn=<SliceBackward0>)

Layer: li

#Optimizing the model Parameter

## Create Loss function and optimizer
### [loss function](https://pytorch.org/docs/stable/nn.html#loss-functions)
- Loss function is a function that compares the target and the predicted output values; measure how well the neural network models the training data. When trainin, we aim to minimize this loss between the predicted and target output.

### [Optimizer](https://pytorch.org/docs/stable/optim.html)
- An optimize is an algorithm or function that adaps the neural network's attributes, like learning rate and weights. Hence, it assists in improving the accuracy and reduce the total loss. But it is a daunting task to choose the appropriate weights for the model.\

#### How to use an optimizer
- To use torch.optim you have to construc an optimizer object, that will hold the current state and will update the parameters based on the computed gradient.

[i.e.]
```
optimizer = optim.SGD(model.paramteres(), lr=0.01, momentum=0.9)

# This now use the Adam algorithm
optimizer = optim.Adam([var1 var2], lr=0.0001)
```

In [12]:
# Create loss function
loss_function = nn.CrossEntropyLoss()

# Create optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

# Train and Test data



In [13]:
# In a single training loop, the model makes predictions on the training dataset 
# (fed to it in batches), and backpropagates the prediction error to adjust the 
# model’s parameters.
def train(dataloader, model, loss_fn, optimizer):
  size = len(dataloader.dataset)
  model.train()

  for batch, (X, y) in enumerate(dataloader):
    X, y = X.to(device), y.to(device)

    # Compute prediction error
    prediction = model(X)
    loss = loss_fn(prediction, y)

    # Backpropogation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if batch % 100 == 0:
      loss, current = loss.item(), (batch+1) * len(X)
      print(f"loss: {loss:7f} [{current:>5d}|{size:>5d}]")

In [14]:
# Test
def test(dataloader, model, loss_fn):
  size = len(dataloader.dataset)
  num_batches = len(dataloader)
  print(f"datasets size : {size} | num_batches: {num_batches}")

  model.eval()
  test_loss, correct = 0, 0
  with torch.no_grad():
    for X, y in dataloader:
      X, y = X.to(device), y.to(device)
      pred = model(X)
      test_loss += loss_fn(pred, y).item()
      correct += (pred.argmax(1) == y).type(torch.float).sum().item()

    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy {(100*correct):>0.1f}, Avg loss: {test_loss:>8f} \n")


# Run And Train and test Model


In [15]:
epochs = 5
for i in range(epochs):
  print(f"Epoch {i+1} \n--------------------------------------")
  train(train_dataloader, model, loss_function, optimizer)
  test(test_dataloader, model, loss_function)
print("Done")


Epoch 1 
--------------------------------------
loss: 2.302805 [   64|60000]
loss: 0.886775 [ 6464|60000]
loss: 0.577879 [12864|60000]
loss: 0.701909 [19264|60000]
loss: 0.592070 [25664|60000]
loss: 0.500234 [32064|60000]
loss: 0.532451 [38464|60000]
loss: 0.602687 [44864|60000]
loss: 0.600043 [51264|60000]
loss: 0.462254 [57664|60000]
datasets size : 10000 | num_batches: 157
Test Error: 
 Accuracy 79.5, Avg loss: 0.549159 

Epoch 2 
--------------------------------------
loss: 0.440747 [   64|60000]
loss: 0.451075 [ 6464|60000]
loss: 0.367039 [12864|60000]
loss: 0.437816 [19264|60000]
loss: 0.402630 [25664|60000]
loss: 0.434720 [32064|60000]
loss: 0.397726 [38464|60000]
loss: 0.473704 [44864|60000]
loss: 0.504045 [51264|60000]
loss: 0.429401 [57664|60000]
datasets size : 10000 | num_batches: 157
Test Error: 
 Accuracy 82.1, Avg loss: 0.478601 

Epoch 3 
--------------------------------------
loss: 0.348087 [   64|60000]
loss: 0.375330 [ 6464|60000]
loss: 0.303277 [12864|60000]
loss: 0

# Saving and Loading Models
Saving the model state as they were trains with the name model.pth for this model state. .pth is the extension for the model state

In [16]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

Saved PyTorch Model State to model.pth


In [17]:
model = NeuralNetwork()
model.load_state_dict(torch.load("model.pth"))

<All keys matched successfully>

# Run the models 

In [77]:
classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
# test_data[0][0] - is the pixel input, test_data[0][1] is the category classes assign to it.

failed = 0
for index, (pixel_data, category_label) in enumerate(test_dataloader):
  # print(f"{pixel_data.shape} | {category_label}")
  with torch.no_grad():
    pred = model(pixel_data)
    predicted, actual = classes[pred[0].argmax(0)], classes[category_label[0]]

    if predicted != actual:
      print(f"predicted: {predicted} | actual: {actual}")
      failed += 1

print(f"Failure Rate: {failed}/{len(test_dataloader)}")


predicted: Dress | actual: Trouser
predicted: T-shirt/top | actual: Shirt
predicted: Coat | actual: Pullover
predicted: Shirt | actual: Coat
predicted: Shirt | actual: Coat
predicted: Shirt | actual: Pullover
predicted: T-shirt/top | actual: Dress
predicted: Dress | actual: Bag
predicted: Shirt | actual: T-shirt/top
predicted: Coat | actual: Pullover
predicted: Shirt | actual: T-shirt/top
predicted: Sandal | actual: Sneaker
predicted: Coat | actual: Shirt
predicted: Shirt | actual: Coat
predicted: Dress | actual: Shirt
predicted: Pullover | actual: Shirt
predicted: Dress | actual: Shirt
predicted: Coat | actual: Pullover
predicted: Dress | actual: Coat
predicted: Dress | actual: Trouser
predicted: Coat | actual: Pullover
predicted: Shirt | actual: Pullover
predicted: Sneaker | actual: Ankle boot
predicted: Dress | actual: Trouser
predicted: Shirt | actual: T-shirt/top
Failure Rate: 25/157
