# Working with data

PyTorch has two primitives to work with data:
  1. torch.utils.data.DataLoader
  2. torch.utils.data.Dataset

- Dataset: 
  - stores the sample and their corresponding labels
- DataLoader: 
  - wraps an iterable around the Dataset

In [31]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

Domain specific libraries of PyTorch:
(* all of each include datasets)
- TorchText
- TorchVision
  - every TorchVision Dataset includes two arguments: 
    1. transform: to modify the samples
    2. target_transform to modify the labels
- TorchAudio


In [32]:
# Download training data from open datasets
training_data = datasets.FashionMNIST(
    root = 'data',
    train = True,
    download = True,
    transform = ToTensor(),
)

# Download test data from open datasets
test_data = datasets.FashionMNIST(
    root = 'data',
    train = False,
    download = True,
    transform = ToTensor(),
)

Passing the Dataset as an argument to DataLoader:
- wraps an iterable over the dataset
- supports automatic batching, sampling, shuffling and multiprocess data loading

In [33]:
batch_size = 64  # 한 batch당 64개의 features와 labels

# Create data loaders
train_dataloader = DataLoader(training_data, batch_size = batch_size)
test_dataloader = DataLoader(test_data, batch_size = batch_size)

for X, y in test_dataloader:
  print(f"Shape of X [N, C, H, W]: {X.shape}")
  print(f"Shape of y: {y.shape} {y.dtype}")
  break

# N: batch N
# C: channels C
# D: depth D
# H: height H
# W: width W

Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64


# Creating Models

- **To define a neural network** in PyTorch, *we create  class that inherits from nn.Module*.

- We **define the layers of the network** in the  _ _ init _ _ function and **specify how data will pass through the network** in the forward function.

- To accelerate operations in the neural network, we move it to the GPU if available

In [34]:
# Get cpu or gpu device for training
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

Using cpu device


In [35]:
# Define model
class NeuralNetwork(nn.Module):
  def __init__(self):
    super(NeuralNetwork, self).__init__()
    self.flatten = nn.Flatten()
    self.linear_relu_stack = nn.Sequential(
        nn.Linear(28*28, 512),
        nn.ReLU(),
        nn.Linear(512, 512),
        nn.ReLU(),
        nn.Linear(512, 10)
    )

  def forward(self, x):
    x = self.flatten(x)
    logits = self.linear_relu_stack(x)
    return logits

model = NeuralNetwork().to(device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


# Optimizing the Model Parameters

To train a model, we need 
- a loss function
- an optimizer

In [36]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr = 1e-3)  

# lr: learning rate
# SGD: Stochastic Gradient Descent; torch.optim.SGD -> implements SGD

In [37]:
# In a single training loop: 
# make prediction on training dataset + backpropages the prediction eror to adjust model's paramers

def train(dataloader, model, loss_fn, optimizer):
  size = len(dataloader.dataset)
  model.train()
  for batch, (X, y) in enumerate(dataloader):
    X, y = X.to(device), y.to(device)

    # Compute prediction error
    pred = model(X)
    loss = loss_fn(pred, y)

    # Backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if batch % 100 == 0:
      loss, current = loss.item(), batch * len(X)
      print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")

- How are optimizer.step() and loss.backward() related?
https://discuss.pytorch.org/t/how-are-optimizer-step-and-loss-backward-related/7350

In [38]:
# Checking the model's performance against the test dataset to ensure it is learning

def test(dataloader, model, loss_fn):
  size = len(dataloader.dataset)
  num_batches = len(dataloader)
  model.eval()
  test_loss, correct = 0, 0
  with torch.no_grad():
    for X, y in dataloader:
      X, y = X.to(device), y.to(device)
      pred = model(X)
      test_loss += loss_fn(pred, y).item()
      correct += (pred.argmax(1) == y).type(torch.float).sum().item()
  test_loss /= num_batches
  correct /= size
  print(f"Test Error: \n Accuracy: {(100 * correct):>0.1f}%, Avg loss: {test_loss:>8f}\n")

- torch.no_grad():
  - context-manager that disabled gradient calculation
  - Tensor.backward()를 부르지 않을 때 inference에 유용 - reduce memory consumtion for computations (that would otherwise have requires_grad = True)

- Pytorch의 no_grad()와 eval()의 차이 정리된 블로그:
  https://coffeedjimmy.github.io/pytorch/2019/11/05/pytorch_nograd_vs_train_eval/

In [39]:
# The training process is conducted over several iterations (epochs)
# During each epoch, the model learns parameters to make better prediction

# Printing the model's accuracy and loss at each epoch:
epochs = 5
for t in range(epochs):
  print(f"Epoch {t+1}\n---------------------")
  train(train_dataloader, model, loss_fn, optimizer)
  test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
---------------------
loss: 2.305748 [    0/60000]
loss: 2.286911 [ 6400/60000]
loss: 2.270999 [12800/60000]
loss: 2.258623 [19200/60000]
loss: 2.238723 [25600/60000]
loss: 2.218482 [32000/60000]
loss: 2.228829 [38400/60000]
loss: 2.193504 [44800/60000]
loss: 2.181669 [51200/60000]
loss: 2.149483 [57600/60000]
Test Error: 
 Accuracy: 42.4%, Avg loss: 2.148501

Epoch 2
---------------------
loss: 2.160897 [    0/60000]
loss: 2.142194 [ 6400/60000]
loss: 2.092786 [12800/60000]
loss: 2.100140 [19200/60000]
loss: 2.046415 [25600/60000]
loss: 1.995163 [32000/60000]
loss: 2.028558 [38400/60000]
loss: 1.951143 [44800/60000]
loss: 1.944458 [51200/60000]
loss: 1.872835 [57600/60000]
Test Error: 
 Accuracy: 54.5%, Avg loss: 1.876561

Epoch 3
---------------------
loss: 1.915221 [    0/60000]
loss: 1.872196 [ 6400/60000]
loss: 1.770276 [12800/60000]
loss: 1.796199 [19200/60000]
loss: 1.685273 [25600/60000]
loss: 1.652298 [32000/60000]
loss: 1.676118 [38400/60000]
loss: 1.587652 [44800/600

# Saving Models

In [40]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model state to model.pth")

Saved PyTorch Model state to model.pth


# Loading Models

In [41]:
# The process for loading a model includes 
# re-creating the model structure & loading the state dictionary into it
model = NeuralNetwork()
model.load_state_dict(torch.load("model.pth"))

<All keys matched successfully>

In [42]:
# This model can now be used to make predictions

classes = [
           "T-shirt/top",
           "Trouser",
           "Pullover",
           "Dress",
           "Coat",
           "Sandal",
           "Shirt",
           "Sneaker",
           "Bag",
           "Ankle boot"
]

model.eval()
X, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
  pred = model(X)
  predicted, actual = classes[pred[0].argmax(0)], classes[y]
  print(f'Predicted: "{predicted}", Actual: "{actual}"')

Predicted: "Ankle boot", Actual: "Ankle boot"
