# Training with PyTorch
This notebook shows how to train a PyTorch model using images from `vl-datasets`. 

<!--<badge>--><a href="https://colab.research.google.com/github/visual-layer/vl-datasets/blob/main/notebooks/train-pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a><!--</badge>-->
<!--<badge>--><a href="https://kaggle.com/kernels/welcome?src=https://github.com/visual-layer/vl-datasets/blob/main/notebooks/train-pytorch.ipynb" target="_parent"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" alt="Open In Colab"/></a><!--</badge>-->

## Installation & Setting Up

In [None]:
!pip install vl-datasets -Uq

In [1]:
import vl_datasets
vl_datasets.__version__

'0.0.8'

## Download the VL food-101 Dataset

![datasetimage](https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/static/img/food-101.jpg)

A challenging data set of 101 food categories is introduced, consisting of 101,000 images. Each class includes 250 manually reviewed test images and 750 training images. The training images deliberately retain some noise, primarily intense colors and occasional incorrect labels. All images have been rescaled to a maximum side length of 512 pixels.


The easiest way to get the clean version of food-101 dataset is to use `vl_datasets` package.

In [2]:
from vl_datasets import VLFood101

train_dataset = VLFood101('./', split='train')
valid_dataset = VLFood101('./', split='test')

Downloading CSV file
Excluding churros/1440917.jpg
Excluding churros/1944265.jpg
Excluding churros/3867252.jpg
Excluding churros/644700.jpg
Excluding hot_and_sour_soup/1674538.jpg
Excluding hot_and_sour_soup/3918910.jpg
Excluding samosa/1727503.jpg
Excluding samosa/2754961.jpg
Excluding samosa/43271.jpg
Excluding samosa/529285.jpg
Excluding samosa/652074.jpg
Excluding samosa/918899.jpg
Excluding samosa/987023.jpg
Excluding sashimi/182979.jpg
Excluding sashimi/2160399.jpg
Excluding sashimi/241368.jpg
Excluding spring_rolls/1745022.jpg
Excluding spring_rolls/182658.jpg
Excluding spring_rolls/3149523.jpg
Excluding spring_rolls/3627865.jpg
Excluding spring_rolls/406134.jpg
Excluding panna_cotta/1112624.jpg
Excluding panna_cotta/2058134.jpg
Excluding panna_cotta/2469457.jpg
Excluding panna_cotta/273835.jpg
Excluding panna_cotta/2824744.jpg
Excluding panna_cotta/3011399.jpg
Excluding panna_cotta/3405936.jpg
Excluding panna_cotta/3421628.jpg
Excluding panna_cotta/379475.jpg
Excluding beef_tar

Excluding cup_cakes/1005580.jpg
Excluding cup_cakes/1082593.jpg
Excluding cup_cakes/112438.jpg
Excluding cup_cakes/2219167.jpg
Excluding cup_cakes/2590269.jpg
Excluding cup_cakes/451074.jpg
Excluding cup_cakes/63497.jpg
Excluding takoyaki/537390.jpg
Excluding chocolate_mousse/1653769.jpg
Excluding chocolate_mousse/1734966.jpg
Excluding chocolate_mousse/2177988.jpg
Excluding chocolate_mousse/2616372.jpg
Excluding chocolate_mousse/343137.jpg
Excluding chocolate_mousse/766461.jpg
Excluding breakfast_burrito/2182358.jpg
Excluding breakfast_burrito/2428601.jpg
Excluding breakfast_burrito/2845140.jpg
Excluding breakfast_burrito/462294.jpg
Excluding breakfast_burrito/662423.jpg
Excluding breakfast_burrito/662424.jpg
Excluding breakfast_burrito/805595.jpg
Excluding hot_dog/1051643.jpg
Excluding hot_dog/1282229.jpg
Excluding hot_dog/3050169.jpg
Excluding hot_dog/3222202.jpg
Excluding hot_dog/3336331.jpg
Excluding hot_dog/3497633.jpg
Excluding macarons/1627847.jpg
Excluding macarons/1671595.jpg


View the first five problematic images that are listed in the `.csv` file.

In [3]:
print(train_dataset.excluded_files[:5])

['churros/1440917.jpg', 'churros/1944265.jpg', 'churros/3867252.jpg', 'churros/644700.jpg', 'hot_and_sour_soup/1674538.jpg']


## Import PyTorch and Torchvision

In [4]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
import torchvision

We can also view the exclude files with:

In [5]:
train_loader = DataLoader(train_dataset, batch_size=256, shuffle=True)
valid_loader = DataLoader(valid_dataset, batch_size=256, shuffle=True)

Adjust the `batch_size` to a value that fits your hardware.

## Define the model architecture
Let's construct a basic convolutional model, `Resnet18` from `Torchvision`.

In [6]:
model = torchvision.models.resnet18(weights=torchvision.models.ResNet18_Weights.DEFAULT)
num_ftrs = model.fc.in_features
model.fc = nn.Linear(num_ftrs, len(train_dataset.classes))

## Define the loss function and optimizer

In [7]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

## Train the model
Now, let's write a simple training loop to train the model for 5 epochs on a GPU or CPU.

In [8]:
num_epochs = 5
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
model.to(device)

for epoch in range(num_epochs):
    running_loss = 0.0
    for i, data in enumerate(train_loader):
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f"Epoch {epoch+1} - Loss: {running_loss/len(train_loader)}")


Using device: cuda
Epoch 1 - Loss: 2.3828936722319005
Epoch 2 - Loss: 1.8042252197103985
Epoch 3 - Loss: 1.606940585475857
Epoch 4 - Loss: 1.482399603876017
Epoch 5 - Loss: 1.3888445458169711


## Evaluate the model
Finally we evaluate the model on the validation set and prints it's accuracy.

In [9]:
correct = 0
total = 0
with torch.no_grad():
    for data in valid_loader:
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Accuracy: {100 * correct / total}")


Accuracy: 72.83607013637629
