# Intel® oneAPI Hackathon for Open Innovation - Accelerating PyTorch Deep Learning Models on Intel XPUs - Hands-on Lab

## Use of Intel® Extension for PyTorch* for training

In this hands-on lab we will demonstrate how **IPEX** can be used for **resnet50** model training in PyTorch framework. For any other PyTorch model, same process can be followed to leverage IPEX optimizations. 

### Computer Vision Workload - Transfer learning with Resnet50 

In this notebook image classification is done using transfer learning with resnet50 model. ResNet-50 is a convolutional neural network that is 50 layers deep. You can load a pretrained version of the network trained on more than a million images from the ImageNet database. In this notebook all architecture except the last layer is fixed. The last layer is modified for two classes. With very less epoch very good accuracy can be achieved. The network has an image input size of 224-by-224.
We are going to use the **optimize** method from Intel® Extension for PyTorch* to apply optimizations.

Refer to https://intel.github.io/intel-extension-for-pytorch/latest/tutorials/installation.html for installation guide

In [1]:
!lscpu


Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          24
On-line CPU(s) list:             0-23
Thread(s) per core:              2
Core(s) per socket:              6
Socket(s):                       2
NUMA node(s):                    2
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           85
Model name:                      Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz
Stepping:                        4
CPU MHz:                         1200.191
CPU max MHz:                     3700.0000
CPU min MHz:                     1200.0000
BogoMIPS:                        6800.00
Virtualization:                  VT-x
L1d cache:                       384 KiB
L1i cache:                       384 KiB
L2 cache:                        12 MiB
L3 cache:                    

In [2]:
#Installation steps 
!pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
!python -m pip install intel_extension_for_pytorch

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cpu
Defaulting to user installation because normal site-packages is not writeable


In [5]:
from zipfile import ZipFile
zf = ZipFile('./HE_Workshop/Pytorch/training_set.zip', 'r')
zf.extractall('training_data')
zf.close()

In [7]:
from zipfile import ZipFile
zf = ZipFile('./HE_Workshop/Pytorch/test_set.zip', 'r')
zf.extractall('test_data')
zf.close()

Let's start by importing all the necessary packages and modules

In [2]:
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
from PIL import Image
import cv2
import os
import torch
import torchvision
from torchvision import datasets, models, transforms
import torch.nn as nn
from torch.nn import functional as F
import torch.optim as optim
import intel_extension_for_pytorch as ipex
import time
input_path = "./"

In [3]:
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

data_transforms = {
    'train':
    transforms.Compose([
        transforms.Resize((224,224)),
        transforms.RandomAffine(0, shear=10, scale=(0.8,1.2)),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        normalize
    ]),
    'validation':
    transforms.Compose([
        transforms.Resize((224,224)),
        transforms.ToTensor(),
        normalize
    ]),
}

image_datasets = {
    'train': 
    datasets.ImageFolder(input_path + 'training_data/training_set', data_transforms['train']),
    'validation': 
    datasets.ImageFolder(input_path + 'test_data/test_set', data_transforms['validation'])
}

dataloaders = {
    'train':
    torch.utils.data.DataLoader(image_datasets['train'],
                                batch_size=32,
                                shuffle=True,
                                num_workers=0),  # for Kaggle
    'validation':
    torch.utils.data.DataLoader(image_datasets['validation'],
                                batch_size=32,
                                shuffle=False,
                                num_workers=0)  # for Kaggle
}

In [4]:
model = models.resnet50(pretrained=True)
    
for param in model.parameters():
    param.requires_grad = False   
    
model.fc = nn.Sequential(
               nn.Linear(2048, 128),
               nn.ReLU(inplace=True),
               nn.Linear(128, 2))



In [5]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.fc.parameters())

In [6]:
def train_model(model, criterion, optimizer, num_epochs=3):
    since= time.time()
    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch+1, num_epochs))
        print('-' * 10)

        for phase in ['train']:
            if phase == 'train':
                model.train()
            else:
                model.eval()

            running_loss = 0.0
            running_corrects = 0

            for inputs, labels in dataloaders[phase]:
                outputs = model(inputs)
                loss = criterion(outputs, labels)

                if phase == 'train':
                    optimizer.zero_grad()
                    loss.backward()
                    optimizer.step()

                _, preds = torch.max(outputs, 1)
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)

            epoch_loss = running_loss / len(image_datasets[phase])
            epoch_acc = running_corrects.double() / len(image_datasets[phase])

            print('{} loss: {:.4f}, acc: {:.4f}'.format(phase,
                                                        epoch_loss,
                                                       epoch_acc))
    time_elapsed = time.time() - since
    train_model.ttime= time_elapsed
    print('Training completed in {:.0f}m {:.0f}s \n \n'.format(time_elapsed // 60, time_elapsed % 60))
    return model

In [None]:
print("pytorch version: {}".format(torch.__version__))
print("ipex version：   {}".format(ipex.__version__))
print("Training with normal PyTorch")
model_trained = train_model(model, criterion, optimizer, num_epochs=3)
dur_n = train_model.ttime
print("Training with Intel Extension for PyTorch (IPEX)")
model, optimizer = ipex.optimize(model, optimizer=optimizer)
model_trained = train_model(model, criterion, optimizer, num_epochs=3)
dur_i = train_model.ttime
print('Training time (normal): {:.2f}sec'.format(dur_n))
print('Training time (ipex):   {:.2f}sec'.format(dur_i))
print('IPEX achieved {:.2f}% better performance comparing to normal PyTorch, Speed up of {:.2f}x'.format((dur_n-dur_i)/dur_n*100, dur_n/dur_i))

plt.bar(["normal", "ipex"], [dur_n, dur_i], width=0.5 , color = ['red' , 'blue'])

pytorch version: 1.12.1+cpu
ipex version：   1.12.300
Training with normal PyTorch
Epoch 1/3
----------
train loss: 0.1321, acc: 0.9445
Epoch 2/3
----------
train loss: 0.0904, acc: 0.9645
Epoch 3/3
----------
train loss: 0.0852, acc: 0.9680
Training completed in 27m 25s 
 

Training with Intel Extension for PyTorch (IPEX)
Epoch 1/3
----------
