First, create the model. This must match the model used in the interactive training notebook.

In [1]:
import cv2
import torch
import torchvision

CATEGORIES = ['apex']

device = torch.device('cuda')
model = torchvision.models.resnet18(pretrained=False)
model.fc = torch.nn.Linear(512, 2 * len(CATEGORIES))
model = model.cuda().eval().half()

Next, load the saved model.  Enter the model path you used to save.

In [2]:

import torch.nn as nn
import torch.nn.functional as F
from torch2trt import torch2trt, TRTModule


Convert and optimize the model using ``torch2trt`` for faster inference with TensorRT.  Please see the [torch2trt](https://github.com/NVIDIA-AI-IOT/torch2trt) readme for more details.

> This optimization process can take a couple minutes to complete. 

In [None]:
class Net(nn.Module):
    def __init__(self, num_classes):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 60, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(60, 30, 5)
        self.fc1 = nn.Linear(30 * 5 * 5, 500)
        self.fc2 = nn.Linear(500, num_classes)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 30 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x


Save the optimized model using the cell below

In [None]:
num_classes = 5
model = Net(num_classes).cuda().eval()
model.load_state_dict(torch.load("sign_model.pt"))


Load the optimized model by executing the cell below

In [None]:
x = torch.randn(1, 3, 32, 32).cuda()  # dummy input
model_trt = torch2trt(model, [x], fp16_mode=True)


In [None]:
torch.save(model_trt.state_dict(), 'sign_model_trt.pth')


In [None]:
import torch
torch.cuda.empty_cache()
torch.cuda.ipc_collect()
