# Pretrained UNet Inference on Trn1 / Inf2

## Introduction

This notebook demonstrates how to compile and run a UNet model for accelerated inference on Neuron. This notebook will use the [`UNet`](https://github.com/milesial/Pytorch-UNet) model, which is primarily used for arbitrary image semantic segmentation tasks.

This Jupyter notebook should be run on a Trn1 or Inf2 instance (`trn1.2xlarge` or `inf2.xlarge` or larger).

## Install Dependencies
This tutorial requires the following pip packages:

- `torch-neuronx`
- `neuronx-cc`
- `torchvision`

Most of these packages will be installed when configuring your environment using the Trn1 setup guide.

## Compile the model into an AWS Neuron optimized TorchScript

In the following section, we load the model, get s sample input, run inference on CPU, compile the model for Neuron using `torch_neuronx.trace()`, and save the optimized model as `TorchScript`.

`torch_neuronx.trace()` expects a tensor or tuple of tensor inputs to use for tracing, so we convert the input image into a tensor.

In [None]:
from PIL import Image
import requests

import torch
import torch_neuronx
from torchvision import models
from torchvision.transforms import functional

# load the model
model = torch.hub.load('milesial/Pytorch-UNet', 'unet_carvana', pretrained=False)
# load the weights
state_dict = torch.hub.load_state_dict_from_url('https://github.com/milesial/Pytorch-UNet/releases/download/v3.0/unet_carvana_scale0.5_epoch2.pth', map_location="cpu")
model.load_state_dict(state_dict)
model.eval()

# Get an example input
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
image = image.convert('RGB')
image = functional.resize(image, (224, 224))
image = functional.to_tensor(image)
image = torch.unsqueeze(image, 0)

# Run inference on CPU
output_cpu = model(image)

# Compile the model
model_neuron = torch_neuronx.trace(model, image)

# Save the TorchScript for inference deployment
filename = 'model.pt'
torch.jit.save(model_neuron, filename)

## Run inference and compare results

In this section we load the compiled model, run inference on Neuron, and compare the CPU and Neuron outputs using the ImageNet classes.

In [None]:
import json
import urllib

# Load the TorchScript compiled model
model_neuron = torch.jit.load(filename)

# Run inference using the Neuron model
output_neuron = model_neuron(image)

# Compare the results
print(f"CPU tensor:    {output_cpu[0][0:10]}")
print(f"Neuron tensor: {output_neuron[0][0:10]}")