# Kidney Segmentation with PyTorch Lightning and OpenVINO™

## **Part 3:** Show Live Inference

This tutorial demonstrates training and inference with a kidney segmentation model. For training, [PyTorch Lightning](https://www.pytorchlightning.ai/) is used with a [UNet](https://arxiv.org/abs/1505.04597) segmentation model. The model is converted to OpenVINO IR with [Model Optimizer](https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_Deep_Learning_Model_Optimizer_DevGuide.html), and quantized with OpenVINO's [Post-Training Optimization Tool](https://docs.openvinotoolkit.org/latest/pot_compression_api_README.html) API. 

The tutorial is split into three parts. This is the third part, which shows how to view inference results, and do performance benchmarks. The other parts will be added soon.

## Instructions

This notebook needs a quantized OpenVINO IR model. We provide a pretrained model trained for 20 epochs with the full [Kits-19](https://github.com/neheller/kits19) frames dataset, which has an F1 score on the validation set of 0.9. To learn how this model was quantized, see the [Quantization Notebook](03-kidney-openvino.ipynb). For a full end-to-end demo that starts with training, see the [Training Notebook](02-kidney-train.ipynb)

## Imports

In [None]:
import glob
import random
import time
import urllib.request
import zipfile
from operator import itemgetter
from pathlib import Path
from typing import List

import cv2
import matplotlib.pyplot as plt
import numpy as np
from IPython.display import Image, Markdown, display
from openvino.inference_engine import IECore

from async_inference import CTAsyncPipeline, SegModel
from omz_python.models import model as omz_model

## Settings

To use the pretrained model, set MODEL_DIR to `Path("pretrained_model")` in the cell below. This is the default. To use the trained model, set MODEL_DIR to `Path("model")` (assuming default settings in the training and quantization notebooks).

In [None]:
# Directory that contains the CT scan data. This directory should contain subdirecties
# case_00XXX where XXX is between 000 and 299, with data prepared according to #TODO
basedir = "kits19_frames"
# The CT scan case number. For example: 16 for data from the case_00016 directory
# Currently only 16 is supported
case = 16
# The directory that contains the IR model files. Should contain unet44.xml and bin
# and quantized_unet44.xml and bin.
MODEL_DIR = Path("pretrained_model")
ir_path = MODEL_DIR / "unet44.xml"
compressed_model_path = MODEL_DIR / "quantized_unet44.xml"

## Download and Prepare Data

Download one validation video for live inference.

In [None]:
if not Path(f"data/case_00{case:03d}").exists():
    filename, _ = urllib.request.urlretrieve(
        f"https://s3.us-west-1.amazonaws.com/openvino.notebooks/case_00{case:03d}.zip"
    )
    with zipfile.ZipFile(filename, "r") as zip_ref:
        zip_ref.extractall("data")

In [None]:
class KitsDataset(object):
    def __init__(self, basedir: str, dataset_type: str, transforms=None):
        """
        Dataset class for prepared Kits19 data, for binary segmentation (background/kidney).

        :param basedir: Directory that contains the prepared CT scans, in subdirectories
                        case_00000 until case_00210
        :param dataset_type: either "train" or "val"
        :param transforms: Compose object with augmentations
        """
        allmasks = sorted(glob.glob(f"{basedir}/case_*/segmentation_frames/*png"))

        # Reserve 10% of the patients for the validation dataset
        # Set a random seed to ensure that this list is reproducable
        random.seed(2.71828)
        self.valpatients = sorted(random.choices(range(210), k=21))

        valcases = [f"case_{i:05d}" for i in self.valpatients]
        if dataset_type == "train":
            masks = [mask for mask in allmasks if Path(mask).parents[1].name not in valcases]
        elif dataset_type == "val":
            masks = [mask for mask in allmasks if Path(mask).parents[1].name in valcases]
        else:
            raise ValueError("Please choose train or val dataset split")

        if dataset_type == "train":
            random.shuffle(masks)
        self.basedir = basedir
        self.dataset_type = dataset_type
        self.dataset = masks
        self.transforms = transforms
        print(f"Created {dataset_type} dataset with {len(self.dataset)} items.")

    def __getitem__(self, index):
        """
        Get an item from the dataset at the specified index.
        Labels are converted to binary labels (background/kidney).

        :return: (annotation, input_image) where annotation is (index, segmentation_mask)
        """
        mask_path = self.dataset[index]
        mask = cv2.imread(mask_path, cv2.IMREAD_UNCHANGED)
        # The masks contain annotations for kidneys and tumors, in this tutorial we only segment
        # kidneys so we can set all pixels that contain a non-background value to 1.
        mask[mask > 0] = 1

        image_path = str(Path(mask_path.replace("segmentation", "imaging")).with_suffix(".jpg"))
        img = cv2.imread(image_path, cv2.IMREAD_UNCHANGED)

        if img.shape != (512, 512, 3):
            img = cv2.resize(img, (512, 512))
            mask = cv2.resize(mask, (512, 512))

        # TODO: add transforms with torchvision.transforms instead of albumentations
        # if self.transforms is not None:

        annotation = (index, mask.astype(np.uint8))
        input_image = np.expand_dims(img, axis=0).astype(np.float32)
        return annotation, input_image

    def __len__(self):
        return len(self.dataset)

## Show Results

Visualize the results of the model on four slices of the validation set. Compare the results of the FP16 IR model with the results of the quantized INT8 model and the reference segmentation annotation.

In [None]:
# The sigmoid function is used to transform the result of the network
# to binary segmentation masks
def sigmoid(x):
    return np.exp(-np.logaddexp(0, -x))

In [None]:
data_loader = KitsDataset("data", "val", None)
num_images = 4
colormap = "gray"

ie = IECore()
net_ir = ie.read_network(ir_path)
net_pot = ie.read_network(compressed_model_path)

exec_net_ir = ie.load_network(network=net_ir, device_name="CPU")
exec_net_pot = ie.load_network(network=net_pot, device_name="CPU")
input_layer = next(iter(net_ir.input_info))
output_layer_ir = next(iter(net_ir.outputs))
output_layer_pot = next(iter(net_pot.outputs))

# data_subset = random.choices(data_loader, k=num_images)
data_subset = itemgetter(28, 30, 38, 60)(data_loader)

fig, ax = plt.subplots(nrows=num_images, ncols=4, figsize=(24, num_images * 4))

for i, (ma, im) in enumerate(data_subset):
    res_ir = exec_net_ir.infer(inputs={input_layer: im})
    res_pot = exec_net_pot.infer(inputs={input_layer: im})
    target_mask = ma[1].astype(np.uint8)

    result_mask_ir = sigmoid(res_ir[output_layer_ir]).round().astype(np.uint8)[0, 0, ::]
    result_mask_pot = sigmoid(res_pot[output_layer_pot]).round().astype(np.uint8)[0, 0, ::]

    ax[i, 0].imshow(im[0, ::], cmap=colormap)
    ax[i, 1].imshow(target_mask, cmap=colormap)
    ax[i, 2].imshow(result_mask_ir, cmap=colormap)
    ax[i, 3].imshow(result_mask_pot, cmap=colormap)

    ax[i, 1].set_title("Annotation")
    ax[i, 2].set_title("Prediction on FP16 model")
    ax[i, 3].set_title("Prediction on INT8 model")

### Compare Performance of the Original and Quantized Models
To measure the inference performance of the FP16 and INT8 models, we use [Benchmark Tool](https://docs.openvinotoolkit.org/latest/openvino_inference_engine_tools_benchmark_tool_README.html), OpenVINO's inference performance measurement tool. Benchmark tool is a command line application that can be run in the notebook with `! benchmark_app` or `%sx benchmark_app`. We create a helper function that makes it easy to compare several configurations. It prints the `benchmark_app` command with the command line options for the chosen parameters. 

> NOTE: For the most accurate performance estimation, we recommended running `benchmark_app` in a terminal/command prompt after closing other applications. Run `benchmark_app --help` to see all command line options.

In [None]:
def benchmark_model(model_xml, device="CPU", seconds=60, api="async", batch=1):
    ie = IECore()
    model_path = Path(model_xml)
    if ("GPU" in device) and ("GPU" not in ie.available_devices):
        raise ValueError(f"A GPU device is not available. Available devices are: {ie.available_devices}")
    else:
        benchmark_command = f"benchmark_app -m {model_path} -d {device} -t {seconds} -api {api} -b {batch} -cdir model_cache"
        display(Markdown(f"**Benchmark {model_path.name} with {device} for {seconds} seconds with {api} inference**"));
        display(Markdown(f"Benchmark command: `{benchmark_command}`"));

        benchmark_output = %sx $benchmark_command
        benchmark_result = [line for line in benchmark_output
                            if not (line.startswith(r"[") or line.startswith("  ") or line == "")]
        print("\n".join(benchmark_result))

In [None]:
# By default, benchmark on MULTI:CPU,GPU if a GPU is available, otherwise on CPU.
device = "MULTI:CPU,GPU" if "GPU" in ie.available_devices else "CPU"
# Uncomment one of the options below to benchmark on other devices
# device = "GPU"
# device = "CPU"
# device = "AUTO"

In [None]:
benchmark_model(model_xml=ir_path, device=device, seconds=15)

In [None]:
benchmark_model(model_xml=compressed_model_path, device=device, seconds=15)

## Show Live Inference

To show live inference on the model in the notebook, we use the asynchronous processing feature of OpenVINO Inference Engine.

#### Visualization Functions

In [None]:
def showarray(frame: np.ndarray, display_handle: str):
    """
    Display array `frame`. Replace information at `display_handle` with `frame`
    encoded as jpeg image.

    Create a display_handle with: `display_handle = display(display_id=True)`
    """
    _, frame = cv2.imencode(ext=".jpeg", img=frame)
    display_handle.update(Image(data=frame.tobytes()))

In [None]:
def do_inference(imagelist: List, model: omz_model.Model, device: str):
    """
    Do inference of images in `imagelist` on `model` on the given `device`.

    :param imagelist: list of images/frames to do inference on
    :param model: Model instance for inference
    :param device: Name of device to perform inference on. For example: "CPU"
    """
    display_handle = display("", display_id=True)
    input_layer = next(iter(model.net.input_info))

    # Create asynchronous pipeline and print time it takes to load the model
    s = time.perf_counter()
    pipeline = CTAsyncPipeline(
        ie=ie, model=model, plugin_config={}, device=device, max_num_requests=0
    )
    e = time.perf_counter()
    start_time = time.perf_counter()

    # Perform asynchronous inference
    next_frame_id = 0
    next_frame_id_to_show = 0

    while next_frame_id < len(imagelist) - 1:
        results = pipeline.get_result(next_frame_id_to_show)

        if results:
            # Show next result from async pipeline
            result, meta = results
            showarray(result, display_handle)

            if next_frame_id_to_show == 0:
                print(f"Loaded model to {device} in {e-s:.2f} seconds.")

            next_frame_id_to_show += 1

        if pipeline.is_ready():
            # Submit new image to async pipeline
            image = imagelist[next_frame_id]
            pipeline.submit_data(
                inputs={input_layer: image}, id=next_frame_id, meta={"frame": image}
            )
            next_frame_id += 1
        else:
            # If the pipeline is not ready yet and there are no results: wait
            pipeline.await_any()

    pipeline.await_all()

    # Show all frames that are in the pipeline after all images have been submitted
    while pipeline.has_completed_request():
        results = pipeline.get_result(next_frame_id_to_show)
        if results:
            result, meta = results
            showarray(result, display_handle)
            next_frame_id_to_show += 1

    end_time = time.perf_counter()
    duration = end_time - start_time
    fps = len(imagelist) / duration
    print(f"Total time for {next_frame_id+1} frames: {duration:.2f} seconds, fps:{fps:.2f}")

#### Load Model and Images

In [None]:
ie = IECore()
segmentation_model = SegModel(ie=ie, model_path=Path(compressed_model_path))

In [None]:
demopattern = f"data/case_00{case:03d}/imaging_frames/*jpg"
imlist = sorted(glob.glob(demopattern))
images = [cv2.imread(im, cv2.IMREAD_UNCHANGED) for im in imlist]

#### Show Inference

In [None]:
# Possible options for device include "CPU", "GPU", "AUTO", "MULTI"
device = "MULTI:CPU,GPU" if "GPU" in ie.available_devices else "CPU"
do_inference(imagelist=images, model=segmentation_model, device=device)