# Triton Inference Server

### Model Management - Examples
---

#### Classification

In [None]:
!tree examples/chestxray_cls/

In [None]:
!cat examples/chestxray_cls/config.pbtxt

---
In the config above, basically we have to define:
- Model name
- Platform: TensorRT (tensorrt_plan), PyTorch (pytorch_libtorch), Tensorflow (tensorflow_graphdef) or others
- Max batch size
- Input description includes input node name, data type, array format and array dimensions
- Output description includes output node name, data type and array dimensions
- Instance group: Define the number of model instance you want to serve in the GPUs

#### Segmentation - TensorRT

In [None]:
!tree examples/covid19_seg/

In [None]:
!cat examples/covid19_seg/config.pbtxt

---

## Manage the Colonoscopy Segmentation Model

Create folders for the model

In [None]:
!mkdir -p ./triton_models/endo_seg
!mkdir -p ./triton_models/endo_seg/1

Copy our TRT model into the folder

In [None]:
!cp ../TensorRT/model_fp16.engine ./triton_models/endo_seg/1/model.plan

Get the input and output node names and shapes

In [None]:
import tensorrt as trt
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)

EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)

builder = trt.Builder(TRT_LOGGER)
network = builder.create_network(EXPLICIT_BATCH)
parser = trt.OnnxParser(network, TRT_LOGGER)
with open('../MONAICore/model.onnx', 'rb') as model:
    parser.parse(model.read())

In [None]:
inputs = network.get_input(0)
inputs.name, inputs.shape

In [None]:
outputs = network.get_output(0)
outputs.name, outputs.shape

Generate the config

In [None]:
%%writefile triton_models/endo_seg/config.pbtxt
name: "endo_seg"
platform: "tensorrt_plan"
max_batch_size: 32
input [
    {
      name: "input.1"
      data_type: TYPE_FP32
      dims: [ 3, 256, 256 ]
    }
]
output [
    {
      name: "495"
      data_type: TYPE_FP32
      dims: [ 1, 256, 256 ]
    }
]
instance_group [
    {
      kind: KIND_GPU
      count: 1
    }
]

---

## Run Triton Inference Server
Run below command in Triton Inference Server container

In [None]:
!tritonserver --model-store=/mount/src/Triton_Inference_Server/triton_models/ 