<img src="http://developer.download.nvidia.com/compute/machine-learning/frameworks/nvidia_logo.png" style="width: 90px; float: center;">

# Triton Client

In [1]:
!pip install tritonclient[http] --quiet

[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3[0m[39;49m -> [0m[32;49m22.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


## Imports & Setup

In [2]:
import os
import pathlib
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
from IPython.display import display, HTML
display(HTML("<style>.jp-CodeCell.jp-mod-outputsScrolled .jp-Cell-outputArea { max-height: 24em; }</style>"))
import numpy as np
import tritonclient.http as httpclient

## Check on Triton Server

We can test the status of the server connection by running the curl command: `curl -v <IP of machine>:8000/v2/health/ready` which should return `HTTP/1.1 200 OK`

**NOTE:-** In our case the IP of machine on which Triton Server is currently running is `localhost`

In [10]:
!curl -v localhost:8000/v2/health/ready

*   Trying 127.0.0.1:8000...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8000 (#0)
> GET /v2/health/ready HTTP/1.1
> Host: localhost:8000
> User-Agent: curl/7.68.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain
< 
* Connection #0 to host localhost left intact


## Perform Sample Inference

In [11]:
from transformers import AutoTokenizer
def sample_inference(texts, model_variant, triton_model_name, triton_url="localhost:8000"):
    tokenizer = AutoTokenizer.from_pretrained(model_variant)
    tokenized_text = tokenizer(texts, padding=True, return_tensors="pt")
    input_ids = tokenized_text['input_ids'].numpy().astype(np.int32)
    inputids_tensor = httpclient.InferInput("input_ids", input_ids.shape, datatype="INT32")
    inputids_tensor.set_data_from_numpy(input_ids, binary_data=True)
    outputs = [httpclient.InferRequestedOutput('output_ids', binary_data=True)]

    triton_client = httpclient.InferenceServerClient(url=triton_url)

    results = triton_client.infer(model_name=triton_model_name,
                                  inputs=[inputids_tensor],
                                  outputs=outputs)

    outputs = results.as_numpy('output_ids')
    output_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)
    return output_text

In [12]:
texts1 = ["summarize: United States involvement in the Vietnam War began shortly after the end of World War II, first in an extremely limited capacity and escalated over a period of 20 years, peaking in April 1969 with 543,000 American combat troops stationed in Vietnam.[1] By the conclusion of the United States's involvement, over 3.1 million Americans had been stationed in the country. This involvement, along with hippie culture, played a key role in sparking the Civil Rights Movement and wide ranging changes in popular culture."]

In [14]:
output_texts = sample_inference(texts1, model_variant="t5-small", triton_model_name = "trt_t5_bs1_beam2")

In [15]:
output_texts

["the involvement of the united states in the Vietnam war began shortly after the end of the world war II. the involvement began shortly after the end of the world war ii. by the conclusion of the United States's involvement, over 3.1 million americans had been stationed in the country. this involvement, along with hippie culture, played a key role in sparking the civil rights movement and wide ranging changes in popular culture."]

In [16]:
output_texts = sample_inference(texts1, model_variant="facebook/bart-base", triton_model_name = "trt_bart_bs1_greedy")

In [17]:
output_texts

["summarize: United States involvement in the Vietnam War began shortly after the end of World War II, first in an extremely limited capacity and escalated over a period of 20 years, peaking in April 1969 with 543,000 American combat troops stationed in Vietnam.[1] By the conclusion of the United States's involvement, over 3.1 million Americans had been stationed in the country. This involvement, along with hippie culture, played a key role in sparking the Civil Rights Movement and wide ranging changes in popular culture."]