Skip to content

Commit

Permalink
Triton naming (kserve#747)
Browse files Browse the repository at this point in the history
* Triton renaming

* Update docs/samples/triton/simple_string/triton.yaml

Co-authored-by: Dan Sun <dsun20@bloomberg.net>

* Update test_triton.py

Co-authored-by: Dan Sun <dsun20@bloomberg.net>
  • Loading branch information
deadeyegoodwin and yuzisun committed Jun 6, 2020
1 parent 5f1d8f4 commit 2deea7d
Show file tree
Hide file tree
Showing 7 changed files with 24 additions and 24 deletions.
14 changes: 7 additions & 7 deletions docs/apis/README.md
Expand Up @@ -821,15 +821,15 @@ TensorflowSpec
</tr>
<tr>
<td>
<code>tensorrt</code></br>
<code>triton</code></br>
<em>
<a href="#serving.kubeflow.org/v1alpha2.TensorRTSpec">
TensorRTSpec
<a href="#serving.kubeflow.org/v1alpha2.TritonSpec">
TritonSpec
</a>
</em>
</td>
<td>
<p>Spec for TensorRT Inference Server (<a href="https://github.com/NVIDIA/tensorrt-inference-server">https://github.com/NVIDIA/tensorrt-inference-server</a>)</p>
<p>Spec for Triton Inference Server (<a href="https://github.com/NVIDIA/triton-inference-server">https://github.com/NVIDIA/triton-inference-server</a>)</p>
</td>
</tr>
<tr>
Expand Down Expand Up @@ -927,7 +927,7 @@ PredictorConfig
</tr>
<tr>
<td>
<code>tensorrt</code></br>
<code>triton</code></br>
<em>
<a href="#serving.kubeflow.org/v1alpha2.PredictorConfig">
PredictorConfig
Expand Down Expand Up @@ -1153,14 +1153,14 @@ int
</tr>
</tbody>
</table>
<h3 id="serving.kubeflow.org/v1alpha2.TensorRTSpec">TensorRTSpec
<h3 id="serving.kubeflow.org/v1alpha2.TritonSpec">TritonSpec
</h3>
<p>
(<em>Appears on:</em>
<a href="#serving.kubeflow.org/v1alpha2.PredictorSpec">PredictorSpec</a>)
</p>
<p>
<p>TensorRTSpec defines arguments for configuring TensorRT model serving.</p>
<p>TritonSpec defines arguments for configuring Triton Inference Server.</p>
</p>
<table>
<thead>
Expand Down
2 changes: 1 addition & 1 deletion docs/predict-api/v2/required_api.md
Expand Up @@ -7,7 +7,7 @@ By implementing this protocol both
inference clients and servers will increase their utility and
portability by being able to operate seamlessly on platforms that have
standardized around this API. This protocol is endorsed by NVIDIA
TensorRT Inference Server, TensorFlow Serving, and ONNX Runtime
Triton Inference Server, TensorFlow Serving, and ONNX Runtime
Server.

For an inference server to be compliant with this protocol the server
Expand Down
2 changes: 1 addition & 1 deletion docs/samples/azure/README.md
Expand Up @@ -4,7 +4,7 @@
By default, KFServing uses anonymous client to download artifacts. To point to an Azure Blob, specify StorageUri to point to an Azure Blob Storage with the format:
```https://{$STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{$CONTAINER}/{$PATH}```

e.g. https://kfserving.blob.core.windows.net/tensorrt/simple_string/
e.g. https://kfserving.blob.core.windows.net/triton/simple_string/

## Using Private Blobs
KFServing supports authenticating using an Azure Service Principle.
Expand Down
14 changes: 7 additions & 7 deletions docs/samples/triton/simple_string/README.md
@@ -1,28 +1,28 @@

# Predict on a InferenceService using TensorRT Inference Server
# Predict on a InferenceService using Triton Inference Server
## Setup
1. Your ~/.kube/config should point to a cluster with [KFServing installed](https://github.com/kubeflow/kfserving/blob/master/docs/DEVELOPER_GUIDE.md#deploy-kfserving).
2. Your cluster's Istio Ingress gateway must be network accessible.

## Create the InferenceService
Apply the CRD
```
kubectl apply -f tensorrt.yaml
kubectl apply -f triton.yaml
```

Expected Output
```
inferenceservice.serving.kubeflow.org/tensorrt-simple-string created
inferenceservice.serving.kubeflow.org/triton-simple-string created
```

## Run a prediction
Uses the client at: https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/client.html#section-client-api
Uses the client at: https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/client_example.html


1. setup vars

```
SERVICE_HOSTNAME=$(kubectl get ksvc tensorrt-simple-string-predictor-default -o jsonpath='{.status.url}' | cut -d "/" -f 3)
SERVICE_HOSTNAME=$(kubectl get ksvc triton-simple-string-predictor-default -o jsonpath='{.status.url}' | cut -d "/" -f 3)
INGRESS_GATEWAY=istio-ingressgateway
CLUSTER_IP=$(kubectl -n istio-system get service $INGRESS_GATEWAY -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo $CLUSTER_IP
Expand All @@ -31,7 +31,7 @@ echo $CLUSTER_IP
```
curl -H "Host: ${SERVICE_HOSTNAME}" http://${CLUSTER_IP}/api/status
```
3. edit /etc/hosts to map the CLUSTER IP to tensorrt-simple-string-predictor-default.default.example.com
3. edit /etc/hosts to map the CLUSTER IP to triton-simple-string-predictor-default.default.example.com
4. run the client
```
docker run -e SERVICE_HOSTNAME:$SERVICE_HOSTNAME -it --rm --net=host kcorer/tensorrtserver_client:19.05
Expand All @@ -40,7 +40,7 @@ docker run -e SERVICE_HOSTNAME:$SERVICE_HOSTNAME -it --rm --net=host kcorer/tens

You should see output like:
```
root@trantor:/workspace# ./build/simple_string_client -u tensorrt-simple-string.default.example.com
root@trantor:/workspace# ./build/simple_string_client -u triton-simple-string.default.example.com
0 + 1 = 1
0 - 1 = -1
1 + 1 = 2
Expand Down
@@ -1,9 +1,9 @@
apiVersion: "serving.kubeflow.org/v1alpha2"
kind: "InferenceService"
metadata:
name: "tensorrt-simple-string"
name: "triton-simple-string"
spec:
default:
predictor:
tensorrt:
triton:
storageUri: "gs://kfserving-samples/models/tensorrt"
4 changes: 2 additions & 2 deletions python/kfserving/README.md
Expand Up @@ -40,7 +40,7 @@ KFServing supports the following storage providers:
* By default, it uses `S3_ENDPOINT`, `AWS_ACCESS_KEY_ID`, and `AWS_SECRET_ACCESS_KEY` environment variables for user authentication.
* Azure Blob Storage with the format: "https://{$STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{$CONTAINER}/{$PATH}"
* By default, it uses anonymous client to download the artifacts.
* For e.g. https://kfserving.blob.core.windows.net/tensorrt/simple_string/
* For e.g. https://kfserving.blob.core.windows.net/triton/simple_string/
* Local filesystem either without any prefix or with a prefix "file://". For example:
* Absolute path: `/absolute/path` or `file:///absolute/path`
* Relative path: `relative/path` or `file://relative/path`
Expand Down Expand Up @@ -94,7 +94,7 @@ Class | Method | Description
- [V1alpha2PyTorchSpec](docs/V1alpha2PyTorchSpec.md)
- [V1alpha2SKLearnSpec](docs/V1alpha2SKLearnSpec.md)
- [V1alpha2StatusConfigurationSpec](docs/V1alpha2StatusConfigurationSpec.md)
- [V1alpha2TensorRTSpec](docs/V1alpha2TensorRTSpec.md)
- [V1alpha2TritonSpec](docs/V1alpha2TritonSpec.md)
- [V1alpha2TensorflowSpec](docs/V1alpha2TensorflowSpec.md)
- [V1alpha2TransformerSpec](docs/V1alpha2TransformerSpec.md)
- [V1alpha2XGBoostSpec](docs/V1alpha2XGBoostSpec.md)
8 changes: 4 additions & 4 deletions python/kfserving/test/test_azure_storage.py
Expand Up @@ -43,7 +43,7 @@ def get_call_args(call_args_list):
def test_blob(mock_storage, mock_makedirs): # pylint: disable=unused-argument

# given
blob_path = 'https://kfserving.blob.core.windows.net/tensorrt/simple_string/'
blob_path = 'https://kfserving.blob.core.windows.net/triton/simple_string/'
paths = ['simple_string/1/model.graphdef', 'simple_string/config.pbtxt']
mock_blob = create_mock_blob(mock_storage, paths)

Expand All @@ -53,8 +53,8 @@ def test_blob(mock_storage, mock_makedirs): # pylint: disable=unused-argument
# then
arg_list = get_call_args(mock_blob.get_blob_to_path.call_args_list)
assert arg_list == [
('tensorrt', 'simple_string/1/model.graphdef', 'dest_path/1/model.graphdef'),
('tensorrt', 'simple_string/config.pbtxt', 'dest_path/config.pbtxt')
('triton', 'simple_string/1/model.graphdef', 'dest_path/1/model.graphdef'),
('triton', 'simple_string/config.pbtxt', 'dest_path/config.pbtxt')
]

mock_storage.assert_called_with(account_name="kfserving")
Expand All @@ -65,7 +65,7 @@ def test_blob(mock_storage, mock_makedirs): # pylint: disable=unused-argument
def test_secure_blob(mock_storage, mock_get_token, mock_makedirs): # pylint: disable=unused-argument

# given
blob_path = 'https://kfsecured.blob.core.windows.net/tensorrt/simple_string/'
blob_path = 'https://kfsecured.blob.core.windows.net/triton/simple_string/'
mock_blob = mock_storage.return_value
mock_blob.list_blobs.side_effect = AzureMissingResourceHttpError("fail auth", 404)
mock_get_token.return_value = "some_token"
Expand Down

0 comments on commit 2deea7d

Please sign in to comment.