Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triton naming #747

Merged
merged 3 commits into from Jun 6, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
14 changes: 7 additions & 7 deletions docs/apis/README.md
Expand Up @@ -821,15 +821,15 @@ TensorflowSpec
</tr>
<tr>
<td>
<code>tensorrt</code></br>
<code>triton</code></br>
<em>
<a href="#serving.kubeflow.org/v1alpha2.TensorRTSpec">
TensorRTSpec
<a href="#serving.kubeflow.org/v1alpha2.TritonSpec">
TritonSpec
</a>
</em>
</td>
<td>
<p>Spec for TensorRT Inference Server (<a href="https://github.com/NVIDIA/tensorrt-inference-server">https://github.com/NVIDIA/tensorrt-inference-server</a>)</p>
<p>Spec for Triton Inference Server (<a href="https://github.com/NVIDIA/triton-inference-server">https://github.com/NVIDIA/triton-inference-server</a>)</p>
</td>
</tr>
<tr>
Expand Down Expand Up @@ -927,7 +927,7 @@ PredictorConfig
</tr>
<tr>
<td>
<code>tensorrt</code></br>
<code>triton</code></br>
<em>
<a href="#serving.kubeflow.org/v1alpha2.PredictorConfig">
PredictorConfig
Expand Down Expand Up @@ -1153,14 +1153,14 @@ int
</tr>
</tbody>
</table>
<h3 id="serving.kubeflow.org/v1alpha2.TensorRTSpec">TensorRTSpec
<h3 id="serving.kubeflow.org/v1alpha2.TritonSpec">TritonSpec
</h3>
<p>
(<em>Appears on:</em>
<a href="#serving.kubeflow.org/v1alpha2.PredictorSpec">PredictorSpec</a>)
</p>
<p>
<p>TensorRTSpec defines arguments for configuring TensorRT model serving.</p>
<p>TritonSpec defines arguments for configuring Triton Inference Server.</p>
</p>
<table>
<thead>
Expand Down
2 changes: 1 addition & 1 deletion docs/predict-api/v2/required_api.md
Expand Up @@ -7,7 +7,7 @@ By implementing this protocol both
inference clients and servers will increase their utility and
portability by being able to operate seamlessly on platforms that have
standardized around this API. This protocol is endorsed by NVIDIA
TensorRT Inference Server, TensorFlow Serving, and ONNX Runtime
Triton Inference Server, TensorFlow Serving, and ONNX Runtime
Server.

For an inference server to be compliant with this protocol the server
Expand Down
2 changes: 1 addition & 1 deletion docs/samples/azure/README.md
Expand Up @@ -4,7 +4,7 @@
By default, KFServing uses anonymous client to download artifacts. To point to an Azure Blob, specify StorageUri to point to an Azure Blob Storage with the format:
```https://{$STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{$CONTAINER}/{$PATH}```

e.g. https://kfserving.blob.core.windows.net/tensorrt/simple_string/
e.g. https://kfserving.blob.core.windows.net/triton/simple_string/

## Using Private Blobs
KFServing supports authenticating using an Azure Service Principle.
Expand Down
14 changes: 7 additions & 7 deletions docs/samples/triton/simple_string/README.md
@@ -1,28 +1,28 @@

# Predict on a InferenceService using TensorRT Inference Server
# Predict on a InferenceService using Triton Inference Server
## Setup
1. Your ~/.kube/config should point to a cluster with [KFServing installed](https://github.com/kubeflow/kfserving/blob/master/docs/DEVELOPER_GUIDE.md#deploy-kfserving).
2. Your cluster's Istio Ingress gateway must be network accessible.

## Create the InferenceService
Apply the CRD
```
kubectl apply -f tensorrt.yaml
kubectl apply -f triton.yaml
```

Expected Output
```
inferenceservice.serving.kubeflow.org/tensorrt-simple-string created
inferenceservice.serving.kubeflow.org/triton-simple-string created
```

## Run a prediction
Uses the client at: https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/client.html#section-client-api
Uses the client at: https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/client_example.html


1. setup vars

```
SERVICE_HOSTNAME=$(kubectl get ksvc tensorrt-simple-string-predictor-default -o jsonpath='{.status.url}' | cut -d "/" -f 3)
SERVICE_HOSTNAME=$(kubectl get ksvc triton-simple-string-predictor-default -o jsonpath='{.status.url}' | cut -d "/" -f 3)
INGRESS_GATEWAY=istio-ingressgateway
CLUSTER_IP=$(kubectl -n istio-system get service $INGRESS_GATEWAY -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo $CLUSTER_IP
Expand All @@ -31,7 +31,7 @@ echo $CLUSTER_IP
```
curl -H "Host: ${SERVICE_HOSTNAME}" http://${CLUSTER_IP}/api/status
```
3. edit /etc/hosts to map the CLUSTER IP to tensorrt-simple-string-predictor-default.default.example.com
3. edit /etc/hosts to map the CLUSTER IP to triton-simple-string-predictor-default.default.example.com
4. run the client
```
docker run -e SERVICE_HOSTNAME:$SERVICE_HOSTNAME -it --rm --net=host kcorer/tensorrtserver_client:19.05
Expand All @@ -40,7 +40,7 @@ docker run -e SERVICE_HOSTNAME:$SERVICE_HOSTNAME -it --rm --net=host kcorer/tens

You should see output like:
```
root@trantor:/workspace# ./build/simple_string_client -u tensorrt-simple-string.default.example.com
root@trantor:/workspace# ./build/simple_string_client -u triton-simple-string.default.example.com
0 + 1 = 1
0 - 1 = -1
1 + 1 = 2
Expand Down
@@ -1,9 +1,9 @@
apiVersion: "serving.kubeflow.org/v1alpha2"
kind: "InferenceService"
metadata:
name: "tensorrt-simple-string"
name: "triton-simple-string"
spec:
default:
predictor:
tensorrt:
triton:
storageUri: "gs://kfserving-samples/models/tensorrt"
4 changes: 2 additions & 2 deletions python/kfserving/README.md
Expand Up @@ -40,7 +40,7 @@ KFServing supports the following storage providers:
* By default, it uses `S3_ENDPOINT`, `AWS_ACCESS_KEY_ID`, and `AWS_SECRET_ACCESS_KEY` environment variables for user authentication.
* Azure Blob Storage with the format: "https://{$STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{$CONTAINER}/{$PATH}"
* By default, it uses anonymous client to download the artifacts.
* For e.g. https://kfserving.blob.core.windows.net/tensorrt/simple_string/
* For e.g. https://kfserving.blob.core.windows.net/triton/simple_string/
* Local filesystem either without any prefix or with a prefix "file://". For example:
* Absolute path: `/absolute/path` or `file:///absolute/path`
* Relative path: `relative/path` or `file://relative/path`
Expand Down Expand Up @@ -94,7 +94,7 @@ Class | Method | Description
- [V1alpha2PyTorchSpec](docs/V1alpha2PyTorchSpec.md)
- [V1alpha2SKLearnSpec](docs/V1alpha2SKLearnSpec.md)
- [V1alpha2StatusConfigurationSpec](docs/V1alpha2StatusConfigurationSpec.md)
- [V1alpha2TensorRTSpec](docs/V1alpha2TensorRTSpec.md)
- [V1alpha2TritonSpec](docs/V1alpha2TritonSpec.md)
- [V1alpha2TensorflowSpec](docs/V1alpha2TensorflowSpec.md)
- [V1alpha2TransformerSpec](docs/V1alpha2TransformerSpec.md)
- [V1alpha2XGBoostSpec](docs/V1alpha2XGBoostSpec.md)
8 changes: 4 additions & 4 deletions python/kfserving/test/test_azure_storage.py
Expand Up @@ -43,7 +43,7 @@ def get_call_args(call_args_list):
def test_blob(mock_storage, mock_makedirs): # pylint: disable=unused-argument

# given
blob_path = 'https://kfserving.blob.core.windows.net/tensorrt/simple_string/'
blob_path = 'https://kfserving.blob.core.windows.net/triton/simple_string/'
paths = ['simple_string/1/model.graphdef', 'simple_string/config.pbtxt']
mock_blob = create_mock_blob(mock_storage, paths)

Expand All @@ -53,8 +53,8 @@ def test_blob(mock_storage, mock_makedirs): # pylint: disable=unused-argument
# then
arg_list = get_call_args(mock_blob.get_blob_to_path.call_args_list)
assert arg_list == [
('tensorrt', 'simple_string/1/model.graphdef', 'dest_path/1/model.graphdef'),
('tensorrt', 'simple_string/config.pbtxt', 'dest_path/config.pbtxt')
('triton', 'simple_string/1/model.graphdef', 'dest_path/1/model.graphdef'),
('triton', 'simple_string/config.pbtxt', 'dest_path/config.pbtxt')
]

mock_storage.assert_called_with(account_name="kfserving")
Expand All @@ -65,7 +65,7 @@ def test_blob(mock_storage, mock_makedirs): # pylint: disable=unused-argument
def test_secure_blob(mock_storage, mock_get_token, mock_makedirs): # pylint: disable=unused-argument

# given
blob_path = 'https://kfsecured.blob.core.windows.net/tensorrt/simple_string/'
blob_path = 'https://kfsecured.blob.core.windows.net/triton/simple_string/'
mock_blob = mock_storage.return_value
mock_blob.list_blobs.side_effect = AzureMissingResourceHttpError("fail auth", 404)
mock_get_token.return_value = "some_token"
Expand Down