Piotrm/resnet50 pyt triton perf fix by piotrm-nvidia · Pull Request #953 · NVIDIA/DeepLearningExamples

piotrm-nvidia · 2021-06-08T14:37:53Z

Those two changes improve performance of:

ONNX runtime with TensorRT execution provider.
TensorRT runtime.

The results in Triton folder for ResNet50 PyTorch are still valid. The change makes it easier to achieve the same results with Quick Start Guide.

Scripts were modified to fix missing ORT_TENSORRT_FP16_ENABLE flag for Triton Inference Server with ONNXRuntime and TensorRT execution provider.

ONNX to TensorRT converter was fixed to force FP16 precision for TensorRT networks.

piotrm-nvidia added 2 commits June 8, 2021 16:25

ResNet50/PyT Triton ONNXruntime fix with env flag

e4b153c

Scripts were modified to fix missing ORT_TENSORRT_FP16_ENABLE flag for Triton Inference Server with ONNXRuntime and TensorRT execution provider.

ResNet50/PyT TensorRT FP16 support fixed

7f7986d

ONNX to TensorRT converter was fixed to force FP16 precision for TensorRT networks.

nv-kkudrynski merged commit 5c33a82 into NVIDIA:master Jun 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Piotrm/resnet50 pyt triton perf fix#953

Piotrm/resnet50 pyt triton perf fix#953
nv-kkudrynski merged 2 commits intoNVIDIA:masterfrom
piotrm-nvidia:piotrm/resnet50_pyt_triton_perf_fix

piotrm-nvidia commented Jun 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

piotrm-nvidia commented Jun 8, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants