Fix issues in TensorRT EP by stevenlix · Pull Request #8996 · microsoft/onnxruntime

stevenlix · 2021-09-08T03:29:57Z

Fix bad_alloc issue when big TRT engine is loaded in TRT EP.
The issue happened in QDQ BERT model, where TRT engine size is bigger than 3GB.
Add cuda_cpu_allocator for OrtMemTypeCPUInput
The issue was seen when memcpy node is inserted before TRT kernel.

jywu-msft · 2021-09-08T03:40:35Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

          trt_state->context->reset();
          trt_state->engine->reset();
          *(trt_state->engine) = tensorrt_ptr::unique_pointer<nvinfer1::ICudaEngine>(trt_state->runtime->deserializeCudaEngine(engine_buf.get(), engine_size, nullptr));
-          LOGS_DEFAULT(VERBOSE) << "[TensorRT EP] DeSerialized " + engine_cache_path;


did you intend to remove this verbose log line?

This line is duplicated with line 1495

* fix big engine load issue and add cuda_cpu_alloc * remove redundancy * fix minor issues

* fast reduction for reducemean (#8976) * Adding preprocessor checks for torch version during torch cpp extensions compilation (#8989) * custom autograd func memory refinement (#8993) * Release torch tensor referenced by torch gradient graph (created in PythonOp) * Update orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/torch_interop_utils/torch_interop_utils.cc * refine with comments Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> * Fix issues in TensorRT EP (#8996) * fix big engine load issue and add cuda_cpu_alloc * remove redundancy * fix minor issues * [js/web] fix karma launch with chrome headless (#8998) * Update Nuget Packge Pipline to CUDA11.4 and TensorRT8 on Windows (#9000) * Update to CUDA11.4 and TensorRT-8.0.3.4 * update trt pool, remove cudnn from setup_env_gpu.bat * revert pool * test gpu package pipeline on t4 * back out changes * back out changes Co-authored-by: George Wu <jywu@microsoft.com> * Fix fuzz testing build blocking release. (#9008) * add model local function support (#8540) * updates for picking pnnx commit * add tests filter to c# tests * plus test fixes * fix versioning for contrib ops * fix tests * test filter for optional ops * more versioning related updates * fix test * fix layernorm spec * more updates * update docs * add more test filters * more filters * update binary size threshold * update docs * draft - enable model local function * enable model local functions in ORT * update to latest rel onnx commit * plus tests * plus more updates * plus updates * test updates * Fix for nested functions + shape inference * plus bug fix and updates per review * plus fixes per review * plus test updates * plus updates per review * plus fixes * fix a test Co-authored-by: Vincent Wang <wangwchpku@outlook.com> Co-authored-by: baijumeswani <bmeswani@microsoft.com> Co-authored-by: pengwa <pengwa@microsoft.com> Co-authored-by: Wei-Sheng Chin <wschin@outlook.com> Co-authored-by: stevenlix <38092805+stevenlix@users.noreply.github.com> Co-authored-by: Yulong Wang <yulongw@microsoft.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> Co-authored-by: George Wu <jywu@microsoft.com> Co-authored-by: Pranav Sharma <prs@microsoft.com> Co-authored-by: Ashwini Khade <askhade@microsoft.com>

stevenlix added 3 commits September 7, 2021 20:05

fix big engine load issue and add cuda_cpu_alloc

071089b

remove redundancy

38827ae

fix minor issues

86b950b

stevenlix requested a review from jywu-msft September 8, 2021 03:29

stevenlix requested a review from a team as a code owner September 8, 2021 03:29

stevenlix added the release:1.9 label Sep 8, 2021

jywu-msft reviewed Sep 8, 2021

View reviewed changes

jywu-msft approved these changes Sep 8, 2021

View reviewed changes

stevenlix merged commit 1c872f9 into master Sep 8, 2021

stevenlix deleted the stevenlix/trtengine branch September 8, 2021 17:28

faxu added the triage:approved label Sep 9, 2021

wangyems pushed a commit that referenced this pull request Sep 9, 2021

Fix issues in TensorRT EP (#8996)

b9e3b5c

* fix big engine load issue and add cuda_cpu_alloc * remove redundancy * fix minor issues

wangyems removed the release:1.9 label Sep 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issues in TensorRT EP#8996

Fix issues in TensorRT EP#8996
stevenlix merged 3 commits intomasterfrom
stevenlix/trtengine

stevenlix commented Sep 8, 2021

Uh oh!

jywu-msft Sep 8, 2021

Uh oh!

stevenlix Sep 8, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

stevenlix commented Sep 8, 2021

Uh oh!

jywu-msft Sep 8, 2021

Choose a reason for hiding this comment

Uh oh!

stevenlix Sep 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

stevenlix Sep 8, 2021 •

edited

Loading