Skip to content

Fix issues in TensorRT EP#8996

Merged
stevenlix merged 3 commits intomasterfrom
stevenlix/trtengine
Sep 8, 2021
Merged

Fix issues in TensorRT EP#8996
stevenlix merged 3 commits intomasterfrom
stevenlix/trtengine

Conversation

@stevenlix
Copy link
Contributor

  1. Fix bad_alloc issue when big TRT engine is loaded in TRT EP.
    The issue happened in QDQ BERT model, where TRT engine size is bigger than 3GB.
  2. Add cuda_cpu_allocator for OrtMemTypeCPUInput
    The issue was seen when memcpy node is inserted before TRT kernel.

@stevenlix stevenlix requested a review from jywu-msft September 8, 2021 03:29
@stevenlix stevenlix requested a review from a team as a code owner September 8, 2021 03:29
trt_state->context->reset();
trt_state->engine->reset();
*(trt_state->engine) = tensorrt_ptr::unique_pointer<nvinfer1::ICudaEngine>(trt_state->runtime->deserializeCudaEngine(engine_buf.get(), engine_size, nullptr));
LOGS_DEFAULT(VERBOSE) << "[TensorRT EP] DeSerialized " + engine_cache_path;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you intend to remove this verbose log line?

Copy link
Contributor Author

@stevenlix stevenlix Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is duplicated with line 1495

@stevenlix stevenlix merged commit 1c872f9 into master Sep 8, 2021
@stevenlix stevenlix deleted the stevenlix/trtengine branch September 8, 2021 17:28
wangyems pushed a commit that referenced this pull request Sep 9, 2021
* fix big engine load issue and add cuda_cpu_alloc

* remove redundancy

* fix minor issues
wangyems added a commit that referenced this pull request Sep 9, 2021
* fast reduction for reducemean (#8976)

* Adding preprocessor checks for torch version during torch cpp extensions compilation (#8989)

* custom autograd func memory refinement  (#8993)

* Release torch tensor referenced by torch gradient graph (created in PythonOp)

* Update orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/torch_interop_utils/torch_interop_utils.cc

* refine with comments

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>

* Fix issues in TensorRT EP (#8996)

* fix big engine load issue and add cuda_cpu_alloc

* remove redundancy

* fix minor issues

* [js/web] fix karma launch with chrome headless (#8998)

* Update Nuget Packge Pipline to CUDA11.4 and TensorRT8 on Windows (#9000)

* Update to CUDA11.4 and TensorRT-8.0.3.4

* update trt pool, remove cudnn from setup_env_gpu.bat

* revert pool

* test gpu package pipeline on t4

* back out changes

* back out changes

Co-authored-by: George Wu <jywu@microsoft.com>

* Fix fuzz testing build blocking release. (#9008)

* add model local function support (#8540)

* updates for picking pnnx commit

* add tests filter to c# tests

* plus test fixes

* fix versioning for contrib ops

* fix tests

* test filter for optional ops

* more versioning related updates

* fix test

* fix layernorm spec

* more updates

* update docs

* add more test filters

* more filters

* update binary size threshold

* update docs

* draft - enable model local function

* enable model local functions in ORT

* update to latest rel onnx commit

* plus tests

* plus more updates

* plus updates

* test updates

* Fix for nested functions + shape inference

* plus bug fix and updates per review

* plus fixes per review

* plus test updates

* plus updates per review

* plus fixes

* fix a test

Co-authored-by: Vincent Wang <wangwchpku@outlook.com>
Co-authored-by: baijumeswani <bmeswani@microsoft.com>
Co-authored-by: pengwa <pengwa@microsoft.com>
Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
Co-authored-by: stevenlix <38092805+stevenlix@users.noreply.github.com>
Co-authored-by: Yulong Wang <yulongw@microsoft.com>
Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com>
Co-authored-by: George Wu <jywu@microsoft.com>
Co-authored-by: Pranav Sharma <prs@microsoft.com>
Co-authored-by: Ashwini Khade <askhade@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants