[To Disable LayerFusion] add copyOutputToHost function for only selected outputs #495

cathy-kim · 2020-04-15T01:52:36Z

When there is an bug(#380) in layer fusion or you just want to debug layer fusions,
using mark_output is the only solution to disable layer fusion now(#252).
https://github.com/NVIDIA/TensorRT/issues/252#issuecomment-577468499

But the latency increases because to use copyOutputToHost brings unintended outputs from device to host.
https://github.com/NVIDIA/TensorRT/issues/252#issuecomment-593716265

Therefore, added two functions in buffers.h to bring the selected outputs only(by index) to host.

nvpohanh · 2023-03-16T05:49:14Z

samples/common/buffers.h

+    //!
+    //! \brief Copy the selected contents of output device buffers by bidingIdx to output host buffers synchronously.
+    //!
+    void copyOutputToHost(const std::vector<int>& bindingIndices) 


Thanks for the contribution!

To follow our coding convention, it would be great if you can make the following modifications:

Use east-const style (i.e. const std::vector<int>& -> std::vector<int> const&) for all occurrences of const.

Use int32_t instead of int.

Use 4-whitespace indentation instead of 2-whitespace indentation.

We appreciate it!

nvpohanh · 2023-03-16T05:49:25Z

samples/common/buffers.h

+          const cudaMemcpyKind memcpyType = cudaMemcpyDeviceToHost;
+          if (!mEngine->bindingIsInput(i))
+          {
+            CHECK(cudaMemcpy(dstPtr, srcPtr, byteSize, memcpyType, stream));


cudaMemcpy should be cudaMemcpyAsync

nvpohanh · 2023-03-16T05:50:06Z

samples/common/buffers.h

+    { 
+        for (int i : bindingIndices)
+        {
+          void* dstPtr = mManagedBuffers[i]->hostBuffer.data();


Could we move these inside the if(){....} clause?

add copyOutputToHost function for only selected outputs

a70f4c0

rajeevsrao assigned nvpohanh Mar 16, 2023

rajeevsrao added Samples Optimization labels Mar 16, 2023

rajeevsrao changed the base branch from master to release/8.6 March 16, 2023 05:41

nvpohanh requested changes Mar 16, 2023

View reviewed changes

rajeevsrao force-pushed the release/8.6 branch from c8d112b to c46089f Compare March 17, 2023 04:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[To Disable LayerFusion] add copyOutputToHost function for only selected outputs #495

[To Disable LayerFusion] add copyOutputToHost function for only selected outputs #495

cathy-kim commented Apr 15, 2020 •

edited

Loading

nvpohanh Mar 16, 2023

nvpohanh Mar 16, 2023

nvpohanh Mar 16, 2023

[To Disable LayerFusion] add copyOutputToHost function for only selected outputs #495

Are you sure you want to change the base?

[To Disable LayerFusion] add copyOutputToHost function for only selected outputs #495

Conversation

cathy-kim commented Apr 15, 2020 • edited Loading

nvpohanh Mar 16, 2023

Choose a reason for hiding this comment

nvpohanh Mar 16, 2023

Choose a reason for hiding this comment

nvpohanh Mar 16, 2023

Choose a reason for hiding this comment

cathy-kim commented Apr 15, 2020 •

edited

Loading