Skip to content
This repository has been archived by the owner on Aug 22, 2024. It is now read-only.

body track v1.1.0 has slow memory leak with Tensor and Tensor-fp16 runtimes #1576

Closed
diablodale opened this issue Apr 20, 2021 · 10 comments
Closed
Assignees
Labels
Body Tracking Issue related to the Body Tracking SDK Bug Something isn't working External Issue Known problem with usage from an external dependency

Comments

@diablodale
Copy link

diablodale commented Apr 20, 2021

I measure a slow memory leak when BodyTrack v1.1.0 is used with the tensor and tensor+fp16 runtimes.
It is approximately 1GB every 45 minutes of continuous 30fps use.

Setup

  • Microsoft Windows [Version 10.0.19042.928]
  • Intel(R) Core(TM) i7-10875H CPU @ 2.30GHz, 8c/16t
  • NVIDIA GeForce RTX 2070 Super w/ 8GB, and integrated Intel UHD Graphics
  • NVIDIA graphics driver: Studio 462.31
  • 32GB ram, 1tb disk
  • VS2019 Community v16.9.3
  • K4A Sensor SDK v1.4.1 and BodyTrack v1.1.0

Repo

  1. Make an app that pulls/gets at 30fps all this data simultaneously: NFOV_UNBINNED depth, 4K color, ir, bodyindex, and skeletons with the regular non-lite onnx model.
  2. Enable the app for easy switching between all the runtimes: cuda, directml, tensor, and tensor+fp16
  3. Get all of the buffers and body data. Display something if you want.
  4. Make a release build
  5. Choose CUDA runtime.
  6. Run it continuously at 30fps for 90 minutes and watch Task Manager/performance/memory/in-use
  7. Repeat previous two steps for all the runtimes. Ensure DirectML uses the NVIDIA card.

Result

  • Tensor and tensor+fp16 leak about 1GB every 45 minutes. The memory in-use gradually increases. In my test of
    tensor+fp16 for 1.5 hours, I also noticed the fps decreased from 30fps->27fps with a noticeable multiple frame latency.
  • CUDA shows no significant leak. In my test it was less than 0.1 GB which in this inexact test is insignificant.
  • DirectML (with nvidia card) shows no significant leak.
Runtime 0 minutes 15 minutes 30 minutes 45 minutes 90 minutes
cuda 12.5   12.6    
directml (with nvidia card) 12.8 12.8 12.8    
tensor 12.9 13.4   14.1  
tensor+fp16 13.2       15.9

image

Expected

No leaks.

@diablodale diablodale added Body Tracking Issue related to the Body Tracking SDK Bug Something isn't working Triage Needed The Issue still needs to be reviewed by Azure Kinect team members. labels Apr 20, 2021
@qm13 qm13 self-assigned this Apr 22, 2021
@qm13 qm13 added External Issue Known problem with usage from an external dependency and removed Triage Needed The Issue still needs to be reviewed by Azure Kinect team members. labels Apr 22, 2021
@diablodale
Copy link
Author

diablodale commented Apr 22, 2021

Is it known if the leak is in ONNX code (Microsoft, opensource-ish on github), TensorRT open-source code on github, or TensorRT closed-source code owned by NVIDIA?

@qm13
Copy link
Collaborator

qm13 commented Apr 23, 2021

Checking with the ONNX team.

@stevenlix
Copy link

Could you try OnnxRuntime master? We've fixed a couple of memory leak issues lately.

@diablodale
Copy link
Author

diablodale commented Apr 23, 2021

I (a customer) am not able. The Azure Kinect Body Tracking code is closed-sourced and I have no source code or build scripts in which to compile any part of it. It is 2.5GB spread across 19 files.

Three of the files are named onnxruntime.dll, onnxruntime_providers_shared.dll, onnxruntime_providers_tensorrt.dll. They appear to be custom compiles with no version stamps. If someone from Microsoft shares instructions, I can likely build it and try.

@Chris45215
Copy link

The multiple-frame latency when body tracking is something I have reported several times. There is a way to remove one of those frames, but at 30FPS there are two more frames of latency that I have not been able to remove. I recorded a video documenting it, showing that even in the 5FPS mode there are 2 frames of latency, at https://www.youtube.com/watch?v=7Jc7KhoPWdc. I consider that video to be indisputable.

To remove one of the extra frames, you need to deviate from Microsoft's example code. Microsoft's example code for realtime, lowest-latency-possible output has an error that causes it to buffer an extra frame and thus always be 1 frame further behind than it could be. I provide a fuller description at #816 (comment), along with a fix for the Unity/C# version of the code. In essence, regarding the code line:
k4a_wait_result_t pop_frame_result = k4abt_tracker_pop_result(tracker, &body_frame, 0);

you need to run that line an additional time (potentially several extra times) when you first start the program, as the queue actually gets 2 frames when it is first populated. If you don't clear out the extra frame(s), you're always pulling a slightly-stale frame from the queue. That's not to mention any other frames that may accumulate in the queue over the course of the program. I suggest putting that all inside a While loop and always checking for additional frames every iteration, just to be sure.

@qm13
Copy link
Collaborator

qm13 commented May 25, 2021

We have verified that the memory leak is in the ONNX Runtime. We have also verified that it is fixed in master. ORT v1.8 is expected to release in June and we plan to update the BT SDK to use ORT v1.8.

@diablodale
Copy link
Author

No solution/alternative/fix/update to resolve the proven memory leak has been provided by Microsoft. This issue has been prematurely closed by the Kinect and Azure teams at Microsoft.

I will not be tracking this issue anymore due to 16 months of no progress or updates to the Kinect Azure platform.

@qm13 qm13 reopened this Oct 7, 2021
@qm13
Copy link
Collaborator

qm13 commented Oct 7, 2021

@diablodale yes, we have a solution. Working on releasing it. Sorry, did not mean to close this issue prematurely.

@qm13
Copy link
Collaborator

qm13 commented Mar 21, 2022

Fixed in v1.1.1

@qm13 qm13 closed this as completed Mar 21, 2022
@diablodale
Copy link
Author

diablodale commented May 25, 2022

@qm13, Fixed in what version? For customers to verify unseeable code changes, we need to know the exact version in which is was potentially fixed so that customers can verify, escalate non-fix, etc.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Body Tracking Issue related to the Body Tracking SDK Bug Something isn't working External Issue Known problem with usage from an external dependency
Projects
None yet
Development

No branches or pull requests

4 participants