Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.12.0: memory leak in TFLite's tflite::Interpreter::Invoke() #66736

Closed
gestalone opened this issue Apr 30, 2024 · 12 comments
Closed

2.12.0: memory leak in TFLite's tflite::Interpreter::Invoke() #66736

gestalone opened this issue Apr 30, 2024 · 12 comments
Assignees
Labels
comp:lite TF Lite related issues TF 2.12 For issues related to Tensorflow 2.12 TFLiteGpuDelegate TFLite Gpu delegate issue type:bug Bug

Comments

@gestalone
Copy link

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

binary

TensorFlow version

2.12.0

Custom code

Yes

OS platform and distribution

Cross-build from 'Windows:x86_64' to 'Android:armv8'

Mobile device

Android with Snapdragon 820

Python version

No response

Bazel version

No response

GCC/compiler version

CXX compiler identification is Clang 14.0.7

CUDA/cuDNN version

no

GPU model and memory

Snapdragon 820 with Adreno 530

Current behavior?

Running the invoke for a tflite model using the gpu delegate, with opencl backend.
It goes fast and well, the problem is that exist a memory leak, that is increasing, not sure how to fix it. Not sure if it's an error on the opencl implementation, on the drivers of the adreno gpu or in the delegate implementation.
image

Standalone code to reproduce the issue

#include <tensorflow/lite/model.h>
#include <tensorflow/lite/interpreter.h>
#include <tensorflow/lite/delegates/gpu/delegate.h>
#include <tensorflow/lite/c/common.h>


std::unique_ptr<tflite::FlatBufferModel> m_model;
std::unique_ptr<tflite::Interpreter> m_interpreter;

m_model = tflite::FlatBufferModel::BuildFromBuffer(m_modelName, m_bufferSize);

tflite::ops::builtin::BuiltinOpResolver resolver;
tflite::InterpreterBuilder(*m_model.get(), resolver)(&m_interpreter);

auto delegategpu = tflite::Interpreter::TfLiteDelegatePtr(TfLiteGpuDelegateV2Create(&gpu_options), &TfLiteGpuDelegateV2Delete);

m_interpreter->ModifyGraphWithDelegate(std::move(delegategpu))


m_interpreter->AllocateTensors()
for(int i = 0; i< 1000 ;i ++)
{
m_interpreter->Invoke() != TfLiteStatus::kTfLiteOk)
}

Relevant log output

No response

@gestalone
Copy link
Author

I am able to reproduce this in the benchmark android_aarch64_benchmark_model of the tf nightly build

@gestalone
Copy link
Author

@sushreebarsa can you take a look?

@sushreebarsa sushreebarsa added comp:lite TF Lite related issues TF 2.12 For issues related to Tensorflow 2.12 labels May 6, 2024
@sushreebarsa
Copy link
Contributor

@gestalone Could you please try to upgrade to the latest TF version as memory leak issues are often addressed in subsequent versions. Kindly let us know if it is appearing in the latest and try to explore if your stable delegate library supports alternative back-ends besides OpenCL?
Thank you!

@sushreebarsa sushreebarsa added the stat:awaiting response Status - Awaiting response from author label May 8, 2024
@gestalone
Copy link
Author

Hi @sushreebarsa, I was able to reproduce it with the benchmark 2.16 version. I tried the opengl backend of the gpu delegate, but unfortunately is not working.
it's quite easy to test, just use the android benchmark, with any tensorflow approved model using the gpu delegate.
The tutorial can be followed from here.
https://www.tensorflow.org/lite/performance/measurement
I think should be fixed, as memory leaks give quite a lot of issues.

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label May 13, 2024
@gestalone
Copy link
Author

@sawantkumar, any help here?

@sawantkumar
Copy link

Hi @gestalone ,

Sure thing, let me replicate your issue . However gpu delegates today primarily use "openCL" as their backend instead of "openGL" . I will get back to you .

@gestalone
Copy link
Author

gestalone commented May 21, 2024

Hi @sawantkumar!
The issue I had was using opencl backend, i was not able to use opengl

@sawantkumar
Copy link

sawantkumar commented May 23, 2024

Hi @gestalone ,

I used "android_aarch64_benchmark_model" on pixel 6a to test a tflite model using the below command

adb shell am start -S \
  -n org.tensorflow.lite.benchmark/.BenchmarkModelActivity \
  --es args '"--graph=/data/local/tmp/efficientnet.tflite \
              --use_gpu=true \
              --num_runs=1000\
              --report_peak_memory_footprint=true\
            —max_secs=30\
                —gpu_backend= “cl”\
        --num_threads=4"'
Screenshot 2024-05-23 at 2 15 45 PM

I used android stuido pofiler to check the memory used by the "tflite benchmark activity " process and it didn't show any memory leaks . The memory usage spiked up to 130 MB when using the benchmark tool but it came back to normal once the benchmarking was complete. Can you please try out your code on a different phone and let me know if you are able to replicate this issue on a different phone. Also if possible , can you provide your tflite model for easier debugging for me.

@sawantkumar sawantkumar added TFLiteGpuDelegate TFLite Gpu delegate issue stat:awaiting response Status - Awaiting response from author labels May 23, 2024
@gestalone
Copy link
Author

image
image

Hi! I am using a different device and I am not able to reproduce, then I guess is device related. MAybe the opencl drivers? I will check.
both run with the same command:
image

@google-ml-butler google-ml-butler bot removed the stat:awaiting response Status - Awaiting response from author label May 23, 2024
@gestalone
Copy link
Author

I was looking a bit and I found this:
https://developer.qualcomm.com/forum/qdn-forums/software/adreno-gpu-sdk/35473
for the model that i Found, I will try to change the buildoptions of opencl for adreno 530.
I will let you know

@gestalone
Copy link
Author

@sawantkumar Hi!
I found the issue. Seems the snapdragon profiler program that i use to check the memory make a bad interaction with the opencl thing. I check it with a different method to get the memory and is not reproducible. Then I guess the issue was to run the benchmark and my code with the snapdragon profiler.
really strange interaction, should be reporte to qualcomm.

Best you can close and thanks for all the help

Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:lite TF Lite related issues TF 2.12 For issues related to Tensorflow 2.12 TFLiteGpuDelegate TFLite Gpu delegate issue type:bug Bug
Projects
None yet
Development

No branches or pull requests

3 participants