Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval function has two redundant memcpy #15

Open
ygch opened this issue Aug 4, 2023 · 2 comments
Open

Eval function has two redundant memcpy #15

ygch opened this issue Aug 4, 2023 · 2 comments

Comments

@ygch
Copy link

ygch commented Aug 4, 2023

In the "Eval" function of the neuron_delegate_kernel.cc file, we observed that there are two memcpy operations performed during the execution of the code. The first occurrence is at line 1237, where data is copied from the tensor to the hardware buffer. The second occurrence is at line 1363, where the inference result is copied back to the tensor. However, we noticed that there are additional copies required before and after invoking the TensorFlow Lite "Invoke" function, and it seems that these two copies are somewhat redundant. By commenting out these two copy operations, we found that the CPU utilization decreases by half, and power consumption is also reduced. We would like to know if it is possible to optimize this part and eliminate these unnecessary copies.

@freedomtan
Copy link
Collaborator

In the "Eval" function of the neuron_delegate_kernel.cc file, we observed that there are two memcpy operations performed during the execution of the code. The first occurrence is at line 1237, where data is copied from the tensor to the hardware buffer. The second occurrence is at line 1363, where the inference result is copied back to the tensor. However, we noticed that there are additional copies required before and after invoking the TensorFlow Lite "Invoke" function, and it seems that these two copies are somewhat redundant. By commenting out these two copy operations, we found that the CPU utilization decreases by half, and power consumption is also reduced. We would like to know if it is possible to optimize this part and eliminate these unnecessary copies.

Thanks for reporting this. Yes, we were aware that there are "extra" memory copies because of some legacy internal memory management issues. We are working on it.

@ygch
Copy link
Author

ygch commented Aug 17, 2023

I tried an optimization approach that involved replacing the actual data copy with a pointer copy, which avoided large-scale data copying while ensuring correct results. This resulted in a decrease in CPU utilization, but there was no significant reduction in power consumption, and the phone still experienced significant heating. Since our project is sensitive to power consumption, I would like to know if there are any further methods to optimize power consumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants