You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the "Eval" function of the neuron_delegate_kernel.cc file, we observed that there are two memcpy operations performed during the execution of the code. The first occurrence is at line 1237, where data is copied from the tensor to the hardware buffer. The second occurrence is at line 1363, where the inference result is copied back to the tensor. However, we noticed that there are additional copies required before and after invoking the TensorFlow Lite "Invoke" function, and it seems that these two copies are somewhat redundant. By commenting out these two copy operations, we found that the CPU utilization decreases by half, and power consumption is also reduced. We would like to know if it is possible to optimize this part and eliminate these unnecessary copies.
The text was updated successfully, but these errors were encountered:
In the "Eval" function of the neuron_delegate_kernel.cc file, we observed that there are two memcpy operations performed during the execution of the code. The first occurrence is at line 1237, where data is copied from the tensor to the hardware buffer. The second occurrence is at line 1363, where the inference result is copied back to the tensor. However, we noticed that there are additional copies required before and after invoking the TensorFlow Lite "Invoke" function, and it seems that these two copies are somewhat redundant. By commenting out these two copy operations, we found that the CPU utilization decreases by half, and power consumption is also reduced. We would like to know if it is possible to optimize this part and eliminate these unnecessary copies.
Thanks for reporting this. Yes, we were aware that there are "extra" memory copies because of some legacy internal memory management issues. We are working on it.
I tried an optimization approach that involved replacing the actual data copy with a pointer copy, which avoided large-scale data copying while ensuring correct results. This resulted in a decrease in CPU utilization, but there was no significant reduction in power consumption, and the phone still experienced significant heating. Since our project is sensitive to power consumption, I would like to know if there are any further methods to optimize power consumption.
In the "Eval" function of the neuron_delegate_kernel.cc file, we observed that there are two memcpy operations performed during the execution of the code. The first occurrence is at line 1237, where data is copied from the tensor to the hardware buffer. The second occurrence is at line 1363, where the inference result is copied back to the tensor. However, we noticed that there are additional copies required before and after invoking the TensorFlow Lite "Invoke" function, and it seems that these two copies are somewhat redundant. By commenting out these two copy operations, we found that the CPU utilization decreases by half, and power consumption is also reduced. We would like to know if it is possible to optimize this part and eliminate these unnecessary copies.
The text was updated successfully, but these errors were encountered: