How to reduce the time running invoke() #68424

abichoi · 2024-05-22T12:35:35Z

Issue type

Support

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

tf 2.15

Custom code

Yes

OS platform and distribution

arm64 Debian GNU/Linux 11

Mobile device

No response

Python version

3.9.2

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

I have converted ssd mobilenet v2 320 x320 and ssd mobilenet v2 fpnlite 640 x 640 into tflite models with iou = 0.3 for both of them. On a raspberry pi 4B 4GB, I found out that invoke() takes 0.22 seconds to run for ssd mobilenet v2 320 x 320 and 2.5 seconds to run for ssd mobilenet v2 fpnlite 640 x 640.
Is there a way to reduce the time needed for invoke() to run?

Standalone code to reproduce the issue

import numpy as np
import tensorflow as tf
import cv2
import time

model_name = "ssd_mobilenet_v2_fpnlite_640x640_iou03" #conf 0.3
model_path = "MODEL_PATH"

interpreter = tf.lite.Interpreter(model_path=model_path)

# Get model details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]

floating_model = (input_details[0]['dtype'] == np.float32)

input_mean = 127.5
input_std = 127.5

interpreter.allocate_tensors()

min_conf_threshold = 0.3
image_path = "IMAGE_PATH"
frame1=cv2.imread(image_path)
imH, imW, channels = frame1.shape
frame=frame1.copy()
frame_rgb = frame
frame_resized = cv2.resize(frame_rgb, (width, height))

input_data = np.expand_dims(frame_resized, axis=0)
floating_model = (input_details[0]['dtype'] == np.float32)
if floating_model:
  input_data = (np.float32(input_data) - input_mean) / input_std

# Perform the actual detection by running the model with the image as input
interpreter.set_tensor(input_details[0]['index'],input_data)
invoke_start = time.time()
interpreter.invoke()
print(time.time()-invoke_start)

# Retrieve detection results
boxes = interpreter.get_tensor(output_details[1]['index'])[0] # Bounding box coordinates of detected objects
classes = interpreter.get_tensor(output_details[3]['index'])[0] # Class index of detected objects
scores = interpreter.get_tensor(output_details[0]['index'])[0] # Confidence of detected objects

for i in range(len(scores)):
    if ((scores[i] > min_conf_threshold) and (scores[i] <= 1.0)):
        ymin = int(max(1,((boxes[i][0]* imH))))
        xmin = int(max(1,(boxes[i][1] * imW)))
        ymax = int(min(imH,((boxes[i][2]* imH))))
        xmax = int(min(imW,(boxes[i][3] * imW)))

        cv2.rectangle(frame, (xmin,ymin), (xmax,ymax), (10, 255, 0), 2)

cv2.imshow(frame)

Relevant log output

No response

sushreebarsa · 2024-05-28T09:23:15Z

@abichoi The significant difference in inference time between your two models is expected. Kindly follow the process below to reduce the invoke() time for the SSD Mobilenet v2 FPNLite 640x640 model on your Raspberry Pi 4B:

Please enable NNAPI acceleration by setting the target option in the Interpreter creation:


interpreter = tf.lite.Interpreter(model_path="ssd_mobilenet_v2_fpnlite_640x640.tflite", experimental_options=tf.lite.InterpreterOptions(target=tf.lite.REFERRED_TARGETS[0]))
interpreter.allocate_tensors()

Thank you!

github-actions · 2024-06-05T01:49:33Z

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions · 2024-06-12T01:50:52Z

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

google-ml-butler · 2024-06-12T01:50:56Z

Are you satisfied with the resolution of your issue?
Yes
No

google-ml-butler bot added the type:support Support issues label May 22, 2024

google-ml-butler bot assigned sushreebarsa May 22, 2024

sushreebarsa added stat:awaiting response Status - Awaiting response from author TF 2.15 For issues related to 2.15.x labels May 28, 2024

github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jun 5, 2024

github-actions bot closed this as completed Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reduce the time running invoke() #68424

How to reduce the time running invoke() #68424

abichoi commented May 22, 2024

sushreebarsa commented May 28, 2024

github-actions bot commented Jun 5, 2024

github-actions bot commented Jun 12, 2024

google-ml-butler bot commented Jun 12, 2024

How to reduce the time running invoke() #68424

How to reduce the time running invoke() #68424

Comments

abichoi commented May 22, 2024

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

sushreebarsa commented May 28, 2024

github-actions bot commented Jun 5, 2024

github-actions bot commented Jun 12, 2024

google-ml-butler bot commented Jun 12, 2024