Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reduce the time running invoke() #68424

Closed
abichoi opened this issue May 22, 2024 · 4 comments
Closed

How to reduce the time running invoke() #68424

abichoi opened this issue May 22, 2024 · 4 comments
Assignees
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.15 For issues related to 2.15.x type:support Support issues

Comments

@abichoi
Copy link

abichoi commented May 22, 2024

Issue type

Support

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

tf 2.15

Custom code

Yes

OS platform and distribution

arm64 Debian GNU/Linux 11

Mobile device

No response

Python version

3.9.2

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

I have converted ssd mobilenet v2 320 x320 and ssd mobilenet v2 fpnlite 640 x 640 into tflite models with iou = 0.3 for both of them. On a raspberry pi 4B 4GB, I found out that invoke() takes 0.22 seconds to run for ssd mobilenet v2 320 x 320 and 2.5 seconds to run for ssd mobilenet v2 fpnlite 640 x 640.
Is there a way to reduce the time needed for invoke() to run?

Standalone code to reproduce the issue

import numpy as np
import tensorflow as tf
import cv2
import time

model_name = "ssd_mobilenet_v2_fpnlite_640x640_iou03" #conf 0.3
model_path = "MODEL_PATH"

interpreter = tf.lite.Interpreter(model_path=model_path)

# Get model details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]

floating_model = (input_details[0]['dtype'] == np.float32)

input_mean = 127.5
input_std = 127.5

interpreter.allocate_tensors()

min_conf_threshold = 0.3
image_path = "IMAGE_PATH"
frame1=cv2.imread(image_path)
imH, imW, channels = frame1.shape
frame=frame1.copy()
frame_rgb = frame
frame_resized = cv2.resize(frame_rgb, (width, height))

input_data = np.expand_dims(frame_resized, axis=0)
floating_model = (input_details[0]['dtype'] == np.float32)
if floating_model:
  input_data = (np.float32(input_data) - input_mean) / input_std

# Perform the actual detection by running the model with the image as input
interpreter.set_tensor(input_details[0]['index'],input_data)
invoke_start = time.time()
interpreter.invoke()
print(time.time()-invoke_start)

# Retrieve detection results
boxes = interpreter.get_tensor(output_details[1]['index'])[0] # Bounding box coordinates of detected objects
classes = interpreter.get_tensor(output_details[3]['index'])[0] # Class index of detected objects
scores = interpreter.get_tensor(output_details[0]['index'])[0] # Confidence of detected objects

for i in range(len(scores)):
    if ((scores[i] > min_conf_threshold) and (scores[i] <= 1.0)):
        ymin = int(max(1,((boxes[i][0]* imH))))
        xmin = int(max(1,(boxes[i][1] * imW)))
        ymax = int(min(imH,((boxes[i][2]* imH))))
        xmax = int(min(imW,(boxes[i][3] * imW)))

        cv2.rectangle(frame, (xmin,ymin), (xmax,ymax), (10, 255, 0), 2)

cv2.imshow(frame)

Relevant log output

No response

@sushreebarsa
Copy link
Contributor

@abichoi The significant difference in inference time between your two models is expected. Kindly follow the process below to reduce the invoke() time for the SSD Mobilenet v2 FPNLite 640x640 model on your Raspberry Pi 4B:

Please enable NNAPI acceleration by setting the target option in the Interpreter creation:


interpreter = tf.lite.Interpreter(model_path="ssd_mobilenet_v2_fpnlite_640x640.tflite", experimental_options=tf.lite.InterpreterOptions(target=tf.lite.REFERRED_TARGETS[0]))
interpreter.allocate_tensors()

Thank you!

@sushreebarsa sushreebarsa added stat:awaiting response Status - Awaiting response from author TF 2.15 For issues related to 2.15.x labels May 28, 2024
Copy link

github-actions bot commented Jun 5, 2024

This issue is stale because it has been open for 7 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label Jun 5, 2024
Copy link

This issue was closed because it has been inactive for 7 days since being marked as stale. Please reopen if you'd like to work on this further.

Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 2.15 For issues related to 2.15.x type:support Support issues
Projects
None yet
Development

No branches or pull requests

2 participants