Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detection boxes are carried over when inferenceing multiple videos #417

Closed
ulrichMarco opened this issue Jul 9, 2021 · 9 comments
Closed
Assignees
Labels

Comments

@ulrichMarco
Copy link

OS: Ubuntu 20.04
Edgetpu Device: Coral USB Accellerator
Python: 3.8.5
tflite runtime: 2.5.0

Hey,
i encountered a kinda strange behaviour when trying to inference several videos/video streams. A detection box that is detectet in one frame on one image is carried over to the frame of an other video and is displayed there as well. When processing only one video, everything is fine.
Example of my error:
error_example.zip

here is a minimal working code example of what im doing:
detect.py:

import cv2
import numpy as np
import tflite_runtime.interpreter as tflite

def post (index, output_data, image):
    boxes = interpreter.get_tensor(output_details[0]['index'])
    classes = interpreter.get_tensor(output_details[1]['index'])
    scores = interpreter.get_tensor(output_details[2]['index'])
    num = interpreter.get_tensor(output_details[3]['index'])
    
    image_h, image_w, image_c = image.shape
    for i, score in enumerate(scores[0]):
      if score >= 0.4 and score <= 1:
        (ymin, xmin, ymax, xmax) = boxes[0][i]
        (left, right, top, bottom) = (xmin*image_w, xmax*image_w, ymin*image_h, ymax*image_h)
        (left, right, top, bottom) = (int(left), int(right), int(top), int(bottom))
        image = cv2.rectangle(image, (int(left), int(top)), (int(right), int(bottom)), colors[int(classes[0][i])], 3)
        image = cv2.putText(image, '%s  %.2f' % (labels[int(classes[0][i])].split()[1], scores[0][i]), (left + 10, top + 30), cv2.FONT_HERSHEY_SIMPLEX, 1, colors[int(classes[0][i])], 2, cv2.LINE_AA) 
        scores[0][i]=0     
    image = cv2.resize(image,(640,360))  
    cv2.imshow(str(index),image)

def read_lines_from_file(filename):
  with open(filename, 'r') as f:
    return [line.strip() for line in f.readlines()]


input_adresses = read_lines_from_file('videos.txt')
labels = read_lines_from_file("labels.txt")
colors= [(255,0,0),(255,255,0),(0,255,0),(0,255,255),(0,0,255),(255,0,255)]
caps = []

for i, input in enumerate(input_adresses):
    cap = cv2.VideoCapture(input)
    caps.append(cap)
    
interpreter = tflite.Interpreter('output_tflite_graph_edgetpu6.tflite', experimental_delegates =[tflite.load_delegate('libedgetpu.so.1')])
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

in_height = input_details[0]['shape'][1]
in_width = input_details[0]['shape'][2]
    
while 1:
   for i,cap in enumerate(caps):
     ret0, img0 = cap.read()
     if ret0:
       img1 = cv2.resize(img0,(in_width,in_height))
       #cv2.imshow(str(i),img1)
     
       input_data = np.expand_dims(img1, axis=0)
       interpreter.set_tensor(input_details[0]['index'], input_data)
       interpreter.invoke()
       output_data = interpreter.get_tensor(output_details[0]['index'])
       post(i, output_data, img0)
     
   if cv2.waitKey(5) & 0xFF == ord('q'):
      break

code ready to use with some example videos avaialable at https://drive.google.com/file/d/1tlWngd_hWXsWBpjk3Q6ST-3EIJHSOe0i/view?usp=sharing just download it and execute the detect.py

the code is meant to proccss multiple rtsp streams with always the newest frame(not implemented in the error example, to reduce confusion), but that means i cant just process the videos one after one.

is there a way to clear the old detections before going to a new frame, without the need to reload the model into the USB Accelerator

@hjonnala hjonnala self-assigned this Jul 9, 2021
@hjonnala
Copy link
Contributor

Hi, does adding this line interpreter.reset_all_variables() in while loop after post function call would solve your issue?

while 1:
   for i,cap in enumerate(caps):
     ret0, img0 = cap.read()
     if ret0:
       img1 = cv2.resize(img0,(in_width,in_height))
       #cv2.imshow(str(i),img1)
     
       input_data = np.expand_dims(img1, axis=0)
       interpreter.set_tensor(input_details[0]['index'], input_data)
       interpreter.invoke()
       output_data = interpreter.get_tensor(output_details[0]['index'])
       post(i, output_data, img0)
       interpreter.reset_all_variables()

@ulrichMarco
Copy link
Author

Hi, sadly this doesnt resolve the problem. Detections are still carried over to multiple frames.

@ulrichMarco
Copy link
Author

@hjonnala
anything new about this issue?
Do you see similar results when you run the code, just to confirm that this isnt due defect hardware or anything else on my side?

@hjonnala
Copy link
Contributor

hjonnala commented Jul 16, 2021

Hi,

I also see the similar results. But I don't think so it is due to hardware. It is more like opencv issue.

For example you can refer to these stack overflow issues.
https://stackoverflow.com/questions/56002672/display-an-image-over-another-image-at-a-particular-co-ordinates-in-opencv

https://answers.opencv.org/question/173100/imshow-sometimes-dont-clear-previous-image/

@ulrichMarco
Copy link
Author

@hjonnala
i dont think its an opencv issue. I use opencv only for image reading drawing the detection boxes and showing the output images. If it would be an opencv issue i wouldnt get the exact same box count when i print out the number of detection boxes directly from the output data like so:

count = 0
for i, score in enumerate(scores[0]):
  if score >= 0.4 and score <= 1:
    count = count +1
    (ymin, xmin, ymax, xmax) = boxes[0][i]
    (left, right, top, bottom) = (xmin*image_w, xmax*image_w, ymin*image_h, ymax*image_h)
    (left, right, top, bottom) = (int(left), int(right), int(top), int(bottom))
    image = cv2.rectangle(image, (int(left), int(top)), (int(right), int(bottom)), colors[int(classes[0][i])], 3)
    image = cv2.putText(image, '%s  %.2f' % (labels[int(classes[0][i])].split()[1], scores[0][i]), (left + 10, top + 30), cv2.FONT_HERSHEY_SIMPLEX, 1, colors[int(classes[0][i])], 2, cv2.LINE_AA) 
    scores[0][i]=0  
print ("Boxes:" + str(count))   
image = cv2.resize(image,(640,360)) 
cv2.imshow(str(index),image)
cv2.waitKey()

@hjonnala
Copy link
Contributor

Hmm..I have also tried cv2.destroyWindow and it is not solving the issue.
Will let you know if I find a way to clear the old detections before going to a new frame.

@ulrichMarco
Copy link
Author

Solved this issue by rewriting the complete project in c++.

@hjonnala
Copy link
Contributor

Awesome, what do you think that is causing the issue with python. Is it tflite_runtime or opencv or something else?

@ulrichMarco
Copy link
Author

I dont think it was opencv, images looked good before feeding them into the interpreter and the drawed boxes matched the one the interpreter returned.
So i assume it might be some tflite_runtime issue not clearing the interpreter data correctly before the next image. But im not that deep into that to be sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants