Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorect tracking on video file #13

Closed
obendidi opened this issue Jul 19, 2017 · 19 comments
Closed

incorect tracking on video file #13

obendidi opened this issue Jul 19, 2017 · 19 comments

Comments

@obendidi
Copy link

obendidi commented Jul 19, 2017

Hello, I've managed to run the tracker correctly with video as input, and I'm using YOLOv2 for generating box detections,but I got really bad results for the tracking, here is the snipet of the code I'm using :

metric = nn_matching.NearestNeighborDistanceMetric(
    "cosine", 0.2, 100)
tracker = Tracker(metric)
encoder = generate_detections.create_box_encoder(
        "resources/networks/mars-small128.ckpt-68577")

camera = cv2.VideoCapture(file)
while camera.isOpened():
        _, frame = camera.read()
        if frame is None:
            print ('\nEnd of Video')
            break
      h, w, _ = frame.shape
	thick = int((h + w) // 300)
##################Yolo part to generate detections #####
        detections = []
        scores = []
        boxes = self.sess.run(self.out, feed_dict)
        for b in boxes:
                left, right, top, bot, mess, max_indx, confidence = boxResults(b)
                detections.append(np.array([left,top,right-left,bot-top]))
                scores.append(float('%.2f' % confidence))
        detections = np.array(detections)
##################################################
        features = encoder(frame, detections)
        detections = [Detection(bbox, score, feature) for bbox,score, feature in
                        zip(detections,scores, features)]

         # Run non-maxima suppression.
         boxes = np.array([d.tlwh for d in detections])
         scores = np.array([d.confidence for d in detections])
         indices = prep.non_max_suppression(boxes, nms_max_overlap, scores)
         detections = [detections[i] for i in indices]

         tracker.predict()
         tracker.update(detections)

         for track,det in zip(tracker.tracks,detections):

	          bbox = track.to_tlbr()
		  cv2.rectangle(frame, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])),
					        (255,255,255), thick)
		  #cv2.putText(frame, str(track.track_id),
                                         #(int(bbox[0]), int(bbox[1]) - 12),0, 1e-3 * h, (255,255,255),thick//3)

		  bbox = det.to_tlbr()
		  cv2.rectangle(frame,(int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])),
					        (255,0,0), thick)
	          cv2.putText(frame, str(track.track_id),
                                                (int(bbox[0]), int(bbox[1]) - 12),0, 1e-3 * h, (255,0,0),thick//3)
        cv2.imshow('', frame)

Here is the the results I got after testing, the ID's were not stable , and the white boxes (generated by the tracker) were always thin (width =0) :
video in google drive (in blue are yolo detections , and white output of the tracker )
can someone please help find where is the problem , and a possible solution ?
thank you !

@obendidi
Copy link
Author

I've found that track.is_confirmed() returns a False most of the time.. it is probably the source of the problem

@nwojke
Copy link
Owner

nwojke commented Jul 19, 2017

Hi bendidi,

in your visualization you go over tracks and detections in one loop

for track,det in zip(tracker.tracks,detections):
    # Visualization

but the number of detections and tracks will usually not be the same. Also, the i-th detection at time k is by no means associated with the i-th track at time k. Use two separate loops to visualize detections and tracks as done here. You might find the OpenCV visualization useful for your purposes as well. This class takes care of only visualizing confirmed tracks.

If this is not just a visualization error, then I'd guess something is wrong with the Kalman state. Double check that the width coming out of the filter really is 0 as suggested by your visualization and check velocities (e.g., print object states). If this doesn't help it would be useful if you can dump detections that you input to the tracker.

Edit: Also, double-check the bounding box format coming out of the detector is correct.

@obendidi
Copy link
Author

obendidi commented Jul 20, 2017

I've added the visualization of detection just to make sure that the bounding box coming of the detector is correct, about the format, the boxes that come from the detector are:
(left, right, top, bot)==(xmin,xmax,ymin,ymax)
and I change it to :
(left, top, right-left, bot-top) == (x,y,w,h)

the width coming out of the tracker is really 0, I've checked with prints :

('is_confirmation', False)
('is_tentative', True)
('is_deleted', False)
('frame', (1024, 1280, 3))
('tracking', array([641,  40,   0,  97]))

('is_confirmation', False)
('is_tentative', True)
('is_deleted', False)
('frame', (1024, 1280, 3))
('tracking', array([615, 197,   0, 126]))

('is_confirmation', False)
('is_tentative', True)
('is_deleted', False)
('frame', (1024, 1280, 3))
('tracking', array([859, 486,   0, 234]))

('is_confirmation', False)
('is_tentative', True)
('is_deleted', False)
('frame', (1024, 1280, 3))
('tracking', array([531, 536,   0, 257]))

('is_confirmation', False)
('is_tentative', True)
('is_deleted', False)
('frame', (1024, 1280, 3))
('tracking', array([1002,  580,    0,  157]))

('is_confirmation', False)
('is_tentative', True)
('is_deleted', False)
('frame', (1024, 1280, 3))
('tracking', array([1084,  725,    0,  204]))

('is confirmation', True)
('is_tentative', False)
('is_deleted', False)
('frame', (1024, 1280, 3))
('tracking', array([ 988.55564239,  672.82243722,  117.2925128 ,  117.2925128 ]))

frame is the shape of the image
and tracking is output box of the tracker in (x,y,w,h) format

Thank you for your help !

@chengangfzu
Copy link

chengangfzu commented Jul 20, 2017 via email

@nwojke
Copy link
Owner

nwojke commented Jul 20, 2017

My next step in debugging this would be to set the --max_cosine_distance parameter to a very large number to see how the Kalman filter performs (if it still produces 0 width tracks and no confirmation).

@bhavikajalli
Copy link

I am facing the same problem. Let me know if you figure it out.

@nwojke
Copy link
Owner

nwojke commented Jul 20, 2017

Can one of you dump the detections that you input to the tracker (including the appearance descriptor) in the format described on the project page in native numpy format? The behavior is hard to debug without a working example.

@lg-code-repo
Copy link

I also get the width to be 0 when using the yolo2 detection. The bbox of your detection result must be with a type of float, the default type is Int type. I get a correct result when I change this. Hopely it helps.But I get a very slow speed with yolo2 detection.

@nwojke
Copy link
Owner

nwojke commented Jul 24, 2017

Right, integer types may break the computations inside the Kalman filter. I have committed a fix that forces floating point types here. A pull from the current master should fix this.

@obendidi
Copy link
Author

obendidi commented Jul 24, 2017

thank you @KingInSky @nwojke that was really the problem , I had to cast the detection to float, the result that I got is better than the sort algorithm but not by much, and I noticed another problem --I don't know if it's normal or not -- but the tracking boxes Width and Height is half of that of the detection's by YOLO, you can find the output video that i tested on here ,
PS : YOLO detection's are in red , deep sort detection are in white

so does this influence the tracking or it has no relation, and if so what are the possible ways to improve the tracking ?
thank you again :)

@nwojke
Copy link
Owner

nwojke commented Jul 25, 2017

If your boxes have wrong size then there is a good chance that the appearance descriptor is also computed from the wrong bounding boxes. This could degrade performance.

@obendidi
Copy link
Author

obendidi commented Jul 25, 2017

any idea of why this is happening ?
I've put my code in here if interested

@obendidi
Copy link
Author

it seems that there is a problem in this line

it should be :

ret[2:] = ret[:2] + ret[2:]

instead of :

ret[2:] = ret[:2] + ret[2:] / 2

I get correct tracking right now , thanks for the help 👍

@nwojke
Copy link
Owner

nwojke commented Jul 26, 2017

Good catch. It seems as if Track.to_tlbr() is not used inside the deep_sort code itself. The visualization in application_util uses Track.to_tlwh() so it went unnoticed. I have comitted a fix.

@ghost
Copy link

ghost commented Sep 4, 2017

Does it run in near realtime?

@nwojke
Copy link
Owner

nwojke commented Sep 4, 2017

@harshmunshi03 the tracker runs in realtime if you have a decent graphics card for feature generation.

@groverpr
Copy link

@Bendidi I tried darkflow with deep-sort for tracking, using your repo. Fora video with a decent number of people in the scene (20-30), it is performing even worse than YOLO v2 with simple IOU based clustering. How were your final results from this?

This was referenced Feb 16, 2018
@tonmoyborah
Copy link

@Bendidi I looked at your warehouse video, it seems there are a lot of ID switches even without any occlusion. Did you find a solution for this?

@Qidian213
Copy link

I combined deep_sort and yolov3 ,it can tracking real-time use my laptop's camera

https://github.com/Qidian213/deep_sort_yolov3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants