Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLOv3 odd detection behaviour vs YOLOv2 #17

Closed
abagshaw opened this issue May 1, 2018 · 6 comments
Closed

YOLOv3 odd detection behaviour vs YOLOv2 #17

abagshaw opened this issue May 1, 2018 · 6 comments

Comments

@abagshaw
Copy link

abagshaw commented May 1, 2018

Here's the same image put through YOLOv2COCO and then YOLOv3COCO respectively (threshold set at 0.4 and input resolution at 416x416 for both):

YOLOv2
YOLOv3

YOLOv3 does pick up the smaller people (as one would expect) - but for some reason it seems to predicting a much smaller bounding box than it should. I've done some experimenting and similar behavior is exhibited on quite a few images (sometimes the proper box appears and a smaller, inner one also appears - in this case, though, only the smaller one is appearing).

It's possible that this is just due to a poor prediction on the model's part, but my guess is that it isn't and due to some problem with the NMS function (almost as if it's doing non-minimal-suppression...if that's a thing 😄 ) or other post processing step.

@taehoonlee
Copy link
Owner

Yes, I suspect the post processing too, because I just had translated the architecture of the original YOLOv3 and concatenated the result boxes of three different scales (see codes). I think I should carefully investigate incorporation of the results from three-scales. Thank you for the comments, @abagshaw.

@abagshaw
Copy link
Author

abagshaw commented May 3, 2018

@taehoonlee This implementation may be helpful as a reference when fixing the post-processing.

Another thing I was a little confused about is I remember reading that YOLOv3 does not use softmax on the outputs, but rather allows for more than one class to be applied to an object (should the confidence of more than one class exceed the threshold). I think the current post-processing implementation in tensornets only allows for single class predictions.

@taehoonlee
Copy link
Owner

@abagshaw, I got it. The keras implementation you mentioned concatenates the result boxes of three different scales like TensorNets. But it does NMS after concat while TensorNets does concat after NMS. I'll compare the two approaches and let me know you.

@taehoonlee
Copy link
Owner

@abagshaw, I confirmed the softmax issue and will revise the post processing with respect to the original one. Thank you!

@taehoonlee
Copy link
Owner

@abagshaw, I revised the NMS issue first. The updates improved the mAP on VOC2007 by 2%.
Could you check to see if the smaller bbox problem is still happening?

@abagshaw
Copy link
Author

abagshaw commented May 9, 2018

@taehoonlee Problem solved! It's working beautifully now as far as I can see. Thanks so much!

@abagshaw abagshaw closed this as completed May 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants