-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation output names for SipMask-VIS #11
Comments
@gauravmunjal13 Hi. Thanks for your interest. (1) The video id for evaluation and saved reasults can be different because the code saving results is written by me and the code for evaluation is provided by the official dataset. But I think it is not necessary to be same because we just need to visualize the results in order. For evaluation, you should follow official dataset. (2) With our provided model, we do not get the ap of 0. I am not sure about your reason. (3) It will skip frames with None. If you want to save results in each image, you can set a lower threshold in configs file. |
Thanks, @JialeCao001 for your response! It's true that the frames which have segmentations None are ignored while saving them. However, even the ones saved (having segmentations), all don't contain a visible object. When I plotted these segmentations, I find they are small. Is it that they are ignored as being small in size? Continuing on that, I would like to know your thoughts of using SipMask on detecting very small objects? Could you please suggest where are you saving the frames/results? I came across function show_results() in inference.py code under mmdet/apis. But it doesn't save the results. Meanwhile, I tried to write a code snippet to plot and save results using results.pkl.json output file so that I can filter results based on score. The segmentations displayed by your code is correct, however, the one using my code snippet is drifted (seems like size or scale issue). But it did show correctly using the results.pkl.json file from MaskTrack-RCNN. Is your output results file differ in some context to the one generated by MaskTrackRCNN? I am decoding the segmentations and applying on the image as:
where, apply_mask() is:
Last, it is suggested on Readme that there are two versions of SipMask as High-Accuracy and Real-Time Fast. How do I know which one I am using and how to switch to the other? |
@gauravmunjal13 Hi. I can not get all the things that you say. Here, I try to ask your questions.
(4) Instance segmentation in images has two versions of SipMask. For video instance segmentation, we only provide SipMask-VIS. |
Many thanks, @JialeCao001 for your comments! These are really useful and really appreciate your help! I may have figured out the reason why my code for plotting and saving the results is not working properly which may highlight some discrepancy in the results.pkl.json file. The input to the model is images and segmentations corresponding to size (512,512). The model produces an output as results.pkl.json in which the size of segmentations is (512,512) and the mask obtained after decoding them is also (512,512). However, the issue was that these segmentations seems to be drifted towards the top left and smaller in size, but were correct in your saving results. My analysis is that you are saving the images as of size (360,360) but the segmentations are of size (512,512). Does it mean there is some discrepancy in producing results file or am I missing anything? Thanks! |
@gauravmunjal13 Hi. If you save images with the following code, can you get the right results? |
Yes! But the resulting images are resized to (360,360) while the input images were of size (512,512). So, if I take the results.pkl.json file to apply the segmentations on input images, it's not correct. |
@gauravmunjal13 okay. I know what you say. The saved image is the same to the rescaled input image for network, which may be different from original image. When writing the json file, the code will rescale bounding box back. |
Perhaps rescaling may not be happening correctly. I followed the following steps to use segmentations from results.pkl.json file on input images of (512,512). Let me know if I am wrong. I used your code (show_result() method in base.py) as the reference. However, I still need to solve the problem of evaluation as the AP is 0. In the method ytvos_eval() in coco_utils.py, ytos detections are loaded from results.pkl.json as predictions while the ground truth annotations are loaded as ytos from the input annotation file which are of size (512,512). Since the predictions (segmentations) may not be rescaled correctly to the original input size, and thus the AP is 0. What do you think? And many thanks @JialeCao001 for your support! |
Hi @JialeCao001 , Let me know if I can provide more information or not clear in explaining. Thanks! |
@gauravmunjal13 I am not sure about your problem. I donot get a mAP of 0 on youtube-vis test set. |
Hi @JialeCao001 , But the following command doesn't results in the correct output file (results.pkl.json) in terms that the annotations aren't resized back to the orignal size: Thanks! |
Hi Team,
I am reaching out regarding the output names during the evaluation of SipMask-VIS. The present code requires video ids to start from 1, however, it saves the output videos starting from 0, thereby mapping 1 to 0 and henceforth.
Also, I am not sure if the above mismatch is the reason that the value of average precision is 0 at the end, while the model does a good job in detecting and tracking the objects in the output frames.
Second, in the saved path, the output misses some frames in the sequence. However, in the results.pkl.json file, the length of segmentations corresponding to a video is correct. From my analysis, it is skipping frames with None as the segmentation value.
I really appreciate your thoughts if I am missing anything.
Thanks!
The text was updated successfully, but these errors were encountered: