-
Notifications
You must be signed in to change notification settings - Fork 3k
COCO Detection Dataset Import/Export support #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
working for object detection
|
Is this supposed to support segmentation masks? It doesn't seem to be doing that for me, it only exports the bounding boxes. |
|
@yclicc Not yet. But it will be supported in near future. Currently, it is supporting only object detection format. |
|
Hi @hardikdava 👋🏻 ! I was AFK at the CVPR conference last week. I'll do a code review today. |
|
@SkalskiP I also added support for instance segmentation masks loading. I tested locally. Please also test with coco segmentation dataset. |
@yclicc You can try now. It should be working. |
|
@hardikdava that throws an error with my data on line 84 of dataset/formats/coco.py because "_polygons" is a dict. COCO format allows for segmentations either as a list of polygons or as a run length encoding (which is what I've got). |
|
@yclicc , have you used latest code? please share your dataset if possible? I tried using segmentation dataset from roboflow. |
|
This is latest code, but your code is expecting a list of polygons (which is a possible segmentation format for COCO, just not the only option). See https://opencv.org/blog/2021/10/12/introduction-to-the-coco-dataset/ for details and examples of both possible types of mask format. |
|
I can't share my dataset, but I'm generating it with CVAT's mask annotation mode, as opposed to the polygon creator. So perhaps unsurprisingly, it has exported a COCO format with Run Length Encoding instead of a list of polygons. |
|
@yclicc this PR only supports for polygons. But I can take a look and check for possibility to extend the functionality. |
|
I'll see if I can come up with a fix and then put in a pull request on your repo. |
|
@yclicc I tried to export dataset using CVAT by annotating object using "Draw new Mask" and export dataset in coco format. It still exported as polygons which is supported by this PR. |
|
Hi @hardikdava 👋🏻 ! This is so awesome to see you contributing this functionality to
NOTE: I still need to test loading and saving on the example dataset. |
|
@SkalskiP I have committed the changes you requested. Also ran two commands with |
@SkalskiP I would like to work on new features. I know unittests are important but I am not fan of implementing them ;) |
|
@hardikdava I did some tests and noticed two critical bugs. I marked them in the code. You can also use this Google Colab as reference: https://colab.research.google.com/drive/1spwwYyYO-LN3RDzp5s8whG1_7JzROm-m?usp=sharing |
No worries I can take care of that. For now, let's make sure simple tests in Google Colab work. And when we will be done with that. I'll make unittests ;) |
|
@SkalskiP please mark the critical part in the code and i will take care of them tomorrow. It's already late. |
|
Hi @SkalskiP 👋🏻, thanks for merging my contribution. I am glad to see my contribution in community. Always happy to help! how should we discuss about new feature in github or else? |
|
|
I've found a problem with this code, in that the image_annotation["category_id"] might be indexed from 1 instead of from zero, or may even skip numbers (not sure if that is valid COCO, but it should be coped with properly). However, coco_categories_to_classes returns the class names as an (obviously zero indexed) list with no skips, but then coco_annotations_to_detections sets the class_ids to simply the image_annotation["category_id"] even though this may not be zero indexed. |
|
Hi @yclicc, I noticed that problem while reviewing one of the current PRs. The fix should be in |
|
Yep, that's fixed it. Thanks! |
|
@yclicc awesome! We will probably release those changes today or tomorrow. |
|
Ah, no I'm afraid this is still present. Suggest modifying coco_categories_to_classes to make it also return a lookup table from category["id"] to index in the new category list. |
|
I just tested it. It works for me. Making it a lookup table would require changing every other format we support. Can you give me any example of a dataset it doesn't work for? |
|
Any COCO dataset where the numbering of the ids of the categories doesn't start at 0 and/or skips a number. E.g. when outputting to YOLO, but the indices of each row of the YOLO dataset will be 1 and 3 instead of 0 and 1. By a lookup table, you don't need to modify anything except the COCO code. Have coco_categories_to_classes return a lookup table, then have coco_annotations_to_detections accept that lookup table and then when the class_ids variable is set have it lookup in that lookup table for each |
|
So the ids of your input dataset are not ordered from In that case, we have two options:
|
|
Mine are actually consecutive from 1 onwards, which is the format CVAT output for me, but the possibility of a number being missed in general (rather than just 0) ought to be considered I think. |
|
@yclicc Yeah this is something that we need to take into consideration :/ |
|
@hardikdava could you take a look at the problem @yclicc described? |
|
@SkalskiP sure. I will take a look at it and try to fix it. |
|
@hardikdava awesome! |
|
@SkalskiP and @yclicc some of the things I found about coco dataset. Issue: Category map ids assignment
Issue: Segmentation polygon type
Let me know your views soon. |
|
Category map ids assignment Segmentation polygon type |
|
@yclicc my thoughts, Segmentation polygon type Category map ids assignment
|
|
Hi @hardikdava and @yclicc 👋🏻 How about we just remap class ids when we load COCO annotations? |
|
@yclicc @SkalskiP Found the solution. We should offset the |
|
Seems good to me, though it still fails as my dataset contains RLEs rather than polygons. |
|
Yep, all seems to work fine when I add my RLE code back in (which sadly I can't share). Thanks! |
@yclicc We will add @hardikdava looks like we will move forward with #176. |

Description
COCOdetection dataset import and exportjsonfilesType of change
How has this change been tested?
supervision.DetectionDatasetsupervision.BoxAnnotatorsupervision.DetectionDatasetDocs