Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to convert from COCO instance segmentation format to YOLOv5 instance segmentation Without Roboflow? #10621

Closed
1 task done
ichsan2895 opened this issue Dec 29, 2022 · 34 comments
Labels
question Further information is requested Stale Stale and schedule for closing soon

Comments

@ichsan2895
Copy link

Search before asking

Question

Hello, is it possible to convert COCO instance segmentation Custom dataset to YOLOv5 instance segmentation dataset (without Roboflow ) or maybe creating from scratch?

I already check this tutorial
Train On Custom Data 1st and this tutorial Format of YOLO annotations

Most of tutorial just tell format for BBOX and doesn't tell how to convert COCO to YOLO
image
image

But I don't find any tutorial for converting COCO to YOLOv5 without Roboflow

Can somebody help me?

Thanks for sharing

Additional

No response

@ichsan2895 ichsan2895 added the question Further information is requested label Dec 29, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Dec 29, 2022

👋 Hello @ichsan2895, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.

For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com.

Requirements

Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI

If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on MacOS, Windows, and Ubuntu every 24 hours and on every commit.

@ExtReMLapin
Copy link

You can code it yourself in python, just keep in mind on coco, origin is top left, in yolo it's center

@ryouchinsa
Copy link

ryouchinsa commented Dec 29, 2022

Using this script, you can convert the COCO segmentation format to the YOLO segmentation format.
https://github.com/ultralytics/JSON2YOLO

RectLabel is an offline image annotation tool for object detection and segmentation.
Although this is not an open source program, with RectLabel you can import the COCO segmentation format and export to the YOLO segmentation format.

class_index x1 y1 x2 y2 x3 y3 ...
0 0.180027 0.287930 0.181324 0.280698 0.183726 0.270573 ...

yolo_polygon

@ichsan2895
Copy link
Author

Using this script, you can convert the COCO segmentation format to the YOLO segmentation format. https://github.com/ultralytics/JSON2YOLO

RectLabel is an offline image annotation tool for object detection and segmentation. Although this is not an open source program, with RectLabel you can import the COCO segmentation format and export to the YOLO segmentation format. https://rectlabel.com/help#xml_to_yolo

class_index x1 y1 x2 y2 x3 y3 ...
0 0.180027 0.287930 0.181324 0.280698 0.183726 0.270573 ...

yolo_polygon

Thanks I will check out JSON2YOLO script, I will report back if I see any trouble

@Edohvin
Copy link

Edohvin commented Jan 9, 2023

Did you see any trouble? @ichsan2895

@ichsan2895
Copy link
Author

ichsan2895 commented Jan 13, 2023

Did you see any trouble? @ichsan2895

Sorry for slow respon

Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer.

image

The log seems success, but the label/annot txt was not found. I'm not sure what happen.

The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here :
https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing

Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

@iagorrr
Copy link

iagorrr commented Jan 18, 2023

Did you see any trouble? @ichsan2895

Sorry for slow respon

Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer.

image

The log seems success, but the label/annot txt was not found. I'm not sure what happen.

The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing

Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

The python notebook worked perfectly for me, thank you !

@kadirnar
Copy link

Did you see any trouble? @ichsan2895

Sorry for slow respon
Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer.
image
The log seems success, but the label/annot txt was not found. I'm not sure what happen.
The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing
Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

The python notebook worked perfectly for me, thank you !

Can you share sample coco json file? This code didn't work.

---> 85     line = *(segments[last_iter] if use_segments else bboxes[last_iter]),  # cls, box or segments
     86     f.write(('%g ' * len(line)).rstrip() % line + '\n')
     87 print("that images contains class:",len(bboxes),"objects")

IndexError: list index out of range

@ichsan2895
Copy link
Author

Did you see any trouble? @ichsan2895

Sorry for slow respon
Yes the JSON2COCO (https://github.com/ultralytics/JSON2YOLO) is failed to work in my computer.
image
The log seems success, but the label/annot txt was not found. I'm not sure what happen.
The COCO was made by labelme annotator. So, it has directory config as :

COCO_Project
|-> JPEGImages\
     |-> img_01.jpg
     |-> img_02.jpg
|-> annotations.txt

Fortunatelly, After One week of debugging. I created a jupyter notebook that mix the code from JSON2COCO & stackoverflow to converting COCO to YOLO and you can download it here : https://drive.google.com/file/d/1xhBiWv_Y0HBZQHoWBwF7yjpRrDZhrk4f/view?usp=sharing
Just change the last cell to desired output_path and json_file path. If you want use bbox annotations, just add use_segment=False. Then you run all cells from start to the last.

# the annot will be formatted as bbox, ideal for object detection task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=False)

# the annot will be formatted as polygon, ideal for instance segmentation task
convert_coco_json_to_yolo_txt("yolo_from_Project_1st", "COCO_Project_1st/annotations.json", use_segment=True)

The python notebook worked perfectly for me, thank you !

Can you share sample coco json file? This code didn't work.

---> 85     line = *(segments[last_iter] if use_segments else bboxes[last_iter]),  # cls, box or segments
     86     f.write(('%g ' * len(line)).rstrip() % line + '\n')
     87 print("that images contains class:",len(bboxes),"objects")

IndexError: list index out of range

Sure, I have make this sample coco json with labelme library.

Please take a look..
typical coco json to yolo segmentation.zip

@kadirnar
Copy link

image
Why are there negative values?

@ichsan2895
Copy link
Author

ichsan2895 commented Mar 31, 2023

image Why are there negative values?

Can you share the entire your coco dataset (images + coco_annot.json)? Since it never happen to me (negative value). For the good result, it recomended to create coco annotation with labelme, then convert it with my notebook to yolo format

@almazgimaev
Copy link

Hi, @ichsan2895,
You can do it in just a couple of clicks using apps from Supervisely ecosystem:

  1. Firstly, you need to upload your COCO format data to the Supervisely using Import COCO app.
    You can also upload data in another format one of the import applications

  2. Next, you can export the data from Supervisely in the YOLO v5/v8 format:

    • For polygons and masks (without internal cutouts), use the "Export to YOLOv8" app;
    class x1 y1 x2 y2 x3 y3 ...
    0 0.100417 0.654604 0.089646 0.662646 0.087561 0.666667 ...
    
    class x_center y_center width height
    0 0.16713 0.787696 0.207783 0.287495
    

I'm sure there are many apps in the Supervisely ecosystem that can help solve your tasks.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

@github-actions github-actions bot added the Stale Stale and schedule for closing soon label Jun 8, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 19, 2023
@SkalskiP
Copy link
Contributor

Hi 👋🏻 I'm probably late to the party, but you can convert between formats with supervision.

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path='...',
    annotations_path='...',
    force_masks=True
).as_yolo(
    images_directory_path='...',
    annotations_directory_path='...',
    data_yaml_path='...'
)

@glenn-jocher
Copy link
Member

@SkalskiP thanks for sharing your solution! We appreciate your input and contribution to the YOLOv5 community. Your code snippet using supervision library seems like a handy tool for converting between formats. It's great to see different approaches to tackle the problem. Keep up the good work!

@lonngxiang
Copy link

Hi 👋🏻 I'm probably late to the party, but you can convert between formats with supervision.

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path='...',
    annotations_path='...',
    force_masks=True
).as_yolo(
    images_directory_path='...',
    annotations_directory_path='...',
    data_yaml_path='...'
)

is segment support?

@glenn-jocher
Copy link
Member

@lonngxiang yes, the supervision utility provides support for converting segmentation annotations in addition to detection annotations. You can use the force_masks=True argument in the from_coco method to ensure that masks are enforced during the conversion process. This enables seamless conversion between different annotation formats, allowing you to work with segmentation annotations as well.

@ryouchinsa
Copy link

We updated our general_json2yolo.py script so that the RLE mask with holes can be converted to the YOLO segmentation format correctly.
ultralytics/ultralytics#917 (comment)

@glenn-jocher
Copy link
Member

@ryouchinsa thank you for sharing the update to the general_json2yolo.py script! 💻 It's great to see the community working together to improve the conversion process for RLE masks with holes to the YOLO segmentation format. Your contribution will definitely benefit others who are facing similar challenges. Keep up the great work! If you have any further improvements or insights, feel free to share them.

@lonngxiang
Copy link

@lonngxiang yes, the supervision utility provides support for converting segmentation annotations in addition to detection annotations. You can use the force_masks=True argument in the from_coco method to ensure that masks are enforced during the conversion process. This enables seamless conversion between different annotation formats, allowing you to work with segmentation annotations as well.

tks, i try use this datasets with 1 label https://universe.roboflow.com/naumov-igor-segmentation/car-segmetarion

image

but i use this script coco2yolo,but i got 2 labeles,i donnot know why

import supervision as sv

sv.DetectionDataset.from_coco(
    images_directory_path= r"C:\Users\loong\Downloads\Car\valid",
    annotations_path=r"C:\Users\loong\Downloads\Car\valid\_annotations.coco.json",
    force_masks=True
).as_yolo(
    images_directory_path=r"C:\Users\loong\Downloads\Car_yolo\val\images",
    annotations_directory_path=r"C:\Users\loong\Downloads\Car_yolo\val\labels",
    data_yaml_path=r"C:\Users\loong\Downloads\Car_yolo\data.yaml"
)

image

@glenn-jocher
Copy link
Member

@lonngxiang it looks like the issue might be related to the conversion process. One possibility is that the COCO dataset includes multiple categories, leading to the creation of multiple labels during the conversion. You may want to review the original COCO annotations and ensure that only the desired category (in this case, "car") is included in the annotations. Double-checking the original COCO annotations to ensure that only the "car" category is present could help resolve the issue.

Additionally, you might want to inspect the annotations.coco.json file to confirm the structure and contents of the annotations. This can help identify any unexpected data that might be causing the extra labels to appear during the conversion process.

Feel free to reach out if you have further questions or need additional assistance!

@lonngxiang
Copy link

it looks like the issue might be related to the conversion process. One possibility is that the COCO dataset includes multiple categories, leading to the creation of multiple labels during the conversion. You may want to review the original COCO annotations and ensure that only the desired category (in this case, "car") is included in the annotations. Double-checking the original COCO annotations to ensure that only the "car" category is present could help resolve the issue.

Additionally, you might want to inspect the annotations.coco.json file to confirm the structure and contents of the annotations. This can help identify any unexpected data that might be causing the extra labels to appear during the conversion process.

Feel free to reach out if you have further questions or need additional assistance!

tks; but how to use this script if i use this download datasets
https://universe.roboflow.com/naumov-igor-segmentation/car-segmetarion

@glenn-jocher
Copy link
Member

@lonngxiang i understand your question, but as an open-source contributor, I am unable to guide you on using specific third-party datasets, such as the one from Roboflow, as I am not associated with them. I recommend referencing the documentation or support resources provided by Roboflow for guidance on using their datasets with the supervision library for conversion. If you encounter any specific issues related to YOLOv5 or general conversion processes, I am here to assist. Additionally, feel free to consult the YOLOv5 documentation for further insights on dataset conversion.

@lonngxiang
Copy link

lonngxiang commented Nov 22, 2023

@lonngxiang i understand your question, but as an open-source contributor, I am unable to guide you on using specific third-party datasets, such as the one from Roboflow, as I am not associated with them. I recommend referencing the documentation or support resources provided by Roboflow for guidance on using their datasets with the supervision library for conversion. If you encounter any specific issues related to YOLOv5 or general conversion processes, I am here to assist. Additionally, feel free to consult the YOLOv5 documentation for further insights on dataset conversion.

tks, I have utilized the Supervision library to convert the COCO segmentation format to the YOLO format. However, when I ran the Ultralytics command, the results were not as expected .

yolo segment train model=yolov8m-seg.yaml data=/mnt/data/loong/segmetarion/Car_yolo/data.yaml epochs=100

@SkalskiP
Copy link
Contributor

Hi @lonngxiang 👋🏻, I'm the creator of Supervision. Have you been able to solve your conversion problem?

@lonngxiang
Copy link

Hi @lonngxiang 👋🏻, I'm the creator of Supervision. Have you been able to solve your conversion problem?

yes,but use sv.DetectionDataset.from_coco().as_yolo() not work for yolo segment format;finally i fixed by used this methods https://github.com/ultralytics/JSON2YOLO/blob/master/general_json2yolo.py

@glenn-jocher
Copy link
Member

@lonngxiang it’s great to hear that you found a solution! If you have any other questions or encounter more issues in the future, feel free to ask. We’re here to help. Good luck with your project!

@ryouchinsa
Copy link

Hi @SkalskiP,
Can supervision convert multiple polygons in the COCO format to YOLO segmentation format?

"annotations": [
{
    "area": 594425,
    "bbox": [328, 834, 780, 2250],
    "category_id": 1,
    "id": 1,
    "image_id": 1,
    "iscrowd": 0,
    "segmentation": [
        [495, 987, 497, 984, 501, 983, 500, 978, 498, 962, 503, 937, 503, 926, 532, 877, 569, 849, 620, 834, 701, 838, 767, 860, 790, 931, 803, 963, 802, 972, 846, 970, 896, 969, 896, 977, 875, 982, 847, 984, 793, 987, 791, 1001, 783, 1009, 785, 1022, 791, 1024, 787, 1027, 795, 1041, 804, 1059, 811, 1072, 810, 1081, 800, 1089, 788, 1092, 783, 1098, 784, 1115, 780, 1120, 774, 1123, 778, 1126, 778, 1136, 775, 1140, 767, 1140, 763, 1146, 767, 1164, 754, 1181, 759, 1212, 751, 1264, 815, 1283, 839, 1303, 865, 1362, 880, 1442, 902, 1525, 930, 1602, 953, 1640, 996, 1699, 1021, 1773, 1039, 1863, 1060, 1920, 1073, 1963, 1089, 1982, 1102, 2013, 1107, 2037, 1107, 2043, 1099, 2046, 1097, 2094, 1089, 2123, 1074, 2137, 1066, 2153, 1033, 2172, 1024, 2166, 1024, 2166, 1023, 2129, 1019, 2093, 1004, 2057, 996, 2016, 1000, 1979, 903, 1814, 860, 1727, 820, 1647, 772, 1547, 695, 1637, 625, 1736, 556, 1854, 495, 1986, 459, 2110, 446, 1998, 449, 1913, 401, 1819, 362, 1720, 342, 1575, 328, 1440, 335, 1382, 348, 1330, 366, 1294, 422, 1248, 437, 1222, 450, 1190, 466, 1147, 482, 1107, 495, 1076, 506, 1019, 497, 1016],
        [878, 2293, 868, 2335, 855, 2372, 843, 2413, 838, 2445, 820, 2497, 806, 2556, 805, 2589, 809, 2622, 810, 2663, 807, 2704, 793, 2785, 772, 2866, 742, 2956, 725, 3000, 724, 3013, 740, 3024, 757, 3029, 778, 3033, 795, 3033, 812, 3032, 812, 3046, 803, 3052, 791, 3063, 771, 3069, 745, 3070, 733, 3074, 719, 3077, 702, 3075, 680, 3083, 664, 3082, 631, 3072, 601, 3061, 558, 3058, 553, 3039, 558, 3023, 566, 3001, 568, 2983, 566, 2960, 572, 2912, 571, 2859, 567, 2781, 572, 2698, 576, 2643, 583, 2613, 604, 2568, 628, 2527, 637, 2500, 636, 2468, 629, 2445, 621, 2423, 673, 2409, 726, 2388, 807, 2344, 878, 2293]
    ]
}],

@YoungjaeDev
Copy link

YoungjaeDev commented Feb 15, 2024

@ryouchinsa
I analyzed the json2yolo conversion code and am curious about the principle.
Why do ti divide by k=0, k=1? In the end, the first and last points of each unit polygon are the same, so we can check which instance it is, but can you tell why we need to do forward and backward for a unit polygon?

for k in range(2):
        # forward connection
        if k == 0:
            # idx_list: [[5], [12, 0], [7]]
            for i, idx in enumerate(idx_list):
                # middle segments have two indexes
                # reverse the index of middle segments
                # 첫번째와 마지막 세그먼트를 제외한 나머지 세그먼트들은 두개의 인덱스를 가지고 있음
                # idx_list = [ [p], [p, q], [p, q], ... , [q]]
                if len(idx) == 2 and idx[0] > idx[1]:
                    idx = idx[::-1]
                    # segments[i] : (N, 2)
                    segments[i] = segments[i][::-1, :]

                segments[i] = np.roll(segments[i], -idx[0], axis=0)
                segments[i] = np.concatenate([segments[i], segments[i][:1]])
                # deal with the first segment and the last one
                if i in [0, len(idx_list) - 1]:
                    s.append(segments[i])
                else:
                    idx = [0, idx[1] - idx[0]]
                    s.append(segments[i][idx[0] : idx[1] + 1])

        else:
            for i in range(len(idx_list) - 1, -1, -1):
                if i not in [0, len(idx_list) - 1]:
                    idx = idx_list[i]
                    nidx = abs(idx[1] - idx[0])
                    s.append(segments[i][nidx:])
    return s

@YoungjaeDev
Copy link

@ryouchinsa

Well, in the end, it depends on how the train code parses it, but I'm curious about whether that method is efficient.

@ryouchinsa
Copy link

Hi @youngjae-avikus,

For example, we are going to merge 3 polygons into a polygon.
To connect 3 polygons with narrow lines with width 0, there are both forward and backward scans to append all points.

k == 0: // forward
Append [0, 1, 2, 0] of left polygon.
Append [0, 1, 2] of center polygon.
Append [0, 1, 2, 3, 4, 0] of right polygon.

k == 1: // backward
Append [2, 3, 0] of center polygon.

If you have any questions, please let us know.

スクリーンショット 2024-02-20 5 29 11

@glenn-jocher
Copy link
Member

glenn-jocher commented Mar 4, 2024

@ryouchinsa looks correct to me

@ryouchinsa
Copy link

@glenn-jocher, thanks for reviewing my explanation.

@glenn-jocher
Copy link
Member

@ryouchinsa you're welcome! If you have any more questions or need further assistance, feel free to reach out. Happy to help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Stale Stale and schedule for closing soon
Projects
None yet
Development

No branches or pull requests