<a href="https://colab.research.google.com/github//pylabel-project/samples/blob/main/yolo2coco.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> 

# Convert Yolo V5 Annotations (TXT Files) to COCO JSON Format
Converting from yolo to another format is a little tricky because yolo format does not store the dimensions of the image, which you will need to convert to most other formats. So you need to read the image file to get the height and width of the image. The PyLabel package takes care of that. This notebook will show how you can import yolo v5 annotations and export them into another format, like COCO.



In [1]:
import logging
logging.getLogger().setLevel(logging.CRITICAL)
%pip install pylabel > /dev/null

Note: you may need to restart the kernel to use updated packages.


In [2]:
from pylabel import importer

## Import Yolo annotations 
First we will import annotations stored in Yolo v5 format. (This is a sample data data set. You can edit this part to point to your dataset.)


There are two methods of importing YOLOv5 annotations. The method shown here 'ImportYoloV5' will read the annotations but you must also provide a list of the class names that map to the class ids. There is another method, 'ImportYoloV5WithYaml' that can read the class names from a YAML file, shown in this notebook: [yolo_with_yaml_importer.ipynb](https://github.com/pylabel-project/samples/blob/main/yolo_with_yaml_importer.ipynb)

In [3]:
#Specify path to the coco.json file
path_to_annotations = "./yolo/file.json"
#Specify the path to the images (if they are in a different folder than the annotations)
path_to_images = "./yolo/images/"

#Import the dataset into the pylable schema 
dataset = importer.ImportCoco(path_to_annotations, path_to_images=path_to_images, name="BCCD_coco")
dataset.df


Unnamed: 0_level_0,img_folder,img_filename,img_path,img_id,img_width,img_height,img_depth,ann_segmented,ann_bbox_xmin,ann_bbox_ymin,...,ann_segmentation,ann_iscrowd,ann_pose,ann_truncated,ann_difficult,cat_id,cat_name,cat_supercategory,split,annotated
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,./yolo/images/,P.Corn.Inv.MSS.A.101.XIII.jpg,,1,2524,2162,,,995,438,...,,1,,,,186,Δ,Greek,,1
1,./yolo/images/,P.Corn.Inv.MSS.A.101.XIII.jpg,,1,2524,2162,,,1905,1488,...,,1,,,,33,Κ,Greek,,1
2,./yolo/images/,P.Corn.Inv.MSS.A.101.XIII.jpg,,1,2524,2162,,,2259,417,...,,1,,,,23,Ε,Greek,,1
3,./yolo/images/,P.Corn.Inv.MSS.A.101.XIII.jpg,,1,2524,2162,,,1769,1006,...,,1,,,,23,Ε,Greek,,1
4,./yolo/images/,P.Corn.Inv.MSS.A.101.XIII.jpg,,1,2524,2162,,,1784,765,...,,1,,,,119,Ν,Greek,,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35592,./yolo/images/,Sorbonne_inv_542.jpg,,5711,1036,1135,,,620,424,...,,1,,,,100,Ρ,Greek,,1
35593,./yolo/images/,Sorbonne_inv_542.jpg,,5711,1036,1135,,,734,415,...,,1,,,,8,Α,Greek,,1
35594,./yolo/images/,Sorbonne_inv_542.jpg,,5711,1036,1135,,,430,700,...,,1,,,,23,Ε,Greek,,1
35595,./yolo/images/,Sorbonne_inv_542.jpg,,5711,1036,1135,,,298,539,...,,1,,,,8,Α,Greek,,1


## Analyze annotations
Pylabel can calculate basic summary statisticts about the dataset such as the number of files and the classes. 
The dataset is stored as a pandas frame so the developer can do additional exploratory analysis on the dataset. 

In [4]:
print(f"Number of images: {dataset.analyze.num_images}")
print(f"Number of classes: {dataset.analyze.num_classes}")
print(f"Classes:{dataset.analyze.classes}")
print(f"Class counts:\n{dataset.analyze.class_counts}")
print(f"Class name id map:\n{dataset.analyze.class_name_id_map}")

Number of images: 152
Number of classes: 25
Classes:['Θ', 'Α', 'Β', 'Τ', 'Ξ', 'Ε', 'Κ', 'Ω', 'Μ', 'Φ', 'Ρ', 'Π', 'Γ', 'Ν', 'Λ', 'Ζ', 'Η', 'Υ', 'Ψ', '.', 'Χ', 'Δ', 'Ο', 'Ι', 'Ϲ']
Class counts:
Ε    4081
Α    3783
Ο    3343
Ι    3134
Ν    3027
Ϲ    2552
Τ    2015
Ρ    1562
Η    1462
Υ    1350
Μ    1215
Π    1213
Λ    1212
Κ    1109
Δ    1100
Ω    1052
Θ     610
Γ     497
Χ     473
Φ     402
Β     179
Ξ     103
Ζ      86
Ψ      36
.       1
Name: cat_name, dtype: int64
Class name id map:
{'Δ': '186', 'Κ': '33', 'Ε': '23', 'Ν': '119', 'Ϲ': '225', 'Ο': '201', 'Π': '107', 'Λ': '120', 'Α': '8', 'Χ': '177', 'Ρ': '100', 'Τ': '14', 'Θ': '7', 'Φ': '77', 'Ι': '212', 'Η': '150', 'Μ': '59', 'Υ': '161', 'Ω': '45', 'Γ': '111', 'Β': '9', 'Ψ': '169', 'Ξ': '17', 'Ζ': '144', '.': '176'}


## Edit Annotations 
All of the annotations are stored in a Pandas dataframe that you can access directly as 'dataset.df'. Not only can you do your own custom queries of the dataset, but you can also manipulate the dataset by removing rows, changing labels, etc.  

PyLabel also includes a lightweight annotation tool that you can use to create and edit bounding box annotations within a Jupyter notebook. You can see an example of that tool here: [pylabeler.ipynb](https://github.com/pylabel-project/samples/blob/main/pylabeler.ipynb)

## Visualize Annotations 
You can render the bounding boxes for your image to inspect them and confirm that they imported correctly.  

# Export to Coco Json
The PyLabel exporter will export all of the annotations in the dataframe to the desired target format.
All annotations will be stored in a single json file. 

In [5]:
dataset.path_to_annotations = "data/yolo"
dataset.export.ExportToYoloV5()[0]


'training/dataset.yaml'

Thank you for trying PyLabel. If you had any issues running this notebook or have ideas for how to make it better, please submit an issue here https://github.com/pylabel-project/pylabel/issues. 