# Pylabel End to End Example 
## Yolo v5 text files to VOC XML
Use this notebook to try out importing, analyzing, and exporting datasets of image annotations. 

In [None]:
#!pip install -i https://test.pypi.org/simple/ pylabelalphatest==0.1.1 

In [2]:
from pylabelalpha import importer

## Import coco annotations 
First we will import annotations from the coco dataset, which are in coco json format. 

In [5]:
path_to_annotations = "/Users/alex/Google Drive/pylabel/datasets/wildlife/yolo_splits/val/labels"

#Identify the path to get from the annotations to the images 
path_to_images = "../images/"

#Import the dataset into the pylable schema 
voc_dataset = importer.ImportYoloV5(path=path_to_annotations, path_to_images=path_to_images, name="Wild life", img_height=329, img_width=329)
voc_dataset.df.head(5)



Unnamed: 0,id,img_folder,img_filename,img_path,img_id,img_width,img_height,img_depth,ann_segmented,ann_bbox_xmin,...,ann_area,ann_segmentation,ann_iscrowd,ann_pose,ann_truncated,ann_difficult,cat_id,cat_name,cat_supercategory,split
0,0,,2011341_2.jpg,,0,329,329,,,101.99987,...,10368.023154,,,,,,3,,,
1,1,,7985836_1.jpg,,1,329,329,,,22.999897,...,5858.988229,,,,,,5,,,
2,2,,3966713_1.jpg,,2,329,329,,,145.554864,...,384.333099,,,,,,1,,,
3,3,,2979383_1.jpg,,3,329,329,,,112.260228,...,248.934647,,,,,,9,,,
4,4,,2020781_2.jpg,,4,329,329,,,147.474908,...,3541.802023,,,,,,6,,,


In [None]:
#Confirm that there are images by displaying an image
from IPython.display import Image

first_row = voc_dataset.df.iloc[0]
full_image_path = voc_dataset.path_to_annotations + first_row.img_folder + first_row.img_filename
print(full_image_path)
Image(full_image_path) 

## Analyze annotations
Pylabel can calculate basic summary statisticts about the dataset such as the number of files and the classes. 
The dataset is stored as a pandas frame so the developer can do additional exploratory analysis on the dataset. 

In [18]:
print(f"Number of images: {coco_dataset.analyze.num_images}")
print(f"Number of classes: {coco_dataset.analyze.num_classes}")
print(f"Classes:{coco_dataset.analyze.classes}")
print(f"Class counts:\n{coco_dataset.analyze.class_counts}")

Number of images: 5000
Number of classes: 80
Classes:['bottle' 'dining table' 'person' 'knife' 'bowl' 'oven' 'cup' 'broccoli'
 'spoon' 'carrot' 'sink' 'potted plant' 'chair' 'refrigerator' 'banana'
 'orange' 'umbrella' 'handbag' 'traffic light' 'bicycle' 'skateboard'
 'car' 'truck' 'toilet' 'motorcycle' 'bird' 'keyboard' 'book' 'tv' 'vase'
 'couch' 'airplane' 'suitcase' 'giraffe' 'cow' 'boat' 'bench' 'sheep'
 'bus' 'backpack' nan 'train' 'stop sign' 'dog' 'cat' 'laptop' 'tie'
 'elephant' 'clock' 'frisbee' 'bear' 'zebra' 'horse' 'skis' 'sports ball'
 'baseball glove' 'donut' 'sandwich' 'cake' 'surfboard' 'bed' 'pizza'
 'tennis racket' 'toothbrush' 'remote' 'apple' 'snowboard' 'kite'
 'baseball bat' 'fire hydrant' 'mouse' 'teddy bear' 'cell phone'
 'scissors' 'wine glass' 'fork' 'microwave' 'hot dog' 'parking meter'
 'toaster' 'hair drier']
Class counts:
person        11004
car            1932
chair          1791
book           1161
bottle         1025
              ...  
microwave      

# Export to VOC XML
The PyLabel exporter will export all of the annotations in the dataframe to the desired target format.
VOC creates one XML for each Jpeg in the dataset. 

In [20]:
!mkdir test_output/
coco_dataset.export.ExportToVoc(coco_dataset.df, segmented_=False, path_=False, database_=False, folder_=False, occluded_=False, write_to_file_=True, output_file_path_ = 'test_output/')



mkdir: test_output/: File exists


()

In [21]:
# Inspect one of the files
!cat test_output/_000000000139_jpg.xml

<?xml version="1.0" ?>
<annotation>
	<filename>000000000139.jpg</filename>
	<size>
		<width>640</width>
		<height>426</height>
		<depth/>
	</size>
	<object>
		<name>potted plant</name>
		<pose/>
		<truncated/>
		<difficult/>
		<bndbox>
			<xmin>236.98</xmin>
			<xmax>261.68</xmax>
			<ymin>73.00999999999999</ymin>
			<ymax>142.51</ymax>
		</bndbox>
	</object>
	<object>
		<name>tv</name>
		<pose/>
		<truncated/>
		<difficult/>
		<bndbox>
			<xmin>7.03</xmin>
			<xmax>156.35</xmax>
			<ymin>72.88999999999999</ymin>
			<ymax>167.76</ymax>
		</bndbox>
	</object>
	<object>
		<name>tv</name>
		<pose/>
		<truncated/>
		<difficult/>
		<bndbox>
			<xmin>557.21</xmin>
			<xmax>638.5600000000001</xmax>
			<ymin>130.45999999999998</ymin>
			<ymax>209.19</ymax>
		</bndbox>
	</object>
	<object>
		<name>chair</name>
		<pose/>
		<truncated/>
		<difficult/>
		<bndbox>
			<xmin>358.98</xmin>
			<xmax>414.98</xmax>
			<ymin>115.22000000000001</ymin>
			<ymax>218.05</ymax>
		</bndbox>
	</object>
	<object>