Download data in different formats #675

hlydecker · 2022-04-27T01:58:49Z

It might be very useful to allow users to download data directly into an ML ready format such as YOLO.

I could probably repurpose some of the dataset utilities I developed for camera traps to do this, because we turn a COCO like dataset into YOLO.

YOLO requires:
a dataset.yaml with:

path: weed_ai_dataset
train: train/images
val: val/images
test: test/images

nc: 2
names: ['Weed: Lolium rigidum','Weed: Sonchus oleraceus']

Where path directs to where the images are stored relative to the YOLO model install.

Individual annotations are stored for each image in text files with the same name as the image, but with the .txt extension. These are in the "labels" directory that sits adjacent to the relevant "images" directory.

These are space separated text files, with:
<class_id> <x_center> <y_center>

0 0.25 0.1 0.43 0.3

Note that x,y,w,h are all in relative percentage of the image.

This is not for us to deal with now, but would be useful in the future!

The text was updated successfully, but these errors were encountered:

geezacoleman · 2022-04-27T04:24:26Z

This would be a great feature - I've been putting together a Google Colab file to train a YOLOv5 model with Weed-AI datasets. It could be a good interim solution. CVAT also offers export/upload in various different formats which might help the process.

Came across this converter from Ultralytics which might also help.

hlydecker · 2022-05-19T22:36:14Z

So some more thoughts:

Minimum viable product: WeedCOCO -> COCO converter. There would be a button on the dataset page "Download Model Ready" (or something similar) which would download a COCO zip file with a COCO format dataset, with the AgContext object split off into a dataset description sort of document (JSON, YAML, md).
Full featured version: Selected dataset export. Allow COCO, YOLOv5, VOC. We would need to build in a test/train/val split functionality to make it work with YOLO.

hlydecker added the enhancement New feature or request label Apr 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download data in different formats #675

Download data in different formats #675

hlydecker commented Apr 27, 2022

geezacoleman commented Apr 27, 2022

hlydecker commented May 19, 2022

Download data in different formats #675

Download data in different formats #675

Comments

hlydecker commented Apr 27, 2022

geezacoleman commented Apr 27, 2022

hlydecker commented May 19, 2022