Preparing Data for YOLO-World

For pre-training YOLO-World, we adopt several datasets as listed in the below table:

Data	Samples	Type	Boxes
Objects365v1	609k	detection	9,621k
GQA	621k	grounding	3,681k
Flickr	149k	grounding	641k
CC3M-Lite	245k	image-text	821k

For training YOLO-World, we mainly adopt two kinds of dataset classs:

Text JSON

The json file is formatted as follows:

[
    ['A_1','A_2'],
    ['B'],
    ['C_1', 'C_2', 'C_3'],
    ...
]

Provide feedback