Skip to content

You can convert ImageNet2012 to COCO by this repository(But only support Object Detection).

Notifications You must be signed in to change notification settings

Xuxiaoxiaohaha/ImageNet2012ToCOCO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ImageNetToCOCO

You can convert ImageNet to COCO by this repository(But only support Object Detection).

中文

本项目以ImageNet2012为例,来生成COCO中的instances_train.jsoninstances_val.json文件,仅用于目标检测任务(segmentation、图像分类任务目前尚未支持)。 如果想用于图像分类任务,可参考此项目:https://github.com/beerys/Convert_Imagenet_to_COCO_format 我们项目的文件布局如下所示。

|—— resources
    |—— meta.mat
    |—— xmls
        |—— train
            |—— n01440764
            |—— n07579787
        |—— val
|—— example
    |—— resources
        |—— meta.mat
        |—— wnid_2_id_name_path.json
        |—— xmls
            |—— train
                |—— n01440764
                |—— n07579787
            |—— val
        |—— annotations
            |—— instances_ImageNet_val2012.json
            |—— instances_ImageNet_train2012.json
|—— convert2COCO.py
|—— preprocess.py
|—— run.sh

快速开始

./run.sh

运行完毕后,resources目录将会变成examples/resources目录的样子,新生成了wnid_2_id_name_path.jsonannotations/instances_ImageNet_val2012.jsonannotations/instances_ImageNet_train2012.json三个文件。

其中wnid_2_id_name_path.json文件是由preprocess.py生成的中间文件,记录了从WNID到ILSVRC2012_ID和类名的映射。

import json
with open('wnid_2_id_name_path.json','r') as f:
    data = json.load(f)
print(data.keys()) # set of WNID
print(data['n01440764']) # a list containing ILSVRC2012_ID and class_name
print(data['n01440764'][0]) # ILSVRC2012_ID
print(data['n01440764'][1]) # class_name

其中instances_ImageNet_val2012.jsonannotations/instances_ImageNet_train2012.json是COCO所需要的json文件。

from pycocotools.coco import COCO
coco = COCO('instances_ImageNet_val2012.json')

接下来,如果你想读取图片,你应该使你的图片文件结构如下所示:

|—— your_images_path
    |—— train
        |—— n01440764
            |—— n01440764_10040.JPEG
            |—— n01440764_xxxxx.JPEG
        |—— nyyyyyyy
            |—— nyyyyyyyy_xxxxx.JPEG
            |—— ...
        |—— ...
    |—— val
        |—— ILSVRC2012_val_00000001.xml
        |—— ILSVRC2012_val_00000002.xml
        |—— ILSVRC2012_val_xxxxxxxx.xml

然后你可以这样子读取图片:

from pycocotools.coco import COCO
import os
from PIL import Image
coco = COCO('instances_ImageNet_val2012.json')
img_id = 1
path = coco.loadImgs(img_id)[0]['file_name']
root = 'your_images_path/val' # or 'your_images_path/train'
img = Image.open(os.path.join(self.root, path)).convert('RGB')

因此,对于你的完整的ImageNet2012来说,你所需要准备的数据格式如下所示,(如resource所示):

|—— your_path/resources
    |—— meta.mat  # Get it by unzipping the Development Kit(Task 1 & 2).
    |—— xmls 
        |—— train # Get it by unzipping the Training bounding box annotations(Task 1 & 2 only)
            |—— n01440764
                |—— n01440764_10040.xml
                |—— n01440764_xxxxx.xml
            |—— nyyyyyyy
                |—— nyyyyyyyy_xxxxx.xml
                |—— ...
            |—— ...
        |—— val # Get it by unzipping the Validation bounding box annotations
            |—— ILSVRC2012_val_00000001.xml
            |—— ILSVRC2012_val_00000002.xml
            |—— ILSVRC2012_val_xxxxxxxx.xml

然后,通过在将三个文件convert2COCO.pypreprocess.pyrun.sh拷贝到目录your_path下,

|—— your_path
    |—— resources
        |—— ...
    |—— convert2COCO.py
    |—— prprocess.py
    |—— run.sh

接着在run.sh中把path变量修改成your_path即可,最后运行./run.sh,便可得到your_path/resources/annoations/instances_ImageNet_train2012.jsonyour_path/resources/annoations/instances_ImageNet_val2012.jsonyour_path/resources/wnid_2_id_name_path.json三个文件。

若想读取图片,则如上述示例使你的图片数据格式如下所示:

|—— your_images_path
    |—— train  # Get it by unzipping the Training images(Task1 & 2), You need to unzip the files in this directory to get the JPEG 
        |—— n01440764
            |—— n01440764_10040.JPEG
            |—— n01440764_xxxxx.JPEG
        |—— nyyyyyyy
            |—— nyyyyyyyy_xxxxx.JPEG
            |—— ...
        |—— ...
    |—— val  # Get it by unzipping the Validation images
        |—— ILSVRC2012_val_00000001.xml
        |—— ILSVRC2012_val_00000002.xml
        |—— ILSVRC2012_val_xxxxxxxx.xml

然后根据如下示例读取文件:

from pycocotools.coco import COCO
import os
from PIL import Image
coco = COCO('instances_ImageNet_val2012.json')
img_id = 1
path = coco.loadImgs(img_id)[0]['file_name']
root = 'your_images_path/val' # or 'your_images_path/train'
img = Image.open(os.path.join(self.root, path)).convert('RGB')

About

You can convert ImageNet2012 to COCO by this repository(But only support Object Detection).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published