# COCO dataset parse

## COCO数据集

大小：25 GB（压缩）

记录数量： 330K图像、80个对象类别、每幅图像有5个标签、25万个关键点。

分两部分发布，前部分于2014年发布，后部分于2015年.
- 2014年版本：
82,783 training, 40,504 validation, and 40,775 testing images，有270k的segmented people和886k的segmented object；
- 2015年版本：
165,482 train, 81,208 val, and 81,434 test images。


### class label

person(人)  bicycle(自行车)  car(汽车)  motorbike(摩托车)  aeroplane(飞机)  bus(公共汽车)  train(火车)  truck(卡车)  boat(船)  
traffic light(信号灯)  fire hydrant(消防栓)  stop sign(停车标志)  parking meter(停车计费器)  bench(长凳)  
bird(鸟)  cat(猫)  dog(狗)  horse(马)  sheep(羊)  cow(牛)  elephant(大象)  bear(熊)  zebra(斑马)  giraffe(长颈鹿)  
backpack(背包)  umbrella(雨伞)  handbag(手提包)  tie(领带)  suitcase(手提箱)  
frisbee(飞盘)  skis(滑雪板双脚)  snowboard(滑雪板)  sports ball(运动球)  kite(风筝) baseball bat(棒球棒)  baseball glove(棒球手套)  skateboard(滑板)  surfboard(冲浪板)  tennis racket(网球拍)  
bottle(瓶子)  wine glass(高脚杯)  cup(茶杯)  fork(叉子)  knife(刀)
spoon(勺子)  bowl(碗)  
banana(香蕉)  apple(苹果)  sandwich(三明治)  orange(橘子)  broccoli(西兰花)  carrot(胡萝卜)  hot dog(热狗)  pizza(披萨)  donut(甜甜圈)  cake(蛋糕)
chair(椅子)  sofa(沙发)  pottedplant(盆栽植物)  bed(床)  diningtable(餐桌)  toilet(厕所)  tvmonitor(电视机)  
laptop(笔记本)  mouse(鼠标)  remote(遥控器)  keyboard(键盘)  cell phone(电话)  
microwave(微波炉)  oven(烤箱)  toaster(烤面包器)  sink(水槽)  refrigerator(冰箱)
book(书)  clock(闹钟)  vase(花瓶)  scissors(剪刀)  teddy bear(泰迪熊)  hair drier(吹风机)  toothbrush(牙刷)


### annotation

3种标注类型，使用json文件存储，每种类型包含了训练和验证
- instances（目标实例）： 
也就是目标检测object detection
- keypoints（目标上的关键点）
- captions（看图说话）

## read annocation

In [2]:
import glob
import os
import shutil
import json
from tqdm import tqdm

In [4]:
annotation_path = r"H:\deepLearning\dataset\COCO2014\annotations_trainval2014\annotations"
json_file = os.path.join(annotation_path, 'instances_train2014.json')
with open(json_file) as f:
    data = json.load(f)
print(type(data))

<class 'dict'>


查看annotation内的具体信息

In [5]:
print(len(data))

5


In [6]:
for k in data.keys():
    print(k)

info
images
licenses
annotations
categories


在images中存放了照片名字信息和ID

In [8]:
print(data['images'][0])

{'license': 5, 'file_name': 'COCO_train2014_000000057870.jpg', 'coco_url': 'http://images.cocodataset.org/train2014/COCO_train2014_000000057870.jpg', 'height': 480, 'width': 640, 'date_captured': '2013-11-14 16:28:13', 'flickr_url': 'http://farm4.staticflickr.com/3153/2970773875_164f0c0b83_z.jpg', 'id': 57870}


在annotation中存放了ID和bbox,特别注意
- annotation存放的标签顺序并不是与images存放的顺序一致，而是二者通过ID进行匹配
- annotation的每个成员只保存了某个图片的一个目标，也就是说一个图片如果有多个目标，目标信息可能分布在多个annotation成员中

In [14]:
print(data['annotations'][0])

{'segmentation': [[312.29, 562.89, 402.25, 511.49, 400.96, 425.38, 398.39, 372.69, 388.11, 332.85, 318.71, 325.14, 295.58, 305.86, 269.88, 314.86, 258.31, 337.99, 217.19, 321.29, 182.49, 343.13, 141.37, 348.27, 132.37, 358.55, 159.36, 377.83, 116.95, 421.53, 167.07, 499.92, 232.61, 560.32, 300.72, 571.89]], 'area': 54652.9556, 'iscrowd': 0, 'image_id': 480023, 'bbox': [116.95, 305.86, 285.3, 266.03], 'category_id': 58, 'id': 86}


## coco annocation to YOLO labels
参考：https://github.com/ultralytics/JSON2YOLO

In [1]:
import glob
import os
import shutil
import json
from tqdm import tqdm
import numpy as np

In [2]:
def make_folders(path='../out/'):
    # Create folders
    if os.path.exists(path):
        shutil.rmtree(path)  # delete output folder
    os.makedirs(path)  # make new output folder
    os.makedirs(path + os.sep + 'labels')  # make new labels folder
    os.makedirs(path + os.sep + 'images')  # make new labels folder
    return path

In [3]:
def coco91_to_coco80_class():  # converts 91-index to 80-index (val2014)  (paper)
    # https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/
    # a = np.loadtxt('data/coco.names', dtype='str', delimiter='\n')
    # b = np.loadtxt('data/coco_paper.names', dtype='str', delimiter='\n')
    # x1 = [list(a[i] == b).index(True) + 1 for i in range(80)]  # darknet to coco
    # x2 = [list(b[i] == a).index(True) if any(b[i] == a) else None for i in range(91)]  # coco to darknet
    x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, None, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, None, 24, 25, None,
         None, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, None, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
         51, 52, 53, 54, 55, 56, 57, 58, 59, None, 60, None, None, 61, None, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,
         None, 73, 74, 75, 76, 77, 78, 79, None]
    return x

In [4]:
def convert_coco_json(path,file):
    '''
    path: annotations的路径
    file: path中的某个json文件名
    '''
    out_dir = make_folders(path+'/out/')  # output directory

    json_file = os.path.join(path, file)
    coco80 = coco91_to_coco80_class()
    
    with open(json_file) as f:
        data = json.load(f)
        
    # Create image dict
    # 其内成员例如：images['57870':'COCO_train2014_000000057870.jpg']
    images = {'%g' % x['id']: x for x in data['images']}
    
    # Write labels file
    for x in tqdm(data['annotations'], desc='Annotations %s' % json_file):
        # x['iscrowd']=1说明bbox框内的是一组对象，否则是单个对象
        if x['iscrowd']:
            continue

        img = images['%g' % x['image_id']]
        h, w, f = img['height'], img['width'], img['file_name']

        # The Labelbox bounding box format is [top left x, top left y, width, height]
        box = np.array(x['bbox'], dtype=np.float64)
        box[:2] += box[2:] / 2  # xy top-left corner to center
        box[[0, 2]] /= w  # normalize x
        box[[1, 3]] /= h  # normalize y
        
        fn =  out_dir + 'labels/'
        if (box[2] > 0.) and (box[3] > 0.):  # if w > 0 and h > 0
            label_f = os.path.splitext(f)[0] + '.txt'
            # 只写模式，追加写
            with open(fn + label_f, 'a') as file:
                file.write('%g %.6f %.6f %.6f %.6f\n' % (coco80[x['category_id'] - 1], *box))


### train2014

In [41]:
annotation_path = r"H:\deepLearning\dataset\COCO2014\annotations_trainval2014\annotations"
train_json = 'instances_train2014.json'
convert_coco_json(annotation_path, train_json)


Annotations H:\deepLearning\dataset\COCO2014\annotations_trainval2014\annotations\instances_train2014.json:   2%| | 907


KeyboardInterrupt: 

### val2014

In [6]:
annotation_path = r"H:\deepLearning\dataset\COCO2014\annotations_trainval2014\annotations"
val_json = 'instances_val2014.json'
convert_coco_json(annotation_path, val_json)


Annotations H:\deepLearning\dataset\COCO2014\annotations_trainval2014\annotations\instances_val2014.json: 100%|█| 29187


## 生成.txt

In [9]:
def generate_file(path_images, txt_name):
    '''
    将所有数据集内的图片名写入txt文件
    '''
    file_imgs = os.listdir(path_images)
    contents = ""
    with open(txt_name,'w') as f:
        contents = ""
        for img in tqdm(file_imgs):
            contents = f"{os.path.join(path_images, img)}\n"
            f.write(contents)

In [10]:
path_images = r"H:\deepLearning\dataset\COCO2014\val2014\val2014"
txt_name    = r"H:\deepLearning\dataset\COCO2014\valid.txt"
generate_file(path_images, txt_name)

100%|████████████████████████████████████████████████████████████████████████| 40504/40504 [00:00<00:00, 288182.84it/s]


In [11]:
path_images = r"H:\deepLearning\dataset\COCO2014\train2014"
txt_name    = r"H:\deepLearning\dataset\COCO2014\train.txt"
generate_file(path_images, txt_name)

100%|████████████████████████████████████████████████████████████████████████| 82783/82783 [00:00<00:00, 287011.76it/s]


## Test

### 文件名分离

In [16]:
file_path = 'a/b.jpg'
os.path.split(file_path)

('a', 'b.jpg')

In [18]:
file_path = 'b.jpg'
os.path.splitext(file_path)

('b', '.jpg')

### 字符串格式化
- %g:浮点数字，不用加小数点
- %f:浮点数字，有小数点

In [24]:
a = [1.0,2,3]
b = ['%g'%x  for x in a]
b

['1', '2.1', '3']

In [25]:
a = [1.0,2,3]
b = ['%f'%x  for x in a]
b

['1.000000', '2.000000', '3.000000']