<a href="https://colab.research.google.com/github/YeongRoYun/BearTeam/blob/dev/data/COCO_Converter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=614

위의 형식을 COCO Format으로 재구성합니다.

```
{
"info": info, "images": [image], "annotations": [annotation], "licenses": [license],
}

info{
"year": int, "version": str, "description": str, "contributor": str, "url": str, "date_created": datetime,
}

image{
"id": int, "width": int, "height": int, "file_name": str, "license": int, "flickr_url": str, "coco_url": str, "date_captured": datetime,
}

license{
"id": int, "name": str, "url": str,
}
```

```
annotation{
"id": int, "image_id": int, "category_id": int, "segmentation": RLE or [polygon], "area": float, "bbox": [x,y,width,height], "iscrowd": 0 or 1,
}

categories[{
"id": int, "name": str, "supercategory": str,
}]
```

In [1]:
import os
from google.colab import drive

drive.mount('/content/drive')

Mounted at /content/drive


In [15]:
import re
import json
from pathlib import Path

## PM을 자전거, 오토바이, 킥보드로 카테고리 변경

In [11]:
from enum import Enum, auto
from collections import defaultdict
class Category(Enum):
    BICYCLE = auto()
    MOTORCYCLE = auto()
    KICKBOARD = auto()

    def __str__(self):
        if self == Category.BICYCLE:
            return 'bicycle'
        elif self == Category.MOTORCYCLE:
            return 'motorcycle'
        elif self == Category.KICKBOARD:
            return 'kickboard'
        else:
            return 'Not defined'
    
    def __int__(self):
        if self == Category.BICYCLE:
            return 0
        elif self == Category.MOTORCYCLE:
            return 1
        elif self == Category.KICKBOARD:
            return 2
        else:
            return -1

# 1 - 21로 나뉜다.
PMs = [
    Category.MOTORCYCLE,
    Category.BICYCLE,
    Category.KICKBOARD,
    Category.KICKBOARD,
    Category.MOTORCYCLE,
    Category.BICYCLE,
    Category.KICKBOARD,
    Category.MOTORCYCLE,
    Category.BICYCLE,
    Category.KICKBOARD,
    Category.MOTORCYCLE,
    Category.BICYCLE,
    Category.KICKBOARD,
    Category.MOTORCYCLE,
    Category.BICYCLE,
    Category.KICKBOARD,
    Category.MOTORCYCLE,
    Category.BICYCLE,
    Category.MOTORCYCLE,
    Category.BICYCLE,
    Category.KICKBOARD,
]

PMtoCategory = defaultdict(list)

for idx, pm in enumerate(PMs):
    PMtoCategory[pm].append(idx + 1)

print(PMtoCategory)

defaultdict(<class 'list'>, {<Category.MOTORCYCLE: 2>: [1, 5, 8, 11, 14, 17, 19], <Category.BICYCLE: 1>: [2, 6, 9, 12, 15, 18, 20], <Category.KICKBOARD: 3>: [3, 4, 7, 10, 13, 16, 21]})


## Image와 Annotation 연결하기

- Annotation Format
```
"info": {
    "video_id": "T006075",
    "clip_id": "001",
    "device": "B",
    "time": "D",
    "weather": "F",
    "is_scripted": "0"
},
"description": {
    "frame_id": "0126",
    "imageWidth": 1920,
    "imageHeight": 1080
},
```
- Image name
```
T006075_001_0126_B_D_F_0.jpg
```

In [45]:
def get_img_name(annoPath):
    """
    Annotation path 넣기!
    """
    assert os.path.exists(annoPath)

    fd = open(annoPath, 'r')
    anno = json.load(fd)


    chunks = [anno['info']['video_id'], anno['info']['clip_id'],
              anno['description']['frame_id'], anno['info']['device'],
              anno['info']['time'], anno['info']['weather'],
              anno['info']['is_scripted'],]
    return '_'.join(chunks) + '.jpg'


def test():
    annoPath = Path('/content') / 'sample.json'
    assert get_img_name(annoPath) == 'T006075_001_0126_B_D_F_0.jpg'
    return True

test()

True

## Info 채우기

In [43]:
from datetime import datetime

annoPath = Path('/content') / 'sample.json'
assert os.path.exists(annoPath)
fd = open(annoPath, 'r')
anno = json.load(fd)

info = {
    "year": anno['info'].get('date', '19990308'), 
    "version": '0.0.1', 
    "description": 'Created by AI-HUB', 
    "contributor": 'BearTeam', 
    "url": 'https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=614', 
    "date_created": str(datetime.now())
}
print(info)

{'year': '19990308', 'version': '0.0.1', 'description': 'Created by AI-HUB', 'contributor': 'BearTeam', 'url': 'https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=614', 'date_created': '2022-08-19 05:57:56.900273'}


## image 채우기

In [88]:
def get_img_id_wrapper():
    id = 0
    def get_img_id():
        nonlocal id
        tmp = id
        id += 1
        return tmp
    return get_img_id

get_img_id = get_img_id_wrapper()

width = anno['description']['imageWidth']
height = anno['description']['imageHeight']

fileName = get_img_name(annoPath)
license = 0
date_captured = anno['info'].get('date', '19990308')


def get_image(annoPath, get_img_id):
    assert annoPath
    with open(annoPath, 'r') as fd:
        anno = json.load(fd)
        width = anno['description']['imageWidth']
        height = anno['description']['imageHeight']
        fileName = get_img_name(annoPath)
        dateCaptured = anno['info'].get('date', '19990308')
        return {
            "id": get_img_id(), 
            "width": width, 
            "height": height, 
            "file_name": fileName, 
            "license": 0, 
            "flickr_url": '', 
            "coco_url": '', 
            "date_captured": dateCaptured,
        }


get_image(annoPath, get_img_id)

{'id': 1,
 'width': 1920,
 'height': 1080,
 'file_name': 'T006075_001_0126_B_D_F_0.jpg',
 'license': 0,
 'flickr_url': '',
 'coco_url': '',
 'date_captured': '19990308'}

## License 채우기

In [62]:
license = {
    'id': 0,
    'name': 'aihub',
    'url': 'https://www.aihub.or.kr/intrcn/guid/usagepolicy.do?currMenu=151&topMenu=105'
}

97

## BBOX

In [92]:
PMtoCategory
int(Category.BICYCLE)
print(str(Category.BICYCLE))
iscrowd = 0

bicycle
