# Reformat the ground truth annos from CSV to JSON

This notebook reformats the annos I created for my ground-truth trailcam images using VOTT. They are reformatted from CSV to a flattened JSON format.

In [1]:
from utils import *
import pandas as pd

TEST_RAW = BASE_PATH / "test_annos_raw.csv"  # copied from dropbox
TEST     = BASE_PATH / "test_annos.json"     # final product will be saved here

width, height  = 1920, 1440

Example of a flattened JSON record:

Here's the raw csv of VOTT annos:

In [2]:
annos = pd.read_csv(TEST_RAW)
annos.head()

Unnamed: 0.1,Unnamed: 0,image,xmin,ymin,xmax,ymax,label
0,0,1637205386_SYFW0754.jpg,1384.664845,611.141671,1566.972281,1002.33043,person
1,2,1637205534_SYFW0757.jpg,638.714341,641.144444,728.363968,907.807043,person
2,4,1637205701_SYFW0760.jpg,0.0,266.741845,1920.0,1376.946401,person
3,6,1637205809_SYFW0762.jpg,933.709539,632.93923,1023.301293,916.858712,person
4,8,1637206366_SYFW0768.jpg,296.078133,527.310674,553.570951,1005.684123,person


Distinct labels from my VOTT data:

In [3]:
labels = annos['label'].unique().tolist()
labels

['person',
 'deer',
 'railcar',
 'car',
 'train',
 'squirrel',
 'bird',
 'bear',
 'my_car',
 'turkey']

In [4]:
label2clean = {
    'person'   : 'person',
    'deer'     : 'deer',
    'railcar'  : 'railcar',
    'car'      : 'truck',
    'train'    : 'train',
    'squirrel' : 'squirrel',
    'bird'     : 'bird',
    'bear'     : 'bear',
    'my_car'   : 'car',
    'turkey'   : 'turkey'
}

I must add `turkey` into `cat2id`.

In [5]:
cat2id = {clean:i for i,clean in enumerate(label2clean.values())}

In [6]:
annos = annos.drop(columns=['Unnamed: 0'])                                      # remove unnamed col
annos['bbox'] = annos.apply(lambda r: [r.xmin, r.ymin, r.xmax, r.ymax], axis=1) # create bbox col
annos['category'] = annos.apply(lambda r: label2clean[r.label], axis=1)          # create category col

In [7]:
res = {}

for fn in annos['image'].unique():
    
    # flat_annos has an int key b/c it was needed for the joins
    #  between train_json['images'] and train_json['annotations'].
    #  This should be a unique id for my trailcam images, but it hasn't
    #  been tested.
    
    img_id = fn.replace('_SYFW', '').replace('.jpg', '')
    
    rec = {}
    rec['path']    = str(TRAILCAM/fn)
    rec['width']   = width
    rec['height']  = height
    rec['cat_ids'] = []
    rec['cats']    = []
    rec['bboxes']  = []
    
    for idx, row in annos[annos['image']==fn].iterrows():
        rec['bboxes'].append(row.bbox)
        rec['cats'].append(row.category)
        rec['cat_ids'].append(cat2id[row.category])
    
    rec['bboxes'] = to_xywh(rec['bboxes'])
    
    res[img_id] = rec

In [8]:
res

{'16372053860754': {'path': '/home/rory/data/trailcam/1637205386_SYFW0754.jpg',
  'width': 1920,
  'height': 1440,
  'cat_ids': [0],
  'cats': ['person'],
  'bboxes': [[1384.6648454157782,
    611.1416709863745,
    182.3074360341152,
    391.18875888625587]]},
 '16372055340757': {'path': '/home/rory/data/trailcam/1637205534_SYFW0757.jpg',
  'width': 1920,
  'height': 1440,
  'cat_ids': [0],
  'cats': ['person'],
  'bboxes': [[638.7143405932655,
    641.1444435473093,
    89.64962753874931,
    266.6625993963827]]},
 '16372057010760': {'path': '/home/rory/data/trailcam/1637205701_SYFW0760.jpg',
  'width': 1920,
  'height': 1440,
  'cat_ids': [0],
  'cats': ['person'],
  'bboxes': [[0.0, 266.7418450418745, 1920.0, 1110.2045560851745]]},
 '16372058090762': {'path': '/home/rory/data/trailcam/1637205809_SYFW0762.jpg',
  'width': 1920,
  'height': 1440,
  'cat_ids': [0],
  'cats': ['person'],
  'bboxes': [[933.7095386825226,
    632.9392303880078,
    89.59175407536077,
    283.919481969440

In [10]:
save_json(res, TEST)

Saved to /home/rory/repos/trailcam/test_annos.json
