## Load caption annotations

You can get 5 captions for each of the 1200 images included in the ImageNetTraining dataset.

In [1]:
import numpy as np
import pandas as pd
from pprint import pprint

df = pd.read_json("data/ImageNetTraining_captions.jsonl", orient="records", lines=True)

# Load image id list (1200 images)
image_id_list = sorted(np.unique(list(df["image_id"].values)))
print("Image ID examples")
print(image_id_list[:10])

# Read category id list (unique 150 categories) 
category_id_list = sorted(np.unique(list(df["category_id"].values)))
print("Category ID examples")
print(category_id_list[:10])

Image ID examples
['n01518878_10042', 'n01518878_12028', 'n01518878_14075', 'n01518878_14910', 'n01518878_5958', 'n01518878_7346', 'n01518878_7579', 'n01518878_8432', 'n01639765_22407', 'n01639765_32862']
Category ID examples
['n01518878', 'n01639765', 'n01645776', 'n01664990', 'n01704323', 'n01726692', 'n01768244', 'n01770393', 'n01772222', 'n01784675']


In [2]:
# Load captions (6000 captions)
for image_id in image_id_list[:3]: # only 3
    print("Image ID:", image_id)
    # 5 captions for each image
    captions = df.query("image_id == '{}'".format(image_id))["captions"].values[0]
    pprint(captions)


Image ID: n01518878_10042
['A brown and tan ostridge walking near a metal building,',
 'Brown feathered ostrich with white head and neck facing to the left, '
 'standing near a metal or aluminum type fencing.',
 'An ostrich with ruffled feathers looks in the direction of a green building.',
 'A brown ostrich is looking the direction of a green wall.',
 'a ostrich standing near a fence and looking at something']
Image ID: n01518878_12028
['The head of an ostridge looking over a metal fence.',
 'An emu or ostrich with an un-amused expression is looking in the direction '
 'of the camera.',
 'An ostrich with fuzzy hair on its head looks over a wire fence.',
 'A serious looking ostrich peers out over a wire fence.',
 'A close up image of a disappointed looking ostrich.']
Image ID: n01518878_14075
['A brown ostridge with its beak open walking in a field of dead grass.',
 'A large bird holds its beak open and stands in some brown grass.',
 'An ostrich with its beak open walks through long, d

## Load category synonyms

ImageNetTraining image stimuli were obtained from [ImageNet](https://www.image-net.org/), whose ImageNet IDs are derived from [WordNet](https://wordnet.princeton.edu/ ) synset. 
You can read the corresponding WordNet synonyms for the 150 categories included in ImageNetTraining. (If you would like to access  more detailed information, please use the `nltk` library.)

In [3]:
df_cat = pd.read_json("data/ImageNetTraining_category.jsonl", orient="records", lines=True)

for category_id in category_id_list[:3]: # only 3
    print("Cateogry ID:", category_id)
    synonyms = df_cat.query("synset == '{}'".format(category_id))["synonym"].values[0]
    print("Synonym:", synonyms)
    

Cateogry ID: n01518878
Synonym: ostrich
Cateogry ID: n01639765
Synonym: frog
Cateogry ID: n01645776
Synonym: true toad
