# Open Images data preparation

The goal of this notebook is to get the public Open Images data formatted in such a way it can be used with DETR.

To download the Open Images, we're using the following script:
- https://github.com/spmallick/learnopencv/blob/master/downloadOpenImages/downloadOI.py


In [36]:
import sys
import numpy as np

In [1]:
# Download required meta-files
!wget https://storage.googleapis.com/openimages/2018_04/class-descriptions-boxable.csv
 
!wget https://storage.googleapis.com/openimages/2018_04/train/train-annotations-bbox.csv
 
!wget https://storage.googleapis.com/openimages/2018_04/validation/validation-annotations-bbox.csv
 
!wget https://storage.googleapis.com/openimages/2018_04/test/test-annotations-bbox.csv 

--2020-08-15 02:23:36--  https://storage.googleapis.com/openimages/2018_04/class-descriptions-boxable.csv
Resolving storage.googleapis.com... 172.217.14.208, 172.217.14.240, 216.58.193.80, ...
Connecting to storage.googleapis.com|172.217.14.208|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 11255 (11K) [text/csv]
Saving to: ‘class-descriptions-boxable.csv’


2020-08-15 02:23:36 (24.2 MB/s) - ‘class-descriptions-boxable.csv’ saved [11255/11255]

--2020-08-15 02:23:36--  https://storage.googleapis.com/openimages/2018_04/train/train-annotations-bbox.csv
Resolving storage.googleapis.com... 216.58.217.48, 216.58.193.80, 172.217.14.240, ...
Connecting to storage.googleapis.com|216.58.217.48|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1194033454 (1.1G) [text/csv]
Saving to: ‘train-annotations-bbox.csv’


2020-08-15 02:23:54 (63.0 MB/s) - ‘train-annotations-bbox.csv’ saved [1194033454/1194033454]

--2020-08-15 02:23:54--  https://storage

### Testing with a small sample of Google Open Images.

In [11]:
!python3 downloadOI.py --mode train --classes 'Ice_cream,Cookie' 

#Download bathtub and toilet images
!python3 downloadOI.py --classes 'Bathtub,Toilet' --mode validation

Class 0 : Ice_cream
Class 1 : Cookie
Annotation Count : 6992
Number of images to be downloaded : 2452
100%|███████████████████████████████████████| 2452/2452 [10:00<00:00,  4.08it/s]
Class 0 : Bathtub
Class 1 : Toilet
Annotation Count : 43
Number of images to be downloaded : 39
100%|███████████████████████████████████████████| 39/39 [00:11<00:00,  3.36it/s]


In [41]:
# Classes of amenities Airbnb mostly cares about
subset = ["Toilet",
          "Swimming_pool",
          "Bed",
          "Biliard_table",
          "Sink",
          "Fountain",
          "Oven",
          "Ceiling_fan",
          "Television",
          "Microwave_oven",
          "Gas_stove",
          "Refrigerator",
          "Kitchen_&_dining_room_table",
          "Washing_machine",
          "Bathtub",
          "Stairs",
          "Fireplace",
          "Pillow",
          "Mirror",
          "Shower",
          "Couch",
          "Countertop",
          "Coffeemaker",
          "Dishwasher",
          "Sofa_bed",
          "Tree_house",
          "Towel",
          "Porch",
          "Wine_rack",
          "Jacuzzi"]

# Converting Images/labels to COCO

 Convert Google Open image Images/labels to with annotations from http://cocodataset.org. We expect the directory structure to be the following:
 ```
 path/to/coco/
  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images
 ```

The expected COCO dataset format:

```
annotation{
    "id": int, 
    "image_id": int, 
    "category_id": int, 
    "segmentation": RLE or [polygon], 
    "area": float, 
    "bbox": [x,y,width,height], 
    "iscrowd": 0 or 1,
}

categories[{
    "id": int, 
    "name": str, 
    "supercategory": str,
}]
```

Current style:
- Filepath: `validation/Toilet/0d0719cfd8e417b7.jpg`
- Label filename: `0d0719cfd8e417b7.txt`
- Image filename: `0d0719cfd8e417b7.jpg`
- Contents of label file: `Toilet,0.093371,0.986232,0.189984,0.965806`

I'll need a way to:
1. Traverse different file paths
2. Gather filenames and explore their text contents
3. Seek duplicates and make sure they contain multiple labels

What I should end up with is:
- A single file of images (can convert this to train/test later)
- A dictionary of all of the different parameters for each image path
    - Perhaps I could do this with a Pandas dataframe? So it's visual? Then convert it to JSON maybe?

In [2]:
import pandas as pd
df = pd.DataFrame()
df

In [13]:
import os
file_ids = [os.path.splitext(file)[0] for file in os.listdir("validation/Bathtub")]
len(file_ids), len(set(file_ids))       

(28, 14)

In [15]:
fi_id = [os.path.splitext(file)[0] for file in os.listdir('validation/Bathtub')]

In [16]:
unique_ids = list(set(file_ids))
#unique_ids

In [17]:
unique_ids

['2f2039140c8b1f2b',
 'cd727c5e0d2cfc1b',
 '11dcac4ca5923a58',
 '789cc0283c0e18af',
 '2f0a18a409d5a769',
 '8e562fee6ff208f7',
 '837b3d11ff02f116',
 '784bba5f12ee7c35',
 '3f66fdf9688514d4',
 '15348e4f2c7ebe0f',
 'eccb080e57b2aac5',
 '822f20d881eebeb9',
 'd7d469cb4c7e8cd2',
 'ddd3f96ff7f0bc78']

In [18]:
df["image_id"] = unique_ids

In [20]:
df.head(5)

Unnamed: 0,image_id
0,2f2039140c8b1f2b
1,cd727c5e0d2cfc1b
2,11dcac4ca5923a58
3,789cc0283c0e18af
4,2f0a18a409d5a769


In [21]:
%%time
img_files = []
label_files = []
for path, dirnames, filenames in os.walk("validation"):
    for file in filenames:
        if ".jpg" in file:
            #print(os.path.join(path, file))
            img_files.append(os.path.join(path, file))
        else:
            label_files.append(os.path.join(path, file))
            #print(os.path.join(path, file))
        #img_files = set(img_files.append(os.path.join(path, file)))
img_files, label_files

CPU times: user 0 ns, sys: 2.5 ms, total: 2.5 ms
Wall time: 1.37 ms


(['validation/Bathtub/3f66fdf9688514d4.jpg',
  'validation/Bathtub/2f2039140c8b1f2b.jpg',
  'validation/Bathtub/cd727c5e0d2cfc1b.jpg',
  'validation/Bathtub/15348e4f2c7ebe0f.jpg',
  'validation/Bathtub/789cc0283c0e18af.jpg',
  'validation/Bathtub/784bba5f12ee7c35.jpg',
  'validation/Bathtub/eccb080e57b2aac5.jpg',
  'validation/Bathtub/2f0a18a409d5a769.jpg',
  'validation/Bathtub/8e562fee6ff208f7.jpg',
  'validation/Bathtub/822f20d881eebeb9.jpg',
  'validation/Bathtub/d7d469cb4c7e8cd2.jpg',
  'validation/Bathtub/11dcac4ca5923a58.jpg',
  'validation/Bathtub/ddd3f96ff7f0bc78.jpg',
  'validation/Bathtub/837b3d11ff02f116.jpg',
  'validation/Toilet/b2dc8e2437e8803f.jpg',
  'validation/Toilet/36d8c654fba5f337.jpg',
  'validation/Toilet/217b95a71cb220f8.jpg',
  'validation/Toilet/3a511ef0f68cc438.jpg',
  'validation/Toilet/42298c9659ce6603.jpg',
  'validation/Toilet/17f698ec871569ca.jpg',
  'validation/Toilet/c7312a8b436f5e90.jpg',
  'validation/Toilet/b3a65783709539de.jpg',
  'validation/Toil

In [22]:
len(img_files)

39

In [23]:
len(label_files)

39

In [24]:
os.path.splitext(img_files[0])[0].split("/")

['validation', 'Bathtub', '3f66fdf9688514d4']

In [25]:
%%time
img_details = []
for img_path in img_files:
    dataset, label, img_id = os.path.splitext(img_path)[0].split("/")
    img_detail = {
        "dataset": dataset,
        "label": label,
        "id": img_id,
        "file_name": img_path
    }
    img_details.append(img_detail)
img_details

CPU times: user 127 µs, sys: 0 ns, total: 127 µs
Wall time: 132 µs


[{'dataset': 'validation',
  'label': 'Bathtub',
  'id': '3f66fdf9688514d4',
  'file_name': 'validation/Bathtub/3f66fdf9688514d4.jpg'},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': '2f2039140c8b1f2b',
  'file_name': 'validation/Bathtub/2f2039140c8b1f2b.jpg'},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': 'cd727c5e0d2cfc1b',
  'file_name': 'validation/Bathtub/cd727c5e0d2cfc1b.jpg'},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': '15348e4f2c7ebe0f',
  'file_name': 'validation/Bathtub/15348e4f2c7ebe0f.jpg'},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': '789cc0283c0e18af',
  'file_name': 'validation/Bathtub/789cc0283c0e18af.jpg'},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': '784bba5f12ee7c35',
  'file_name': 'validation/Bathtub/784bba5f12ee7c35.jpg'},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': 'eccb080e57b2aac5',
  'file_name': 'validation/Bathtub/eccb080e57b2aac5.jpg'},
 {'dataset': 'validation',
  'label': 'Bathtub',

In [26]:
# Now let's get bounding box information
label_details = []
for label_path in label_files:
    dataset, label, img_id = os.path.splitext(label_path)[0].split("/")
    with open(label_path, "r") as file:
        data = file.read().replace("\n", "").split(",")
        XMin, XMax, YMin, YMax = data[1], data[2], data[3], data[4]
        # bbox dimensions = XMin, XMax, YMin, YMax
        #bbox = [float(i) for i in [data[1], data[2], data[3], data[4]]]
        
    label_detail = {
        "dataset": dataset,
        "label": label,
        "id": img_id,
        "bbox": [XMin, XMax, YMin, YMax]
    }
    label_details.append(label_detail)
label_details

[{'dataset': 'validation',
  'label': 'Bathtub',
  'id': '784bba5f12ee7c35',
  'bbox': ['0.000000', '1.000000', '0.000000', '1.000000']},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': '789cc0283c0e18af',
  'bbox': ['0.039686', '0.948718', '0.119581', '1.000000']},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': '2f2039140c8b1f2b',
  'bbox': ['0.000000', '1.000000', '0.000000', '1.000000']},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': 'cd727c5e0d2cfc1b',
  'bbox': ['0.048442', '0.980496', '0.427911', '0.806723']},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': '822f20d881eebeb9',
  'bbox': ['0.000000', '1.000000', '0.181437', '0.963113']},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': '8e562fee6ff208f7',
  'bbox': ['0.013755', '0.513952', '0.476118', '0.999980Bathtub']},
 {'dataset': 'validation',
  'label': 'Bathtub',
  'id': 'eccb080e57b2aac5',
  'bbox': ['0.303341', '0.905829', '0.470852', '0.999948']},
 {'dataset': 'validation',


In [27]:
df = pd.DataFrame(img_details)
df

Unnamed: 0,dataset,label,id,file_name
0,validation,Bathtub,3f66fdf9688514d4,validation/Bathtub/3f66fdf9688514d4.jpg
1,validation,Bathtub,2f2039140c8b1f2b,validation/Bathtub/2f2039140c8b1f2b.jpg
2,validation,Bathtub,cd727c5e0d2cfc1b,validation/Bathtub/cd727c5e0d2cfc1b.jpg
3,validation,Bathtub,15348e4f2c7ebe0f,validation/Bathtub/15348e4f2c7ebe0f.jpg
4,validation,Bathtub,789cc0283c0e18af,validation/Bathtub/789cc0283c0e18af.jpg
5,validation,Bathtub,784bba5f12ee7c35,validation/Bathtub/784bba5f12ee7c35.jpg
6,validation,Bathtub,eccb080e57b2aac5,validation/Bathtub/eccb080e57b2aac5.jpg
7,validation,Bathtub,2f0a18a409d5a769,validation/Bathtub/2f0a18a409d5a769.jpg
8,validation,Bathtub,8e562fee6ff208f7,validation/Bathtub/8e562fee6ff208f7.jpg
9,validation,Bathtub,822f20d881eebeb9,validation/Bathtub/822f20d881eebeb9.jpg


I should write a function which:
- Gets my image IDs
- Matches them with their details in the annotations csv 
- Saves their file name

In [6]:
val_annot = pd.read_csv("validation-annotations-bbox.csv")
val_annot

Unnamed: 0,ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside
0,0001eeaf4aed83f9,freeform,/m/0cmf2,1,0.022464,0.964178,0.070656,0.800164,0,0,0,0,0
1,000595fe6fee6369,freeform,/m/02wbm,1,0.000000,1.000000,0.000233,1.000000,0,0,1,0,0
2,000595fe6fee6369,freeform,/m/02xwb,1,0.141030,0.180277,0.676262,0.732455,0,0,0,0,0
3,000595fe6fee6369,freeform,/m/02xwb,1,0.213781,0.253028,0.298764,0.354956,1,0,0,0,0
4,000595fe6fee6369,freeform,/m/02xwb,1,0.232926,0.288447,0.488954,0.545146,1,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
204616,ffff21932da3ed01,freeform,/m/03fp41,1,0.177790,0.710296,0.412302,0.578570,0,0,1,0,0
204617,ffff21932da3ed01,freeform,/m/05s2s,1,0.000000,0.031963,0.502994,0.562275,1,1,0,0,0
204618,ffff21932da3ed01,freeform,/m/0c9ph5,1,0.323775,0.409382,0.464495,0.554111,0,0,1,0,0
204619,ffff21932da3ed01,freeform,/m/0c9ph5,1,0.540223,0.624863,0.493633,0.577892,1,0,1,0,0


In [7]:
classes = pd.read_csv("class-descriptions-boxable.csv", names=["id", "ClassName"])
classes

Unnamed: 0,id,ClassName
0,/m/011k07,Tortoise
1,/m/011q46kg,Container
2,/m/012074,Magpie
3,/m/0120dh,Sea turtle
4,/m/01226z,Football
...,...,...
596,/m/0qmmr,Wheelchair
597,/m/0wdt60w,Rugby ball
598,/m/0xfy,Armadillo
599,/m/0xzly,Maracas


In [8]:
%%time
# Create classname column on annotations which converts label codes to string labels
val_annot["ClassName"] = val_annot["LabelName"].map(classes.set_index("id")["ClassName"])
val_annot["ClassName"]

CPU times: user 24.3 ms, sys: 3.51 ms, total: 27.8 ms
Wall time: 26.4 ms


0           Airplane
1               Food
2              Fruit
3              Fruit
4              Fruit
             ...    
204616    Houseplant
204617         Plant
204618        Flower
204619        Flower
204620      Building
Name: ClassName, Length: 204621, dtype: object

# Move all images into one file

In [28]:
# Take these and move them into one file
img_files[:3]

['validation/Bathtub/3f66fdf9688514d4.jpg',
 'validation/Bathtub/2f2039140c8b1f2b.jpg',
 'validation/Bathtub/cd727c5e0d2cfc1b.jpg']

In [29]:
# Make directory with images
os.mkdir("images")

In [30]:
# Copy images files to images
import shutil
for image in img_files:
    shutil.copy2(image, "images")

In [31]:
len(os.listdir("images")), len(img_files)

(38, 39)

In [33]:
len([i.split("/")[2] for i in img_files])

39

In [38]:
len(np.unique([i.split("/")[2] for i in img_files]))

38

In [39]:
# Make list of images we have and split them off the dataframe with all the label information
my_images = [os.path.splitext(img_name)[0] for img_name in os.listdir("images")]
my_images

['b2dc8e2437e8803f',
 '36d8c654fba5f337',
 '217b95a71cb220f8',
 '3f66fdf9688514d4',
 '3a511ef0f68cc438',
 '42298c9659ce6603',
 '17f698ec871569ca',
 'c7312a8b436f5e90',
 'b3a65783709539de',
 '2f2039140c8b1f2b',
 'cd727c5e0d2cfc1b',
 '47ab0e73c33bdefd',
 '0d0719cfd8e417b7',
 '15348e4f2c7ebe0f',
 'd60e1c25e87d8b45',
 '789cc0283c0e18af',
 'f90eac5444f62b19',
 '2ff2bb609a057f7a',
 'be84a3da54658167',
 '784bba5f12ee7c35',
 'eccb080e57b2aac5',
 '2f0a18a409d5a769',
 'c1eefc708000b69f',
 '8e562fee6ff208f7',
 '7c2825a3d8e0ed29',
 '822f20d881eebeb9',
 'd7d469cb4c7e8cd2',
 '40a69f79da2aeaaa',
 '85ccc93d1f941931',
 '11dcac4ca5923a58',
 'eaa75967cbc70ac1',
 'ddd3f96ff7f0bc78',
 '3454c87a14067798',
 'e108d68162990dfb',
 'e6e05b56799fefba',
 '539e0871494dea5d',
 'b821cc12eede040e',
 '837b3d11ff02f116']

In [42]:
subset

['Toilet',
 'Swimming_pool',
 'Bed',
 'Biliard_table',
 'Sink',
 'Fountain',
 'Oven',
 'Ceiling_fan',
 'Television',
 'Microwave_oven',
 'Gas_stove',
 'Refrigerator',
 'Kitchen_&_dining_room_table',
 'Washing_machine',
 'Bathtub',
 'Stairs',
 'Fireplace',
 'Pillow',
 'Mirror',
 'Shower',
 'Couch',
 'Countertop',
 'Coffeemaker',
 'Dishwasher',
 'Sofa_bed',
 'Tree_house',
 'Towel',
 'Porch',
 'Wine_rack',
 'Jacuzzi']

In [43]:
# Make sure we only get the images we're concerned about
my_images_df = val_annot[val_annot["ImageID"].isin(my_images) & val_annot["ClassName"].isin(subset)]
my_images_df.head()

Unnamed: 0,ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside,ClassName
10797,0d0719cfd8e417b7,freeform,/m/09g1w,1,0.093371,0.986232,0.189984,0.965806,0,0,0,0,0,Toilet
14593,11dcac4ca5923a58,freeform,/m/03dnzn,1,0.360645,0.999988,0.0,1.0,0,1,0,0,0,Bathtub
17325,15348e4f2c7ebe0f,freeform,/m/03dnzn,1,0.08733,0.999973,0.600616,0.999962,0,0,0,0,0,Bathtub
17329,15348e4f2c7ebe0f,freeform,/m/065h6l,1,0.090833,1.0,0.600928,1.0,0,0,0,0,0,Jacuzzi
19543,17f698ec871569ca,freeform,/m/09g1w,1,0.160562,0.475538,0.18189,0.847251,0,0,0,0,0,Toilet


In [45]:
my_images_df.columns

Index(['ImageID', 'Source', 'LabelName', 'Confidence', 'XMin', 'XMax', 'YMin',
       'YMax', 'IsOccluded', 'IsTruncated', 'IsGroupOf', 'IsDepiction',
       'IsInside', 'ClassName'],
      dtype='object')

In [47]:
pd.Categorical(my_images_df.ClassName).codes

array([3, 0, 0, 1, 3, 3, 3, 0, 3, 0, 3, 3, 3, 3, 2, 3, 0, 3, 3, 3, 3, 0,
       0, 3, 0, 1, 0, 3, 0, 0, 3, 3, 3, 3, 3, 2, 2, 2, 3, 3, 0, 2, 3, 0,
       1, 0, 3, 3, 3, 0, 1, 3], dtype=int8)

In [48]:
my_images_df[my_images_df["ImageID"] == "c7312a8b436f5e90"]["ClassName"].values

array(['Sink', 'Sink', 'Sink', 'Toilet', 'Toilet'], dtype=object)

In [49]:
my_images_df[my_images_df["ImageID"] == "c7312a8b436f5e90"][["XMin", "XMax", "YMin", "YMax"]].values

array([[0.238995, 0.372361, 0.778648, 0.871384],
       [0.461353, 0.615221, 0.322836, 0.441359],
       [0.782433, 0.921636, 0.850595, 0.936918],
       [0.483064, 0.58323 , 0.148464, 0.254009],
       [0.756245, 0.872021, 0.377751, 0.570644]])

In [50]:
len(my_images_df)

52

In [51]:
my_images_df["ImageID"].value_counts()

c7312a8b436f5e90    5
17f698ec871569ca    2
8e562fee6ff208f7    2
d60e1c25e87d8b45    2
2f0a18a409d5a769    2
3454c87a14067798    2
eccb080e57b2aac5    2
3a511ef0f68cc438    2
822f20d881eebeb9    2
15348e4f2c7ebe0f    2
d7d469cb4c7e8cd2    2
85ccc93d1f941931    1
b821cc12eede040e    1
11dcac4ca5923a58    1
217b95a71cb220f8    1
36d8c654fba5f337    1
47ab0e73c33bdefd    1
eaa75967cbc70ac1    1
2ff2bb609a057f7a    1
3f66fdf9688514d4    1
0d0719cfd8e417b7    1
ddd3f96ff7f0bc78    1
c1eefc708000b69f    1
f90eac5444f62b19    1
2f2039140c8b1f2b    1
be84a3da54658167    1
42298c9659ce6603    1
cd727c5e0d2cfc1b    1
b3a65783709539de    1
7c2825a3d8e0ed29    1
e108d68162990dfb    1
789cc0283c0e18af    1
e6e05b56799fefba    1
784bba5f12ee7c35    1
40a69f79da2aeaaa    1
837b3d11ff02f116    1
b2dc8e2437e8803f    1
539e0871494dea5d    1
Name: ImageID, dtype: int64

In [52]:
%%time
# Create images labels setup in detectron2 style
img_dicts = []
for i, img in enumerate(my_images):
    file_path = "images/" + img + ".jpg"
    img_data = my_images_df[my_images_df["ImageID"] == img]
    img_label = img_data["ClassName"].values
    bboxes = img_data[["XMin", "XMax", "YMin", "YMax"]].values
    img_dict = {"file_path": file_path,
                "img_label": img_label,
                "bboxes": bboxes}
    img_dicts.append(img_dict)
img_dicts

CPU times: user 63.7 ms, sys: 7.1 ms, total: 70.8 ms
Wall time: 66.6 ms


[{'file_path': 'images/b2dc8e2437e8803f.jpg',
  'img_label': array(['Toilet'], dtype=object),
  'bboxes': array([[0.337954, 0.695044, 0.258797, 0.857483]])},
 {'file_path': 'images/36d8c654fba5f337.jpg',
  'img_label': array(['Toilet'], dtype=object),
  'bboxes': array([[0.063034, 0.805088, 0.      , 0.697051]])},
 {'file_path': 'images/217b95a71cb220f8.jpg',
  'img_label': array(['Toilet'], dtype=object),
  'bboxes': array([[0.438669, 0.602654, 0.449081, 0.532643]])},
 {'file_path': 'images/3f66fdf9688514d4.jpg',
  'img_label': array(['Bathtub'], dtype=object),
  'bboxes': array([[0.050368, 0.933495, 0.250992, 1.      ]])},
 {'file_path': 'images/3a511ef0f68cc438.jpg',
  'img_label': array(['Sink', 'Toilet'], dtype=object),
  'bboxes': array([[0.22748 , 0.493562, 0.625067, 0.78379 ],
         [0.124107, 0.338616, 0.755953, 0.978802]])},
 {'file_path': 'images/42298c9659ce6603.jpg',
  'img_label': array(['Toilet'], dtype=object),
  'bboxes': array([[0.301737, 0.792065, 0.190453, 0.8385

In [53]:
%%time
multiple = my_images_df[my_images_df["ImageID"] == "d60e1c25e87d8b45"].reset_index()
for i in range(len(multiple)):
    print(multiple.loc[i])

index                    172106
ImageID        d60e1c25e87d8b45
Source                 freeform
LabelName             /m/0130jx
Confidence                    1
XMin                          0
XMax                   0.823265
YMin                   0.315739
YMax                   0.975746
IsOccluded                    0
IsTruncated                   0
IsGroupOf                     0
IsDepiction                   0
IsInside                      0
ClassName                  Sink
Name: 0, dtype: object
index                    172112
ImageID        d60e1c25e87d8b45
Source                 freeform
LabelName              /m/09g1w
Confidence                    1
XMin                   0.405653
XMax                          1
YMin                   0.146613
YMax                   0.626838
IsOccluded                    1
IsTruncated                   0
IsGroupOf                     0
IsDepiction                   0
IsInside                      0
ClassName                Toilet
Name: 1, dtype: o

In [54]:
%%time
category_ids = pd.Categorical(my_images_df["ClassName"]).codes
my_images_df.loc[:, "ClassID"] = category_ids
my_images_df.head()

CPU times: user 2.4 ms, sys: 4.48 ms, total: 6.88 ms
Wall time: 6.35 ms


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[key] = _infer_fill_value(value)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s


Unnamed: 0,ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside,ClassName,ClassID
10797,0d0719cfd8e417b7,freeform,/m/09g1w,1,0.093371,0.986232,0.189984,0.965806,0,0,0,0,0,Toilet,3
14593,11dcac4ca5923a58,freeform,/m/03dnzn,1,0.360645,0.999988,0.0,1.0,0,1,0,0,0,Bathtub,0
17325,15348e4f2c7ebe0f,freeform,/m/03dnzn,1,0.08733,0.999973,0.600616,0.999962,0,0,0,0,0,Bathtub,0
17329,15348e4f2c7ebe0f,freeform,/m/065h6l,1,0.090833,1.0,0.600928,1.0,0,0,0,0,0,Jacuzzi,1
19543,17f698ec871569ca,freeform,/m/09g1w,1,0.160562,0.475538,0.18189,0.847251,0,0,0,0,0,Toilet,3


In [55]:
img_dicts[0]

{'file_path': 'images/b2dc8e2437e8803f.jpg',
 'img_label': array(['Toilet'], dtype=object),
 'bboxes': array([[0.337954, 0.695044, 0.258797, 0.857483]])}

In [56]:
%%time
# Create images labels setup in detectron2 style
img_dicts = []
for img in my_images:
    record = {}
    # Add image metadata
    filename = "images/" + img + ".jpg"
    img_data = my_images_df[my_images_df["ImageID"] == img].reset_index()
    
    record["file_name"] = filename
    record["image_id"] = img
    # TODO - use cv2
    record["height"] = 0
    record["width"] = 0
    
    # Create annotations list (contains labels of images)
    annotations = []
    for i in range(len(img_data)):
        img_label = img_data.loc[i]["ClassName"]
        category_id = img_data.loc[i]["ClassID"]
        #print(f"label: {img_label}")
        bbox = [img_data.loc[i][["XMin", "XMax", "YMin", "YMax"]].values]
        obj = {
            "bbox": bbox,
            "bbox_mode": 0, # TODO
            #"img_label": img_label, # not needed
            "category_id": category_id, 
             #"segmentation": ["poly"], # not needed for bounding boxes
             #"iscrowd": 0 # not included as per detectron2 docs sure of this  
        }
        annotations.append(obj)
        #print(annotations)
    record["annotations"] = annotations
    img_dicts.append(record)
img_dicts

CPU times: user 170 ms, sys: 0 ns, total: 170 ms
Wall time: 166 ms


[{'file_name': 'images/b2dc8e2437e8803f.jpg',
  'image_id': 'b2dc8e2437e8803f',
  'height': 0,
  'width': 0,
  'annotations': [{'bbox': [array([0.337954, 0.695044, 0.258797, 0.857483], dtype=object)],
    'bbox_mode': 0,
    'category_id': 3}]},
 {'file_name': 'images/36d8c654fba5f337.jpg',
  'image_id': '36d8c654fba5f337',
  'height': 0,
  'width': 0,
  'annotations': [{'bbox': [array([0.063034, 0.805088, 0.0, 0.697051], dtype=object)],
    'bbox_mode': 0,
    'category_id': 3}]},
 {'file_name': 'images/217b95a71cb220f8.jpg',
  'image_id': '217b95a71cb220f8',
  'height': 0,
  'width': 0,
  'annotations': [{'bbox': [array([0.438669, 0.602654, 0.44908100000000006, 0.532643], dtype=object)],
    'bbox_mode': 0,
    'category_id': 3}]},
 {'file_name': 'images/3f66fdf9688514d4.jpg',
  'image_id': '3f66fdf9688514d4',
  'height': 0,
  'width': 0,
  'annotations': [{'bbox': [array([0.050368, 0.9334950000000001, 0.250992, 1.0], dtype=object)],
    'bbox_mode': 0,
    'category_id': 0}]},
 {'fi