Generating label.tsv and feature.tsv from image #33

sameerpande12 · 2020-09-30T18:24:58Z

Hi guys, I am trying to generate my own features.tsv and labels.tsv for my dataset, but I am stuck at the following:

I have a slight confusion regarding what exactly these features are. Upon reading the "Oscar" paper, I can understand that per bounding box a feature vector is of type (v',z) where v' is P-dimensional (2048) and z is 6 dimensional (position).
I have a difficulty in understanding where do these 2048 features come from. Initially, I thought that these were from the FC-layer of Faster-R-CNN but upon checking the FC-layer size is 4096 in Faster-R-CNN.
The Oscar paper mentions, " Specifically, v and q are generated as follows. Given an image with K regions
of objects (normally over-sampled and noisy), Faster R-CNN [28] is used to extract the visual semantics of each region". I have a slight confusion regarding how are these K-regions determined. Are these K-image regions the bound-boxes output by Faster-RCNN?

I am relatively new to this area. Any help would be appreciated.

shravan1394 · 2020-10-02T17:34:50Z

The information is kind of dispersed in the issues, I will summarize it here for anyone looking in the future.

The features are extracted using the bottom up attention model from https://github.com/peteanderson80/bottom-up-attention.
You need to slightly modify the tools/generate_tsv.py to get the label.tsv and feature.tsv. The following code must be added to this file to create the exact format of featue.tsv
box_width = boxes[:, 2] - boxes[:, 0]
box_height = boxes[:, 3] - boxes[:, 1]
scaled_width = box_width / image_width
scaled_height = box_height / image_height
scaled_x = boxes[:, 0] / image_width
scaled_y = boxes[:, 1] / image_height
scaled_width = scaled_width[..., np.newaxis]
scaled_height = scaled_height[..., np.newaxis]
scaled_x = scaled_x[..., np.newaxis]
scaled_y = scaled_y[..., np.newaxis]
spatial_features = np.concatenate( (scaled_x, scaled_y, scaled_x + scaled_width, scaled_y + scaled_height, scaled_width, scaled_height), axis=1)
full_features = np.concatenate((features, spatial_features), axis=1)
fea_base64 = base64.b64encode(full_features).decode('utf-8')
fea_info = {'num_boxes': boxes.shape[0], 'feature': fea_base64}
row = [[image_key, json.dumps(fea_info)]

I am attaching the file that I used for this purpose and to generate label.tsv as well. You might have to change the code depending on your data location and format.
tsv_gen.py.zip

I still had some issues with csv Dictwriter generating strings with single quote while json loads requiring it as double quotes in run_captioning.py. I made modifications to run_captioning.py to make it work. If you guys have a better solution, let me know.

Finally to generate label.lineidx and feature.lineidx, make use of the following function

sameerpande12 · 2020-10-02T17:43:54Z

Thanks !

EByrdS · 2020-11-11T17:46:00Z

@shravan1394, what is the command line you used to generate the caption after having the right features?

Also, could you share the modifications to run_captioning.py to fix the problem with json loads?

The generated label.lineidx and feature.lineidx need to be in the same folder as custom.feature.tsv and custom.label.tsv, right?

zamanmub · 2021-11-11T13:54:54Z

The information is kind of dispersed in the issues, I will summarize it here for anyone looking in the future.

The features are extracted using the bottom up attention model from https://github.com/peteanderson80/bottom-up-attention. You need to slightly modify the tools/generate_tsv.py to get the label.tsv and feature.tsv. The following code must be added to this file to create the exact format of featue.tsv box_width = boxes[:, 2] - boxes[:, 0] box_height = boxes[:, 3] - boxes[:, 1] scaled_width = box_width / image_width scaled_height = box_height / image_height scaled_x = boxes[:, 0] / image_width scaled_y = boxes[:, 1] / image_height scaled_width = scaled_width[..., np.newaxis] scaled_height = scaled_height[..., np.newaxis] scaled_x = scaled_x[..., np.newaxis] scaled_y = scaled_y[..., np.newaxis] spatial_features = np.concatenate( (scaled_x, scaled_y, scaled_x + scaled_width, scaled_y + scaled_height, scaled_width, scaled_height), axis=1) full_features = np.concatenate((features, spatial_features), axis=1) fea_base64 = base64.b64encode(full_features).decode('utf-8') fea_info = {'num_boxes': boxes.shape[0], 'feature': fea_base64} row = [[image_key, json.dumps(fea_info)]

I am attaching the file that I used for this purpose and to generate label.tsv as well. You might have to change the code depending on your data location and format. tsv_gen.py.zip

I still had some issues with csv Dictwriter generating strings with single quote while json loads requiring it as double quotes in run_captioning.py. I made modifications to run_captioning.py to make it work. If you guys have a better solution, let me know.

Finally to generate label.lineidx and feature.lineidx, make use of the following function

After using this script to generate feature and label tsv files, and after resolving the issue with single-quotes, I received the following error

JSONDecodeError: Expecting value: line 1 column 14 (char 13) error

I solved it by removing .decode('utf-8') from base64.b64encode(full_features).decode('utf-8') in the bottom-up-attention based extractor script

zamanmub · 2021-11-11T13:57:21Z

@EByrdS you can convert the single quotes to double quotes following #49 (comment) or #49 (comment)

Cuberick-Orion · 2022-03-28T14:18:00Z

The information is kind of dispersed in the issues, I will summarize it here for anyone looking in the future.

The features are extracted using the bottom up attention model from https://github.com/peteanderson80/bottom-up-attention. You need to slightly modify the tools/generate_tsv.py to get the label.tsv and feature.tsv. The following code must be added to this file to create the exact format of featue.tsv box_width = boxes[:, 2] - boxes[:, 0] box_height = boxes[:, 3] - boxes[:, 1] scaled_width = box_width / image_width scaled_height = box_height / image_height scaled_x = boxes[:, 0] / image_width scaled_y = boxes[:, 1] / image_height scaled_width = scaled_width[..., np.newaxis] scaled_height = scaled_height[..., np.newaxis] scaled_x = scaled_x[..., np.newaxis] scaled_y = scaled_y[..., np.newaxis] spatial_features = np.concatenate( (scaled_x, scaled_y, scaled_x + scaled_width, scaled_y + scaled_height, scaled_width, scaled_height), axis=1) full_features = np.concatenate((features, spatial_features), axis=1) fea_base64 = base64.b64encode(full_features).decode('utf-8') fea_info = {'num_boxes': boxes.shape[0], 'feature': fea_base64} row = [[image_key, json.dumps(fea_info)]

I am attaching the file that I used for this purpose and to generate label.tsv as well. You might have to change the code depending on your data location and format. tsv_gen.py.zip

I still had some issues with csv Dictwriter generating strings with single quote while json loads requiring it as double quotes in run_captioning.py. I made modifications to run_captioning.py to make it work. If you guys have a better solution, let me know.

Finally to generate label.lineidx and feature.lineidx, make use of the following function

Thanks for the summary of information here!

To anyone wishing to extract features on custom datasets, stumbled on this thread, and potentially struggling with the caffe environment, I'd recommend using the docker env built from the lxmert.

Follow the instructions to set up the environment, then rewrite the import part of the script following this (at the top of the file).

weiyx16 mentioned this issue Jan 8, 2021

Extracted feature for VQA test-dev set #38

Closed

anbrjohn mentioned this issue Mar 12, 2021

Generating inputs to Oscar model #49

Open

Alcoholrithm mentioned this issue Apr 2, 2021

Getting incomplete caption sentences when using model fine-tuned with CIDEr optimization #80

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating label.tsv and feature.tsv from image #33

Generating label.tsv and feature.tsv from image #33

sameerpande12 commented Sep 30, 2020 •

edited

Loading

shravan1394 commented Oct 2, 2020 •

edited

Loading

sameerpande12 commented Oct 2, 2020

EByrdS commented Nov 11, 2020 •

edited

Loading

zamanmub commented Nov 11, 2021

zamanmub commented Nov 11, 2021

Cuberick-Orion commented Mar 28, 2022

Generating label.tsv and feature.tsv from image #33

Generating label.tsv and feature.tsv from image #33

Comments

sameerpande12 commented Sep 30, 2020 • edited Loading

shravan1394 commented Oct 2, 2020 • edited Loading

sameerpande12 commented Oct 2, 2020

EByrdS commented Nov 11, 2020 • edited Loading

zamanmub commented Nov 11, 2021

zamanmub commented Nov 11, 2021

Cuberick-Orion commented Mar 28, 2022

sameerpande12 commented Sep 30, 2020 •

edited

Loading

shravan1394 commented Oct 2, 2020 •

edited

Loading

EByrdS commented Nov 11, 2020 •

edited

Loading