Skip to content
This repository has been archived by the owner on Jul 22, 2024. It is now read-only.

Generating label.tsv and feature.tsv from image #33

Open
sameerpande12 opened this issue Sep 30, 2020 · 6 comments
Open

Generating label.tsv and feature.tsv from image #33

sameerpande12 opened this issue Sep 30, 2020 · 6 comments

Comments

@sameerpande12
Copy link

sameerpande12 commented Sep 30, 2020

Hi guys, I am trying to generate my own features.tsv and labels.tsv for my dataset, but I am stuck at the following:

  1. I have a slight confusion regarding what exactly these features are. Upon reading the "Oscar" paper, I can understand that per bounding box a feature vector is of type (v',z) where v' is P-dimensional (2048) and z is 6 dimensional (position).
    I have a difficulty in understanding where do these 2048 features come from. Initially, I thought that these were from the FC-layer of Faster-R-CNN but upon checking the FC-layer size is 4096 in Faster-R-CNN.

  2. The Oscar paper mentions, " Specifically, v and q are generated as follows. Given an image with K regions
    of objects (normally over-sampled and noisy), Faster R-CNN [28] is used to extract the visual semantics of each region"
    . I have a slight confusion regarding how are these K-regions determined. Are these K-image regions the bound-boxes output by Faster-RCNN?

I am relatively new to this area. Any help would be appreciated.

@shravan1394
Copy link

shravan1394 commented Oct 2, 2020

The information is kind of dispersed in the issues, I will summarize it here for anyone looking in the future.

The features are extracted using the bottom up attention model from https://github.com/peteanderson80/bottom-up-attention.
You need to slightly modify the tools/generate_tsv.py to get the label.tsv and feature.tsv. The following code must be added to this file to create the exact format of featue.tsv
box_width = boxes[:, 2] - boxes[:, 0]
box_height = boxes[:, 3] - boxes[:, 1]
scaled_width = box_width / image_width
scaled_height = box_height / image_height
scaled_x = boxes[:, 0] / image_width
scaled_y = boxes[:, 1] / image_height
scaled_width = scaled_width[..., np.newaxis]
scaled_height = scaled_height[..., np.newaxis]
scaled_x = scaled_x[..., np.newaxis]
scaled_y = scaled_y[..., np.newaxis]
spatial_features = np.concatenate( (scaled_x, scaled_y, scaled_x + scaled_width, scaled_y + scaled_height, scaled_width, scaled_height), axis=1)
full_features = np.concatenate((features, spatial_features), axis=1)
fea_base64 = base64.b64encode(full_features).decode('utf-8')
fea_info = {'num_boxes': boxes.shape[0], 'feature': fea_base64}
row = [[image_key, json.dumps(fea_info)]

I am attaching the file that I used for this purpose and to generate label.tsv as well. You might have to change the code depending on your data location and format.
tsv_gen.py.zip

I still had some issues with csv Dictwriter generating strings with single quote while json loads requiring it as double quotes in run_captioning.py. I made modifications to run_captioning.py to make it work. If you guys have a better solution, let me know.

Finally to generate label.lineidx and feature.lineidx, make use of the following function

@sameerpande12
Copy link
Author

Thanks !

@EByrdS
Copy link

EByrdS commented Nov 11, 2020

@shravan1394, what is the command line you used to generate the caption after having the right features?

Also, could you share the modifications to run_captioning.py to fix the problem with json loads?

The generated label.lineidx and feature.lineidx need to be in the same folder as custom.feature.tsv and custom.label.tsv, right?

@zamanmub
Copy link

The information is kind of dispersed in the issues, I will summarize it here for anyone looking in the future.

The features are extracted using the bottom up attention model from https://github.com/peteanderson80/bottom-up-attention. You need to slightly modify the tools/generate_tsv.py to get the label.tsv and feature.tsv. The following code must be added to this file to create the exact format of featue.tsv box_width = boxes[:, 2] - boxes[:, 0] box_height = boxes[:, 3] - boxes[:, 1] scaled_width = box_width / image_width scaled_height = box_height / image_height scaled_x = boxes[:, 0] / image_width scaled_y = boxes[:, 1] / image_height scaled_width = scaled_width[..., np.newaxis] scaled_height = scaled_height[..., np.newaxis] scaled_x = scaled_x[..., np.newaxis] scaled_y = scaled_y[..., np.newaxis] spatial_features = np.concatenate( (scaled_x, scaled_y, scaled_x + scaled_width, scaled_y + scaled_height, scaled_width, scaled_height), axis=1) full_features = np.concatenate((features, spatial_features), axis=1) fea_base64 = base64.b64encode(full_features).decode('utf-8') fea_info = {'num_boxes': boxes.shape[0], 'feature': fea_base64} row = [[image_key, json.dumps(fea_info)]

I am attaching the file that I used for this purpose and to generate label.tsv as well. You might have to change the code depending on your data location and format. tsv_gen.py.zip

I still had some issues with csv Dictwriter generating strings with single quote while json loads requiring it as double quotes in run_captioning.py. I made modifications to run_captioning.py to make it work. If you guys have a better solution, let me know.

Finally to generate label.lineidx and feature.lineidx, make use of the following function

After using this script to generate feature and label tsv files, and after resolving the issue with single-quotes, I received the following error

JSONDecodeError: Expecting value: line 1 column 14 (char 13) error

I solved it by removing .decode('utf-8') from base64.b64encode(full_features).decode('utf-8') in the bottom-up-attention based extractor script

@zamanmub
Copy link

@EByrdS you can convert the single quotes to double quotes following #49 (comment) or #49 (comment)

@Cuberick-Orion
Copy link

The information is kind of dispersed in the issues, I will summarize it here for anyone looking in the future.

The features are extracted using the bottom up attention model from https://github.com/peteanderson80/bottom-up-attention. You need to slightly modify the tools/generate_tsv.py to get the label.tsv and feature.tsv. The following code must be added to this file to create the exact format of featue.tsv box_width = boxes[:, 2] - boxes[:, 0] box_height = boxes[:, 3] - boxes[:, 1] scaled_width = box_width / image_width scaled_height = box_height / image_height scaled_x = boxes[:, 0] / image_width scaled_y = boxes[:, 1] / image_height scaled_width = scaled_width[..., np.newaxis] scaled_height = scaled_height[..., np.newaxis] scaled_x = scaled_x[..., np.newaxis] scaled_y = scaled_y[..., np.newaxis] spatial_features = np.concatenate( (scaled_x, scaled_y, scaled_x + scaled_width, scaled_y + scaled_height, scaled_width, scaled_height), axis=1) full_features = np.concatenate((features, spatial_features), axis=1) fea_base64 = base64.b64encode(full_features).decode('utf-8') fea_info = {'num_boxes': boxes.shape[0], 'feature': fea_base64} row = [[image_key, json.dumps(fea_info)]

I am attaching the file that I used for this purpose and to generate label.tsv as well. You might have to change the code depending on your data location and format. tsv_gen.py.zip

I still had some issues with csv Dictwriter generating strings with single quote while json loads requiring it as double quotes in run_captioning.py. I made modifications to run_captioning.py to make it work. If you guys have a better solution, let me know.

Finally to generate label.lineidx and feature.lineidx, make use of the following function

Thanks for the summary of information here!

To anyone wishing to extract features on custom datasets, stumbled on this thread, and potentially struggling with the caffe environment, I'd recommend using the docker env built from the lxmert.

Follow the instructions to set up the environment, then rewrite the import part of the script following this (at the top of the file).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants