You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do you release the feature representations (both image and text) ?
And how to know the [person2] in question text corresponding to the man who wears a hat (in your paper Fig.2)?
The text was updated successfully, but these errors were encountered:
Sorry for the late reply.
A1: To leave more space for trying different methods of feature extraction, we didn't provide feature representation. Besides, for an end-to-end model like the baseline we provided, feature representations are not necessary to be extracted explicitly.
A2: We provide information about alignment in *.json files with the same name as corresponding images. The fields in the JSON files are as follows:
{
'boxes':[], # list of [x1,y1,x2,y2,area percentage] for each box
'segms':[], # list of edge lines for each box
'names':[], # label name of each box
'width':int, # width of the whole image
'height'int: # height of the whole image
}
names are sorted in order.
e.g. for a names like ['person', 'person', 'person', 'person', 'bottle', 'spoon', 'chair', 'chair', 'chair', 'chair', 'chair'], [person1] refers to the first object in subset ['person', 'person', 'person', 'person'], and analogously [chair2] refers to the second object in subset ['chair', 'chair', 'chair', 'chair', 'chair'].
The indices in names can correspond with the objects in boxes which are regions in images.
Do you release the feature representations (both image and text) ?
And how to know the [person2] in question text corresponding to the man who wears a hat (in your paper Fig.2)?
The text was updated successfully, but these errors were encountered: