Is BertEncoder fine-tuned on the visual grounding tasks? #4

PlumedSerpent · 2021-08-15T05:48:47Z

It seems that BertEncoder will be finetuned and updated during training, which is unfair compared to "Improving one-stage visual grounding by recursive sub-query construction" and "A fast and accurate one- stage approach to visual grounding".

hbb1 · 2021-08-26T05:45:29Z

@PlumedSerpent
Thanks for you attention. We fine-tuned Bert for comparing to SQC-large (see their codebase) in refcocog.
Actually, we achieve comparable or even better results with LSTM to SQC-large (BertEncoder) in refcoco, referit, refcoco+.
Completely fair comparison to their algorithms (SQC) is not practical, e.g. they use more parameters, large resolution input, etc. Furthermore, our contribution does not lie in the language side so our paper seems orthogonal to SQC.

hbb1 · 2021-09-12T11:20:21Z

I am going to close this issue. Please open more if there is any other comment.

hbb1 closed this as completed Sep 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is BertEncoder fine-tuned on the visual grounding tasks? #4

Is BertEncoder fine-tuned on the visual grounding tasks? #4

PlumedSerpent commented Aug 15, 2021

hbb1 commented Aug 26, 2021 •

edited

hbb1 commented Sep 12, 2021

Is BertEncoder fine-tuned on the visual grounding tasks? #4

Is BertEncoder fine-tuned on the visual grounding tasks? #4

Comments

PlumedSerpent commented Aug 15, 2021

hbb1 commented Aug 26, 2021 • edited

hbb1 commented Sep 12, 2021

hbb1 commented Aug 26, 2021 •

edited