Retrieval results much lower than expected on COCO 5K test set and Flickr30K 1K test set #78

juliuswang0728 · 2020-12-11T13:27:33Z

Hi!

Is there anyone who has tried to run the evaluation with the provided multi_task_model.bin model?

I obtained

COCO (5K test set), R@{1 | 5 | 10}, IR: image retrieval, TR: text retrieval

IR: 32.979 | 61.911 | 74.082
TR: 14.62 | 32.18 | 39.76

Flickr30K

IR: 52.84 | 79.54 | 87.18
TR: 69.3 | 89 | 94

For what it's worth, those are not really comparable with those in the 12-in-1 paper.
I understand that there's room for improvement on TR as there's no hard negative mining for texts, but seems IR results are also unsatisfactory. I'm wondering if there's something missing here.

Thanks!

shivangibithel · 2021-08-12T22:16:31Z

Hi
I recently tried IR on Flickr30k 1K test set and getting the following results.

Can you tell if you ever got similar results?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrieval results much lower than expected on COCO 5K test set and Flickr30K 1K test set #78

Retrieval results much lower than expected on COCO 5K test set and Flickr30K 1K test set #78

juliuswang0728 commented Dec 11, 2020 •

edited

shivangibithel commented Aug 12, 2021

Retrieval results much lower than expected on COCO 5K test set and Flickr30K 1K test set #78

Retrieval results much lower than expected on COCO 5K test set and Flickr30K 1K test set #78

Comments

juliuswang0728 commented Dec 11, 2020 • edited

shivangibithel commented Aug 12, 2021

juliuswang0728 commented Dec 11, 2020 •

edited