Unable to replicate results after retraining #5

aemrey · 2020-02-12T21:38:02Z

Hello and thank you for this fantastic repo!

I am trying to retrain your model using COCO features I have extracted myself using the bottom-up attention repo as you have suggested in #2. I am currently on epoch 15 and the highest CIDEr score on the test set has been 1.13. This is much less than the 1.31 that I get when using your pretrained model. Other than the new features, I am using your default values for all hyperparameters.

Could you give me some guidance in order to better replicate your results?

baraldilorenzo · 2020-02-13T10:05:12Z

Dear @aemrey,

thanks for your interest in our code. From what you mention, it seems that the RL training part has not even started so it is fairly normal that you are getting lower results. Please keep it running until the training stops by itself.

Lorenzo.

alesolano · 2020-03-10T09:56:53Z

Hi @aemrey, I'm following your steps and right now I'm extracting the features of my set of images. I'm using this model loaded with these weights.

Now, how do you pack those features into an Nx2048 tensor? I understood "features" by the output of the blob cls_prob here, but it returns an Nx1601 tensor. I'm sure I'm missing something here, maybe not taking the correct blob or using the wrong model.

Thanks!

P.S.: Don't know if I should maybe open a new issue

EDIT: Well, I understand now that I should take the blob res5c, maybe? Though the output is Nx2048x14x14, so don't really know what to do with those 14x14. And still I guess I need the cls_prob to sort the array.
I'll keep updating, but if you see something odd on what I'm doing and have a quick hint, please let me know.

EDIT2: Moved to #7

ruotianluo · 2020-03-13T21:58:09Z

I ran a couple of times your code.

The test cider is always below 1.30. Any clues?

marcellacornia · 2020-03-19T09:51:57Z

Hi @ruotianluo,
thanks for your interest in our work.

We noticed that the results and the training behavior can be slightly different by changing the underlying architecture. For this reason, we also provided the weights of our final model.

In our experiments, we used an NVIDIA 2080 Ti GPU. The other settings are the ones we reported in our repository.

wanboyang · 2020-10-19T07:24:07Z

I ran a couple of times your code.

The test cider is always below 1.30. Any clues?

I think it caused by the difference between bottom-up features provided in [1] and provided in this project. The testing results shown in , the curve named m2_transformer_wan used the features provided in [1] and the curve named m2_transformer_wan n used the features provided in this project.
I compare the features of [1] and this project , and find numbers of box in some image_id are different.

baraldilorenzo closed this as completed Feb 13, 2020

alesolano mentioned this issue Mar 10, 2020

From features of new images to M2 transformer #7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to replicate results after retraining #5

Unable to replicate results after retraining #5

aemrey commented Feb 12, 2020

baraldilorenzo commented Feb 13, 2020

alesolano commented Mar 10, 2020 •

edited

Loading

ruotianluo commented Mar 13, 2020

marcellacornia commented Mar 19, 2020 •

edited

Loading

wanboyang commented Oct 19, 2020

Unable to replicate results after retraining #5

Unable to replicate results after retraining #5

Comments

aemrey commented Feb 12, 2020

baraldilorenzo commented Feb 13, 2020

alesolano commented Mar 10, 2020 • edited Loading

ruotianluo commented Mar 13, 2020

marcellacornia commented Mar 19, 2020 • edited Loading

wanboyang commented Oct 19, 2020

alesolano commented Mar 10, 2020 •

edited

Loading

marcellacornia commented Mar 19, 2020 •

edited

Loading