-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to replicate results after retraining #5
Comments
Dear @aemrey, thanks for your interest in our code. From what you mention, it seems that the RL training part has not even started so it is fairly normal that you are getting lower results. Please keep it running until the training stops by itself. Lorenzo. |
Hi @aemrey, I'm following your steps and right now I'm extracting the features of my set of images. I'm using this model loaded with these weights. Now, how do you pack those features into an Thanks! P.S.: Don't know if I should maybe open a new issue EDIT: Well, I understand now that I should take the blob EDIT2: Moved to #7 |
Hi @ruotianluo, We noticed that the results and the training behavior can be slightly different by changing the underlying architecture. For this reason, we also provided the weights of our final model. In our experiments, we used an NVIDIA 2080 Ti GPU. The other settings are the ones we reported in our repository. |
Hello and thank you for this fantastic repo!
I am trying to retrain your model using COCO features I have extracted myself using the bottom-up attention repo as you have suggested in #2. I am currently on epoch 15 and the highest CIDEr score on the test set has been 1.13. This is much less than the 1.31 that I get when using your pretrained model. Other than the new features, I am using your default values for all hyperparameters.
Could you give me some guidance in order to better replicate your results?
The text was updated successfully, but these errors were encountered: