Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loss decreasing is very slow #20

Closed
eustcPL opened this issue Oct 31, 2017 · 8 comments
Closed

loss decreasing is very slow #20

eustcPL opened this issue Oct 31, 2017 · 8 comments

Comments

@eustcPL
Copy link

eustcPL commented Oct 31, 2017

I try to use a single lstm and a classifier to train a question-only model, but the loss decreasing is very slow and the val acc1 is under 30 even through 40 epochs

@eustcPL
Copy link
Author

eustcPL commented Oct 31, 2017

can you help me? @Cadene

@Cadene
Copy link
Owner

Cadene commented Nov 1, 2017

Is it the open ended accuracy ?

With the VQA 1.0 dataset the question model achieves 40% open ended accuracy.

With VQA 2.0 dataset, it achieves 44%.

It could be a problem of overfitting, underfitting, preprocessing, or bug.

Did you try to change the number of parameters in your LSTM and to plot the accuracy curves ?

@Cadene
Copy link
Owner

Cadene commented Nov 1, 2017

Accuracy != Open Ended Accuracy (which is calculated using the eval code)

@eustcPL
Copy link
Author

eustcPL commented Nov 1, 2017

It is open ended accuracy in validation under 30 when training. Is it normal?

@Cadene
Copy link
Owner

Cadene commented Nov 1, 2017

I did not try to train an embedding matrix + LSTM.
Send me a link to your repo here or code by mail ;)

@eustcPL
Copy link
Author

eustcPL commented Nov 1, 2017

It's so weird. When use Skip-Thoughts, I can get much better result. Could you tell me what wrong with embedding matrix + LSTM?
Thank you very much!

@Cadene
Copy link
Owner

Cadene commented Nov 1, 2017

I just saw in your mail that you are using a dropout of 0.5 for your LSTM.

The cudnn backend that pytorch is using doesn't include a Sequential Dropout. That is why I made a custom API for the GRU.

I don't know what to tell you besides: you should be using the pretrained skip-thoughts model as your language only model if you want a strong baseline

What do you want to achieve ?

@eustcPL
Copy link
Author

eustcPL commented Nov 2, 2017

okay, thank you again! I will close this issue

@eustcPL eustcPL closed this as completed Nov 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants