-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can not reach groundhog performance #71
Comments
We were getting comparable scores for cs-en when the initial pr was made, around august so the issues in NMT repo might be outdated. iirc the there were fixes at beam-search which uses the generate computational graph (same one we generate samples). Have you checked whether the cost computational graphs are generating the same cost or not (using the same batch and initial parameters)? |
Thanks for your fast response, |
Thanks, keep us posted |
Henry Choi told me that he was able to reproduce English to French results On 8 January 2016 at 15:32, Orhan Firat notifications@github.com wrote:
|
@critias Hi, I am wondering did you reach the Groudhog performance. If you did, how did you reach that? I am trying the example as well and I cannot reach the performance. |
Hi, |
It turned out the problem was on our side. We changed some minor parts of the code that caused a mismatch between the encoding used to create the vocabulary (just bytes) and the encoding used during training/translation (unicode). |
Hi,
we are having a hard time to reproduce the results we got with GroundHog using Blocks.
Given the exact same training data, vocabulary, test set and settings we are 3 Bleu points behind GroundHog on a German to English translation task. We tried many different setups and number of iterations, but we can't reach it.
The GroundHog translation costs also seem to have a higher correlation between good and bad sentences then blocks. e.g.:
"vielen Dank ." translated to "thank you ."
a perfect translation and a common phrase which should have a low cost.
GroundHog cost: 0.000250929
Blocks cost: 0.357417
"fliegende Katze ." is translated to "fly away , cat ." not wrong but kind of a strange/unusual sentence.
GroundHog: 0.280177
Blocks cost: 0.267061
Blocks gives "thank you ." a higher cost to then "fly away , cat ." which seems strange to me. I take this as a hint that the problem is mainly related to the model and not to the search. The last comment here:
kyunghyuncho/NMT#21
seems to have the same issue. Has there been any progress on this?
Any tips where the Blocks computation graph differs from the GroundHog graph (it's to large to just look at it an see a difference)? Or other hints what the problem could be?
Thanks,
The text was updated successfully, but these errors were encountered: