New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenNMT benchmarking versus other pytorch libraries. #139
Comments
With such a small dataset the comparison is not really meaningful. I would say you would need a dataset of at least 100K sentences to get any indication of performance (this is for translation--for other tasks you may be able to get away with less). |
Thank you for the comparison though. We will set up as benchmark to make sure it does at least this well. We have been using this data for comparison. It is about 200k sentences. https://github.com/OpenNMT/IntegrationTesting/tree/master/data Would you be willing to run the pytorch-seq2seq model for comparison sake and post the ppl? |
@Deepblue129 I agree that the newstest2013 is too small for benchmarking the effectiveness. I used it mainly for evaluating the efficiency. |
Any update on this? Would love to post these benchmarks now that our code is more stable. |
I haven't run our experiment on the 200k sentence data mentioned above, but we do have improved our speed. I will have a look at your integration test data and get by to you soon. |
I have similar problems repeating results with PyOpenNMT. I train my English-to-German nmt model with Europarl v7, and almost completely copy the parameter setting and corpus mentioned in http://opennmt.net/Models (but I didn't use preprocess.lua and aggressive tokenizer, which are also mentioned). Even with many tries, I cannot reduce the perplexity to below 15 in validation set. I wonder why this happens? Have you tested PyOpenNMT on Europarl v7 and achieved 7.19 PPL in newstest2013.deen? Would mind you releasing the benchmarks that we can refer to? |
We haven't tried this yet, but we have replicated other similar benchmarks,
so I am surprised it is so far off. Can you send over your logs and
command?
…On Sep 12, 2017 3:11 AM, "dalegebit" ***@***.***> wrote:
I have similar problems repeating results with PyOpenNMT. I train my
English-to-German nmt model with Europarl v7, and almost completely copy
the parameter setting and corpus mentioned in I
<https://urldefense.proofpoint.com/v2/url?u=http-3A__opennmt.net_Models_&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=wnHFZ7D4m-9MRwk-CWlvCGbWEiQX_AvUO2LuMy4Vj7c&m=wXI3UiGyBT7pEYR-AxeIVxDZ57F69Qmu21gDO6PoH60&s=9UpKJsTCmwiqXYPz6gyygw3rV_y0-L5AvC8aTbyMuAQ&e=>
(but I didn't use preprocess.lua and aggressive tokenizer, which are also
mentioned). Even with many tries, I cannot reduce the perplexity below 15
in validation set. I wonder why this happens? Have you tested PyOpenNMT on
Europarl v7 and achieved 7.19 PPL in newstest2013.deen? Would mind you
releasing the benchmarks that we can refer to?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_OpenNMT_OpenNMT-2Dpy_issues_139-23issuecomment-2D328760938&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=wnHFZ7D4m-9MRwk-CWlvCGbWEiQX_AvUO2LuMy4Vj7c&m=wXI3UiGyBT7pEYR-AxeIVxDZ57F69Qmu21gDO6PoH60&s=o00aVsSb28xtU4H_jjXelt54n140yOSXDaOSFgYTLTM&e=>,
or mute the thread
<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AACMKibw2TJmpK6tEV-2DV1j7BWChAQXv2ks5shi6hgaJpZM4Ofi-5Fi&d=DwMFaQ&c=WO-RGvefibhHBZq3fL85hQ&r=wnHFZ7D4m-9MRwk-CWlvCGbWEiQX_AvUO2LuMy4Vj7c&m=wXI3UiGyBT7pEYR-AxeIVxDZ57F69Qmu21gDO6PoH60&s=wItEVNOSCndyID4HfN0Jl2pg6gCl4n19c_YYqM9LvUw&e=>
.
|
* yay * ok * removed more of adam's code :( * small fix
Sorry, I missed your message. Sure. Here are my logs, adam version and sgd version. Adam version has larger ppl on validation set but better accuracy, while sgd version has smaller ppl but worse accuracy. But none of them can achieve <15 ppl on validation set. In the logs, I only show the result of 13 epoches, however, even if I run many more epoches, there is no sign that they could reach that goal. I only used default parameters and default learning rate scheduling, except for changing the initial learning rate and adding |
Here is the log of latest (0.1.4) pytorch-seq2seq running on your integration test data with its default setting: Look like |
closing this, quite old. |
There is certainly a lot of great work going into OpenNMT. But with every new feature added from some paper, do we have any sense of those features in tandem helping OpenNMT?
This gist compares OpenNMT-py v.s. pytorch-seq2seq on newstest2013.
pytorch-seq2seq has a simple LSTM + Attention model and it achieves:
While OpenNMT is much more mature library does worse:
What are the goals of this library?
The text was updated successfully, but these errors were encountered: