New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WMT15 BaryScore #3
Comments
Hi, |
I didn't change model/layer/epsilon. just the default settings from the command_line.sh in your latest commit. Only thing I changed was adding the lines for the pearson correlation and putting the wmt15 newstest de-en into the samples folder |
Can you run by considering the last 3/5 layers and check the different div/metric. You should be able to reproduce the exact same results (a co-author did) but I cannot tell you the exact parameters now. I will try to have access to the file. |
ok, i just tried various layers: last 5 layers is -0.3559000480869218, last 3 layers is -0.44103185124368643, some other values were similar. If i try to run depthscore on the same data, I get similar correlation. btw: depthscore logs some warnings:
and InfoLM even stops with an error:
Yes, for e.g. BERTScore I get the results, that the authors reported in their paper |
or could you please tell me the results of the examples in the samples folder? maybe that helps finding the solution |
There is no results on wmt 15 in the bertscore paper. Can you tell me the correlations you get with Bleu and bertscore ? |
BERTScore published results on WMT18 and there is a list of how BERTScore performs using various models on WMT16 linked on their github repo: https://docs.google.com/spreadsheets/d/1RKOVpselB98Nnh_EOC4A2BYn8_201tmPODpNWu4w7xI/edit#gid=0 BERTScore gives me a correlation of 0.7485 on WMT15, BLEU I didn't try yet, I'll send it later |
thanks a lot for these scores. That looks really helpful. Do you have the script, that created these files in your repo? I couldn't configure score_cli.py to report these models. for example roberta-base_lm_wsw_nbarycentersTrue_range(8, 13) means you use roberta-base in baryscore? then I guess range(8, 13) means last_layers=5? nbarycentersTrue means, you take the 'baryscore_W' from the score dictionary? and what does lm and wsw mean? |
I also have a problem with reproducing your results. If I run your command_line.sh on WMT15 de-en I get a pearson correlation of -0.3559000480869218, but in your paper you reported 75.9. how did you run these experiments?
The text was updated successfully, but these errors were encountered: