Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] explaining romanian postprocessing for MBART BLEU hacking #5943

Merged
merged 1 commit into from
Jul 21, 2020

Conversation

sshleifer
Copy link
Contributor

No description provided.

@sshleifer sshleifer requested a review from sgugger July 21, 2020 16:22
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link

codecov bot commented Jul 21, 2020

Codecov Report

Merging #5943 into master will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5943      +/-   ##
==========================================
- Coverage   77.31%   77.31%   -0.01%     
==========================================
  Files         146      146              
  Lines       26214    26214              
==========================================
- Hits        20268    20267       -1     
- Misses       5946     5947       +1     
Impacted Files Coverage Δ
src/transformers/file_utils.py 81.19% <0.00%> (-0.30%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ccbf74a...163ba92. Read the comment docs.

@vince62s
Copy link

@sshleifer just a quick comment on this, from here: facebookresearch/fairseq#1758 (comment)

for EN to RO, there is no point of maximising BLEU by removing diacritics when in real world (WMT human evaluation + SMT Matrix) it is clearly compared with a reference which HAS diacritics. But lot of papers do not do the things right.

@sshleifer
Copy link
Contributor Author

sshleifer commented Jul 21, 2020

That makes sense @vince62s! This is mostly so I can have a link to paste into a github issue when people ask me why their BLEU score is 27 :)

@sshleifer sshleifer changed the title [Doc] explaining romanian postprocessing to get high BLEU for en-ro [Doc] explaining romanian postprocessing for MBART BLEU hacking Jul 21, 2020
@sshleifer sshleifer merged commit 95d1962 into huggingface:master Jul 21, 2020
@sshleifer sshleifer deleted the ro-pro branch July 21, 2020 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants