How to reproduce the result on WMT14 En-De #202

ustctf-zz · 2018-06-28T03:06:11Z

Hi,

Thank you for providing such an impressive toolkit!

For replicating the WMT14 En-De translation result, I follow the instructions here , but after running on 8 M40 for 5.5 days, the test set BLEU (<27) cannot match the one stated in the paper , or even the original T2T paper (28.4). May I know what's wrong at my side? Here is the running script:

model=transformer
PROBLEM=WMT14_ENDE
SETTING=transformer_vaswani_wmt_en_de_big

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py ${REMOTE_DATA_PATH}/wmt14_en_de_joined_dict \
--arch $SETTING --share-all-embeddings \
  --optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
  --lr-scheduler inverse_sqrt --warmup-init-lr 1e-07 --warmup-updates 4000 \
  --lr 0.001 --min-lr 1e-09  --update-freq 16\
  --dropout 0.3 --weight-decay 0.0 --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
  --max-tokens 4096 --no-progress-bar --save-dir ${REMOTE_MODEL_PATH}/$model/$PROBLEM/$SETTING

(I do not use --fp16 and slightly enlarge the batch size from 3584 to 4096)

Here is the test script:

python generate.py ${REMOTE_DATA_PATH}/wmt14_en_de_joined_dict --path ${REMOTE_MODEL_PATH}/${model}/${PROBLEM}/${SETTING}/checkpoint_best.pt --batch-size 128 --beam 4 --lenpen 0.6 --quiet --remove-bpe --no-progress-bar

It outputs (after training for 5.5 days): Generate test with beam=4: BLEU4 = 26.66, 57.9/32.3/20.4/13.2 (BP=1.000, ratio=1.013, syslen=66179, reflen=65346)

BTW, it seems the dataset generated using prepare-wmt14en2de.sh has < 4M training pairs, not matching 4.5M, is it a possible reason?

Thanks a lot.

The text was updated successfully, but these errors were encountered:

myleott · 2018-06-28T16:10:46Z

Yes, you are right. Originally I used the Google dataset [1], but was hoping to reproduce the results with our script, because it's not clear how the Google version was preprocessed.

I'm working on an updated preprocessing script that should better match the Google version (~4.5M pairs). I'll post it here and update the README shortly.

[1] https://github.com/tensorflow/tensor2tensor/blob/6a7ef7f79f56fdcb1b16ae76d7e61cb09033dc4f/tensor2tensor/data_generators/translate_ende.py#L60-L61

myleott · 2018-06-28T17:59:40Z

Please try this dataset: #203

I just ran it on 128 GPUs and get the same results as (actually a little better than) the paper now.

ustctf-zz · 2018-07-02T05:07:30Z

Thanks @myleott !

I'm running on the new dataset (with 8 GPUs), and will return to you with latest result.

ustctf-zz · 2018-07-08T12:54:37Z

Hi @myleott , after running on 8 M40 GPUs for about 5 days, I obtain a BLEU of 28.77 on WMT14 En-De. Thanks again for the code and help!

BTW, may I know that do you have a plan of giving the detailed config/command to reproduce the result on WMT14 En-Fr? Thanks!

myleott · 2018-07-11T09:45:00Z

For En-Fr you can use the transformer_vaswani_wmt_en_fr_big architecture. It's nearly identical to the En-De architecture except that we use a smaller dropout value: https://github.com/pytorch/fairseq/blob/f26b6affdaf67d271e0d39f4c4c8384c4e8160d9/fairseq/models/transformer.py#L467-L470

I used the standard fairseq En-Fr dataset with 40k BPE tokens, available here: https://github.com/pytorch/fairseq/blob/master/examples/translation/prepare-wmt14en2fr.sh. For preprocessing make sure to add the --joined-dictionary flag

ustctf-zz · 2018-07-12T01:49:22Z

Thanks!

…priately (#202)

wangqiangneu · 2018-12-19T06:18:40Z

Hi @myleott , after running on 8 M40 GPUs for about 5 days, I obtain a BLEU of 28.77 on WMT14 En-De. Thanks again for the code and help!

BTW, may I know that do you have a plan of giving the detailed config/command to reproduce the result on WMT14 En-Fr? Thanks!

Hi @myleott @ustctf, if I use the new processed WMT14 En-De data provided by Google, should I also do some postprocessing (like get_ende_bleu.sh in tensor2tensor) to get a good BLEU?

kalyangvs · 2018-12-24T12:52:54Z

hi @ustctf Can you provide the BLEU score for en-fr by using this script https://github.com/pytorch/fairseq/blob/master/examples/translation/prepare-wmt14en2fr.sh
if you have used base transformer too please provide the scores. thanks.

ustctf-zz · 2018-12-24T13:04:35Z

@gvskalyan Sorry I've no records. Maybe you can ask for the official help.

kalyangvs · 2018-12-24T14:33:28Z

@gvskalyan Sorry I've no records. Maybe you can ask for the official help.

Yeah, Thank You.

…earch#202)

myleott closed this as completed Jul 11, 2018

myleott added a commit that referenced this issue Aug 28, 2018

Suppress stdout on most distributed replicas and set device IDs appro…

0768187

…priately (#202)

kalyangvs mentioned this issue Dec 24, 2018

interactivate.py can not generate translation result #412

Closed

dropreg mentioned this issue Nov 20, 2021

Training configuration for the WMT14 EnDe dataset? dropreg/R-Drop#19

Closed

yfyeung pushed a commit to yfyeung/fairseq that referenced this issue Dec 6, 2023

Update git SHA-1 in RESULTS.md for transducer_stateless. (facebookres…

27fa5f0

…earch#202)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reproduce the result on WMT14 En-De #202

How to reproduce the result on WMT14 En-De #202

ustctf-zz commented Jun 28, 2018 •

edited

Loading

myleott commented Jun 28, 2018 •

edited

Loading

myleott commented Jun 28, 2018 •

edited

Loading

ustctf-zz commented Jul 2, 2018

ustctf-zz commented Jul 8, 2018

myleott commented Jul 11, 2018

ustctf-zz commented Jul 12, 2018

wangqiangneu commented Dec 19, 2018

kalyangvs commented Dec 24, 2018

ustctf-zz commented Dec 24, 2018

kalyangvs commented Dec 24, 2018

How to reproduce the result on WMT14 En-De #202

How to reproduce the result on WMT14 En-De #202

Comments

ustctf-zz commented Jun 28, 2018 • edited Loading

myleott commented Jun 28, 2018 • edited Loading

myleott commented Jun 28, 2018 • edited Loading

ustctf-zz commented Jul 2, 2018

ustctf-zz commented Jul 8, 2018

myleott commented Jul 11, 2018

ustctf-zz commented Jul 12, 2018

wangqiangneu commented Dec 19, 2018

kalyangvs commented Dec 24, 2018

ustctf-zz commented Dec 24, 2018

kalyangvs commented Dec 24, 2018

ustctf-zz commented Jun 28, 2018 •

edited

Loading

myleott commented Jun 28, 2018 •

edited

Loading

myleott commented Jun 28, 2018 •

edited

Loading