You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The "--arch glat_sd" is weird. Is it "cmlm_sd" or "cmlm_transformer"?
Another question is, could you please give us the generation scripts for CMLM (iter>1), when setting "--iter-decode-max-iter 5/10"? I find that the BLEU under iter=5/10 is much worse than that of iter=1.
The text was updated successfully, but these errors were encountered:
Hello I have the same question, which of the three should be used in the training script of CMLM+DSLP, cmlm_sd, cmlm_sd_ss, cmlm_transformer? Is it clear to you now? Thanks!
Hi, thank you for releasing the code!
I have a question about the given bash scripts of training and inference.
The training scripts of the CMLM+DSLP
python3 train.py data-bin/wmt14.en-de_kd --source-lang en --target-lang de --save-dir checkpoints --eval-tokenized-bleu \ --keep-interval-updates 5 --save-interval-updates 500 --validate-interval-updates 500 --maximize-best-checkpoint-metric \ --eval-bleu-remove-bpe --eval-bleu-print-samples --best-checkpoint-metric bleu --log-format simple --log-interval 100 \ --eval-bleu --eval-bleu-detok space --keep-last-epochs 5 --keep-best-checkpoints 5 --fixed-validation-seed 7 --ddp-backend=no_c10d \ --share-all-embeddings --decoder-learned-pos --encoder-learned-pos --optimizer adam --adam-betas "(0.9,0.98)" --lr 0.0005 \ --lr-scheduler inverse_sqrt --stop-min-lr 1e-09 --warmup-updates 10000 --warmup-init-lr 1e-07 --apply-bert-init --weight-decay 0.01 \ --fp16 --clip-norm 2.0 --max-update 300000 --task translation_lev --criterion nat_loss --arch glat_sd --noise full_mask \ --concat-yhat --concat-dropout 0.0 --label-smoothing 0.1 \ --activation-fn gelu --dropout 0.1 --max-tokens 8192 \ --length-loss-factor 0.1 --pred-length-offset
The "--arch glat_sd" is weird. Is it "cmlm_sd" or "cmlm_transformer"?
Another question is, could you please give us the generation scripts for CMLM (iter>1), when setting "--iter-decode-max-iter 5/10"? I find that the BLEU under iter=5/10 is much worse than that of iter=1.
The text was updated successfully, but these errors were encountered: