Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different batch_size lead to different results #9

Open
JackHorse opened this issue Oct 11, 2019 · 6 comments
Open

different batch_size lead to different results #9

JackHorse opened this issue Oct 11, 2019 · 6 comments

Comments

@JackHorse
Copy link

Hi, I have been reproducing your results on IWSLT-16 En-De experiments using the NAT pre-trained models. However, I get different result when I use different batch_size.

  • When batch_size = 1:

微信截图_20191011163447

  • But when batch_size = 1600:

微信截图_20191011162118

Can you tell me why ?

@jaseleephd
Copy link
Collaborator

Hmm that is weird. Can you post the exact decoding script you used? If you're using length prediction, can you try disabling it and run again?

@JackHorse
Copy link
Author

when I disabling length prediction, this problem have been solved.
But why the bleu score is higher than before ?

捕获

The script is :

--batch_size 1 --load_vocab --dataset iwslt-ende --vocab_size 40000 --ffw_block highway --params small --lr_schedule anneal --fast --valid_repeat_dec 20 --use_argmax --next_dec_input both --mode test --remove_repeats --debug --load_from 02.08_20.10.ptrn_model_voc40k_2048_5_278_507_2_drop_0.1_drop_len_pred_0.3_0.0003_anne_anneal_steps_250000_high_tr4_2decs__pred_both_copy_argmax_

@jaseleephd
Copy link
Collaborator

Hmm are you using the pretrained length prediction model? @mansimov might know more.

@JackHorse
Copy link
Author

Thank you for your reply!

The pretrained model I use is what you released in https://drive.google.com/open?id=1N8tfU5ttnov2jWk3-PHVMJClQA0pKXoN
But what is pretrained length prediction model ?

@mansimov
Copy link
Collaborator

mansimov commented Oct 16, 2019

Thanks for raising an issue!

Are you using the same setup as we did in the paper (pytorch 0.3 or 0.4) ?
I will try running the pretrained models as well myself

@JackHorse
Copy link
Author

I use the same setup with your requirements (pytorch 0.4, python 3.6.4, torchtext 0.2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants