How to deal with sequences with different lengths? #41

sunhaozhepy · 2022-10-27T02:41:28Z

Thank you for your great work! I've read your paper and am having trouble understanding generating sequences with different lengths. It seems to me that as you fix n=64 in experiments, you can't change it anymore as the hidden size d'=n*d in Transformer is fixed. As a result, it should be impossible during inference time to generate sequences with length other than 64...?

XiangLi1999 · 2022-11-19T07:38:47Z

Hi,

Thanks for the question!

We can generate sentences shorter than length 64 via padding. If the training script sets --padding_mode pad, then the format will be [BOS][SENTENCE][EOS][PAD][PAD][PAD]...

You could try decoding from this model by running batch_decode.py on https://drive.google.com/drive/folders/110CA22rwu_3EcllPYGhql0TnYeOBY77d
and you will observe the padding pattern.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to deal with sequences with different lengths? #41

How to deal with sequences with different lengths? #41

sunhaozhepy commented Oct 27, 2022

XiangLi1999 commented Nov 19, 2022

How to deal with sequences with different lengths? #41

How to deal with sequences with different lengths? #41

Comments

sunhaozhepy commented Oct 27, 2022

XiangLi1999 commented Nov 19, 2022