Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why take the first element of a batch after padding RNN output? #20

Closed
shengyuzhang opened this issue Apr 1, 2019 · 2 comments
Closed

Comments

@shengyuzhang
Copy link

From my understanding, the code indicates that:

After padding the RNN output "out" to "padded" with batch_first=True, the first dim of "padded" should be batch_size, and then the operation "padded[0]" takes the first element of a batch. This operation is rare and hard to understand. Am I wrong? Could someone help explain the purpose of this code?

Thanks in advance.

@shengyuzhang
Copy link
Author

Seemed that pad_packed_sequence returns a tuple rather than a tensor of shape [batch_size, seq_len, embed_size]

@fartashf
Copy link
Owner

fartashf commented Apr 2, 2019

As you correctly pointed out, pad_packed_sequence returns a tuple, where the first elements is the padded sequence.
https://pytorch.org/docs/stable/nn.html#torch.nn.utils.rnn.pad_packed_sequence

This is from an old pytorch tutorial that I cannot find now. The purpose is to handle variable size sequences. Nowadays, it is more common to pad sequences and make them equal length at the time of batch creation. This is true for pytorch official examples:
https://github.com/pytorch/examples/tree/master/word_language_model

@fartashf fartashf closed this as completed Apr 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants