Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort sequences internally in pack_padded_sequence #3584

Closed
colesbury opened this issue Nov 9, 2017 · 13 comments
Closed

Sort sequences internally in pack_padded_sequence #3584

colesbury opened this issue Nov 9, 2017 · 13 comments
Assignees

Comments

@colesbury
Copy link
Member

Currently, pack_padded_sequence requires the caller to sort inputs by the length (descending). We should sort and reorder the input internally if it's not already sorted. We can then restore the original order in pad_packed_sequence.

cc @kyunghyuncho

@hhsecond
Copy link
Contributor

hhsecond commented Nov 9, 2017

I am working on 1128. I can add this to my code as well

@apaszke
Copy link
Contributor

apaszke commented Nov 9, 2017

This will make it impossible to do the batched filling we do now

@colesbury
Copy link
Member Author

@apaszke, you can sort and then do the batched filling

@apaszke
Copy link
Contributor

apaszke commented Nov 9, 2017

but how do you sort a tensor? with a series of selects and cat? it's expensive

@colesbury
Copy link
Member Author

colesbury commented Nov 9, 2017 via email

@sanyam5
Copy link

sanyam5 commented Jan 24, 2018

Would really love to have this feature! Has this been implemented yet?

@elanmart
Copy link
Contributor

But you'd still have to manually sort all other tensors in a batch, no?

@sanyam5
Copy link

sanyam5 commented Jan 25, 2018

Well, I was hoping for a method that would first sort the inputs, then compute the results and then restore the original order before returning the results. This way everything would happen under-the-hood and the user need not worry about sorting anything else in their code.

@junpeiz
Copy link

junpeiz commented Feb 3, 2018

If we manually do the sort before RNN and unsort after the RNN, would it influence the backpropagation? For example,

def forward(self, input, lengths, hidden):
    # Sort the input and lengths as the descending order
    lengths, perm_index = lengths.sort(0, descending=True)
    input = input[perm_index]

    packed_input = pack(input, list(lengths.data), batch_first=True)
    output, hidden = self.rnn(packed_input, hidden)
    output = unpack(output, batch_first=True)[0]
    
    # restore the sorting
    odx = perm_index.view(-1, 1).unsqueeze(1).expand(output.size(0), output.size(1), output.size(2))
    decoded = output.gather(0, odx)
    return decoded, hidden

@RadZaeem
Copy link

RadZaeem commented Feb 12, 2018

@zhoujunpei
in my experience, I managed to do backprop by also sorting the labels with the same ordering as the lengths list.
labels = labels[perm_index]
loss = CrossEntropy(out, labels)
loss.backward()

if not, the backprop will fit into garbage (random labels)

maybe your method also works. the point is the ouput and label index is same

@cdluminate
Copy link
Contributor

second this proposal

@soumith soumith closed this as completed May 30, 2018
@soumith soumith reopened this May 30, 2018
@airsplay
Copy link

airsplay commented Oct 6, 2018

For efficiency, is it possible to check whether the sequence is sorted inside the function "pack_padded_sequence"?
It would be much convenient if the descending requirement is not mandatory.

@zou3519 zou3519 self-assigned this Dec 13, 2018
zou3519 added a commit to zou3519/pytorch that referenced this issue Dec 20, 2018
… in RNNs

Fixes pytorch#3584.

Motivation: manually sorting sequences, packing them, and then unsorting them
is something a lot of users have complained about doing, especially when we can
offer library support for them.

Overview: we internally sort sequences before packing them and store a list of
`unsorted_indices` that represent how to unsort the sequences inside
PackedSequence. The packing helper functions return PackedSequence with the
`permutation` field and the unpacking helper functions use it to unsort.

To implement this, the following changes were made:
- PackedSequence now keeps `sorted_indices` and `unsorted_indices`.
  These two can be thought of as permutations and are inverses of each other.
  `sorted_indices` is how the sequences were sorted; `unsorted_indices` is how
  to unsort the sequences.
- Added an `enforce_sorted` argument to pack_sequence and pack_padded_sequence
  that maintains the legacy behavior of error-ing out on unsorted-sequences.
  When `enforce_sorted=True`, these functions maintain their ONNX exportability.
- pack_sequence(sequences, enforce_sorted) takes in unsorted sequences.
- pack_padded_sequence can take in a padded tensor that represents padded,
  unsorted sequences.
- pad_packed_sequence unsorts the PackedSequence such that it is still the
  inverse operation of packed_padded_sequence.
- RNNs apply `sort_indices` to their input hidden state and apply
  `unsort_indices` to their output hidden state. This is to ensure that the
  hidden state batches correspond to the user's ordering of input sequences.

NOT BC-Breaking
- The default for pack_sequence and pack_padded_sequence is
  `enforce_sorted=True` to avoid breaking ONNX export. To use the new
  functionality, pass in `enforce_sorted=False`

Testing Plan
- Modified TestNN.test_pack_sequence, TestNN.test_packed_padded_sequence,
  and TestNN.test_variable_sequence (RNN test) to check the behavior
  of unsorted sequences, sorted sequences, and sorted sequences with
  enforce_sorted=True
@cdluminate
Copy link
Contributor

Excellent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.