<div class="alert alert-block alert-info">
<b>Number of points for this notebook:</b> 4
<br>
<b>Deadline:</b> March 30, 2020 (Monday). 23:00
</div>

# Exercise 5. Sequence-to-sequence modeling with recurrent neural networks

The goals of this exercise are
* to get familiar with recurrent neural networks used for sequential data processing
* to get familiar with the sequence-to-sequence model for machine translation
* to learn PyTorch tools for batch processing of sequences with varying lengths
* to learn how to write a custom `DataLoader`

You may find it useful to look at this tutorial:
* [Translation with a Sequence to Sequence Network and Attention](https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html)

In [1]:
skip_training = True# Set this flag to True before validation and submission

In [2]:
# During evaluation, this cell sets skip_training to True
# skip_training = True

In [3]:
import os
import random
import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence

import tools
import tests

In [4]:
# When running on your own computer, you can specify the data directory by:
# data_dir = tools.select_data_dir('/your/local/data/directory')
data_dir = tools.select_data_dir()

The data directory is /coursedata


In [5]:
# Select the device for training (use GPU if you have one)
#device = torch.device('cuda:0')
device = torch.device('cpu')

In [6]:
if skip_training:
    # The models are always evaluated on CPU
    device = torch.device("cpu")

## Data

The dataset that we are going to use consists of pairs of sentences in French and English.

In [7]:
from data import TranslationDataset, MAX_LENGTH, SOS_token, EOS_token

trainset = TranslationDataset(data_dir, train=True)

* `TranslationDataset` supports indexing as required by `torch.utils.data.Dataset`.
* Sentences are tensors of maximum length `MAX_LENGTH`.
* Words in a (sentence) tensor are represented as an index (integer) in a language vocabulary.
* The string representation of a word from the source language can be obtained from index `i` with `dataset.input_lang.index2word[i]`.
* Similarly for the target language `dataset.output_lang.index2word[j]`.

Let us look at samples from that dataset.

In [8]:
src_sentence, tgt_sentence = trainset[np.random.choice(len(trainset))]
print('Source sentence: "%s"' % ' '.join(trainset.input_lang.index2word[i.item()] for i in src_sentence))
print('Sentence as tensor of word indices:')
print(src_sentence)

print('Target sentence: "%s"' % ' '.join(trainset.output_lang.index2word[i.item()] for i in tgt_sentence))
print('Sentence as tensor of word indices:')
print(tgt_sentence)

Source sentence: "c est que nous appelons un pionnier . EOS"
Sentence as tensor of word indices:
tensor([ 145,   25,  914,  123, 3629,   66, 3628,    5,    1])
Target sentence: "he is what we call a pioneer . EOS"
Sentence as tensor of word indices:
tensor([  14,   40, 1471,   77, 1389,   42, 2266,    4,    1])


In [9]:
print('Number of source-target pairs in the training set: ', len(trainset))

Number of source-target pairs in the training set:  8682


## Sequence-to-sequence model for machine translation

In this exercise, we are going to build a machine translation system which transforms a sentence in one language into a sentence in another one. The computational graph of the translation model is shown below:

<img src="seq2seq.png" width=900>

We are going to use a simplified model without the dotted connections.

## Custom DataLoader

We would like to train the sequence-to-sequence model using mini-batch training.
One difficulty of mini-batch training in this case is that sequences may have varying lengths and this has to be taken into account when building the computational graph. Luckily, PyTorch has tools to support batch processing of such sequences.
To use those tools, we need to write a custom data loader which puts sequences of varying lengths in the same tensor. We can customize the data loader by providing a custom `collate_fn` as explained [here](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader).

Our collate function:
- combines sequences from the source language in a single tensor with extra values (at the end) filled with `padding_value=0`.
- combines sequences from the target language in a single tensor with extra values (at the end) filled with `padding_value=0`.

**Important**:
- Late in the code (not in this `collate` function), we will convert source sequences to objects of class [`PackedSequence`](https://pytorch.org/docs/stable/nn.html?highlight=packedsequence#torch.nn.utils.rnn.PackedSequence) which can be processed by recurrent units such as `GRU` or `LSTM`. `PackedSequence` requires sequences to be sorted by their lengths.
**Therefore, the returned source sequences should be sorted by length in a decreasing order.**
* The target sequences need not be sorted by their lengths because we have to keep the same order of sequences in the source and target tensors.

Your task is to implement the collate function.

In [10]:
padding_value = 0

In [11]:
from torch.nn.utils.rnn import pad_sequence

def collate(list_of_samples):
    """Merges a list of samples to form a mini-batch.

    Args:
      list_of_samples is a list of tuples (src_seq, tgt_seq):
          src_seq is of shape (src_seq_length,)
          tgt_seq is of shape (tgt_seq_length,)

    Returns:
      src_seqs of shape (max_src_seq_length, batch_size): Tensor of padded source sequences.
          The sequences should be sorted by length in a decreasing order, that is src_seqs[:,0] should be
          the longest sequence, and src_seqs[:,-1] should be the shortest.
      src_seq_lengths: List of lengths of source sequences.
      tgt_seqs of shape (max_tgt_seq_length, batch_size): Tensor of padded target sequences.
    """
    # YOUR CODE HERE
    try:
        list_of_samples = sorted(list_of_samples, key=lambda x: len(x[0]), reverse=True)
        src_seqs, tgt_seqs = zip(*list_of_samples)
        src_seq_length = [len(x) for x in src_seqs]
        
        src_seqs = pad_sequence(src_seqs, padding_value=padding_value)
        tgt_seqs = pad_sequence(tgt_seqs, padding_value=padding_value)
        return src_seqs, src_seq_length, tgt_seqs
    except:
        raise NotImplementedError()

In [12]:
def test_collate_shapes():
    pairs = [
        (torch.LongTensor([1, 2]), torch.LongTensor([3, 4, 5])),
        (torch.LongTensor([6, 7, 8]), torch.LongTensor([9, 10])),
    ]
    pad_src_seqs, src_seq_lengths, pad_tgt_seqs = collate(pairs)
    assert pad_src_seqs.shape == torch.Size([3, 2]), f"Bad pad_src_seqs.shape: {pad_src_seqs.shape}"
    assert pad_src_seqs.dtype == torch.long
    assert pad_tgt_seqs.shape == torch.Size([3, 2]), f"Bad pad_tgt_seqs.shape: {pad_tgt_seqs.shape}"
    assert pad_tgt_seqs.dtype == torch.long
    print('Success')

test_collate_shapes()

Success


In [13]:
# This cell tests collate() function

In [14]:
# We create custom DataLoader using the implemented collate function
# We are going to process 64 sequences at the same time (batch_size=64)
from torch.utils.data import DataLoader
trainloader = DataLoader(dataset=trainset, batch_size=64, shuffle=True, collate_fn=collate, pin_memory=True)

## Encoder

The encoder encodes a source sequence $(x_1, x_2, ..., x_T)$ into a single vector $h_T$ using the following recursion:
$$
  h_{t} = f(h_{t-1}, x_t) \qquad t = 1, \ldots, T
$$
where:
* intial state $h_0$ is often chosen arbitrarily (we choose it to be zero)
* function $f$ is defined by the type of the RNN cell (in our experiments, we will use [GRU](https://pytorch.org/docs/stable/nn.html#torch.nn.GRU))
* $x_t$ is a vector that represents the $t$-th word in the source sentence.

A common practice in natural language processing is to _learn_ the word representations $x_t$ (instead of, for example, using one-hot coded vectors). In PyTorch, this is supported by class [Embedding](https://pytorch.org/docs/stable/nn.html#torch.nn.Embedding) which we are going to use.

The computational graph of the encoder is shown below:

<img src="seq2seq_encoder.png" width=500>

Your task is to implement the `forward` function of the encoder. It should contain the following steps:
* Embed the words of the source sequences.
* Pack source sequences using [`pack_padded_sequence`](https://pytorch.org/docs/stable/nn.html?highlight=pack_padded_sequence#torch.nn.utils.rnn.pack_padded_sequence). This converts padded source sequences into an object that can be processed by PyTorch recurrent units such as `nn.GRU` or `nn.LSTM`.
* Apply GRU computations to packed sequences obtained in the previous step
* Convert packed sequence of GRU outputs into padded representation with [`pad_packed_sequence`](https://pytorch.org/docs/stable/nn.html?highlight=pad_packed_sequence#torch.nn.utils.rnn.pad_packed_sequence).

In [15]:
class Encoder(nn.Module):
    def __init__(self, src_dictionary_size, embed_size, hidden_size):
        """
        Args:
          src_dictionary_size: The number of words in the source dictionary.
          embed_size: The number of dimensions in the word embeddings.
          hidden_size: The number of features in the hidden state of GRU.
        """
        super(Encoder, self).__init__()
        self.hidden_size = hidden_size
        self.embedding = nn.Embedding(src_dictionary_size, embed_size)
        self.gru = nn.GRU(input_size=embed_size, hidden_size=hidden_size)

    def forward(self, pad_seqs, seq_lengths, hidden):
        """
        Args:
          pad_seqs of shape (max_seq_length, batch_size): Padded source sequences.
          seq_lengths: List of sequence lengths.
          hidden of shape (1, batch_size, hidden_size): Initial states of the GRU.

        Returns:
          outputs of shape (max_seq_length, batch_size, hidden_size): Padded outputs of GRU at every step.
          hidden of shape (1, batch_size, hidden_size): Updated states of the GRU.
        """
        # YOUR CODE HERE
        try:
            embedding = self.embedding(pad_seqs)
            packed = pack_padded_sequence(embedding, seq_lengths)
            print(packed)
            outputs, hidden = self.gru(packed, hidden)
            outputs, _ = pad_packed_sequence(outputs)
            return outputs, hidden
        except:
            raise NotImplementedError()
    
    def init_hidden(self, batch_size=1):
        return torch.zeros(1, batch_size, self.hidden_size)

In [16]:
def test_Encoder_shapes():
    hidden_size = 3
    encoder = Encoder(src_dictionary_size=5, embed_size=10, hidden_size=hidden_size)

    max_seq_length = 4
    batch_size = 2
    hidden = encoder.init_hidden(batch_size=batch_size)
    pad_seqs = torch.tensor([
        [1, 2],
        [2, 3],
        [3, 0],
        [4, 0]
    ])

    outputs, new_hidden = encoder.forward(pad_seqs=pad_seqs, seq_lengths=[4, 2], hidden=hidden)
    assert outputs.shape == torch.Size([4, batch_size, hidden_size]), f"Bad outputs.shape: {outputs.shape}"
    assert new_hidden.shape == torch.Size([1, batch_size, hidden_size]), f"Bad new_hidden.shape: {new_hidden.shape}"
    print('Success')

test_Encoder_shapes()

PackedSequence(data=tensor([[-1.2423, -0.0124,  0.5017,  0.8370, -1.4228, -0.2938,  2.0453, -0.3844,
          0.8609,  0.9017],
        [-0.6772,  2.3649, -0.5501, -2.3647, -1.0711,  1.6568, -0.9419,  0.9434,
          0.3803,  1.3666],
        [-0.6772,  2.3649, -0.5501, -2.3647, -1.0711,  1.6568, -0.9419,  0.9434,
          0.3803,  1.3666],
        [ 1.0180, -1.1960, -0.0071,  0.1918, -2.6715,  0.9023, -0.0368,  1.5802,
          0.1921, -0.6606],
        [ 1.0180, -1.1960, -0.0071,  0.1918, -2.6715,  0.9023, -0.0368,  1.5802,
          0.1921, -0.6606],
        [-1.9189, -1.3079,  1.8937, -0.8635,  0.3058,  0.5225, -0.5711, -0.3023,
         -0.8606,  0.0335]], grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([2, 2, 1, 1]), sorted_indices=None, unsorted_indices=None)
Success


In [17]:
tests.test_Encoder(Encoder)

PackedSequence(data=tensor([[-0.5000,  0.2000],
        [ 0.1000, -0.4000],
        [ 0.1000, -0.4000],
        [-0.3000,  0.4000],
        [-0.3000,  0.4000],
        [-0.1000,  0.8000]]), batch_sizes=tensor([2, 2, 1, 1]), sorted_indices=None, unsorted_indices=None)
outputs[:, 0, :]:
 tensor([[ 0.0000, -0.0150],
        [ 0.0004, -0.0221],
        [ 0.0007, -0.0055],
        [ 0.0005,  0.0323]])
expected:
 tensor([[ 0.0000, -0.0150],
        [ 0.0004, -0.0221],
        [ 0.0007, -0.0055],
        [ 0.0005,  0.0323]])
outputs[:2, 1, :]:
 tensor([[ 0.0000, -0.0150],
        [ 0.0004, -0.0021]])
expected:
 tensor([[ 0.0000, -0.0150],
        [ 0.0004, -0.0021]])
new_hidden:
 tensor([[[ 0.0005,  0.0323],
         [ 0.0004, -0.0021]]])
expected:
 tensor([[[ 0.0005,  0.0323],
         [ 0.0004, -0.0021]]])
Success


## Decoder

The decoder takes as input the representation computed by the encoder and transforms it into a sentence in the target language. The computational graph of the decoder is shown below:

<img src="seq2seq_decoder.png" width=500 align="top">

* $z_0$ is the output of the encoder, that is $z_0 = h_5$, thus `hidden_size` of the decoder should be the same as `hidden_size` of the encoder.
* $y_{i}$ are the log-probabilities of the words in the target language, the dimensionality of $y_{i}$ is the size of the target dictionary.
* $z_{i}$ is mapped to $y_{i}$ using a linear layer `self.out` followed by `F.log_softmax` (because we use `nn.NLLLoss` loss for training).
* Each cell of the decoder is a GRU, it receives as inputs the previous state $z_{i-1}$ and relu of the **embedding** of the previous word. Thus, you need to embed the words of the target language as well. The previous word is taken as the word with the maximum log-probability.

Note that the decoder outputs a word at every step and the same word is used as the input to the recurrent unit at the next step. At the beginning of decoding, the previous word input is fed with a special word SOS which stands for "start of a sentence". During training, we know the target sentence for decoding, therefore we can feed the correct words $y_i$ as inputs to the recurrent unit.

There is one extra thing that it is wise to take care of. When the target sentence is fed to the decoder during training, the decoder learns to generate only the next word (this scenario is called "teacher forcing"). At test time, the decoder works differently: it generates the whole sequence using its own predictions as inputs at each step. Therefore, it makes sense to train the decoder to produce full sentences. In order to do that, we will alternate between two modes during training:
* "teacher forcing": the decoder is fed with the words in the target sequence
* no "teacher forcing": the decoder generates the output sequence using its own predictions. We will limit the maximum length of generated sequences to `MAX_LENGTH`.

You need to implement the decoder which has the structure shown in the figure above.

In [18]:
class Decoder(nn.Module):
    def __init__(self, tgt_dictionary_size, embed_size, hidden_size):
        """
        Args:
          tgt_dictionary_size: The number of words in the target dictionary.
          embed_size: The number of dimensions in the word embeddings.
          hidden_size: The number of features in the hidden state.
        """
        super(Decoder, self).__init__()
        self.hidden_size = hidden_size

        self.embedding = nn.Embedding(tgt_dictionary_size, embed_size)
        self.gru = nn.GRU(input_size=embed_size, hidden_size=hidden_size)
        self.out = nn.Linear(hidden_size, tgt_dictionary_size)

    def forward(self, hidden, pad_tgt_seqs=None, teacher_forcing=False):
        """
        Args:
          hidden of shape (1, batch_size, hidden_size): States of the GRU.
          pad_tgt_seqs of shape (max_out_seq_length, batch_size): Tensor of words (word indices) of the
              target sentence. If None, the output sequence is generated by feeding the decoder's outputs
              (teacher_forcing has to be False).
          teacher_forcing (bool): Whether to use teacher forcing or not.

        Returns:
          outputs of shape (max_out_seq_length, batch_size, tgt_dictionary_size): Tensor of log-probabilities
              of words in the target language.
          hidden of shape (1, batch_size, hidden_size): New states of the GRU.

        Note: Do not forget to transfer tensors that you may want to create in this function to the device
        specified by `hidden.device`.
        """
        if pad_tgt_seqs is None:
            assert not teacher_forcing, 'Cannot use teacher forcing without a target sequence.'

        # YOUR CODE HERE
        try:
            out_length = pad_tgt_seqs.shape[0] if pad_tgt_seqs is not None else MAX_LENGTH 
            input_ = torch.tensor(SOS_token * np.ones((1, hidden.size(1))), device=hidden.device, dtype=torch.int64)
            outputs = []
            for i in range(out_length):
                embedding = self.embedding(input_.view(1,-1))
                output, hidden = self.gru(F.relu(embedding), hidden)
                output = self.out(output)
                output = F.log_softmax(output, dim=2)
                outputs.append(output)
                if teacher_forcing:
                    input_ = pad_tgt_seqs[i]
                else:
                    input_ = torch.max(output,dim=2)[1].detach()
            outputs = torch.cat(outputs, dim=0)
            return outputs, hidden
        except:
            raise NotImplementedError()

In [19]:
def test_Decoder_shapes():
    hidden_size = 2
    tgt_dictionary_size = 5
    test_decoder = Decoder(tgt_dictionary_size, embed_size=10, hidden_size=hidden_size)

    max_seq_length = 4
    batch_size = 2
    pad_tgt_seqs = torch.tensor([
        [1, 2],
        [2, 3],
        [3, 0],
        [4, 0]
    ])  # [max_seq_length, batch_size]

    hidden = torch.zeros(1, batch_size, hidden_size)
    outputs, new_hidden = test_decoder.forward(hidden, pad_tgt_seqs, teacher_forcing=False)

    assert outputs.size(0) <= 4, f"Too long output sequence: outputs.size(0)={outputs.size(0)}"
    assert outputs.shape[1:] == torch.Size([batch_size, tgt_dictionary_size]), \
        f"Bad outputs.shape[1:]={outputs.shape[1:]}"
    assert new_hidden.shape == torch.Size([1, batch_size, hidden_size]), f"Bad new_hidden.shape={new_hidden.shape}"

    outputs, new_hidden = test_decoder.forward(hidden, pad_tgt_seqs, teacher_forcing=True)
    assert outputs.shape == torch.Size([4, batch_size, tgt_dictionary_size]), \
        f"Bad shape outputs.shape={outputs.shape}"
    assert new_hidden.shape == torch.Size([1, batch_size, hidden_size]), f"Bad new_hidden.shape={new_hidden.shape}"

    # Generation mode
    outputs, new_hidden = test_decoder.forward(hidden, None, teacher_forcing=False)
    assert outputs.shape[1:] == torch.Size([batch_size, tgt_dictionary_size]), \
        f"Bad outputs.shape[1:]={outputs.shape[1:]}"
    assert new_hidden.shape == torch.Size([1, batch_size, hidden_size]), f"Bad new_hidden.shape={new_hidden.shape}"

    print('Success')

test_Decoder_shapes()

Success


In [20]:
tests.test_Decoder_no_forcing(Decoder)
tests.test_Decoder_with_forcing(Decoder)
tests.test_Decoder_generation(Decoder)

outputs[:, 0, :]:
 tensor([[-1.1366, -2.1924, -1.4361, -1.9640, -1.6645],
        [-1.3540, -1.8630, -1.5249, -1.7793, -1.6085],
        [-1.4899, -1.7024, -1.5838, -1.6901, -1.5962],
        [-1.5665, -1.6246, -1.6166, -1.6457, -1.5956]])
expected:
 tensor([[-1.1366, -2.1924, -1.4361, -1.9640, -1.6645],
        [-1.3540, -1.8630, -1.5249, -1.7793, -1.6085],
        [-1.4899, -1.7024, -1.5838, -1.6901, -1.5962],
        [-1.5665, -1.6246, -1.6166, -1.6457, -1.5956]])
outputs[:, 1, :]:
 tensor([[-1.1366, -2.1924, -1.4361, -1.9640, -1.6645],
        [-1.3540, -1.8630, -1.5249, -1.7793, -1.6085],
        [-1.4899, -1.7024, -1.5838, -1.6901, -1.5962],
        [-1.5665, -1.6246, -1.6166, -1.6457, -1.5956]])
expected:
 tensor([[-1.1366, -2.1924, -1.4361, -1.9640, -1.6645],
        [-1.3540, -1.8630, -1.5249, -1.7793, -1.6085],
        [-1.4899, -1.7024, -1.5838, -1.6901, -1.5962],
        [-1.5665, -1.6246, -1.6166, -1.6457, -1.5956]])
new_hidden:
 tensor([[[0.1003, 0.0421],
         [0.1003

## Training of sequence-to-sequence model using mini-batches

Now we are going to train the sequence-to-sequence model on the toy translation dataset.

In [21]:
# Create the seq2seq model
hidden_size = embed_size = 256
encoder = Encoder(trainset.input_lang.n_words, embed_size, hidden_size).to(device)
decoder = Decoder(trainset.output_lang.n_words, embed_size, hidden_size).to(device)

In [22]:
teacher_forcing_ratio = 0.5

Implement the training loop in the cell below. In the training loop, we first encode source sequences using the encoder, then we decode the encoded state using the decoder. The decoder outputs log-probabilities of words in the target language. We need to use these log-probabilities and the indexes of the words in the target sequences to compute the loss.

Recommended hyperparameters:
- Encoder optimizer: Adam with learning rate 0.001
- Decoder optimizer: Adam with learning rate 0.001
- Number of epochs: 30
- Toggle `teacher_forcing` on and off (for each mini-batch) according to the `teacher_forcing_ratio` specified above.

Hints:
- Training should proceed relatively fast.
- If you do well, the training loss should reach 0.1 in 30 epochs.
- **Important:** When computing the loss, you need to ignore the padded values. This can easily be done by using argument `ignore_index` of function [`nll_loss`](
https://pytorch.org/docs/stable/nn.functional.html#torch.nn.functional.nll_loss).

In [23]:
if not skip_training:
    # YOUR CODE HERE
    try:
        import random
        ENoptimizer = torch.optim.Adam(encoder.parameters(), lr=0.001)
        DEoptimizer = torch.optim.Adam(decoder.parameters(), lr=0.001)
        
        for epoch in range(30):
            for i, batch in enumerate(trainloader):
                ENoptimizer.zero_grad()
                DEoptimizer.zero_grad()
                
                input_seq, input_seq_len, target_seq = batch 
                input_seq, target_seq = input_seq.to(device), target_seq.to(device)
                batch_size = input_seq.shape[1]
                
                ENhidden = encoder.init_hidden(batch_size).to(device)
                _, ENhidden = encoder(input_seq, input_seq_len, ENhidden)
                teacher_forcing = True if random.uniform(0,1) < teacher_forcing_ratio else False
                DEoutputs, DEhidden = decoder(ENhidden, target_seq, teacher_forcing=teacher_forcing)
                
                loss = 0
                for i in range(DEoutputs.shape[0]):
                    loss = loss + F.nll_loss(DEoutputs[i], target_seq[i], ignore_index=0) 
                loss = loss/len(DEoutputs[0])
                loss.backward()
                
                ENoptimizer.step()
                DEoptimizer.step()
    except:
        raise NotImplementedError()

PackedSequence(data=tensor([[ 0.8802, -0.6379,  0.6623,  ...,  0.2814, -1.4391, -0.9124],
        [ 0.4771,  1.7098, -0.1462,  ...,  1.3521, -0.0677,  0.5578],
        [ 0.8802, -0.6379,  0.6623,  ...,  0.2814, -1.4391, -0.9124],
        ...,
        [-1.5637,  0.1523,  1.4797,  ..., -0.8766, -0.2085, -0.3420],
        [-1.5637,  0.1523,  1.4797,  ..., -0.8766, -0.2085, -0.3420],
        [-1.5637,  0.1523,  1.4797,  ..., -0.8766, -0.2085, -0.3420]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 37, 19, 13, 11]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7000, -1.0763, -1.8274,  ...,  0.1813, -0.2524, -1.3005],
        [ 0.7000, -1.0763, -1.8274,  ...,  0.1813, -0.2524, -1.3005],
        [-0.5359,  0.6534, -0.5080,  ...,  0.0445,  0.1146, -0.3603],
        ...,
        [-1.5647,  0.1533,  1.4787,  ..., -0.8776, -0.2095, -0.3410],
        [-1.5647,  0.1533,  1.4787,  ..., -0.8776, -0.2095, -0.3410],
        [-1

PackedSequence(data=tensor([[ 0.8671, -0.6260,  0.6570,  ...,  0.2710, -1.4527, -0.8993],
        [ 0.8671, -0.6260,  0.6570,  ...,  0.2710, -1.4527, -0.8993],
        [ 0.9411,  0.7290,  0.4508,  ...,  1.7479,  0.1384,  1.2227],
        ...,
        [-1.5769,  0.1645,  1.4901,  ..., -0.8896, -0.2214, -0.3320],
        [-1.5769,  0.1645,  1.4901,  ..., -0.8896, -0.2214, -0.3320],
        [-1.5769,  0.1645,  1.4901,  ..., -0.8896, -0.2214, -0.3320]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 56, 38, 24, 16,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.9404,  0.7290,  0.4516,  ...,  1.7471,  0.1383,  1.2227],
        [ 0.8661, -0.6251,  0.6577,  ...,  0.2700, -1.4537, -0.8984],
        [ 0.8661, -0.6251,  0.6577,  ...,  0.2700, -1.4537, -0.8984],
        ...,
        [-1.5777,  0.1652,  1.4909,  ..., -0.8903, -0.2221, -0.3317],
        [-1.5777,  0.1652,  1.4909,  ..., -0.8903, -0.2221, -0.3317],
        [-1

PackedSequence(data=tensor([[ 0.5033,  1.2756, -1.1483,  ...,  1.1805,  2.6638,  0.7072],
        [ 0.8536, -0.6122,  0.6702,  ...,  0.2567, -1.4679, -0.8930],
        [ 0.8536, -0.6122,  0.6702,  ...,  0.2567, -1.4679, -0.8930],
        ...,
        [-1.5831,  0.1686,  1.4968,  ..., -0.8954, -0.2273, -0.3302],
        [-1.5831,  0.1686,  1.4968,  ..., -0.8954, -0.2273, -0.3302],
        [-1.5831,  0.1686,  1.4968,  ..., -0.8954, -0.2273, -0.3302]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 63, 63, 54, 37, 25, 13,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8526, -0.6110,  0.6714,  ...,  0.2558, -1.4692, -0.8929],
        [ 0.5036,  1.2748, -1.1491,  ...,  1.1813,  2.6647,  0.7068],
        [ 0.8526, -0.6110,  0.6714,  ...,  0.2558, -1.4692, -0.8929],
        ...,
        [-1.5833,  0.1684,  1.4970,  ..., -0.8955, -0.2276, -0.3302],
        [-1.5833,  0.1684,  1.4970,  ..., -0.8955, -0.2276, -0.3302],
        [-1

PackedSequence(data=tensor([[ 0.4048,  0.0760, -1.2922,  ...,  0.2585, -1.6433,  0.5268],
        [ 0.8428, -0.5969,  0.6875,  ...,  0.2422, -1.4860, -0.8918],
        [ 0.9244,  0.7402,  0.4672,  ...,  1.7428,  0.1339,  1.2351],
        ...,
        [-1.5841,  0.1646,  1.4986,  ..., -0.8959, -0.2303, -0.3306],
        [-1.5841,  0.1646,  1.4986,  ..., -0.8959, -0.2303, -0.3306],
        [-1.5841,  0.1646,  1.4986,  ..., -0.8959, -0.2303, -0.3306]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 43, 29, 17,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7213, -1.0684, -1.7859,  ...,  0.1488, -0.2407, -1.2897],
        [ 0.8428, -0.5963,  0.6885,  ...,  0.2414, -1.4873, -0.8915],
        [ 0.8428, -0.5963,  0.6885,  ...,  0.2414, -1.4873, -0.8915],
        ...,
        [-1.5841,  0.1644,  1.4987,  ..., -0.8959, -0.2305, -0.3307],
        [-1.5841,  0.1644,  1.4987,  ..., -0.8959, -0.2305, -0.3307],
        [-1

PackedSequence(data=tensor([[ 0.4043,  0.0725, -1.3084,  ...,  0.2704, -1.6287,  0.5179],
        [-0.4191,  1.8537, -0.9554,  ...,  0.2580, -2.0506,  1.5709],
        [ 0.8496, -0.5891,  0.6985,  ...,  0.2360, -1.5016, -0.8815],
        ...,
        [-1.5845,  0.1630,  1.4993,  ..., -0.8961, -0.2312, -0.3314],
        [-1.5845,  0.1630,  1.4993,  ..., -0.8961, -0.2312, -0.3314],
        [-1.5845,  0.1630,  1.4993,  ..., -0.8961, -0.2312, -0.3314]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 46, 35, 19, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8505, -0.5888,  0.6989,  ...,  0.2359, -1.5024, -0.8806],
        [-1.4784,  0.7732, -1.7460,  ..., -0.5081,  0.1024,  0.0909],
        [ 0.8505, -0.5888,  0.6989,  ...,  0.2359, -1.5024, -0.8806],
        ...,
        [-1.5846,  0.1630,  1.4993,  ..., -0.8961, -0.2312, -0.3314],
        [-1.5846,  0.1630,  1.4993,  ..., -0.8961, -0.2312, -0.3314],
        [-1

PackedSequence(data=tensor([[ 0.8637, -0.5892,  0.6975,  ...,  0.2355, -1.5107, -0.8721],
        [ 0.7420, -1.0956, -1.7694,  ...,  0.1340, -0.2166, -1.3050],
        [ 0.8637, -0.5892,  0.6975,  ...,  0.2355, -1.5107, -0.8721],
        ...,
        [-1.5847,  0.1620,  1.4991,  ..., -0.8961, -0.2314, -0.3318],
        [-1.5847,  0.1620,  1.4991,  ..., -0.8961, -0.2314, -0.3318],
        [-1.5847,  0.1620,  1.4991,  ..., -0.8961, -0.2314, -0.3318]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 54, 36, 24, 15,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8640, -0.5886,  0.6971,  ...,  0.2351, -1.5111, -0.8716],
        [ 0.7428, -1.0965, -1.7690,  ...,  0.1337, -0.2162, -1.3060],
        [ 0.5059,  1.2712, -1.1871,  ...,  1.1968,  2.7037,  0.6952],
        ...,
        [-1.5847,  0.1616,  1.4991,  ..., -0.8961, -0.2315, -0.3319],
        [-1.5847,  0.1616,  1.4991,  ..., -0.8961, -0.2315, -0.3319],
        [-1

PackedSequence(data=tensor([[ 0.8709, -0.5917,  0.6870,  ...,  0.2354, -1.5115, -0.8753],
        [ 0.5022,  1.2720, -1.1943,  ...,  1.1955,  2.7102,  0.6961],
        [ 0.9195,  0.7554,  0.4719,  ...,  1.7398,  0.1335,  1.2657],
        ...,
        [-1.5838,  0.1629,  1.4991,  ..., -0.8962, -0.2315, -0.3325],
        [-1.5838,  0.1629,  1.4991,  ..., -0.8962, -0.2315, -0.3325],
        [-1.5838,  0.1629,  1.4991,  ..., -0.8962, -0.2315, -0.3325]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 49, 40, 25, 14,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5024,  1.2720, -1.1946,  ...,  1.1954,  2.7104,  0.6963],
        [ 0.8714, -0.5922,  0.6861,  ...,  0.2357, -1.5112, -0.8761],
        [ 0.8714, -0.5922,  0.6861,  ...,  0.2357, -1.5112, -0.8761],
        ...,
        [-1.5837,  0.1632,  1.4990,  ..., -0.8962, -0.2315, -0.3326],
        [-1.5837,  0.1632,  1.4990,  ..., -0.8962, -0.2315, -0.3326],
        [-1

PackedSequence(data=tensor([[ 0.8714, -0.5919,  0.6817,  ...,  0.2350, -1.5101, -0.8796],
        [-0.3993,  1.8605, -0.9440,  ...,  0.2574, -2.0747,  1.5565],
        [ 0.7506, -1.1073, -1.7706,  ...,  0.1338, -0.2123, -1.3323],
        ...,
        [-1.5828,  0.1644,  1.4985,  ..., -0.8961, -0.2323, -0.3333],
        [-1.5828,  0.1644,  1.4985,  ..., -0.8961, -0.2323, -0.3333],
        [-1.5828,  0.1644,  1.4985,  ..., -0.8961, -0.2323, -0.3333]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 44, 26, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8718, -0.5923,  0.6811,  ...,  0.2356, -1.5095, -0.8803],
        [ 0.7514, -1.1081, -1.7715,  ...,  0.1345, -0.2118, -1.3331],
        [ 1.0121, -2.0374,  0.2357,  ...,  0.6477,  1.3614,  1.4056],
        ...,
        [-1.5828,  0.1648,  1.4986,  ..., -0.8961, -0.2323, -0.3333],
        [-1.5828,  0.1648,  1.4986,  ..., -0.8961, -0.2323, -0.3333],
        [-1

PackedSequence(data=tensor([[-0.2732,  0.1002, -0.0487,  ..., -0.4867, -0.6588,  1.5856],
        [ 0.9335,  0.7556,  0.4652,  ...,  1.7603,  0.1342,  1.2782],
        [ 0.4019,  0.0642, -1.3329,  ...,  0.2684, -1.6080,  0.4950],
        ...,
        [-1.5827,  0.1675,  1.4985,  ..., -0.8966, -0.2321, -0.3337],
        [-1.5827,  0.1675,  1.4985,  ..., -0.8966, -0.2321, -0.3337],
        [-1.5827,  0.1675,  1.4985,  ..., -0.8966, -0.2321, -0.3337]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 38, 27, 18,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7627, -1.1189, -1.7815,  ...,  0.1442, -0.2070, -1.3465],
        [ 0.8659, -0.5872,  0.6808,  ...,  0.2324, -1.5073, -0.8843],
        [ 0.8659, -0.5872,  0.6808,  ...,  0.2324, -1.5073, -0.8843],
        ...,
        [-1.5825,  0.1676,  1.4985,  ..., -0.8966, -0.2322, -0.3338],
        [-1.5825,  0.1676,  1.4985,  ..., -0.8966, -0.2322, -0.3338],
        [-1

PackedSequence(data=tensor([[ 0.8648, -0.5847,  0.6804,  ...,  0.2326, -1.5071, -0.8899],
        [-0.3910,  1.8590, -0.9449,  ...,  0.2641, -2.0760,  1.5490],
        [-0.2656,  0.0947, -0.0535,  ..., -0.4806, -0.6609,  1.5927],
        ...,
        [-1.5829,  0.1640,  1.4972,  ..., -0.8958, -0.2312, -0.3331],
        [-1.5829,  0.1640,  1.4972,  ..., -0.8958, -0.2312, -0.3331],
        [-1.5829,  0.1640,  1.4972,  ..., -0.8958, -0.2312, -0.3331]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 50, 36, 15,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.9409,  0.7581,  0.4648,  ...,  1.7673,  0.1306,  1.2767],
        [ 0.7596, -1.1150, -1.7801,  ...,  0.1408, -0.2130, -1.3567],
        [-0.0518, -0.6426,  0.4567,  ...,  0.0089, -0.6084, -0.5322],
        ...,
        [-1.5830,  0.1642,  1.4973,  ..., -0.8958, -0.2311, -0.3331],
        [-1.5830,  0.1642,  1.4973,  ..., -0.8958, -0.2311, -0.3331],
        [-1

PackedSequence(data=tensor([[ 0.8623, -0.5817,  0.6787,  ...,  0.2327, -1.5038, -0.8976],
        [ 0.7589, -1.1138, -1.7809,  ...,  0.1387, -0.2157, -1.3624],
        [ 1.0244, -0.1838,  0.2180,  ..., -1.0459,  0.3103,  0.4647],
        ...,
        [-1.5832,  0.1685,  1.4975,  ..., -0.8963, -0.2313, -0.3342],
        [-1.5832,  0.1685,  1.4975,  ..., -0.8963, -0.2313, -0.3342],
        [-1.5832,  0.1685,  1.4975,  ..., -0.8963, -0.2313, -0.3342]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 41, 30, 14,  1]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8622, -0.5816,  0.6785,  ...,  0.2329, -1.5035, -0.8984],
        [ 0.1329, -0.7620,  0.2552,  ...,  1.1375, -0.5880,  0.8907],
        [ 0.8622, -0.5816,  0.6785,  ...,  0.2329, -1.5035, -0.8984],
        ...,
        [-1.5832,  0.1687,  1.4975,  ..., -0.8964, -0.2313, -0.3343],
        [-1.5832,  0.1687,  1.4975,  ..., -0.8964, -0.2313, -0.3343],
        [-1

PackedSequence(data=tensor([[ 0.8660, -0.5849,  0.6744,  ...,  0.2388, -1.4999, -0.9073],
        [ 0.8660, -0.5849,  0.6744,  ...,  0.2388, -1.4999, -0.9073],
        [ 0.7613, -1.1173, -1.7841,  ...,  0.1420, -0.2122, -1.3717],
        ...,
        [-1.5836,  0.1706,  1.4990,  ..., -0.8964, -0.2299, -0.3340],
        [-1.5836,  0.1706,  1.4990,  ..., -0.8964, -0.2299, -0.3340],
        [-1.5836,  0.1706,  1.4990,  ..., -0.8964, -0.2299, -0.3340]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 56, 46, 36, 19,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8663, -0.5852,  0.6740,  ...,  0.2393, -1.4997, -0.9077],
        [ 0.8663, -0.5852,  0.6740,  ...,  0.2393, -1.4997, -0.9077],
        [ 0.8663, -0.5852,  0.6740,  ...,  0.2393, -1.4997, -0.9077],
        ...,
        [-1.5836,  0.1702,  1.4989,  ..., -0.8963, -0.2299, -0.3340],
        [-1.5836,  0.1702,  1.4989,  ..., -0.8963, -0.2299, -0.3340],
        [-1

PackedSequence(data=tensor([[ 0.8665, -0.5845,  0.6725,  ...,  0.2397, -1.4993, -0.9084],
        [-0.5417,  0.6270, -0.5062,  ...,  0.0433,  0.1347, -0.3508],
        [ 0.9603,  0.7460,  0.4575,  ...,  1.7894,  0.1309,  1.2763],
        ...,
        [-1.5838,  0.1692,  1.4985,  ..., -0.8962, -0.2311, -0.3330],
        [-1.5838,  0.1692,  1.4985,  ..., -0.8962, -0.2311, -0.3330],
        [-1.5838,  0.1692,  1.4985,  ..., -0.8962, -0.2311, -0.3330]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 63, 62, 56, 45, 31, 11,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.3886,  1.8593, -0.9422,  ...,  0.2632, -2.0811,  1.5337],
        [ 0.4071,  0.0608, -1.3367,  ...,  0.2684, -1.6036,  0.4980],
        [ 0.9607,  0.7459,  0.4574,  ...,  1.7897,  0.1307,  1.2761],
        ...,
        [-1.5839,  0.1700,  1.4987,  ..., -0.8964, -0.2314, -0.3331],
        [-1.5839,  0.1700,  1.4987,  ..., -0.8964, -0.2314, -0.3331],
        [-1

PackedSequence(data=tensor([[ 0.4110,  0.0594, -1.3356,  ...,  0.2685, -1.6106,  0.4953],
        [ 0.8654, -0.5836,  0.6701,  ...,  0.2386, -1.4980, -0.9081],
        [ 0.5031,  1.2781, -1.2044,  ...,  1.1882,  2.7178,  0.6841],
        ...,
        [-1.5818,  0.1690,  1.4985,  ..., -0.8951, -0.2285, -0.3303],
        [-1.5818,  0.1690,  1.4985,  ..., -0.8951, -0.2285, -0.3303],
        [-1.5818,  0.1690,  1.4985,  ..., -0.8951, -0.2285, -0.3303]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 42, 32, 20,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5034,  1.2780, -1.2045,  ...,  1.1884,  2.7177,  0.6837],
        [ 0.8652, -0.5835,  0.6698,  ...,  0.2385, -1.4979, -0.9082],
        [ 0.9612,  0.7468,  0.4582,  ...,  1.7889,  0.1284,  1.2687],
        ...,
        [-1.5817,  0.1687,  1.4983,  ..., -0.8949, -0.2283, -0.3301],
        [-1.5817,  0.1687,  1.4983,  ..., -0.8949, -0.2283, -0.3301],
        [-1

PackedSequence(data=tensor([[-0.5460,  0.6268, -0.5079,  ...,  0.0409,  0.1417, -0.3487],
        [ 0.8610, -0.5811,  0.6680,  ...,  0.2363, -1.4927, -0.9073],
        [ 0.8610, -0.5811,  0.6680,  ...,  0.2363, -1.4927, -0.9073],
        ...,
        [-1.5812,  0.1695,  1.4977,  ..., -0.8935, -0.2279, -0.3309],
        [-1.5812,  0.1695,  1.4977,  ..., -0.8935, -0.2279, -0.3309],
        [-1.5812,  0.1695,  1.4977,  ..., -0.8935, -0.2279, -0.3309]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 37, 23, 13,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7659, -1.1214, -1.7890,  ...,  0.1462, -0.2197, -1.3757],
        [ 0.8607, -0.5809,  0.6680,  ...,  0.2361, -1.4925, -0.9071],
        [ 0.8607, -0.5809,  0.6680,  ...,  0.2361, -1.4925, -0.9071],
        ...,
        [-1.5811,  0.1698,  1.4978,  ..., -0.8936, -0.2279, -0.3310],
        [-1.5811,  0.1698,  1.4978,  ..., -0.8936, -0.2279, -0.3310],
        [-1

PackedSequence(data=tensor([[ 0.8583, -0.5790,  0.6679,  ...,  0.2351, -1.4916, -0.9066],
        [-0.2609,  0.0924, -0.0661,  ..., -0.4795, -0.6524,  1.5806],
        [ 0.8583, -0.5790,  0.6679,  ...,  0.2351, -1.4916, -0.9066],
        ...,
        [-1.5792,  0.1683,  1.4983,  ..., -0.8961, -0.2292, -0.3293],
        [-1.5792,  0.1683,  1.4983,  ..., -0.8961, -0.2292, -0.3293],
        [-1.5792,  0.1683,  1.4983,  ..., -0.8961, -0.2292, -0.3293]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 43, 33, 16,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.3781,  1.8502, -0.9445,  ...,  0.2704, -2.0848,  1.5271],
        [ 0.8584, -0.5791,  0.6674,  ...,  0.2353, -1.4914, -0.9068],
        [ 0.7673, -1.1212, -1.7870,  ...,  0.1468, -0.2232, -1.3751],
        ...,
        [-1.5791,  0.1680,  1.4984,  ..., -0.8963, -0.2294, -0.3291],
        [-1.5791,  0.1680,  1.4984,  ..., -0.8963, -0.2294, -0.3291],
        [-1

PackedSequence(data=tensor([[ 0.8622, -0.5831,  0.6616,  ...,  0.2410, -1.4916, -0.9054],
        [ 0.9664,  0.7376,  0.4498,  ...,  1.7968,  0.1362,  1.2572],
        [ 0.7705, -1.1238, -1.7872,  ...,  0.1503, -0.2221, -1.3728],
        ...,
        [-1.5780,  0.1673,  1.4999,  ..., -0.8981, -0.2297, -0.3285],
        [-1.5780,  0.1673,  1.4999,  ..., -0.8981, -0.2297, -0.3285],
        [-1.5780,  0.1673,  1.4999,  ..., -0.8981, -0.2297, -0.3285]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 45, 31, 15,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8622, -0.5831,  0.6617,  ...,  0.2410, -1.4917, -0.9051],
        [ 0.8622, -0.5831,  0.6617,  ...,  0.2410, -1.4917, -0.9051],
        [-0.3717,  1.8467, -0.9447,  ...,  0.2748, -2.0890,  1.5235],
        ...,
        [-1.5779,  0.1675,  1.5000,  ..., -0.8982, -0.2297, -0.3285],
        [-1.5779,  0.1675,  1.5000,  ..., -0.8982, -0.2297, -0.3285],
        [-1

PackedSequence(data=tensor([[ 0.8623, -0.5839,  0.6597,  ...,  0.2426, -1.4903, -0.9031],
        [ 0.8623, -0.5839,  0.6597,  ...,  0.2426, -1.4903, -0.9031],
        [ 0.9677,  0.7370,  0.4488,  ...,  1.7965,  0.1361,  1.2566],
        ...,
        [-1.5777,  0.1672,  1.4990,  ..., -0.8987, -0.2317, -0.3292],
        [-1.5777,  0.1672,  1.4990,  ..., -0.8987, -0.2317, -0.3292],
        [-1.5777,  0.1672,  1.4990,  ..., -0.8987, -0.2317, -0.3292]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 42, 34, 20,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7711, -1.1245, -1.7861,  ...,  0.1512, -0.2200, -1.3708],
        [ 0.8621, -0.5838,  0.6598,  ...,  0.2425, -1.4901, -0.9029],
        [ 0.7711, -1.1245, -1.7861,  ...,  0.1512, -0.2200, -1.3708],
        ...,
        [-1.5777,  0.1671,  1.4988,  ..., -0.8987, -0.2319, -0.3291],
        [-1.5777,  0.1671,  1.4988,  ..., -0.8987, -0.2319, -0.3291],
        [-1

PackedSequence(data=tensor([[-0.5468,  0.6201, -0.5087,  ...,  0.0458,  0.1475, -0.3504],
        [ 0.9664,  0.7398,  0.4515,  ...,  1.7947,  0.1334,  1.2590],
        [ 0.8605, -0.5847,  0.6574,  ...,  0.2427, -1.4840, -0.9012],
        ...,
        [-1.5774,  0.1665,  1.4997,  ..., -0.8984, -0.2305, -0.3276],
        [-1.5774,  0.1665,  1.4997,  ..., -0.8984, -0.2305, -0.3276],
        [-1.5774,  0.1665,  1.4997,  ..., -0.8984, -0.2305, -0.3276]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 39, 27, 15,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8605, -0.5849,  0.6572,  ...,  0.2428, -1.4834, -0.9011],
        [ 0.8605, -0.5849,  0.6572,  ...,  0.2428, -1.4834, -0.9011],
        [ 0.4119,  0.0585, -1.3412,  ...,  0.2698, -1.6060,  0.4985],
        ...,
        [-1.5774,  0.1662,  1.4997,  ..., -0.8983, -0.2304, -0.3275],
        [-1.5774,  0.1662,  1.4997,  ..., -0.8983, -0.2304, -0.3275],
        [-1

PackedSequence(data=tensor([[ 0.7729, -1.1268, -1.7880,  ...,  0.1541, -0.2206, -1.3686],
        [-0.3714,  1.8453, -0.9389,  ...,  0.2702, -2.0956,  1.5159],
        [-0.2589,  0.0979, -0.0697,  ..., -0.4793, -0.6635,  1.5743],
        ...,
        [-1.5789,  0.1660,  1.4998,  ..., -0.9002, -0.2330, -0.3267],
        [-1.5789,  0.1660,  1.4998,  ..., -0.9002, -0.2330, -0.3267],
        [-1.5789,  0.1660,  1.4998,  ..., -0.9002, -0.2330, -0.3267]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 54, 39, 26, 18,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2590,  0.0981, -0.0697,  ..., -0.4794, -0.6635,  1.5744],
        [ 0.7729, -1.1268, -1.7882,  ...,  0.1541, -0.2207, -1.3682],
        [-0.3714,  1.8454, -0.9384,  ...,  0.2699, -2.0961,  1.5162],
        ...,
        [-1.5791,  0.1664,  1.4998,  ..., -0.9005, -0.2331, -0.3265],
        [-1.5791,  0.1664,  1.4998,  ..., -0.9005, -0.2331, -0.3265],
        [-1

PackedSequence(data=tensor([[ 0.8620, -0.5866,  0.6533,  ...,  0.2434, -1.4728, -0.9028],
        [ 0.8620, -0.5866,  0.6533,  ...,  0.2434, -1.4728, -0.9028],
        [ 0.8620, -0.5866,  0.6533,  ...,  0.2434, -1.4728, -0.9028],
        ...,
        [-1.5788,  0.1648,  1.5004,  ..., -0.8995, -0.2309, -0.3225],
        [-1.5788,  0.1648,  1.5004,  ..., -0.8995, -0.2309, -0.3225],
        [-1.5788,  0.1648,  1.5004,  ..., -0.8995, -0.2309, -0.3225]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 36, 26, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8622, -0.5869,  0.6527,  ...,  0.2438, -1.4725, -0.9027],
        [-0.3723,  1.8466, -0.9339,  ...,  0.2671, -2.0994,  1.5162],
        [ 0.8622, -0.5869,  0.6527,  ...,  0.2438, -1.4725, -0.9027],
        ...,
        [-1.5789,  0.1650,  1.5006,  ..., -0.8996, -0.2309, -0.3224],
        [-1.5789,  0.1650,  1.5006,  ..., -0.8996, -0.2309, -0.3224],
        [-1

PackedSequence(data=tensor([[ 0.5087,  1.2817, -1.2032,  ...,  1.1818,  2.7080,  0.6630],
        [ 0.8652, -0.5907,  0.6478,  ...,  0.2471, -1.4733, -0.9009],
        [ 0.8652, -0.5907,  0.6478,  ...,  0.2471, -1.4733, -0.9009],
        ...,
        [-1.5791,  0.1678,  1.4979,  ..., -0.9000, -0.2333, -0.3224],
        [-1.5791,  0.1678,  1.4979,  ..., -0.9000, -0.2333, -0.3224],
        [-1.5791,  0.1678,  1.4979,  ..., -0.9000, -0.2333, -0.3224]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 44, 32, 17,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4123,  0.0597, -1.3468,  ...,  0.2735, -1.5960,  0.5040],
        [ 0.8651, -0.5906,  0.6481,  ...,  0.2469, -1.4737, -0.9013],
        [ 0.8651, -0.5906,  0.6481,  ...,  0.2469, -1.4737, -0.9013],
        ...,
        [-1.5791,  0.1680,  1.4980,  ..., -0.9000, -0.2333, -0.3224],
        [-1.5791,  0.1680,  1.4980,  ..., -0.9000, -0.2333, -0.3224],
        [-1

PackedSequence(data=tensor([[-1.7873, -1.3421, -0.8144,  ...,  0.6427,  0.0091,  0.7110],
        [ 0.8656, -0.5921,  0.6470,  ...,  0.2475, -1.4730, -0.9021],
        [ 0.8656, -0.5921,  0.6470,  ...,  0.2475, -1.4730, -0.9021],
        ...,
        [-1.5818,  0.1695,  1.4997,  ..., -0.9017, -0.2323, -0.3202],
        [-1.5818,  0.1695,  1.4997,  ..., -0.9017, -0.2323, -0.3202],
        [-1.5818,  0.1695,  1.4997,  ..., -0.9017, -0.2323, -0.3202]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 53, 41, 27, 12,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8658, -0.5923,  0.6467,  ...,  0.2478, -1.4727, -0.9020],
        [ 0.9674,  0.7394,  0.4467,  ...,  1.7995,  0.1212,  1.2848],
        [ 0.7742, -1.1294, -1.7873,  ...,  0.1557, -0.2247, -1.3581],
        ...,
        [-1.5820,  0.1696,  1.4996,  ..., -0.9017, -0.2324, -0.3199],
        [-1.5820,  0.1696,  1.4996,  ..., -0.9017, -0.2324, -0.3199],
        [-1

PackedSequence(data=tensor([[ 0.7718, -1.1278, -1.7857,  ...,  0.1550, -0.2258, -1.3571],
        [ 0.9688,  0.7385,  0.4447,  ...,  1.8007,  0.1210,  1.2846],
        [ 0.8649, -0.5918,  0.6487,  ...,  0.2466, -1.4746, -0.9050],
        ...,
        [-1.5825,  0.1706,  1.5022,  ..., -0.9022, -0.2311, -0.3175],
        [-1.5825,  0.1706,  1.5022,  ..., -0.9022, -0.2311, -0.3175],
        [-1.5825,  0.1706,  1.5022,  ..., -0.9022, -0.2311, -0.3175]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 52, 39, 28, 16, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7721, -1.1280, -1.7859,  ...,  0.1551, -0.2257, -1.3569],
        [ 0.8649, -0.5918,  0.6487,  ...,  0.2466, -1.4746, -0.9051],
        [ 0.8649, -0.5918,  0.6487,  ...,  0.2466, -1.4746, -0.9051],
        ...,
        [-1.5827,  0.1709,  1.5023,  ..., -0.9023, -0.2311, -0.3175],
        [-1.5827,  0.1709,  1.5023,  ..., -0.9023, -0.2311, -0.3175],
        [-1

PackedSequence(data=tensor([[ 0.8648, -0.5931,  0.6462,  ...,  0.2484, -1.4737, -0.9026],
        [ 0.8648, -0.5931,  0.6462,  ...,  0.2484, -1.4737, -0.9026],
        [ 0.9714,  0.7353,  0.4407,  ...,  1.8028,  0.1228,  1.2834],
        ...,
        [-1.5846,  0.1746,  1.5027,  ..., -0.9040, -0.2322, -0.3166],
        [-1.5846,  0.1746,  1.5027,  ..., -0.9040, -0.2322, -0.3166],
        [-1.5846,  0.1746,  1.5027,  ..., -0.9040, -0.2322, -0.3166]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 39, 25, 16,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.3670,  1.8406, -0.9382,  ...,  0.2714, -2.0938,  1.5173],
        [ 0.9714,  0.7353,  0.4406,  ...,  1.8029,  0.1228,  1.2833],
        [ 0.8648, -0.5932,  0.6460,  ...,  0.2486, -1.4736, -0.9022],
        ...,
        [-1.5847,  0.1746,  1.5026,  ..., -0.9039, -0.2322, -0.3165],
        [-1.5847,  0.1746,  1.5026,  ..., -0.9039, -0.2322, -0.3165],
        [-1

PackedSequence(data=tensor([[ 0.8621, -0.5924,  0.6469,  ...,  0.2472, -1.4727, -0.9047],
        [ 0.1075, -0.7571,  0.2538,  ...,  1.1235, -0.5718,  0.9278],
        [ 1.0245, -2.0481,  0.2321,  ...,  0.6523,  1.3649,  1.4020],
        ...,
        [-1.5855,  0.1719,  1.5007,  ..., -0.9024, -0.2312, -0.3143],
        [-1.5855,  0.1719,  1.5007,  ..., -0.9024, -0.2312, -0.3143],
        [-1.5855,  0.1719,  1.5007,  ..., -0.9024, -0.2312, -0.3143]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 45, 36, 26, 11]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2581,  0.0981, -0.0862,  ..., -0.4784, -0.6751,  1.5795],
        [-0.2581,  0.0981, -0.0862,  ..., -0.4784, -0.6751,  1.5795],
        [ 0.7742, -1.1301, -1.7869,  ...,  0.1556, -0.2271, -1.3590],
        ...,
        [-1.5855,  0.1718,  1.5005,  ..., -0.9025, -0.2313, -0.3144],
        [-1.5855,  0.1718,  1.5005,  ..., -0.9025, -0.2313, -0.3144],
        [-1

PackedSequence(data=tensor([[-0.2571,  0.0970, -0.0881,  ..., -0.4768, -0.6750,  1.5803],
        [ 0.5072,  1.2870, -1.2072,  ...,  1.1834,  2.7145,  0.6632],
        [-0.3601,  1.8373, -0.9444,  ...,  0.2738, -2.0990,  1.5097],
        ...,
        [-1.5847,  0.1735,  1.4997,  ..., -0.9046, -0.2328, -0.3165],
        [-1.5847,  0.1735,  1.4997,  ..., -0.9046, -0.2328, -0.3165],
        [-1.5847,  0.1735,  1.4997,  ..., -0.9046, -0.2328, -0.3165]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 50, 35, 19, 10,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8612, -0.5919,  0.6455,  ...,  0.2472, -1.4728, -0.9056],
        [ 0.8612, -0.5919,  0.6455,  ...,  0.2472, -1.4728, -0.9056],
        [ 0.8612, -0.5919,  0.6455,  ...,  0.2472, -1.4728, -0.9056],
        ...,
        [-1.5850,  0.1739,  1.4996,  ..., -0.9050, -0.2331, -0.3169],
        [-1.5850,  0.1739,  1.4996,  ..., -0.9050, -0.2331, -0.3169],
        [-1

PackedSequence(data=tensor([[ 0.5113,  1.2842, -1.2125,  ...,  1.1859,  2.7157,  0.6640],
        [-0.2561,  0.0959, -0.0916,  ..., -0.4759, -0.6755,  1.5792],
        [ 0.8623, -0.5930,  0.6432,  ...,  0.2488, -1.4733, -0.9022],
        ...,
        [-1.5874,  0.1753,  1.5000,  ..., -0.9056, -0.2330, -0.3153],
        [-1.5874,  0.1753,  1.5000,  ..., -0.9056, -0.2330, -0.3153],
        [-1.5874,  0.1753,  1.5000,  ..., -0.9056, -0.2330, -0.3153]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 45, 32, 16,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0230, -2.0481,  0.2286,  ...,  0.6544,  1.3664,  1.4034],
        [ 0.9763,  0.7316,  0.4381,  ...,  1.8053,  0.1237,  1.2808],
        [ 0.8621, -0.5930,  0.6433,  ...,  0.2488, -1.4735, -0.9020],
        ...,
        [-1.5872,  0.1750,  1.5000,  ..., -0.9054, -0.2328, -0.3150],
        [-1.5872,  0.1750,  1.5000,  ..., -0.9054, -0.2328, -0.3150],
        [-1

PackedSequence(data=tensor([[ 0.8612, -0.5912,  0.6435,  ...,  0.2487, -1.4749, -0.9031],
        [-0.2562,  0.3849,  0.0383,  ...,  0.5411,  2.2456,  0.8878],
        [ 0.8612, -0.5912,  0.6435,  ...,  0.2487, -1.4749, -0.9031],
        ...,
        [-1.5861,  0.1702,  1.4971,  ..., -0.9043, -0.2366, -0.3166],
        [-1.5861,  0.1702,  1.4971,  ..., -0.9043, -0.2366, -0.3166],
        [-1.5861,  0.1702,  1.4971,  ..., -0.9043, -0.2366, -0.3166]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 37, 27, 12,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8613, -0.5911,  0.6434,  ...,  0.2487, -1.4752, -0.9034],
        [ 0.8613, -0.5911,  0.6434,  ...,  0.2487, -1.4752, -0.9034],
        [ 0.8613, -0.5911,  0.6434,  ...,  0.2487, -1.4752, -0.9034],
        ...,
        [-1.5861,  0.1699,  1.4969,  ..., -0.9043, -0.2369, -0.3168],
        [-1.5861,  0.1699,  1.4969,  ..., -0.9043, -0.2369, -0.3168],
        [-1

PackedSequence(data=tensor([[ 0.8639, -0.5933,  0.6419,  ...,  0.2496, -1.4726, -0.9013],
        [ 0.8639, -0.5933,  0.6419,  ...,  0.2496, -1.4726, -0.9013],
        [ 0.7795, -1.1389, -1.7987,  ...,  0.1598, -0.2193, -1.3464],
        ...,
        [-1.5873,  0.1732,  1.4996,  ..., -0.9070, -0.2346, -0.3151],
        [-1.5873,  0.1732,  1.4996,  ..., -0.9070, -0.2346, -0.3151],
        [-1.5873,  0.1732,  1.4996,  ..., -0.9070, -0.2346, -0.3151]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 50, 35, 22, 14,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8642, -0.5934,  0.6416,  ...,  0.2497, -1.4723, -0.9010],
        [ 0.8642, -0.5934,  0.6416,  ...,  0.2497, -1.4723, -0.9010],
        [ 0.7795, -1.1389, -1.7987,  ...,  0.1597, -0.2193, -1.3464],
        ...,
        [-1.5876,  0.1737,  1.5001,  ..., -0.9074, -0.2345, -0.3150],
        [-1.5876,  0.1737,  1.5001,  ..., -0.9074, -0.2345, -0.3150],
        [-1

PackedSequence(data=tensor([[ 0.8645, -0.5941,  0.6408,  ...,  0.2490, -1.4693, -0.8978],
        [ 0.5088,  1.2856, -1.2066,  ...,  1.1838,  2.7155,  0.6694],
        [ 0.7774, -1.1351, -1.7946,  ...,  0.1561, -0.2220, -1.3488],
        ...,
        [-1.5889,  0.1772,  1.5023,  ..., -0.9099, -0.2336, -0.3137],
        [-1.5889,  0.1772,  1.5023,  ..., -0.9099, -0.2336, -0.3137],
        [-1.5889,  0.1772,  1.5023,  ..., -0.9099, -0.2336, -0.3137]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 63, 63, 54, 38, 30, 15,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.9787,  0.7312,  0.4397,  ...,  1.8057,  0.1233,  1.2753],
        [ 0.8646, -0.5942,  0.6408,  ...,  0.2490, -1.4694, -0.8976],
        [ 0.8646, -0.5942,  0.6408,  ...,  0.2490, -1.4694, -0.8976],
        ...,
        [-1.5889,  0.1773,  1.5021,  ..., -0.9101, -0.2339, -0.3139],
        [-1.5889,  0.1773,  1.5021,  ..., -0.9101, -0.2339, -0.3139],
        [-1

PackedSequence(data=tensor([[ 0.8659, -0.5944,  0.6417,  ...,  0.2477, -1.4745, -0.8988],
        [ 0.9835,  0.7257,  0.4352,  ...,  1.8092,  0.1261,  1.2696],
        [ 0.8659, -0.5944,  0.6417,  ...,  0.2477, -1.4745, -0.8988],
        ...,
        [-1.5865,  0.1763,  1.5003,  ..., -0.9095, -0.2352, -0.3138],
        [-1.5865,  0.1763,  1.5003,  ..., -0.9095, -0.2352, -0.3138],
        [-1.5865,  0.1763,  1.5003,  ..., -0.9095, -0.2352, -0.3138]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 41, 29, 14,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2564,  0.1010, -0.0981,  ..., -0.4746, -0.6766,  1.5758],
        [ 0.8661, -0.5945,  0.6416,  ...,  0.2477, -1.4750, -0.8990],
        [ 0.7793, -1.1365, -1.7941,  ...,  0.1565, -0.2210, -1.3494],
        ...,
        [-1.5863,  0.1763,  1.5001,  ..., -0.9093, -0.2352, -0.3140],
        [-1.5863,  0.1763,  1.5001,  ..., -0.9093, -0.2352, -0.3140],
        [-1

PackedSequence(data=tensor([[ 0.8673, -0.5954,  0.6403,  ...,  0.2478, -1.4770, -0.8996],
        [ 0.4200,  0.0552, -1.3590,  ...,  0.2796, -1.5874,  0.5198],
        [-0.3530,  1.8382, -0.9415,  ...,  0.2726, -2.1069,  1.4963],
        ...,
        [-1.5860,  0.1743,  1.5013,  ..., -0.9078, -0.2325, -0.3119],
        [-1.5860,  0.1743,  1.5013,  ..., -0.9078, -0.2325, -0.3119],
        [-1.5860,  0.1743,  1.5013,  ..., -0.9078, -0.2325, -0.3119]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 41, 29, 18, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8673, -0.5955,  0.6401,  ...,  0.2478, -1.4769, -0.8995],
        [ 0.9852,  0.7241,  0.4341,  ...,  1.8104,  0.1264,  1.2676],
        [ 0.8673, -0.5955,  0.6401,  ...,  0.2478, -1.4769, -0.8995],
        ...,
        [-1.5860,  0.1741,  1.5014,  ..., -0.9077, -0.2324, -0.3116],
        [-1.5860,  0.1741,  1.5014,  ..., -0.9077, -0.2324, -0.3116],
        [-1

PackedSequence(data=tensor([[ 0.8672, -0.5979,  0.6380,  ...,  0.2484, -1.4746, -0.8951],
        [ 0.8672, -0.5979,  0.6380,  ...,  0.2484, -1.4746, -0.8951],
        [-0.0476, -0.6553,  0.4327,  ...,  0.0244, -0.5993, -0.5289],
        ...,
        [-1.5872,  0.1745,  1.5004,  ..., -0.9077, -0.2330, -0.3106],
        [-1.5872,  0.1745,  1.5004,  ..., -0.9077, -0.2330, -0.3106],
        [-1.5872,  0.1745,  1.5004,  ..., -0.9077, -0.2330, -0.3106]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 38, 24, 14,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8674, -0.5983,  0.6376,  ...,  0.2485, -1.4742, -0.8946],
        [ 0.3833, -0.4372,  1.4809,  ...,  1.4230,  0.0092,  0.2945],
        [-0.3530,  1.8361, -0.9420,  ...,  0.2735, -2.1059,  1.4988],
        ...,
        [-1.5873,  0.1745,  1.5003,  ..., -0.9077, -0.2331, -0.3106],
        [-1.5873,  0.1745,  1.5003,  ..., -0.9077, -0.2331, -0.3106],
        [-1

PackedSequence(data=tensor([[ 0.8666, -0.5987,  0.6368,  ...,  0.2481, -1.4721, -0.8943],
        [ 0.8666, -0.5987,  0.6368,  ...,  0.2481, -1.4721, -0.8943],
        [ 0.8666, -0.5987,  0.6368,  ...,  0.2481, -1.4721, -0.8943],
        ...,
        [-1.5881,  0.1753,  1.4994,  ..., -0.9078, -0.2348, -0.3127],
        [-1.5881,  0.1753,  1.4994,  ..., -0.9078, -0.2348, -0.3127],
        [-1.5881,  0.1753,  1.4994,  ..., -0.9078, -0.2348, -0.3127]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 49, 35, 27, 11,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8664, -0.5987,  0.6368,  ...,  0.2481, -1.4721, -0.8943],
        [ 1.0231, -2.0493,  0.2329,  ...,  0.6541,  1.3666,  1.4064],
        [ 0.8664, -0.5987,  0.6368,  ...,  0.2481, -1.4721, -0.8943],
        ...,
        [-1.5879,  0.1754,  1.4992,  ..., -0.9078, -0.2347, -0.3126],
        [-1.5879,  0.1754,  1.4992,  ..., -0.9078, -0.2347, -0.3126],
        [-1

PackedSequence(data=tensor([[ 0.8611, -0.5967,  0.6375,  ...,  0.2455, -1.4701, -0.8929],
        [ 0.7826, -1.1426, -1.7951,  ...,  0.1592, -0.2178, -1.3533],
        [ 0.5122,  1.2871, -1.2016,  ...,  1.1812,  2.7100,  0.6662],
        ...,
        [-1.5882,  0.1757,  1.4982,  ..., -0.9066, -0.2329, -0.3119],
        [-1.5882,  0.1757,  1.4982,  ..., -0.9066, -0.2329, -0.3119],
        [-1.5882,  0.1757,  1.4982,  ..., -0.9066, -0.2329, -0.3119]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 38, 27, 15,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8611, -0.5968,  0.6373,  ...,  0.2456, -1.4699, -0.8927],
        [ 0.4235,  0.0446, -1.3712,  ...,  0.2855, -1.5763,  0.5091],
        [ 0.4235,  0.0446, -1.3712,  ...,  0.2855, -1.5763,  0.5091],
        ...,
        [-1.5883,  0.1762,  1.4981,  ..., -0.9068, -0.2331, -0.3123],
        [-1.5883,  0.1762,  1.4981,  ..., -0.9068, -0.2331, -0.3123],
        [-1

PackedSequence(data=tensor([[-0.0426, -0.6541,  0.4293,  ...,  0.0249, -0.6066, -0.5277],
        [ 0.9807,  0.7292,  0.4398,  ...,  1.8069,  0.1177,  1.2733],
        [ 0.8653, -0.6014,  0.6331,  ...,  0.2485, -1.4692, -0.8913],
        ...,
        [-1.5874,  0.1788,  1.4973,  ..., -0.9084, -0.2354, -0.3139],
        [-1.5874,  0.1788,  1.4973,  ..., -0.9084, -0.2354, -0.3139],
        [-1.5874,  0.1788,  1.4973,  ..., -0.9084, -0.2354, -0.3139]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 48, 42, 35, 15,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4246,  0.0435, -1.3739,  ...,  0.2867, -1.5754,  0.5089],
        [ 0.7798, -1.1399, -1.7912,  ...,  0.1572, -0.2236, -1.3565],
        [ 0.9810,  0.7288,  0.4396,  ...,  1.8071,  0.1174,  1.2732],
        ...,
        [-1.5872,  0.1786,  1.4976,  ..., -0.9084, -0.2352, -0.3135],
        [-1.5872,  0.1786,  1.4976,  ..., -0.9084, -0.2352, -0.3135],
        [-1

PackedSequence(data=tensor([[ 0.9883,  0.7211,  0.4326,  ...,  1.8128,  0.1163,  1.2703],
        [ 0.8655, -0.6004,  0.6350,  ...,  0.2481, -1.4738, -0.8954],
        [ 0.7803, -1.1406, -1.7902,  ...,  0.1590, -0.2235, -1.3560],
        ...,
        [-1.5884,  0.1787,  1.4985,  ..., -0.9095, -0.2367, -0.3119],
        [-1.5884,  0.1787,  1.4985,  ..., -0.9095, -0.2367, -0.3119],
        [-1.5884,  0.1787,  1.4985,  ..., -0.9095, -0.2367, -0.3119]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 47, 29, 16,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.3545,  1.8293, -0.9421,  ...,  0.2728, -2.0875,  1.5080],
        [-0.5588,  0.6369, -0.5089,  ...,  0.0287,  0.1285, -0.3240],
        [ 0.9881,  0.7213,  0.4325,  ...,  1.8126,  0.1163,  1.2706],
        ...,
        [-1.5886,  0.1786,  1.4981,  ..., -0.9096, -0.2372, -0.3122],
        [-1.5886,  0.1786,  1.4981,  ..., -0.9096, -0.2372, -0.3122],
        [-1

PackedSequence(data=tensor([[ 0.8652, -0.6014,  0.6343,  ...,  0.2499, -1.4720, -0.8955],
        [ 0.8652, -0.6014,  0.6343,  ...,  0.2499, -1.4720, -0.8955],
        [-0.2557,  0.0952, -0.1021,  ..., -0.4682, -0.6679,  1.5820],
        ...,
        [-1.5902,  0.1776,  1.4979,  ..., -0.9094, -0.2351, -0.3086],
        [-1.5902,  0.1776,  1.4979,  ..., -0.9094, -0.2351, -0.3086],
        [-1.5902,  0.1776,  1.4979,  ..., -0.9094, -0.2351, -0.3086]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([42, 42, 42, 42, 41, 29, 20, 16,  9,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2557,  0.0952, -0.1019,  ..., -0.4679, -0.6679,  1.5815],
        [ 0.8654, -0.6017,  0.6342,  ...,  0.2501, -1.4717, -0.8953],
        [ 0.8654, -0.6017,  0.6342,  ...,  0.2501, -1.4717, -0.8953],
        ...,
        [-1.5906,  0.1779,  1.4982,  ..., -0.9096, -0.2347, -0.3082],
        [-1.5906,  0.1779,  1.4982,  ..., -0.9096, -0.2347, -0.3082],
        [-1

PackedSequence(data=tensor([[ 0.8673, -0.6025,  0.6330,  ...,  0.2507, -1.4714, -0.8974],
        [ 0.9863,  0.7226,  0.4285,  ...,  1.8126,  0.1167,  1.2750],
        [ 0.9863,  0.7226,  0.4285,  ...,  1.8126,  0.1167,  1.2750],
        ...,
        [-1.5947,  0.1825,  1.4987,  ..., -0.9117, -0.2339, -0.3102],
        [-1.5947,  0.1825,  1.4987,  ..., -0.9117, -0.2339, -0.3102],
        [-1.5947,  0.1825,  1.4987,  ..., -0.9117, -0.2339, -0.3102]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 37, 25, 13,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8672, -0.6024,  0.6330,  ...,  0.2505, -1.4714, -0.8976],
        [-0.0378, -0.6576,  0.4215,  ...,  0.0270, -0.6014, -0.5185],
        [ 0.7822, -1.1446, -1.7916,  ...,  0.1596, -0.2276, -1.3517],
        ...,
        [-1.5950,  0.1825,  1.4990,  ..., -0.9119, -0.2336, -0.3099],
        [-1.5950,  0.1825,  1.4990,  ..., -0.9119, -0.2336, -0.3099],
        [-1

PackedSequence(data=tensor([[ 0.5128,  1.2869, -1.2062,  ...,  1.1781,  2.7128,  0.6786],
        [ 1.0396, -2.0630,  0.2266,  ...,  0.6637,  1.3627,  1.4048],
        [ 0.7807, -1.1437, -1.7923,  ...,  0.1580, -0.2256, -1.3493],
        ...,
        [-1.5933,  0.1772,  1.4984,  ..., -0.9113, -0.2330, -0.3067],
        [-1.5933,  0.1772,  1.4984,  ..., -0.9113, -0.2330, -0.3067],
        [-1.5933,  0.1772,  1.4984,  ..., -0.9113, -0.2330, -0.3067]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 39, 29, 15,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2558,  0.0912, -0.1093,  ..., -0.4638, -0.6640,  1.5813],
        [ 0.8676, -0.6029,  0.6307,  ...,  0.2512, -1.4696, -0.8973],
        [ 0.7806, -1.1437, -1.7924,  ...,  0.1580, -0.2254, -1.3490],
        ...,
        [-1.5933,  0.1769,  1.4980,  ..., -0.9112, -0.2332, -0.3066],
        [-1.5933,  0.1769,  1.4980,  ..., -0.9112, -0.2332, -0.3066],
        [-1

PackedSequence(data=tensor([[ 0.8666, -0.6016,  0.6316,  ...,  0.2492, -1.4693, -0.8990],
        [ 0.7827, -1.1452, -1.7951,  ...,  0.1596, -0.2231, -1.3440],
        [ 0.9894,  0.7193,  0.4231,  ...,  1.8160,  0.1162,  1.2802],
        ...,
        [-1.5939,  0.1780,  1.4987,  ..., -0.9126, -0.2331, -0.3062],
        [-1.5939,  0.1780,  1.4987,  ..., -0.9126, -0.2331, -0.3062],
        [-1.5939,  0.1780,  1.4987,  ..., -0.9126, -0.2331, -0.3062]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 40, 27, 18,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7830, -1.1452, -1.7955,  ...,  0.1598, -0.2224, -1.3438],
        [ 0.9891,  0.7195,  0.4232,  ...,  1.8154,  0.1157,  1.2802],
        [ 0.8667, -0.6016,  0.6316,  ...,  0.2491, -1.4694, -0.8992],
        ...,
        [-1.5941,  0.1784,  1.4989,  ..., -0.9129, -0.2331, -0.3063],
        [-1.5941,  0.1784,  1.4989,  ..., -0.9129, -0.2331, -0.3063],
        [-1

PackedSequence(data=tensor([[ 0.8681, -0.6023,  0.6309,  ...,  0.2499, -1.4717, -0.8983],
        [ 0.8681, -0.6023,  0.6309,  ...,  0.2499, -1.4717, -0.8983],
        [ 0.7842, -1.1442, -1.7975,  ...,  0.1609, -0.2178, -1.3425],
        ...,
        [-1.5940,  0.1757,  1.4982,  ..., -0.9118, -0.2331, -0.3073],
        [-1.5940,  0.1757,  1.4982,  ..., -0.9118, -0.2331, -0.3073],
        [-1.5940,  0.1757,  1.4982,  ..., -0.9118, -0.2331, -0.3073]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 42, 29, 15,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8681, -0.6024,  0.6309,  ...,  0.2500, -1.4718, -0.8978],
        [ 0.8681, -0.6024,  0.6309,  ...,  0.2500, -1.4718, -0.8978],
        [ 0.9874,  0.7210,  0.4280,  ...,  1.8109,  0.1095,  1.2788],
        ...,
        [-1.5941,  0.1760,  1.4981,  ..., -0.9118, -0.2329, -0.3074],
        [-1.5941,  0.1760,  1.4981,  ..., -0.9118, -0.2329, -0.3074],
        [-1

PackedSequence(data=tensor([[ 0.9877,  0.7203,  0.4274,  ...,  1.8110,  0.1118,  1.2820],
        [ 0.8690, -0.6038,  0.6314,  ...,  0.2509, -1.4741, -0.8973],
        [ 0.7824, -1.1422, -1.7960,  ...,  0.1592, -0.2181, -1.3431],
        ...,
        [-1.5939,  0.1784,  1.4999,  ..., -0.9121, -0.2295, -0.3044],
        [-1.5939,  0.1784,  1.4999,  ..., -0.9121, -0.2295, -0.3044],
        [-1.5939,  0.1784,  1.4999,  ..., -0.9121, -0.2295, -0.3044]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 41, 30, 18,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8688, -0.6037,  0.6316,  ...,  0.2507, -1.4744, -0.8976],
        [-0.2599,  0.0992, -0.1197,  ..., -0.4695, -0.6671,  1.5828],
        [ 0.9878,  0.7203,  0.4274,  ...,  1.8110,  0.1119,  1.2816],
        ...,
        [-1.5938,  0.1783,  1.5000,  ..., -0.9120, -0.2293, -0.3040],
        [-1.5938,  0.1783,  1.5000,  ..., -0.9120, -0.2293, -0.3040],
        [-1

PackedSequence(data=tensor([[ 0.8685, -0.6034,  0.6335,  ...,  0.2505, -1.4723, -0.8974],
        [ 0.8685, -0.6034,  0.6335,  ...,  0.2505, -1.4723, -0.8974],
        [ 0.8685, -0.6034,  0.6335,  ...,  0.2505, -1.4723, -0.8974],
        ...,
        [-1.5959,  0.1796,  1.5009,  ..., -0.9141, -0.2310, -0.3026],
        [-1.5959,  0.1796,  1.5009,  ..., -0.9141, -0.2310, -0.3026],
        [-1.5959,  0.1796,  1.5009,  ..., -0.9141, -0.2310, -0.3026]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 34, 23, 10,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8686, -0.6035,  0.6335,  ...,  0.2506, -1.4722, -0.8972],
        [ 0.7820, -1.1425, -1.7946,  ...,  0.1604, -0.2138, -1.3420],
        [ 0.9913,  0.7171,  0.4234,  ...,  1.8137,  0.1148,  1.2781],
        ...,
        [-1.5959,  0.1797,  1.5008,  ..., -0.9141, -0.2311, -0.3027],
        [-1.5959,  0.1797,  1.5008,  ..., -0.9141, -0.2311, -0.3027],
        [-1

PackedSequence(data=tensor([[-0.2567,  0.0982, -0.1245,  ..., -0.4678, -0.6695,  1.5834],
        [ 1.0224, -2.0398,  0.2308,  ...,  0.6420,  1.3598,  1.4156],
        [ 0.8684, -0.6028,  0.6359,  ...,  0.2493, -1.4742, -0.8941],
        ...,
        [-1.5966,  0.1802,  1.4986,  ..., -0.9136, -0.2350, -0.3039],
        [-1.5966,  0.1802,  1.4986,  ..., -0.9136, -0.2350, -0.3039],
        [-1.5966,  0.1802,  1.4986,  ..., -0.9136, -0.2350, -0.3039]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 61, 50, 32, 15,  8,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.0408, -0.6562,  0.4236,  ...,  0.0308, -0.5965, -0.5067],
        [ 0.9935,  0.7157,  0.4225,  ...,  1.8148,  0.1168,  1.2718],
        [ 0.8688, -0.6033,  0.6355,  ...,  0.2496, -1.4741, -0.8936],
        ...,
        [-1.5970,  0.1798,  1.4984,  ..., -0.9136, -0.2353, -0.3042],
        [-1.5970,  0.1798,  1.4984,  ..., -0.9136, -0.2353, -0.3042],
        [-1

PackedSequence(data=tensor([[ 0.7843, -1.1447, -1.7961,  ...,  0.1639, -0.2129, -1.3410],
        [ 0.8709, -0.6070,  0.6305,  ...,  0.2534, -1.4708, -0.8904],
        [ 0.9948,  0.7152,  0.4218,  ...,  1.8148,  0.1153,  1.2736],
        ...,
        [-1.6019,  0.1796,  1.4968,  ..., -0.9132, -0.2328, -0.3046],
        [-1.6019,  0.1796,  1.4968,  ..., -0.9132, -0.2328, -0.3046],
        [-1.6019,  0.1796,  1.4968,  ..., -0.9132, -0.2328, -0.3046]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 61, 54, 39, 27, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8707, -0.6069,  0.6305,  ...,  0.2534, -1.4708, -0.8904],
        [-0.2552,  0.0992, -0.1271,  ..., -0.4687, -0.6704,  1.5824],
        [ 0.7848, -1.1451, -1.7965,  ...,  0.1643, -0.2130, -1.3410],
        ...,
        [-1.6025,  0.1797,  1.4967,  ..., -0.9132, -0.2325, -0.3046],
        [-1.6025,  0.1797,  1.4967,  ..., -0.9132, -0.2325, -0.3046],
        [-1

PackedSequence(data=tensor([[ 0.7846, -1.1440, -1.7952,  ...,  0.1625, -0.2165, -1.3428],
        [ 0.8702, -0.6079,  0.6296,  ...,  0.2545, -1.4680, -0.8904],
        [ 0.8702, -0.6079,  0.6296,  ...,  0.2545, -1.4680, -0.8904],
        ...,
        [-1.6040,  0.1805,  1.4986,  ..., -0.9144, -0.2305, -0.3028],
        [-1.6040,  0.1805,  1.4986,  ..., -0.9144, -0.2305, -0.3028],
        [-1.6040,  0.1805,  1.4986,  ..., -0.9144, -0.2305, -0.3028]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 33, 23, 17,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7846, -1.1439, -1.7950,  ...,  0.1624, -0.2167, -1.3430],
        [ 0.8701, -0.6080,  0.6296,  ...,  0.2545, -1.4679, -0.8908],
        [ 0.8701, -0.6080,  0.6296,  ...,  0.2545, -1.4679, -0.8908],
        ...,
        [-1.6042,  0.1805,  1.4986,  ..., -0.9144, -0.2304, -0.3026],
        [-1.6042,  0.1805,  1.4986,  ..., -0.9144, -0.2304, -0.3026],
        [-1

PackedSequence(data=tensor([[ 0.4405,  0.0427, -1.4028,  ...,  0.2882, -1.5729,  0.5097],
        [ 0.8688, -0.6074,  0.6299,  ...,  0.2533, -1.4687, -0.8930],
        [ 0.9883,  0.7176,  0.4191,  ...,  1.8097,  0.1092,  1.2825],
        ...,
        [-1.5998,  0.1792,  1.4994,  ..., -0.9151, -0.2297, -0.3007],
        [-1.5998,  0.1792,  1.4994,  ..., -0.9151, -0.2297, -0.3007],
        [-1.5998,  0.1792,  1.4994,  ..., -0.9151, -0.2297, -0.3007]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 39, 26, 14,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7819, -1.1420, -1.7936,  ...,  0.1603, -0.2191, -1.3421],
        [ 0.9883,  0.7175,  0.4190,  ...,  1.8097,  0.1092,  1.2824],
        [ 0.5138,  1.2839, -1.2001,  ...,  1.1834,  2.7120,  0.6816],
        ...,
        [-1.5998,  0.1793,  1.4996,  ..., -0.9153, -0.2298, -0.3007],
        [-1.5998,  0.1793,  1.4996,  ..., -0.9153, -0.2298, -0.3007],
        [-1

PackedSequence(data=tensor([[ 0.9905,  0.7151,  0.4172,  ...,  1.8104,  0.1089,  1.2790],
        [ 0.8695, -0.6073,  0.6302,  ...,  0.2537, -1.4715, -0.8924],
        [-0.9263, -1.2301, -1.0283,  ...,  0.3943, -2.2807,  2.3030],
        ...,
        [-1.5976,  0.1766,  1.4986,  ..., -0.9154, -0.2288, -0.3030],
        [-1.5976,  0.1766,  1.4986,  ..., -0.9154, -0.2288, -0.3030],
        [-1.5976,  0.1766,  1.4986,  ..., -0.9154, -0.2288, -0.3030]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 42, 32, 13,  1]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4416,  0.0405, -1.4020,  ...,  0.2902, -1.5700,  0.5101],
        [ 0.7838, -1.1456, -1.7974,  ...,  0.1644, -0.2119, -1.3373],
        [ 0.8694, -0.6074,  0.6300,  ...,  0.2537, -1.4717, -0.8922],
        ...,
        [-1.5980,  0.1763,  1.4981,  ..., -0.9152, -0.2286, -0.3032],
        [-1.5980,  0.1763,  1.4981,  ..., -0.9152, -0.2286, -0.3032],
        [-1

PackedSequence(data=tensor([[ 0.8660, -0.6084,  0.6310,  ...,  0.2520, -1.4669, -0.8854],
        [ 0.7846, -1.1462, -1.7983,  ...,  0.1643, -0.2128, -1.3386],
        [ 0.8660, -0.6084,  0.6310,  ...,  0.2520, -1.4669, -0.8854],
        ...,
        [-1.6010,  0.1789,  1.4968,  ..., -0.9149, -0.2296, -0.3017],
        [-1.6010,  0.1789,  1.4968,  ..., -0.9149, -0.2296, -0.3017],
        [-1.6010,  0.1789,  1.4968,  ..., -0.9149, -0.2296, -0.3017]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 43, 33, 19,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8659, -0.6085,  0.6310,  ...,  0.2520, -1.4665, -0.8852],
        [ 0.8659, -0.6085,  0.6310,  ...,  0.2520, -1.4665, -0.8852],
        [ 0.8659, -0.6085,  0.6310,  ...,  0.2520, -1.4665, -0.8852],
        ...,
        [-1.6009,  0.1788,  1.4971,  ..., -0.9150, -0.2297, -0.3012],
        [-1.6009,  0.1788,  1.4971,  ..., -0.9150, -0.2297, -0.3012],
        [-1

PackedSequence(data=tensor([[ 0.8619, -0.6078,  0.6293,  ...,  0.2484, -1.4684, -0.8802],
        [ 0.8619, -0.6078,  0.6293,  ...,  0.2484, -1.4684, -0.8802],
        [ 0.8619, -0.6078,  0.6293,  ...,  0.2484, -1.4684, -0.8802],
        ...,
        [-1.6006,  0.1779,  1.5002,  ..., -0.9177, -0.2319, -0.3005],
        [-1.6006,  0.1779,  1.5002,  ..., -0.9177, -0.2319, -0.3005],
        [-1.6006,  0.1779,  1.5002,  ..., -0.9177, -0.2319, -0.3005]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 40, 26, 16,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.0412, -0.6601,  0.4161,  ...,  0.0374, -0.5939, -0.5017],
        [ 0.5084,  1.2903, -1.2044,  ...,  1.1792,  2.7123,  0.6814],
        [-0.2585,  0.1020, -0.1224,  ..., -0.4713, -0.6650,  1.5843],
        ...,
        [-1.6008,  0.1780,  1.5005,  ..., -0.9180, -0.2320, -0.3007],
        [-1.6008,  0.1780,  1.5005,  ..., -0.9180, -0.2320, -0.3007],
        [-1

PackedSequence(data=tensor([[ 0.8665, -0.6105,  0.6257,  ...,  0.2509, -1.4686, -0.8795],
        [ 0.8665, -0.6105,  0.6257,  ...,  0.2509, -1.4686, -0.8795],
        [ 0.4465,  0.0341, -1.4031,  ...,  0.2964, -1.5655,  0.5061],
        ...,
        [-1.6017,  0.1739,  1.4968,  ..., -0.9173, -0.2311, -0.2997],
        [-1.6017,  0.1739,  1.4968,  ..., -0.9173, -0.2311, -0.2997],
        [-1.6017,  0.1739,  1.4968,  ..., -0.9173, -0.2311, -0.2997]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 40, 28, 20,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7900, -1.1520, -1.7997,  ...,  0.1693, -0.2123, -1.3475],
        [ 1.0210, -2.0384,  0.2282,  ...,  0.6376,  1.3627,  1.4238],
        [ 0.8668, -0.6107,  0.6255,  ...,  0.2512, -1.4685, -0.8794],
        ...,
        [-1.6021,  0.1742,  1.4967,  ..., -0.9173, -0.2309, -0.2993],
        [-1.6021,  0.1742,  1.4967,  ..., -0.9173, -0.2309, -0.2993],
        [-1

PackedSequence(data=tensor([[-0.0400, -0.6644,  0.4174,  ...,  0.0438, -0.5893, -0.4959],
        [ 0.8679, -0.6121,  0.6245,  ...,  0.2520, -1.4677, -0.8782],
        [ 0.8679, -0.6121,  0.6245,  ...,  0.2520, -1.4677, -0.8782],
        ...,
        [-1.6041,  0.1771,  1.4976,  ..., -0.9164, -0.2283, -0.2947],
        [-1.6041,  0.1771,  1.4976,  ..., -0.9164, -0.2283, -0.2947],
        [-1.6041,  0.1771,  1.4976,  ..., -0.9164, -0.2283, -0.2947]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 48, 33, 28, 16, 11]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7897, -1.1530, -1.8014,  ...,  0.1680, -0.2100, -1.3484],
        [ 0.5060,  1.2879, -1.2061,  ...,  1.1872,  2.7111,  0.6913],
        [ 0.8680, -0.6121,  0.6245,  ...,  0.2520, -1.4679, -0.8779],
        ...,
        [-1.6040,  0.1771,  1.4980,  ..., -0.9165, -0.2281, -0.2944],
        [-1.6040,  0.1771,  1.4980,  ..., -0.9165, -0.2281, -0.2944],
        [-1

PackedSequence(data=tensor([[ 0.8667, -0.6095,  0.6285,  ...,  0.2513, -1.4688, -0.8805],
        [ 0.7893, -1.1506, -1.8022,  ...,  0.1655, -0.2095, -1.3461],
        [ 0.8667, -0.6095,  0.6285,  ...,  0.2513, -1.4688, -0.8805],
        ...,
        [-1.6045,  0.1785,  1.4974,  ..., -0.9190, -0.2299, -0.2981],
        [-1.6045,  0.1785,  1.4974,  ..., -0.9190, -0.2299, -0.2981],
        [-1.6045,  0.1785,  1.4974,  ..., -0.9190, -0.2299, -0.2981]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 44, 29, 14,  2]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8666, -0.6093,  0.6289,  ...,  0.2510, -1.4696, -0.8810],
        [-0.2579,  0.1012, -0.1309,  ..., -0.4700, -0.6565,  1.5900],
        [ 0.8666, -0.6093,  0.6289,  ...,  0.2510, -1.4696, -0.8810],
        ...,
        [-1.6045,  0.1784,  1.4971,  ..., -0.9189, -0.2300, -0.2982],
        [-1.6045,  0.1784,  1.4971,  ..., -0.9189, -0.2300, -0.2982],
        [-1

PackedSequence(data=tensor([[ 0.8624, -0.6071,  0.6301,  ...,  0.2479, -1.4731, -0.8843],
        [ 0.9937,  0.7126,  0.4160,  ...,  1.8098,  0.1060,  1.2893],
        [ 0.8624, -0.6071,  0.6301,  ...,  0.2479, -1.4731, -0.8843],
        ...,
        [-1.6060,  0.1782,  1.4957,  ..., -0.9195, -0.2319, -0.2974],
        [-1.6060,  0.1782,  1.4957,  ..., -0.9195, -0.2319, -0.2974],
        [-1.6060,  0.1782,  1.4957,  ..., -0.9195, -0.2319, -0.2974]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 37, 21, 15,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7890, -1.1490, -1.8044,  ...,  0.1613, -0.2099, -1.3411],
        [ 0.8623, -0.6070,  0.6301,  ...,  0.2478, -1.4732, -0.8844],
        [ 0.8623, -0.6070,  0.6301,  ...,  0.2478, -1.4732, -0.8844],
        ...,
        [-1.6058,  0.1783,  1.4956,  ..., -0.9194, -0.2318, -0.2975],
        [-1.6058,  0.1783,  1.4956,  ..., -0.9194, -0.2318, -0.2975],
        [-1

PackedSequence(data=tensor([[-0.5696,  0.6455, -0.5045,  ...,  0.0366,  0.1147, -0.3180],
        [ 0.8611, -0.6055,  0.6305,  ...,  0.2466, -1.4719, -0.8840],
        [ 0.8611, -0.6055,  0.6305,  ...,  0.2466, -1.4719, -0.8840],
        ...,
        [-1.5991,  0.1756,  1.4974,  ..., -0.9191, -0.2283, -0.2988],
        [-1.5991,  0.1756,  1.4974,  ..., -0.9191, -0.2283, -0.2988],
        [-1.5991,  0.1756,  1.4974,  ..., -0.9191, -0.2283, -0.2988]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 40, 27, 14,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8614, -0.6055,  0.6306,  ...,  0.2466, -1.4715, -0.8839],
        [ 1.0212, -2.0404,  0.2181,  ...,  0.6385,  1.3584,  1.4149],
        [ 0.8614, -0.6055,  0.6306,  ...,  0.2466, -1.4715, -0.8839],
        ...,
        [-1.5989,  0.1756,  1.4976,  ..., -0.9191, -0.2287, -0.2985],
        [-1.5989,  0.1756,  1.4976,  ..., -0.9191, -0.2287, -0.2985],
        [-1

PackedSequence(data=tensor([[ 0.5056,  1.2906, -1.1910,  ...,  1.1884,  2.7113,  0.6729],
        [-0.2549,  0.0974, -0.1406,  ..., -0.4693, -0.6571,  1.5828],
        [-0.3147,  1.8176, -0.9537,  ...,  0.2969, -2.1040,  1.4992],
        ...,
        [-1.5969,  0.1738,  1.4940,  ..., -0.9202, -0.2371, -0.2977],
        [-1.5969,  0.1738,  1.4940,  ..., -0.9202, -0.2371, -0.2977],
        [-1.5969,  0.1738,  1.4940,  ..., -0.9202, -0.2371, -0.2977]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 38, 30, 13,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.3147,  1.8174, -0.9540,  ...,  0.2969, -2.1038,  1.4997],
        [ 1.0222, -2.0406,  0.2186,  ...,  0.6368,  1.3551,  1.4118],
        [-0.2548,  0.0972, -0.1410,  ..., -0.4691, -0.6570,  1.5828],
        ...,
        [-1.5970,  0.1735,  1.4936,  ..., -0.9203, -0.2374, -0.2978],
        [-1.5970,  0.1735,  1.4936,  ..., -0.9203, -0.2374, -0.2978],
        [-1

PackedSequence(data=tensor([[ 0.8599, -0.6069,  0.6334,  ...,  0.2460, -1.4665, -0.8807],
        [ 0.8599, -0.6069,  0.6334,  ...,  0.2460, -1.4665, -0.8807],
        [ 0.7939, -1.1524, -1.8082,  ...,  0.1625, -0.2171, -1.3314],
        ...,
        [-1.5989,  0.1737,  1.4971,  ..., -0.9211, -0.2353, -0.2966],
        [-1.5989,  0.1737,  1.4971,  ..., -0.9211, -0.2353, -0.2966],
        [-1.5989,  0.1737,  1.4971,  ..., -0.9211, -0.2353, -0.2966]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 56, 39, 26, 11,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0227, -2.0412,  0.2153,  ...,  0.6368,  1.3553,  1.4113],
        [-0.3133,  1.8147, -0.9577,  ...,  0.2995, -2.1048,  1.5031],
        [ 0.4372,  0.0457, -1.3963,  ...,  0.2904, -1.5585,  0.5072],
        ...,
        [-1.5991,  0.1738,  1.4973,  ..., -0.9211, -0.2351, -0.2964],
        [-1.5991,  0.1738,  1.4973,  ..., -0.9211, -0.2351, -0.2964],
        [-1

PackedSequence(data=tensor([[ 0.8597, -0.6061,  0.6345,  ...,  0.2438, -1.4675, -0.8840],
        [ 0.8597, -0.6061,  0.6345,  ...,  0.2438, -1.4675, -0.8840],
        [ 0.8597, -0.6061,  0.6345,  ...,  0.2438, -1.4675, -0.8840],
        ...,
        [-1.6030,  0.1745,  1.4951,  ..., -0.9198, -0.2345, -0.2941],
        [-1.6030,  0.1745,  1.4951,  ..., -0.9198, -0.2345, -0.2941],
        [-1.6030,  0.1745,  1.4951,  ..., -0.9198, -0.2345, -0.2941]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 35, 28, 16, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8598, -0.6060,  0.6345,  ...,  0.2437, -1.4675, -0.8840],
        [ 0.4368,  0.0457, -1.3976,  ...,  0.2884, -1.5585,  0.5111],
        [ 0.9969,  0.7089,  0.4152,  ...,  1.8109,  0.1139,  1.2761],
        ...,
        [-1.6032,  0.1742,  1.4947,  ..., -0.9198, -0.2348, -0.2944],
        [-1.6032,  0.1742,  1.4947,  ..., -0.9198, -0.2348, -0.2944],
        [-1

PackedSequence(data=tensor([[ 0.8621, -0.6089,  0.6304,  ...,  0.2463, -1.4657, -0.8845],
        [ 0.8621, -0.6089,  0.6304,  ...,  0.2463, -1.4657, -0.8845],
        [-0.3104,  1.8153, -0.9550,  ...,  0.3014, -2.1116,  1.4974],
        ...,
        [-1.6050,  0.1789,  1.4950,  ..., -0.9206, -0.2295, -0.2970],
        [-1.6050,  0.1789,  1.4950,  ..., -0.9206, -0.2295, -0.2970],
        [-1.6050,  0.1789,  1.4950,  ..., -0.9206, -0.2295, -0.2970]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 39, 32, 20,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8621, -0.6091,  0.6298,  ...,  0.2465, -1.4654, -0.8844],
        [ 0.9950,  0.7100,  0.4194,  ...,  1.8092,  0.1075,  1.2732],
        [ 1.0207, -2.0398,  0.2168,  ...,  0.6380,  1.3582,  1.4119],
        ...,
        [-1.6053,  0.1790,  1.4953,  ..., -0.9208, -0.2293, -0.2968],
        [-1.6053,  0.1790,  1.4953,  ..., -0.9208, -0.2293, -0.2968],
        [-1

PackedSequence(data=tensor([[-0.2568,  0.0926, -0.1418,  ..., -0.4647, -0.6550,  1.5826],
        [-0.2568,  0.0926, -0.1418,  ..., -0.4647, -0.6550,  1.5826],
        [ 0.8610, -0.6102,  0.6274,  ...,  0.2462, -1.4656, -0.8834],
        ...,
        [-1.6059,  0.1813,  1.4992,  ..., -0.9223, -0.2284, -0.2952],
        [-1.6059,  0.1813,  1.4992,  ..., -0.9223, -0.2284, -0.2952],
        [-1.6059,  0.1813,  1.4992,  ..., -0.9223, -0.2284, -0.2952]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 38, 29, 14,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8609, -0.6101,  0.6275,  ...,  0.2461, -1.4656, -0.8831],
        [ 0.7901, -1.1540, -1.8052,  ...,  0.1558, -0.2140, -1.3273],
        [ 0.9971,  0.7079,  0.4186,  ...,  1.8107,  0.1048,  1.2715],
        ...,
        [-1.6060,  0.1809,  1.4992,  ..., -0.9220, -0.2282, -0.2949],
        [-1.6060,  0.1809,  1.4992,  ..., -0.9220, -0.2282, -0.2949],
        [-1

PackedSequence(data=tensor([[ 0.8632, -0.6115,  0.6250,  ...,  0.2497, -1.4668, -0.8796],
        [ 0.7873, -1.1545, -1.8033,  ...,  0.1532, -0.2086, -1.3203],
        [ 0.8632, -0.6115,  0.6250,  ...,  0.2497, -1.4668, -0.8796],
        ...,
        [-1.6066,  0.1798,  1.4975,  ..., -0.9229, -0.2319, -0.2945],
        [-1.6066,  0.1798,  1.4975,  ..., -0.9229, -0.2319, -0.2945],
        [-1.6066,  0.1798,  1.4975,  ..., -0.9229, -0.2319, -0.2945]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 48, 37, 27, 11,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.7872, -1.1544, -1.8033,  ...,  0.1531, -0.2086, -1.3199],
        [-0.3087,  1.8212, -0.9490,  ...,  0.2992, -2.1152,  1.4952],
        [ 0.8631, -0.6114,  0.6251,  ...,  0.2497, -1.4670, -0.8797],
        ...,
        [-1.6067,  0.1801,  1.4974,  ..., -0.9228, -0.2319, -0.2946],
        [-1.6067,  0.1801,  1.4974,  ..., -0.9228, -0.2319, -0.2946],
        [-1

PackedSequence(data=tensor([[ 0.8636, -0.6116,  0.6253,  ...,  0.2475, -1.4706, -0.8781],
        [ 0.8636, -0.6116,  0.6253,  ...,  0.2475, -1.4706, -0.8781],
        [-0.3116,  1.8202, -0.9535,  ...,  0.2997, -2.1151,  1.4983],
        ...,
        [-1.6065,  0.1778,  1.4967,  ..., -0.9209, -0.2316, -0.2930],
        [-1.6065,  0.1778,  1.4967,  ..., -0.9209, -0.2316, -0.2930],
        [-1.6065,  0.1778,  1.4967,  ..., -0.9209, -0.2316, -0.2930]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 38, 28, 12,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8635, -0.6117,  0.6252,  ...,  0.2473, -1.4708, -0.8780],
        [ 0.8635, -0.6117,  0.6252,  ...,  0.2473, -1.4708, -0.8780],
        [ 0.7861, -1.1545, -1.8031,  ...,  0.1537, -0.2101, -1.3188],
        ...,
        [-1.6064,  0.1778,  1.4967,  ..., -0.9208, -0.2314, -0.2929],
        [-1.6064,  0.1778,  1.4967,  ..., -0.9208, -0.2314, -0.2929],
        [-1

PackedSequence(data=tensor([[ 0.8644, -0.6129,  0.6232,  ...,  0.2464, -1.4710, -0.8747],
        [-0.2582,  0.0946, -0.1404,  ..., -0.4718, -0.6555,  1.5813],
        [-0.3104,  1.8161, -0.9599,  ...,  0.3054, -2.1112,  1.5049],
        ...,
        [-1.6078,  0.1799,  1.4961,  ..., -0.9230, -0.2307, -0.2959],
        [-1.6078,  0.1799,  1.4961,  ..., -0.9230, -0.2307, -0.2959],
        [-1.6078,  0.1799,  1.4961,  ..., -0.9230, -0.2307, -0.2959]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 41, 29, 15,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8645, -0.6130,  0.6231,  ...,  0.2465, -1.4709, -0.8744],
        [ 1.0013,  0.7041,  0.4145,  ...,  1.8117,  0.1010,  1.2656],
        [ 0.8645, -0.6130,  0.6231,  ...,  0.2465, -1.4709, -0.8744],
        ...,
        [-1.6077,  0.1795,  1.4964,  ..., -0.9231, -0.2306, -0.2955],
        [-1.6077,  0.1795,  1.4964,  ..., -0.9231, -0.2306, -0.2955],
        [-1

PackedSequence(data=tensor([[ 0.5090,  1.2803, -1.1992,  ...,  1.1937,  2.7011,  0.6686],
        [ 0.5090,  1.2803, -1.1992,  ...,  1.1937,  2.7011,  0.6686],
        [ 0.8648, -0.6128,  0.6211,  ...,  0.2470, -1.4710, -0.8748],
        ...,
        [-1.6079,  0.1785,  1.4975,  ..., -0.9244, -0.2307, -0.2927],
        [-1.6079,  0.1785,  1.4975,  ..., -0.9244, -0.2307, -0.2927],
        [-1.6079,  0.1785,  1.4975,  ..., -0.9244, -0.2307, -0.2927]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 40, 24, 11,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8646, -0.6126,  0.6210,  ...,  0.2469, -1.4711, -0.8750],
        [ 0.7899, -1.1524, -1.8039,  ...,  0.1584, -0.2115, -1.3243],
        [-0.0306, -0.6661,  0.4001,  ...,  0.0580, -0.5758, -0.4943],
        ...,
        [-1.6080,  0.1784,  1.4976,  ..., -0.9247, -0.2309, -0.2927],
        [-1.6080,  0.1784,  1.4976,  ..., -0.9247, -0.2309, -0.2927],
        [-1

PackedSequence(data=tensor([[ 8.6422e-01, -6.1207e-01,  6.1970e-01,  ...,  2.4711e-01,
         -1.4729e+00, -8.7606e-01],
        [ 8.6422e-01, -6.1207e-01,  6.1970e-01,  ...,  2.4711e-01,
         -1.4729e+00, -8.7606e-01],
        [-2.4410e-01,  3.7183e-01, -1.0878e-03,  ...,  5.7425e-01,
          2.2584e+00,  9.4111e-01],
        ...,
        [-1.6117e+00,  1.8098e-01,  1.4982e+00,  ..., -9.2688e-01,
         -2.2777e-01, -2.9098e-01],
        [-1.6117e+00,  1.8098e-01,  1.4982e+00,  ..., -9.2688e-01,
         -2.2777e-01, -2.9098e-01],
        [-1.6117e+00,  1.8098e-01,  1.4982e+00,  ..., -9.2688e-01,
         -2.2777e-01, -2.9098e-01]], grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 50, 39, 27, 12,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4460,  0.0388, -1.3975,  ...,  0.2928, -1.5506,  0.5019],
        [ 0.8643, -0.6120,  0.6198,  ...,  0.2471, -1.4731, -0.8761],
        [ 0.7888, -1.1526, -1.8042,  ...,  

PackedSequence(data=tensor([[ 0.8623, -0.6116,  0.6226,  ...,  0.2456, -1.4758, -0.8795],
        [ 0.5098,  1.2802, -1.1927,  ...,  1.1904,  2.6936,  0.6690],
        [ 0.7903, -1.1545, -1.8097,  ...,  0.1619, -0.2047, -1.3306],
        ...,
        [-1.6112,  0.1798,  1.4996,  ..., -0.9256, -0.2294, -0.2925],
        [-1.6112,  0.1798,  1.4996,  ..., -0.9256, -0.2294, -0.2925],
        [-1.6112,  0.1798,  1.4996,  ..., -0.9256, -0.2294, -0.2925]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 35, 27, 15,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.5889,  0.6488, -0.4993,  ...,  0.0390,  0.1063, -0.3083],
        [ 0.5098,  1.2802, -1.1925,  ...,  1.1902,  2.6932,  0.6693],
        [ 0.8622, -0.6116,  0.6226,  ...,  0.2455, -1.4759, -0.8796],
        ...,
        [-1.6112,  0.1799,  1.4995,  ..., -0.9257, -0.2297, -0.2927],
        [-1.6112,  0.1799,  1.4995,  ..., -0.9257, -0.2297, -0.2927],
        [-1

PackedSequence(data=tensor([[ 0.8612, -0.6120,  0.6217,  ...,  0.2445, -1.4773, -0.8805],
        [-1.4509,  0.7508, -1.7743,  ..., -0.4671,  0.0764,  0.1531],
        [ 0.4479,  0.0342, -1.3944,  ...,  0.2937, -1.5385,  0.5059],
        ...,
        [-1.6105,  0.1809,  1.4970,  ..., -0.9277, -0.2319, -0.2930],
        [-1.6105,  0.1809,  1.4970,  ..., -0.9277, -0.2319, -0.2930],
        [-1.6105,  0.1809,  1.4970,  ..., -0.9277, -0.2319, -0.2930]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 37, 26, 12,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8606, -0.6118,  0.6216,  ...,  0.2442, -1.4771, -0.8804],
        [ 0.7894, -1.1539, -1.8104,  ...,  0.1600, -0.2012, -1.3314],
        [ 0.8606, -0.6118,  0.6216,  ...,  0.2442, -1.4771, -0.8804],
        ...,
        [-1.6106,  0.1808,  1.4971,  ..., -0.9280, -0.2320, -0.2928],
        [-1.6106,  0.1808,  1.4971,  ..., -0.9280, -0.2320, -0.2928],
        [-1

PackedSequence(data=tensor([[ 0.8603, -0.6126,  0.6205,  ...,  0.2449, -1.4733, -0.8787],
        [ 0.5028,  1.2852, -1.1866,  ...,  1.1825,  2.6877,  0.6747],
        [ 0.7891, -1.1536, -1.8085,  ...,  0.1611, -0.2013, -1.3320],
        ...,
        [-1.6139,  0.1825,  1.4959,  ..., -0.9297, -0.2318, -0.2896],
        [-1.6139,  0.1825,  1.4959,  ..., -0.9297, -0.2318, -0.2896],
        [-1.6139,  0.1825,  1.4959,  ..., -0.9297, -0.2318, -0.2896]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 50, 28, 22, 11,  2]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5029,  1.2852, -1.1866,  ...,  1.1823,  2.6875,  0.6750],
        [ 0.4477,  0.0341, -1.3956,  ...,  0.2925, -1.5356,  0.4986],
        [-0.2542,  0.0937, -0.1477,  ..., -0.4718, -0.6570,  1.5790],
        ...,
        [-1.6146,  0.1835,  1.4957,  ..., -0.9297, -0.2312, -0.2897],
        [-1.6146,  0.1835,  1.4957,  ..., -0.9297, -0.2312, -0.2897],
        [-1

PackedSequence(data=tensor([[ 0.8635, -0.6166,  0.6214,  ...,  0.2487, -1.4734, -0.8810],
        [ 0.8635, -0.6166,  0.6214,  ...,  0.2487, -1.4734, -0.8810],
        [ 0.8635, -0.6166,  0.6214,  ...,  0.2487, -1.4734, -0.8810],
        ...,
        [-1.6170,  0.1847,  1.4957,  ..., -0.9318, -0.2333, -0.2927],
        [-1.6170,  0.1847,  1.4957,  ..., -0.9318, -0.2333, -0.2927],
        [-1.6170,  0.1847,  1.4957,  ..., -0.9318, -0.2333, -0.2927]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 39, 23, 10,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8634, -0.6166,  0.6214,  ...,  0.2486, -1.4736, -0.8814],
        [ 0.8634, -0.6166,  0.6214,  ...,  0.2486, -1.4736, -0.8814],
        [ 0.8634, -0.6166,  0.6214,  ...,  0.2486, -1.4736, -0.8814],
        ...,
        [-1.6175,  0.1845,  1.4958,  ..., -0.9318, -0.2333, -0.2925],
        [-1.6175,  0.1845,  1.4958,  ..., -0.9318, -0.2333, -0.2925],
        [-1

PackedSequence(data=tensor([[ 1.0024,  0.7006,  0.4125,  ...,  1.8142,  0.1029,  1.2833],
        [ 0.8630, -0.6165,  0.6215,  ...,  0.2481, -1.4737, -0.8817],
        [ 1.0024,  0.7006,  0.4125,  ...,  1.8142,  0.1029,  1.2833],
        ...,
        [-1.6239,  0.1877,  1.4990,  ..., -0.9318, -0.2279, -0.2865],
        [-1.6239,  0.1877,  1.4990,  ..., -0.9318, -0.2279, -0.2865],
        [-1.6239,  0.1877,  1.4990,  ..., -0.9318, -0.2279, -0.2865]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 44, 30, 13,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8630, -0.6162,  0.6217,  ...,  0.2478, -1.4737, -0.8821],
        [ 0.8630, -0.6162,  0.6217,  ...,  0.2478, -1.4737, -0.8821],
        [ 0.8630, -0.6162,  0.6217,  ...,  0.2478, -1.4737, -0.8821],
        ...,
        [-1.6233,  0.1874,  1.4995,  ..., -0.9319, -0.2280, -0.2860],
        [-1.6233,  0.1874,  1.4995,  ..., -0.9319, -0.2280, -0.2860],
        [-1

PackedSequence(data=tensor([[ 0.4918,  1.2914, -1.1823,  ...,  1.1830,  2.6864,  0.6807],
        [ 0.4918,  1.2914, -1.1823,  ...,  1.1830,  2.6864,  0.6807],
        [ 0.8628, -0.6153,  0.6241,  ...,  0.2459, -1.4704, -0.8815],
        ...,
        [-1.6233,  0.1849,  1.5008,  ..., -0.9319, -0.2321, -0.2842],
        [-1.6233,  0.1849,  1.5008,  ..., -0.9319, -0.2321, -0.2842],
        [-1.6233,  0.1849,  1.5008,  ..., -0.9319, -0.2321, -0.2842]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 48, 35, 26, 13]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4917,  1.2914, -1.1817,  ...,  1.1832,  2.6861,  0.6808],
        [ 0.7915, -1.1627, -1.8098,  ...,  0.1656, -0.2060, -1.3281],
        [-0.2459,  0.3758, -0.0055,  ...,  0.5702,  2.2481,  0.9415],
        ...,
        [-1.6232,  0.1848,  1.5006,  ..., -0.9320, -0.2325, -0.2843],
        [-1.6232,  0.1848,  1.5006,  ..., -0.9320, -0.2325, -0.2843],
        [-1

PackedSequence(data=tensor([[ 0.4903,  1.2908, -1.1749,  ...,  1.1860,  2.6838,  0.6813],
        [ 0.8709, -0.6202,  0.6222,  ...,  0.2494, -1.4698, -0.8806],
        [ 0.8709, -0.6202,  0.6222,  ...,  0.2494, -1.4698, -0.8806],
        ...,
        [-1.6212,  0.1876,  1.4992,  ..., -0.9364, -0.2350, -0.2855],
        [-1.6212,  0.1876,  1.4992,  ..., -0.9364, -0.2350, -0.2855],
        [-1.6212,  0.1876,  1.4992,  ..., -0.9364, -0.2350, -0.2855]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 44, 22,  9,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8712, -0.6204,  0.6219,  ...,  0.2495, -1.4697, -0.8806],
        [ 0.8712, -0.6204,  0.6219,  ...,  0.2495, -1.4697, -0.8806],
        [ 0.8712, -0.6204,  0.6219,  ...,  0.2495, -1.4697, -0.8806],
        ...,
        [-1.6212,  0.1878,  1.4996,  ..., -0.9368, -0.2349, -0.2855],
        [-1.6212,  0.1878,  1.4996,  ..., -0.9368, -0.2349, -0.2855],
        [-1

PackedSequence(data=tensor([[ 0.8723, -0.6208,  0.6198,  ...,  0.2496, -1.4680, -0.8808],
        [ 0.8723, -0.6208,  0.6198,  ...,  0.2496, -1.4680, -0.8808],
        [ 1.0069,  0.6961,  0.4143,  ...,  1.8137,  0.0981,  1.2776],
        ...,
        [-1.6211,  0.1898,  1.5000,  ..., -0.9375, -0.2344, -0.2868],
        [-1.6211,  0.1898,  1.5000,  ..., -0.9375, -0.2344, -0.2868],
        [-1.6211,  0.1898,  1.5000,  ..., -0.9375, -0.2344, -0.2868]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 56, 42, 31, 20,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.0325, -0.6757,  0.4003,  ...,  0.0743, -0.5790, -0.4898],
        [ 0.8724, -0.6207,  0.6198,  ...,  0.2495, -1.4680, -0.8810],
        [ 0.8724, -0.6207,  0.6198,  ...,  0.2495, -1.4680, -0.8810],
        ...,
        [-1.6209,  0.1896,  1.4999,  ..., -0.9377, -0.2346, -0.2870],
        [-1.6209,  0.1896,  1.4999,  ..., -0.9377, -0.2346, -0.2870],
        [-1

PackedSequence(data=tensor([[ 0.8739, -0.6213,  0.6166,  ...,  0.2492, -1.4677, -0.8826],
        [ 0.8739, -0.6213,  0.6166,  ...,  0.2492, -1.4677, -0.8826],
        [ 0.8739, -0.6213,  0.6166,  ...,  0.2492, -1.4677, -0.8826],
        ...,
        [-1.6247,  0.1893,  1.4982,  ..., -0.9327, -0.2301, -0.2864],
        [-1.6247,  0.1893,  1.4982,  ..., -0.9327, -0.2301, -0.2864],
        [-1.6247,  0.1893,  1.4982,  ..., -0.9327, -0.2301, -0.2864]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 42, 27, 17,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8743, -0.6212,  0.6160,  ...,  0.2492, -1.4672, -0.8827],
        [ 1.0049,  0.6975,  0.4136,  ...,  1.8131,  0.0962,  1.2815],
        [ 0.8743, -0.6212,  0.6160,  ...,  0.2492, -1.4672, -0.8827],
        ...,
        [-1.6249,  0.1894,  1.4988,  ..., -0.9328, -0.2301, -0.2862],
        [-1.6249,  0.1894,  1.4988,  ..., -0.9328, -0.2301, -0.2862],
        [-1

PackedSequence(data=tensor([[ 0.4438,  0.0274, -1.4089,  ...,  0.2840, -1.5427,  0.4994],
        [ 0.8795, -0.6226,  0.6097,  ...,  0.2505, -1.4654, -0.8850],
        [-0.2466,  0.3723, -0.0055,  ...,  0.5734,  2.2462,  0.9455],
        ...,
        [-1.6242,  0.1897,  1.5017,  ..., -0.9347, -0.2306, -0.2833],
        [-1.6242,  0.1897,  1.5017,  ..., -0.9347, -0.2306, -0.2833],
        [-1.6242,  0.1897,  1.5017,  ..., -0.9347, -0.2306, -0.2833]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 53, 39, 31, 15,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.9309,  0.9624, -0.5288,  ...,  3.1014, -1.4733, -0.5595],
        [-0.3029,  1.7950, -0.9611,  ...,  0.3227, -2.0950,  1.5131],
        [ 0.8798, -0.6230,  0.6093,  ...,  0.2507, -1.4654, -0.8854],
        ...,
        [-1.6243,  0.1898,  1.5015,  ..., -0.9344, -0.2305, -0.2832],
        [-1.6243,  0.1898,  1.5015,  ..., -0.9344, -0.2305, -0.2832],
        [-1

PackedSequence(data=tensor([[ 0.8801, -0.6252,  0.6062,  ...,  0.2513, -1.4649, -0.8881],
        [ 0.7963, -1.1673, -1.8151,  ...,  0.1730, -0.2144, -1.3262],
        [ 0.8801, -0.6252,  0.6062,  ...,  0.2513, -1.4649, -0.8881],
        ...,
        [-1.6268,  0.1856,  1.4999,  ..., -0.9337, -0.2318, -0.2822],
        [-1.6268,  0.1856,  1.4999,  ..., -0.9337, -0.2318, -0.2822],
        [-1.6268,  0.1856,  1.4999,  ..., -0.9337, -0.2318, -0.2822]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 39, 31, 18,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8799, -0.6252,  0.6062,  ...,  0.2511, -1.4651, -0.8880],
        [ 0.4447,  0.0249, -1.4110,  ...,  0.2857, -1.5425,  0.5022],
        [ 0.8799, -0.6252,  0.6062,  ...,  0.2511, -1.4651, -0.8880],
        ...,
        [-1.6270,  0.1851,  1.4998,  ..., -0.9343, -0.2325, -0.2824],
        [-1.6270,  0.1851,  1.4998,  ..., -0.9343, -0.2325, -0.2824],
        [-1

PackedSequence(data=tensor([[ 0.8770, -0.6250,  0.6065,  ...,  0.2504, -1.4659, -0.8859],
        [ 0.4464,  0.0250, -1.4103,  ...,  0.2865, -1.5437,  0.5026],
        [ 0.7945, -1.1641, -1.8133,  ...,  0.1701, -0.2151, -1.3195],
        ...,
        [-1.6291,  0.1911,  1.5019,  ..., -0.9366, -0.2328, -0.2825],
        [-1.6291,  0.1911,  1.5019,  ..., -0.9366, -0.2328, -0.2825],
        [-1.6291,  0.1911,  1.5019,  ..., -0.9366, -0.2328, -0.2825]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 40, 28, 14,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0082,  0.6939,  0.4120,  ...,  1.8159,  0.0985,  1.2782],
        [ 1.0082,  0.6939,  0.4120,  ...,  1.8159,  0.0985,  1.2782],
        [ 0.8769, -0.6250,  0.6065,  ...,  0.2504, -1.4660, -0.8856],
        ...,
        [-1.6290,  0.1915,  1.5019,  ..., -0.9361, -0.2323, -0.2820],
        [-1.6290,  0.1915,  1.5019,  ..., -0.9361, -0.2323, -0.2820],
        [-1

PackedSequence(data=tensor([[-0.3021,  1.7983, -0.9608,  ...,  0.3240, -2.0955,  1.5143],
        [-0.3021,  1.7983, -0.9608,  ...,  0.3240, -2.0955,  1.5143],
        [ 0.7942, -1.1612, -1.8095,  ...,  0.1671, -0.2108, -1.3205],
        ...,
        [-1.6247,  0.1853,  1.4993,  ..., -0.9338, -0.2346, -0.2790],
        [-1.6247,  0.1853,  1.4993,  ..., -0.9338, -0.2346, -0.2790],
        [-1.6247,  0.1853,  1.4993,  ..., -0.9338, -0.2346, -0.2790]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 44, 29, 14,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2531,  0.1069, -0.1363,  ..., -0.4726, -0.6649,  1.5826],
        [ 0.8771, -0.6231,  0.6069,  ...,  0.2502, -1.4668, -0.8855],
        [ 0.4470,  0.0238, -1.4124,  ...,  0.2868, -1.5440,  0.5014],
        ...,
        [-1.6244,  0.1850,  1.4993,  ..., -0.9337, -0.2349, -0.2792],
        [-1.6244,  0.1850,  1.4993,  ..., -0.9337, -0.2349, -0.2792],
        [-1

PackedSequence(data=tensor([[ 0.8777, -0.6220,  0.6078,  ...,  0.2499, -1.4662, -0.8892],
        [ 1.0125,  0.6914,  0.4079,  ...,  1.8188,  0.0984,  1.2809],
        [ 0.8777, -0.6220,  0.6078,  ...,  0.2499, -1.4662, -0.8892],
        ...,
        [-1.6277,  0.1887,  1.4996,  ..., -0.9312, -0.2308, -0.2807],
        [-1.6277,  0.1887,  1.4996,  ..., -0.9312, -0.2308, -0.2807],
        [-1.6277,  0.1887,  1.4996,  ..., -0.9312, -0.2308, -0.2807]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 43, 33, 20, 13]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8774, -0.6219,  0.6080,  ...,  0.2499, -1.4660, -0.8894],
        [ 0.8774, -0.6219,  0.6080,  ...,  0.2499, -1.4660, -0.8894],
        [-0.2986,  1.7989, -0.9604,  ...,  0.3248, -2.0982,  1.5123],
        ...,
        [-1.6281,  0.1888,  1.4991,  ..., -0.9311, -0.2307, -0.2811],
        [-1.6281,  0.1888,  1.4991,  ..., -0.9311, -0.2307, -0.2811],
        [-1

PackedSequence(data=tensor([[ 0.8741, -0.6192,  0.6115,  ...,  0.2483, -1.4637, -0.8897],
        [ 0.8741, -0.6192,  0.6115,  ...,  0.2483, -1.4637, -0.8897],
        [ 0.4416,  0.0322, -1.4102,  ...,  0.2790, -1.5470,  0.5024],
        ...,
        [-1.6255,  0.1898,  1.4973,  ..., -0.9341, -0.2342, -0.2804],
        [-1.6255,  0.1898,  1.4973,  ..., -0.9341, -0.2342, -0.2804],
        [-1.6255,  0.1898,  1.4973,  ..., -0.9341, -0.2342, -0.2804]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 43, 26, 15,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8742, -0.6192,  0.6115,  ...,  0.2483, -1.4635, -0.8897],
        [ 1.0109,  0.6936,  0.4095,  ...,  1.8187,  0.0965,  1.2777],
        [ 0.8742, -0.6192,  0.6115,  ...,  0.2483, -1.4635, -0.8897],
        ...,
        [-1.6254,  0.1899,  1.4976,  ..., -0.9345, -0.2348, -0.2803],
        [-1.6254,  0.1899,  1.4976,  ..., -0.9345, -0.2348, -0.2803],
        [-1

PackedSequence(data=tensor([[ 0.8812, -0.6230,  0.6073,  ...,  0.2534, -1.4618, -0.8903],
        [ 1.0120,  0.6919,  0.4055,  ...,  1.8212,  0.0978,  1.2785],
        [ 0.8812, -0.6230,  0.6073,  ...,  0.2534, -1.4618, -0.8903],
        ...,
        [-1.6262,  0.1914,  1.4992,  ..., -0.9376, -0.2328, -0.2788],
        [-1.6262,  0.1914,  1.4992,  ..., -0.9376, -0.2328, -0.2788],
        [-1.6262,  0.1914,  1.4992,  ..., -0.9376, -0.2328, -0.2788]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 44, 35, 22, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8817, -0.6233,  0.6072,  ...,  0.2536, -1.4619, -0.8907],
        [ 0.8057, -1.1615, -1.8161,  ...,  0.1703, -0.2254, -1.3226],
        [ 0.8817, -0.6233,  0.6072,  ...,  0.2536, -1.4619, -0.8907],
        ...,
        [-1.6263,  0.1911,  1.4994,  ..., -0.9377, -0.2329, -0.2784],
        [-1.6263,  0.1911,  1.4994,  ..., -0.9377, -0.2329, -0.2784],
        [-1

PackedSequence(data=tensor([[ 0.8845, -0.6260,  0.6066,  ...,  0.2552, -1.4627, -0.8899],
        [ 0.8845, -0.6260,  0.6066,  ...,  0.2552, -1.4627, -0.8899],
        [ 0.8109, -1.1647, -1.8169,  ...,  0.1770, -0.2244, -1.3272],
        ...,
        [-1.6274,  0.1880,  1.4977,  ..., -0.9369, -0.2334, -0.2788],
        [-1.6274,  0.1880,  1.4977,  ..., -0.9369, -0.2334, -0.2788],
        [-1.6274,  0.1880,  1.4977,  ..., -0.9369, -0.2334, -0.2788]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 56, 40, 29, 15,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8848, -0.6262,  0.6064,  ...,  0.2554, -1.4627, -0.8897],
        [ 0.4432,  0.0278, -1.4107,  ...,  0.2831, -1.5473,  0.5011],
        [ 1.0101,  0.6945,  0.4044,  ...,  1.8206,  0.0989,  1.2861],
        ...,
        [-1.6271,  0.1880,  1.4974,  ..., -0.9368, -0.2332, -0.2790],
        [-1.6271,  0.1880,  1.4974,  ..., -0.9368, -0.2332, -0.2790],
        [-1

PackedSequence(data=tensor([[ 0.8097, -1.1632, -1.8167,  ...,  0.1780, -0.2190, -1.3272],
        [ 1.0100,  0.6942,  0.4090,  ...,  1.8242,  0.1020,  1.2864],
        [ 0.8097, -1.1632, -1.8167,  ...,  0.1780, -0.2190, -1.3272],
        ...,
        [-1.6318,  0.1927,  1.5009,  ..., -0.9371, -0.2287, -0.2754],
        [-1.6318,  0.1927,  1.5009,  ..., -0.9371, -0.2287, -0.2754],
        [-1.6318,  0.1927,  1.5009,  ..., -0.9371, -0.2287, -0.2754]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 49, 32, 22,  9,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2596,  0.1033, -0.1419,  ..., -0.4703, -0.6701,  1.5904],
        [ 0.8843, -0.6240,  0.6106,  ...,  0.2546, -1.4646, -0.8906],
        [ 0.8843, -0.6240,  0.6106,  ...,  0.2546, -1.4646, -0.8906],
        ...,
        [-1.6322,  0.1927,  1.5008,  ..., -0.9377, -0.2294, -0.2756],
        [-1.6322,  0.1927,  1.5008,  ..., -0.9377, -0.2294, -0.2756],
        [-1

PackedSequence(data=tensor([[ 0.4876,  1.2831, -1.1771,  ...,  1.1862,  2.6723,  0.6919],
        [ 1.0084,  0.6950,  0.4075,  ...,  1.8209,  0.0976,  1.2887],
        [ 0.4876,  1.2831, -1.1771,  ...,  1.1862,  2.6723,  0.6919],
        ...,
        [-1.6321,  0.1925,  1.5004,  ..., -0.9375, -0.2299, -0.2752],
        [-1.6321,  0.1925,  1.5004,  ..., -0.9375, -0.2299, -0.2752],
        [-1.6321,  0.1925,  1.5004,  ..., -0.9375, -0.2299, -0.2752]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 37, 25, 11,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8029, -1.1552, -1.8182,  ...,  0.1720, -0.2180, -1.3132],
        [ 0.8794, -0.6210,  0.6166,  ...,  0.2504, -1.4718, -0.8945],
        [ 0.8794, -0.6210,  0.6166,  ...,  0.2504, -1.4718, -0.8945],
        ...,
        [-1.6319,  0.1924,  1.5003,  ..., -0.9369, -0.2295, -0.2751],
        [-1.6319,  0.1924,  1.5003,  ..., -0.9369, -0.2295, -0.2751],
        [-1

PackedSequence(data=tensor([[ 0.8023, -1.1542, -1.8219,  ...,  0.1697, -0.2181, -1.3068],
        [ 0.8781, -0.6197,  0.6161,  ...,  0.2481, -1.4747, -0.8959],
        [ 1.0093,  0.6936,  0.4075,  ...,  1.8204,  0.0943,  1.2894],
        ...,
        [-1.6371,  0.1927,  1.4991,  ..., -0.9330, -0.2289, -0.2761],
        [-1.6371,  0.1927,  1.4991,  ..., -0.9330, -0.2289, -0.2761],
        [-1.6371,  0.1927,  1.4991,  ..., -0.9330, -0.2289, -0.2761]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 43, 31, 14,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4381,  0.0308, -1.4100,  ...,  0.2856, -1.5415,  0.5050],
        [ 0.8782, -0.6199,  0.6160,  ...,  0.2482, -1.4748, -0.8959],
        [ 0.8782, -0.6199,  0.6160,  ...,  0.2482, -1.4748, -0.8959],
        ...,
        [-1.6371,  0.1925,  1.4987,  ..., -0.9330, -0.2291, -0.2760],
        [-1.6371,  0.1925,  1.4987,  ..., -0.9330, -0.2291, -0.2760],
        [-1

PackedSequence(data=tensor([[ 0.8034, -1.1579, -1.8243,  ...,  0.1697, -0.2186, -1.2992],
        [ 0.8774, -0.6212,  0.6154,  ...,  0.2491, -1.4742, -0.8942],
        [ 0.4887,  1.2797, -1.1766,  ...,  1.1905,  2.6740,  0.6931],
        ...,
        [-1.6370,  0.1916,  1.5002,  ..., -0.9342, -0.2288, -0.2733],
        [-1.6370,  0.1916,  1.5002,  ..., -0.9342, -0.2288, -0.2733],
        [-1.6370,  0.1916,  1.5002,  ..., -0.9342, -0.2288, -0.2733]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 40, 30, 14,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8772, -0.6212,  0.6154,  ...,  0.2493, -1.4742, -0.8942],
        [ 0.8034, -1.1583, -1.8244,  ...,  0.1698, -0.2184, -1.2985],
        [ 0.8772, -0.6212,  0.6154,  ...,  0.2493, -1.4742, -0.8942],
        ...,
        [-1.6372,  0.1912,  1.5003,  ..., -0.9346, -0.2291, -0.2732],
        [-1.6372,  0.1912,  1.5003,  ..., -0.9346, -0.2291, -0.2732],
        [-1

PackedSequence(data=tensor([[ 0.8751, -0.6204,  0.6198,  ...,  0.2484, -1.4750, -0.8965],
        [-0.2584,  0.1044, -0.1395,  ..., -0.4698, -0.6732,  1.5968],
        [ 0.8751, -0.6204,  0.6198,  ...,  0.2484, -1.4750, -0.8965],
        ...,
        [-1.6369,  0.1942,  1.5012,  ..., -0.9391, -0.2296, -0.2750],
        [-1.6369,  0.1942,  1.5012,  ..., -0.9391, -0.2296, -0.2750],
        [-1.6369,  0.1942,  1.5012,  ..., -0.9391, -0.2296, -0.2750]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 50, 35, 22, 12,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0267, -2.0424,  0.2032,  ...,  0.6411,  1.3810,  1.4000],
        [ 0.4341,  0.0281, -1.4172,  ...,  0.2846, -1.5429,  0.5089],
        [-0.2582,  0.1039, -0.1397,  ..., -0.4695, -0.6730,  1.5970],
        ...,
        [-1.6368,  0.1939,  1.5011,  ..., -0.9396, -0.2298, -0.2750],
        [-1.6368,  0.1939,  1.5011,  ..., -0.9396, -0.2298, -0.2750],
        [-1

PackedSequence(data=tensor([[ 0.8751, -0.6179,  0.6233,  ...,  0.2480, -1.4738, -0.8950],
        [-0.2579,  0.0998, -0.1452,  ..., -0.4674, -0.6726,  1.5947],
        [ 0.8751, -0.6179,  0.6233,  ...,  0.2480, -1.4738, -0.8950],
        ...,
        [-1.6403,  0.1969,  1.4999,  ..., -0.9411, -0.2278, -0.2734],
        [-1.6403,  0.1969,  1.4999,  ..., -0.9411, -0.2278, -0.2734],
        [-1.6403,  0.1969,  1.4999,  ..., -0.9411, -0.2278, -0.2734]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 51, 39, 15,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8752, -0.6177,  0.6234,  ...,  0.2480, -1.4737, -0.8946],
        [ 0.8752, -0.6177,  0.6234,  ...,  0.2480, -1.4737, -0.8946],
        [-0.2581,  0.0996, -0.1455,  ..., -0.4674, -0.6725,  1.5946],
        ...,
        [-1.6400,  0.1969,  1.5005,  ..., -0.9413, -0.2279, -0.2729],
        [-1.6400,  0.1969,  1.5005,  ..., -0.9413, -0.2279, -0.2729],
        [-1

PackedSequence(data=tensor([[ 0.8781, -0.6173,  0.6233,  ...,  0.2486, -1.4709, -0.8936],
        [ 0.8781, -0.6173,  0.6233,  ...,  0.2486, -1.4709, -0.8936],
        [-0.2792,  1.7902, -0.9641,  ...,  0.3271, -2.1067,  1.5269],
        ...,
        [-1.6395,  0.1958,  1.5006,  ..., -0.9377, -0.2262, -0.2704],
        [-1.6395,  0.1958,  1.5006,  ..., -0.9377, -0.2262, -0.2704],
        [-1.6395,  0.1958,  1.5006,  ..., -0.9377, -0.2262, -0.2704]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 39, 24, 14,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4921,  1.2684, -1.1800,  ...,  1.1936,  2.6821,  0.6823],
        [ 0.8781, -0.6181,  0.6225,  ...,  0.2491, -1.4711, -0.8928],
        [ 0.4331,  0.0189, -1.4244,  ...,  0.2897, -1.5367,  0.5069],
        ...,
        [-1.6387,  0.1957,  1.5008,  ..., -0.9379, -0.2262, -0.2707],
        [-1.6387,  0.1957,  1.5008,  ..., -0.9379, -0.2262, -0.2707],
        [-1

PackedSequence(data=tensor([[ 0.8746, -0.6215,  0.6188,  ...,  0.2499, -1.4665, -0.8870],
        [ 0.8746, -0.6215,  0.6188,  ...,  0.2499, -1.4665, -0.8870],
        [ 1.0316, -2.0478,  0.1904,  ...,  0.6433,  1.3808,  1.4021],
        ...,
        [-1.6377,  0.1940,  1.5020,  ..., -0.9416, -0.2310, -0.2752],
        [-1.6377,  0.1940,  1.5020,  ..., -0.9416, -0.2310, -0.2752],
        [-1.6377,  0.1940,  1.5020,  ..., -0.9416, -0.2310, -0.2752]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 40, 30, 15,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0046,  0.6945,  0.4154,  ...,  1.8156,  0.0919,  1.2927],
        [ 0.8741, -0.6213,  0.6190,  ...,  0.2497, -1.4657, -0.8870],
        [ 0.8741, -0.6213,  0.6190,  ...,  0.2497, -1.4657, -0.8870],
        ...,
        [-1.6380,  0.1944,  1.5021,  ..., -0.9415, -0.2306, -0.2753],
        [-1.6380,  0.1944,  1.5021,  ..., -0.9415, -0.2306, -0.2753],
        [-1

PackedSequence(data=tensor([[ 1.0107,  0.6890,  0.4142,  ...,  1.8218,  0.0965,  1.2910],
        [ 0.4296,  0.0218, -1.4173,  ...,  0.2906, -1.5356,  0.4973],
        [ 0.8710, -0.6191,  0.6213,  ...,  0.2469, -1.4615, -0.8901],
        ...,
        [-1.6402,  0.1953,  1.5000,  ..., -0.9424, -0.2282, -0.2712],
        [-1.6402,  0.1953,  1.5000,  ..., -0.9424, -0.2282, -0.2712],
        [-1.6402,  0.1953,  1.5000,  ..., -0.9424, -0.2282, -0.2712]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 46, 29, 16,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8710, -0.6191,  0.6212,  ...,  0.2468, -1.4614, -0.8901],
        [ 0.8710, -0.6191,  0.6212,  ...,  0.2468, -1.4614, -0.8901],
        [ 0.8710, -0.6191,  0.6212,  ...,  0.2468, -1.4614, -0.8901],
        ...,
        [-1.6406,  0.1955,  1.4998,  ..., -0.9421, -0.2283, -0.2710],
        [-1.6406,  0.1955,  1.4998,  ..., -0.9421, -0.2283, -0.2710],
        [-1

PackedSequence(data=tensor([[ 0.8098, -1.1696, -1.8161,  ...,  0.1734, -0.2226, -1.2936],
        [ 0.8764, -0.6184,  0.6191,  ...,  0.2465, -1.4646, -0.8832],
        [ 0.4277,  0.0225, -1.4177,  ...,  0.2916, -1.5364,  0.4954],
        ...,
        [-1.6412,  0.1952,  1.4996,  ..., -0.9426, -0.2305, -0.2704],
        [-1.6412,  0.1952,  1.4996,  ..., -0.9426, -0.2305, -0.2704],
        [-1.6412,  0.1952,  1.4996,  ..., -0.9426, -0.2305, -0.2704]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 48, 36, 29, 16,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2599,  0.1008, -0.1436,  ..., -0.4657, -0.6685,  1.5950],
        [ 0.8767, -0.6183,  0.6188,  ...,  0.2465, -1.4647, -0.8829],
        [-0.6079,  0.6620, -0.4914,  ...,  0.0347,  0.0946, -0.3007],
        ...,
        [-1.6409,  0.1955,  1.5000,  ..., -0.9429, -0.2303, -0.2705],
        [-1.6409,  0.1955,  1.5000,  ..., -0.9429, -0.2303, -0.2705],
        [-1

PackedSequence(data=tensor([[ 0.8775, -0.6185,  0.6164,  ...,  0.2458, -1.4633, -0.8828],
        [ 0.8775, -0.6185,  0.6164,  ...,  0.2458, -1.4633, -0.8828],
        [ 0.4242,  0.0234, -1.4190,  ...,  0.2912, -1.5343,  0.4943],
        ...,
        [-1.6397,  0.1961,  1.5032,  ..., -0.9429, -0.2275, -0.2689],
        [-1.6397,  0.1961,  1.5032,  ..., -0.9429, -0.2275, -0.2689],
        [-1.6397,  0.1961,  1.5032,  ..., -0.9429, -0.2275, -0.2689]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 37, 30, 14,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8093, -1.1702, -1.8168,  ...,  0.1746, -0.2227, -1.2942],
        [-0.2772,  1.7940, -0.9675,  ...,  0.3290, -2.1088,  1.5269],
        [ 0.8093, -1.1702, -1.8168,  ...,  0.1746, -0.2227, -1.2942],
        ...,
        [-1.6402,  0.1960,  1.5033,  ..., -0.9430, -0.2275, -0.2685],
        [-1.6402,  0.1960,  1.5033,  ..., -0.9430, -0.2275, -0.2685],
        [-1

PackedSequence(data=tensor([[-0.2754,  1.7962, -0.9669,  ...,  0.3280, -2.1132,  1.5215],
        [ 0.8091, -1.1678, -1.8186,  ...,  0.1750, -0.2232, -1.2923],
        [ 0.8754, -0.6185,  0.6155,  ...,  0.2466, -1.4615, -0.8825],
        ...,
        [-1.6401,  0.1939,  1.5029,  ..., -0.9437, -0.2323, -0.2672],
        [-1.6401,  0.1939,  1.5029,  ..., -0.9437, -0.2323, -0.2672],
        [-1.6401,  0.1939,  1.5029,  ..., -0.9437, -0.2323, -0.2672]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 36, 27, 10,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8752, -0.6186,  0.6156,  ...,  0.2466, -1.4615, -0.8825],
        [ 0.8092, -1.1675, -1.8187,  ...,  0.1749, -0.2233, -1.2922],
        [ 0.4203, -0.3666,  1.5275,  ...,  1.4385,  0.0301,  0.4584],
        ...,
        [-1.6402,  0.1937,  1.5032,  ..., -0.9437, -0.2324, -0.2667],
        [-1.6402,  0.1937,  1.5032,  ..., -0.9437, -0.2324, -0.2667],
        [-1

PackedSequence(data=tensor([[ 0.8736, -0.6191,  0.6173,  ...,  0.2458, -1.4639, -0.8826],
        [-0.2445,  0.1013, -0.1287,  ..., -0.4633, -0.6709,  1.5939],
        [ 0.8736, -0.6191,  0.6173,  ...,  0.2458, -1.4639, -0.8826],
        ...,
        [-1.6431,  0.1950,  1.5026,  ..., -0.9423, -0.2285, -0.2650],
        [-1.6431,  0.1950,  1.5026,  ..., -0.9423, -0.2285, -0.2650],
        [-1.6431,  0.1950,  1.5026,  ..., -0.9423, -0.2285, -0.2650]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 39, 31, 16, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8735, -0.6191,  0.6174,  ...,  0.2456, -1.4642, -0.8827],
        [ 0.8088, -1.1659, -1.8202,  ...,  0.1747, -0.2232, -1.2914],
        [ 0.8735, -0.6191,  0.6174,  ...,  0.2456, -1.4642, -0.8827],
        ...,
        [-1.6432,  0.1950,  1.5029,  ..., -0.9423, -0.2283, -0.2646],
        [-1.6432,  0.1950,  1.5029,  ..., -0.9423, -0.2283, -0.2646],
        [-1

PackedSequence(data=tensor([[ 0.8759, -0.6189,  0.6174,  ...,  0.2452, -1.4651, -0.8842],
        [ 0.8759, -0.6189,  0.6174,  ...,  0.2452, -1.4651, -0.8842],
        [-0.2440,  0.0982, -0.1298,  ..., -0.4600, -0.6660,  1.5955],
        ...,
        [-1.6462,  0.1959,  1.5035,  ..., -0.9439, -0.2300, -0.2645],
        [-1.6462,  0.1959,  1.5035,  ..., -0.9439, -0.2300, -0.2645],
        [-1.6462,  0.1959,  1.5035,  ..., -0.9439, -0.2300, -0.2645]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 40, 31, 17,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8765, -0.6189,  0.6170,  ...,  0.2453, -1.4649, -0.8838],
        [ 1.0113,  0.6874,  0.4048,  ...,  1.8209,  0.1004,  1.3010],
        [ 0.8765, -0.6189,  0.6170,  ...,  0.2453, -1.4649, -0.8838],
        ...,
        [-1.6463,  0.1960,  1.5034,  ..., -0.9439, -0.2303, -0.2647],
        [-1.6463,  0.1960,  1.5034,  ..., -0.9439, -0.2303, -0.2647],
        [-1

PackedSequence(data=tensor([[ 0.8777, -0.6167,  0.6100,  ...,  0.2425, -1.4700, -0.8845],
        [-0.2405,  0.3666, -0.0177,  ...,  0.5917,  2.2300,  0.9780],
        [ 1.0123,  0.6862,  0.4056,  ...,  1.8208,  0.0989,  1.3009],
        ...,
        [-1.6466,  0.1947,  1.5013,  ..., -0.9442, -0.2286, -0.2642],
        [-1.6466,  0.1947,  1.5013,  ..., -0.9442, -0.2286, -0.2642],
        [-1.6466,  0.1947,  1.5013,  ..., -0.9442, -0.2286, -0.2642]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 40, 26,  9,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0304, -2.0464,  0.1837,  ...,  0.6471,  1.3843,  1.4041],
        [ 0.8776, -0.6165,  0.6098,  ...,  0.2422, -1.4703, -0.8846],
        [ 0.8776, -0.6165,  0.6098,  ...,  0.2422, -1.4703, -0.8846],
        ...,
        [-1.6463,  0.1947,  1.5014,  ..., -0.9441, -0.2285, -0.2640],
        [-1.6463,  0.1947,  1.5014,  ..., -0.9441, -0.2285, -0.2640],
        [-1

PackedSequence(data=tensor([[ 0.8770, -0.6164,  0.6079,  ...,  0.2404, -1.4726, -0.8852],
        [ 0.8770, -0.6164,  0.6079,  ...,  0.2404, -1.4726, -0.8852],
        [ 0.4920,  1.2654, -1.1866,  ...,  1.1923,  2.6700,  0.6789],
        ...,
        [-1.6455,  0.1937,  1.5029,  ..., -0.9432, -0.2280, -0.2620],
        [-1.6455,  0.1937,  1.5029,  ..., -0.9432, -0.2280, -0.2620],
        [-1.6455,  0.1937,  1.5029,  ..., -0.9432, -0.2280, -0.2620]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 49, 37, 24,  9,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2423,  0.1031, -0.1382,  ..., -0.4657, -0.6700,  1.5960],
        [-0.2423,  0.1031, -0.1382,  ..., -0.4657, -0.6700,  1.5960],
        [ 1.0302, -2.0443,  0.1801,  ...,  0.6482,  1.3875,  1.4054],
        ...,
        [-1.6456,  0.1933,  1.5029,  ..., -0.9434, -0.2278, -0.2616],
        [-1.6456,  0.1933,  1.5029,  ..., -0.9434, -0.2278, -0.2616],
        [-1

PackedSequence(data=tensor([[-0.3108, -0.7713, -1.8332,  ..., -0.9178, -0.9134,  0.0487],
        [ 0.8760, -0.6172,  0.6088,  ...,  0.2413, -1.4718, -0.8850],
        [ 0.8760, -0.6172,  0.6088,  ...,  0.2413, -1.4718, -0.8850],
        ...,
        [-1.6480,  0.1980,  1.5004,  ..., -0.9425, -0.2271, -0.2648],
        [-1.6480,  0.1980,  1.5004,  ..., -0.9425, -0.2271, -0.2648],
        [-1.6480,  0.1980,  1.5004,  ..., -0.9425, -0.2271, -0.2648]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 35, 24, 12,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0149,  0.6864,  0.4073,  ...,  1.8230,  0.0953,  1.3005],
        [ 0.8758, -0.6172,  0.6090,  ...,  0.2413, -1.4715, -0.8851],
        [ 0.4872,  1.2677, -1.1795,  ...,  1.1896,  2.6736,  0.6799],
        ...,
        [-1.6480,  0.1982,  1.5004,  ..., -0.9426, -0.2274, -0.2653],
        [-1.6480,  0.1982,  1.5004,  ..., -0.9426, -0.2274, -0.2653],
        [-1

PackedSequence(data=tensor([[ 0.8037, -1.1666, -1.8202,  ...,  0.1710, -0.2173, -1.2859],
        [ 0.4241,  0.0343, -1.4354,  ...,  0.2938, -1.5379,  0.5144],
        [ 0.8770, -0.6177,  0.6097,  ...,  0.2417, -1.4672, -0.8871],
        ...,
        [-1.6482,  0.1978,  1.5015,  ..., -0.9434, -0.2287, -0.2657],
        [-1.6482,  0.1978,  1.5015,  ..., -0.9434, -0.2287, -0.2657],
        [-1.6482,  0.1978,  1.5015,  ..., -0.9434, -0.2287, -0.2657]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 56, 40, 24, 15,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2460,  0.1055, -0.1428,  ..., -0.4669, -0.6691,  1.5991],
        [ 0.4243,  0.0343, -1.4355,  ...,  0.2941, -1.5382,  0.5144],
        [ 0.8771, -0.6176,  0.6097,  ...,  0.2417, -1.4664, -0.8876],
        ...,
        [-1.6484,  0.1982,  1.5019,  ..., -0.9433, -0.2287, -0.2661],
        [-1.6484,  0.1982,  1.5019,  ..., -0.9433, -0.2287, -0.2661],
        [-1

PackedSequence(data=tensor([[ 0.8781, -0.6183,  0.6082,  ...,  0.2412, -1.4614, -0.8917],
        [ 0.8781, -0.6183,  0.6082,  ...,  0.2412, -1.4614, -0.8917],
        [ 0.4896,  1.2653, -1.1779,  ...,  1.1873,  2.6813,  0.6774],
        ...,
        [-1.6471,  0.1972,  1.5047,  ..., -0.9433, -0.2274, -0.2663],
        [-1.6471,  0.1972,  1.5047,  ..., -0.9433, -0.2274, -0.2663],
        [-1.6471,  0.1972,  1.5047,  ..., -0.9433, -0.2274, -0.2663]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 55, 40, 28, 13,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8064, -1.1697, -1.8228,  ...,  0.1717, -0.2159, -1.2846],
        [ 0.8778, -0.6183,  0.6081,  ...,  0.2411, -1.4610, -0.8919],
        [-0.2485,  0.1029, -0.1422,  ..., -0.4649, -0.6702,  1.6008],
        ...,
        [-1.6471,  0.1977,  1.5049,  ..., -0.9432, -0.2275, -0.2663],
        [-1.6471,  0.1977,  1.5049,  ..., -0.9432, -0.2275, -0.2663],
        [-1

PackedSequence(data=tensor([[ 0.4195,  0.0358, -1.4354,  ...,  0.2930, -1.5398,  0.5141],
        [ 0.8752, -0.6163,  0.6094,  ...,  0.2382, -1.4580, -0.8897],
        [ 0.8752, -0.6163,  0.6094,  ...,  0.2382, -1.4580, -0.8897],
        ...,
        [-1.6468,  0.1964,  1.5046,  ..., -0.9469, -0.2279, -0.2651],
        [-1.6468,  0.1964,  1.5046,  ..., -0.9469, -0.2279, -0.2651],
        [-1.6468,  0.1964,  1.5046,  ..., -0.9469, -0.2279, -0.2651]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 36, 27, 18,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8752, -0.6162,  0.6097,  ...,  0.2380, -1.4579, -0.8894],
        [ 1.0129,  0.6854,  0.4092,  ...,  1.8266,  0.0963,  1.2975],
        [ 0.4196,  0.0361, -1.4355,  ...,  0.2928, -1.5402,  0.5141],
        ...,
        [-1.6467,  0.1969,  1.5047,  ..., -0.9471, -0.2277, -0.2653],
        [-1.6467,  0.1969,  1.5047,  ..., -0.9471, -0.2277, -0.2653],
        [-1

PackedSequence(data=tensor([[ 0.8744, -0.6180,  0.6120,  ...,  0.2377, -1.4562, -0.8859],
        [ 0.4875,  1.2677, -1.1727,  ...,  1.1795,  2.6795,  0.6784],
        [ 0.8744, -0.6180,  0.6120,  ...,  0.2377, -1.4562, -0.8859],
        ...,
        [-1.6485,  0.1963,  1.5051,  ..., -0.9482, -0.2304, -0.2652],
        [-1.6485,  0.1963,  1.5051,  ..., -0.9482, -0.2304, -0.2652],
        [-1.6485,  0.1963,  1.5051,  ..., -0.9482, -0.2304, -0.2652]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 40, 27, 13,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8745, -0.6184,  0.6120,  ...,  0.2380, -1.4559, -0.8856],
        [-0.2502,  0.1009, -0.1416,  ..., -0.4659, -0.6742,  1.6111],
        [ 0.8745, -0.6184,  0.6120,  ...,  0.2380, -1.4559, -0.8856],
        ...,
        [-1.6488,  0.1959,  1.5053,  ..., -0.9487, -0.2306, -0.2655],
        [-1.6488,  0.1959,  1.5053,  ..., -0.9487, -0.2306, -0.2655],
        [-1

PackedSequence(data=tensor([[ 1.0262, -2.0399,  0.1758,  ...,  0.6464,  1.3862,  1.4031],
        [ 1.0102,  0.6866,  0.4077,  ...,  1.8266,  0.0977,  1.2987],
        [ 1.0262, -2.0399,  0.1758,  ...,  0.6464,  1.3862,  1.4031],
        ...,
        [-1.6531,  0.1952,  1.5078,  ..., -0.9481, -0.2248, -0.2650],
        [-1.6531,  0.1952,  1.5078,  ..., -0.9481, -0.2248, -0.2650],
        [-1.6531,  0.1952,  1.5078,  ..., -0.9481, -0.2248, -0.2650]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 37, 23, 10,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8746, -0.6215,  0.6106,  ...,  0.2416, -1.4555, -0.8832],
        [ 0.8045, -1.1643, -1.8210,  ...,  0.1635, -0.2217, -1.2993],
        [ 0.8045, -1.1643, -1.8210,  ...,  0.1635, -0.2217, -1.2993],
        ...,
        [-1.6529,  0.1949,  1.5080,  ..., -0.9479, -0.2245, -0.2648],
        [-1.6529,  0.1949,  1.5080,  ..., -0.9479, -0.2245, -0.2648],
        [-1

PackedSequence(data=tensor([[-0.2433,  0.0983, -0.1464,  ..., -0.4618, -0.6749,  1.6053],
        [-0.6021,  0.6520, -0.4811,  ...,  0.0377,  0.0862, -0.2917],
        [ 0.8043, -1.1651, -1.8217,  ...,  0.1638, -0.2216, -1.3001],
        ...,
        [-1.6540,  0.1974,  1.5053,  ..., -0.9465, -0.2250, -0.2667],
        [-1.6540,  0.1974,  1.5053,  ..., -0.9465, -0.2250, -0.2667],
        [-1.6540,  0.1974,  1.5053,  ..., -0.9465, -0.2250, -0.2667]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 33, 24, 15,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8752, -0.6209,  0.6111,  ...,  0.2404, -1.4529, -0.8848],
        [ 1.0087,  0.6866,  0.4083,  ...,  1.8219,  0.0993,  1.3005],
        [-0.2433,  0.0982, -0.1466,  ..., -0.4616, -0.6749,  1.6053],
        ...,
        [-1.6539,  0.1975,  1.5053,  ..., -0.9467, -0.2253, -0.2668],
        [-1.6539,  0.1975,  1.5053,  ..., -0.9467, -0.2253, -0.2668],
        [-1

PackedSequence(data=tensor([[ 0.4128,  0.0435, -1.4360,  ...,  0.2899, -1.5463,  0.5163],
        [ 1.0073,  0.6867,  0.4095,  ...,  1.8185,  0.1000,  1.2986],
        [ 0.8739, -0.6206,  0.6102,  ...,  0.2405, -1.4521, -0.8860],
        ...,
        [-1.6537,  0.1978,  1.5043,  ..., -0.9456, -0.2270, -0.2674],
        [-1.6537,  0.1978,  1.5043,  ..., -0.9456, -0.2270, -0.2674],
        [-1.6537,  0.1978,  1.5043,  ..., -0.9456, -0.2270, -0.2674]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 42, 23, 10,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0069,  0.6869,  0.4096,  ...,  1.8179,  0.1001,  1.2986],
        [ 0.8741, -0.6205,  0.6102,  ...,  0.2403, -1.4524, -0.8863],
        [-0.2756,  1.7947, -0.9808,  ...,  0.3442, -2.1110,  1.5389],
        ...,
        [-1.6533,  0.1977,  1.5047,  ..., -0.9456, -0.2270, -0.2675],
        [-1.6533,  0.1977,  1.5047,  ..., -0.9456, -0.2270, -0.2675],
        [-1

PackedSequence(data=tensor([[ 0.4272, -0.1568, -1.3743,  ...,  1.2780, -0.0840,  0.4361],
        [ 1.0262, -2.0346,  0.1681,  ...,  0.6484,  1.3955,  1.4049],
        [-0.2418,  0.0977, -0.1441,  ..., -0.4626, -0.6699,  1.6016],
        ...,
        [-1.6524,  0.1953,  1.5080,  ..., -0.9472, -0.2274, -0.2659],
        [-1.6524,  0.1953,  1.5080,  ..., -0.9472, -0.2274, -0.2659],
        [-1.6524,  0.1953,  1.5080,  ..., -0.9472, -0.2274, -0.2659]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 35, 25, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4099,  0.0444, -1.4338,  ...,  0.2893, -1.5386,  0.5165],
        [ 0.8743, -0.6217,  0.6096,  ...,  0.2423, -1.4543, -0.8868],
        [-0.2418,  0.0975, -0.1443,  ..., -0.4626, -0.6698,  1.6014],
        ...,
        [-1.6525,  0.1953,  1.5082,  ..., -0.9468, -0.2272, -0.2658],
        [-1.6525,  0.1953,  1.5082,  ..., -0.9468, -0.2272, -0.2658],
        [-1

PackedSequence(data=tensor([[ 1.0069,  0.6850,  0.4060,  ...,  1.8209,  0.1053,  1.2969],
        [-0.2743,  1.7928, -0.9793,  ...,  0.3480, -2.1103,  1.5484],
        [ 0.8741, -0.6220,  0.6106,  ...,  0.2415, -1.4563, -0.8894],
        ...,
        [-1.6533,  0.1950,  1.5069,  ..., -0.9473, -0.2264, -0.2670],
        [-1.6533,  0.1950,  1.5069,  ..., -0.9473, -0.2264, -0.2670],
        [-1.6533,  0.1950,  1.5069,  ..., -0.9473, -0.2264, -0.2670]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 48, 36, 19,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8741, -0.6219,  0.6108,  ...,  0.2413, -1.4565, -0.8899],
        [ 0.8741, -0.6219,  0.6108,  ...,  0.2413, -1.4565, -0.8899],
        [ 0.4137,  0.0399, -1.4356,  ...,  0.2944, -1.5366,  0.5179],
        ...,
        [-1.6529,  0.1950,  1.5071,  ..., -0.9470, -0.2264, -0.2670],
        [-1.6529,  0.1950,  1.5071,  ..., -0.9470, -0.2264, -0.2670],
        [-1

PackedSequence(data=tensor([[ 0.8747, -0.6212,  0.6127,  ...,  0.2388, -1.4580, -0.8928],
        [ 0.4140,  0.0376, -1.4374,  ...,  0.2967, -1.5356,  0.5191],
        [ 0.4140,  0.0376, -1.4374,  ...,  0.2967, -1.5356,  0.5191],
        ...,
        [-1.6524,  0.1932,  1.5092,  ..., -0.9457, -0.2262, -0.2633],
        [-1.6524,  0.1932,  1.5092,  ..., -0.9457, -0.2262, -0.2633],
        [-1.6524,  0.1932,  1.5092,  ..., -0.9457, -0.2262, -0.2633]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 47, 37, 19,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2383,  0.1013, -0.1413,  ..., -0.4651, -0.6707,  1.6012],
        [ 0.8748, -0.6211,  0.6127,  ...,  0.2386, -1.4581, -0.8930],
        [ 0.8748, -0.6211,  0.6127,  ...,  0.2386, -1.4581, -0.8930],
        ...,
        [-1.6523,  0.1935,  1.5092,  ..., -0.9457, -0.2263, -0.2634],
        [-1.6523,  0.1935,  1.5092,  ..., -0.9457, -0.2263, -0.2634],
        [-1

PackedSequence(data=tensor([[ 1.0108,  0.6819,  0.4029,  ...,  1.8241,  0.1052,  1.2967],
        [ 1.0194, -2.0350,  0.1745,  ...,  0.6475,  1.4000,  1.3947],
        [ 0.8038, -1.1643, -1.8219,  ...,  0.1633, -0.2217, -1.3113],
        ...,
        [-1.6556,  0.1930,  1.5074,  ..., -0.9492, -0.2274, -0.2623],
        [-1.6556,  0.1930,  1.5074,  ..., -0.9492, -0.2274, -0.2623],
        [-1.6556,  0.1930,  1.5074,  ..., -0.9492, -0.2274, -0.2623]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 52, 39, 26, 13,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4136,  0.0371, -1.4382,  ...,  0.2990, -1.5356,  0.5194],
        [ 0.8037, -1.1642, -1.8218,  ...,  0.1634, -0.2215, -1.3118],
        [ 0.8781, -0.6205,  0.6147,  ...,  0.2406, -1.4590, -0.8932],
        ...,
        [-1.6549,  0.1932,  1.5076,  ..., -0.9492, -0.2272, -0.2621],
        [-1.6549,  0.1932,  1.5076,  ..., -0.9492, -0.2272, -0.2621],
        [-1

PackedSequence(data=tensor([[-0.2768,  1.7973, -0.9714,  ...,  0.3404, -2.1132,  1.5472],
        [-0.2422,  0.0991, -0.1390,  ..., -0.4625, -0.6752,  1.5994],
        [ 0.0754, -0.7565,  0.2372,  ...,  1.1373, -0.6393,  0.9290],
        ...,
        [-1.6556,  0.1957,  1.5079,  ..., -0.9524, -0.2258, -0.2635],
        [-1.6556,  0.1957,  1.5079,  ..., -0.9524, -0.2258, -0.2635],
        [-1.6556,  0.1957,  1.5079,  ..., -0.9524, -0.2258, -0.2635]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 48, 36, 25, 13,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8806, -0.6205,  0.6111,  ...,  0.2417, -1.4585, -0.8921],
        [-1.4475,  0.7657, -1.7930,  ..., -0.4504,  0.0627,  0.1834],
        [ 0.8806, -0.6205,  0.6111,  ...,  0.2417, -1.4585, -0.8921],
        ...,
        [-1.6559,  0.1955,  1.5079,  ..., -0.9523, -0.2258, -0.2634],
        [-1.6559,  0.1955,  1.5079,  ..., -0.9523, -0.2258, -0.2634],
        [-1

PackedSequence(data=tensor([[-0.2412,  0.0975, -0.1389,  ..., -0.4643, -0.6792,  1.5957],
        [ 0.4875,  1.2848, -1.1673,  ...,  1.1807,  2.6704,  0.6951],
        [ 0.8059, -1.1668, -1.8233,  ...,  0.1660, -0.2149, -1.3102],
        ...,
        [-1.6580,  0.1959,  1.5090,  ..., -0.9498, -0.2294, -0.2642],
        [-1.6580,  0.1959,  1.5090,  ..., -0.9498, -0.2294, -0.2642],
        [-1.6580,  0.1959,  1.5090,  ..., -0.9498, -0.2294, -0.2642]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 52, 39, 26, 13,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8060, -1.1669, -1.8232,  ...,  0.1660, -0.2147, -1.3099],
        [-1.8446, -1.3517, -0.7537,  ...,  0.5774,  0.0259,  0.7042],
        [ 0.8797, -0.6206,  0.6081,  ...,  0.2416, -1.4551, -0.8899],
        ...,
        [-1.6581,  0.1954,  1.5088,  ..., -0.9500, -0.2293, -0.2640],
        [-1.6581,  0.1954,  1.5088,  ..., -0.9500, -0.2293, -0.2640],
        [-1

PackedSequence(data=tensor([[-0.2407,  0.0976, -0.1407,  ..., -0.4647, -0.6807,  1.5951],
        [ 0.4127,  0.0441, -1.4348,  ...,  0.2971, -1.5357,  0.5154],
        [ 1.0145,  0.6839,  0.3995,  ...,  1.8278,  0.1004,  1.3122],
        ...,
        [-1.6579,  0.1956,  1.5094,  ..., -0.9464, -0.2250, -0.2593],
        [-1.6579,  0.1956,  1.5094,  ..., -0.9464, -0.2250, -0.2593],
        [-1.6579,  0.1956,  1.5094,  ..., -0.9464, -0.2250, -0.2593]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 43, 30, 15,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.6060,  0.6489, -0.4769,  ...,  0.0287,  0.0808, -0.2917],
        [ 0.8805, -0.6227,  0.6078,  ...,  0.2437, -1.4560, -0.8890],
        [ 0.8805, -0.6227,  0.6078,  ...,  0.2437, -1.4560, -0.8890],
        ...,
        [-1.6580,  0.1956,  1.5096,  ..., -0.9464, -0.2246, -0.2590],
        [-1.6580,  0.1956,  1.5096,  ..., -0.9464, -0.2246, -0.2590],
        [-1

PackedSequence(data=tensor([[ 0.8032, -1.1653, -1.8166,  ...,  0.1603, -0.2216, -1.3065],
        [ 0.8737, -0.6214,  0.6094,  ...,  0.2427, -1.4545, -0.8903],
        [ 0.8032, -1.1653, -1.8166,  ...,  0.1603, -0.2216, -1.3065],
        ...,
        [-1.6583,  0.1948,  1.5085,  ..., -0.9476, -0.2252, -0.2597],
        [-1.6583,  0.1948,  1.5085,  ..., -0.9476, -0.2252, -0.2597],
        [-1.6583,  0.1948,  1.5085,  ..., -0.9476, -0.2252, -0.2597]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 37, 26,  9,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8734, -0.6212,  0.6097,  ...,  0.2426, -1.4544, -0.8904],
        [ 1.0234, -2.0397,  0.1738,  ...,  0.6497,  1.3969,  1.3909],
        [ 0.8734, -0.6212,  0.6097,  ...,  0.2426, -1.4544, -0.8904],
        ...,
        [-1.6585,  0.1947,  1.5083,  ..., -0.9474, -0.2251, -0.2594],
        [-1.6585,  0.1947,  1.5083,  ..., -0.9474, -0.2251, -0.2594],
        [-1

PackedSequence(data=tensor([[ 0.8722, -0.6197,  0.6124,  ...,  0.2417, -1.4542, -0.8916],
        [ 0.4110,  0.0435, -1.4340,  ...,  0.2985, -1.5348,  0.5191],
        [ 0.8722, -0.6197,  0.6124,  ...,  0.2417, -1.4542, -0.8916],
        ...,
        [-1.6622,  0.1960,  1.5058,  ..., -0.9451, -0.2236, -0.2590],
        [-1.6622,  0.1960,  1.5058,  ..., -0.9451, -0.2236, -0.2590],
        [-1.6622,  0.1960,  1.5058,  ..., -0.9451, -0.2236, -0.2590]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 52, 39, 27, 12,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4881,  1.2851, -1.1704,  ...,  1.1846,  2.6669,  0.6988],
        [ 1.0247,  0.6754,  0.3906,  ...,  1.8385,  0.1104,  1.3060],
        [ 0.8724, -0.6197,  0.6124,  ...,  0.2418, -1.4542, -0.8917],
        ...,
        [-1.6622,  0.1963,  1.5057,  ..., -0.9452, -0.2236, -0.2594],
        [-1.6622,  0.1963,  1.5057,  ..., -0.9452, -0.2236, -0.2594],
        [-1

PackedSequence(data=tensor([[ 1.0181,  0.6817,  0.3949,  ...,  1.8346,  0.1066,  1.3033],
        [ 1.0181,  0.6817,  0.3949,  ...,  1.8346,  0.1066,  1.3033],
        [ 0.8042, -1.1698, -1.8134,  ...,  0.1655, -0.2192, -1.3013],
        ...,
        [-1.6592,  0.1952,  1.5085,  ..., -0.9476, -0.2260, -0.2627],
        [-1.6592,  0.1952,  1.5085,  ..., -0.9476, -0.2260, -0.2627],
        [-1.6592,  0.1952,  1.5085,  ..., -0.9476, -0.2260, -0.2627]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 39, 27, 13,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8721, -0.6219,  0.6134,  ...,  0.2437, -1.4521, -0.8912],
        [ 0.8721, -0.6219,  0.6134,  ...,  0.2437, -1.4521, -0.8912],
        [ 0.8721, -0.6219,  0.6134,  ...,  0.2437, -1.4521, -0.8912],
        ...,
        [-1.6592,  0.1946,  1.5085,  ..., -0.9478, -0.2262, -0.2624],
        [-1.6592,  0.1946,  1.5085,  ..., -0.9478, -0.2262, -0.2624],
        [-1

PackedSequence(data=tensor([[ 0.8121, -1.1757, -1.8096,  ...,  0.1716, -0.2300, -1.2947],
        [-0.0338, -0.6806,  0.3904,  ...,  0.0997, -0.5510, -0.4661],
        [ 0.8767, -0.6235,  0.6155,  ...,  0.2453, -1.4497, -0.8869],
        ...,
        [-1.6604,  0.1979,  1.5099,  ..., -0.9469, -0.2257, -0.2630],
        [-1.6604,  0.1979,  1.5099,  ..., -0.9469, -0.2257, -0.2630],
        [-1.6604,  0.1979,  1.5099,  ..., -0.9469, -0.2257, -0.2630]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 44, 24, 11,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0169,  0.6836,  0.3934,  ...,  1.8314,  0.1034,  1.3040],
        [ 0.4841,  1.2860, -1.1693,  ...,  1.1787,  2.6609,  0.6971],
        [ 0.8770, -0.6236,  0.6155,  ...,  0.2456, -1.4495, -0.8866],
        ...,
        [-1.6601,  0.1978,  1.5103,  ..., -0.9469, -0.2255, -0.2622],
        [-1.6601,  0.1978,  1.5103,  ..., -0.9469, -0.2255, -0.2622],
        [-1

PackedSequence(data=tensor([[-0.2347,  0.0975, -0.1413,  ..., -0.4651, -0.6762,  1.6048],
        [ 0.8763, -0.6259,  0.6180,  ...,  0.2480, -1.4473, -0.8832],
        [ 0.8763, -0.6259,  0.6180,  ...,  0.2480, -1.4473, -0.8832],
        ...,
        [-1.6555,  0.1977,  1.5082,  ..., -0.9483, -0.2255, -0.2593],
        [-1.6555,  0.1977,  1.5082,  ..., -0.9483, -0.2255, -0.2593],
        [-1.6555,  0.1977,  1.5082,  ..., -0.9483, -0.2255, -0.2593]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 35, 24, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8199, -1.1807, -1.8082,  ...,  0.1770, -0.2360, -1.2938],
        [ 1.0185,  0.6803,  0.3918,  ...,  1.8324,  0.1078,  1.3047],
        [ 1.0263, -2.0376,  0.1707,  ...,  0.6512,  1.4030,  1.3972],
        ...,
        [-1.6552,  0.1975,  1.5081,  ..., -0.9482, -0.2257, -0.2593],
        [-1.6552,  0.1975,  1.5081,  ..., -0.9482, -0.2257, -0.2593],
        [-1

PackedSequence(data=tensor([[ 0.8750, -0.6272,  0.6149,  ...,  0.2471, -1.4476, -0.8832],
        [ 0.8750, -0.6272,  0.6149,  ...,  0.2471, -1.4476, -0.8832],
        [-1.8396, -1.3499, -0.7624,  ...,  0.5824,  0.0346,  0.6973],
        ...,
        [-1.6552,  0.1961,  1.5120,  ..., -0.9521, -0.2262, -0.2594],
        [-1.6552,  0.1961,  1.5120,  ..., -0.9521, -0.2262, -0.2594],
        [-1.6552,  0.1961,  1.5120,  ..., -0.9521, -0.2262, -0.2594]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 37, 28,  9,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8751, -0.6273,  0.6148,  ...,  0.2470, -1.4476, -0.8834],
        [ 0.8751, -0.6273,  0.6148,  ...,  0.2470, -1.4476, -0.8834],
        [ 0.8231, -1.1819, -1.8105,  ...,  0.1797, -0.2372, -1.2942],
        ...,
        [-1.6553,  0.1961,  1.5125,  ..., -0.9523, -0.2259, -0.2590],
        [-1.6553,  0.1961,  1.5125,  ..., -0.9523, -0.2259, -0.2590],
        [-1

PackedSequence(data=tensor([[ 1.0144,  0.6816,  0.3934,  ...,  1.8310,  0.1041,  1.3056],
        [-0.2582,  1.7909, -0.9790,  ...,  0.3540, -2.1187,  1.5534],
        [ 0.1058,  2.0329,  1.3633,  ..., -1.2242, -0.6901,  1.2503],
        ...,
        [-1.6616,  0.1972,  1.5126,  ..., -0.9508, -0.2254, -0.2557],
        [-1.6616,  0.1972,  1.5126,  ..., -0.9508, -0.2254, -0.2557],
        [-1.6616,  0.1972,  1.5126,  ..., -0.9508, -0.2254, -0.2557]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 31, 24, 12,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0237, -2.0367,  0.1706,  ...,  0.6549,  1.4099,  1.4060],
        [ 0.8768, -0.6277,  0.6126,  ...,  0.2468, -1.4469, -0.8871],
        [ 0.8768, -0.6277,  0.6126,  ...,  0.2468, -1.4469, -0.8871],
        ...,
        [-1.6621,  0.1972,  1.5124,  ..., -0.9506, -0.2258, -0.2558],
        [-1.6621,  0.1972,  1.5124,  ..., -0.9506, -0.2258, -0.2558],
        [-1

PackedSequence(data=tensor([[ 0.8761, -0.6306,  0.6116,  ...,  0.2476, -1.4422, -0.8839],
        [ 0.8761, -0.6306,  0.6116,  ...,  0.2476, -1.4422, -0.8839],
        [-0.2570,  1.7893, -0.9791,  ...,  0.3568, -2.1221,  1.5548],
        ...,
        [-1.6619,  0.1964,  1.5113,  ..., -0.9516, -0.2247, -0.2553],
        [-1.6619,  0.1964,  1.5113,  ..., -0.9516, -0.2247, -0.2553],
        [-1.6619,  0.1964,  1.5113,  ..., -0.9516, -0.2247, -0.2553]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 42, 31, 16,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8762, -0.6308,  0.6116,  ...,  0.2476, -1.4420, -0.8836],
        [ 0.8762, -0.6308,  0.6116,  ...,  0.2476, -1.4420, -0.8836],
        [ 0.8762, -0.6308,  0.6116,  ...,  0.2476, -1.4420, -0.8836],
        ...,
        [-1.6621,  0.1966,  1.5115,  ..., -0.9514, -0.2245, -0.2553],
        [-1.6621,  0.1966,  1.5115,  ..., -0.9514, -0.2245, -0.2553],
        [-1

PackedSequence(data=tensor([[ 0.8751, -0.6324,  0.6110,  ...,  0.2469, -1.4446, -0.8838],
        [ 0.8208, -1.1797, -1.8106,  ...,  0.1769, -0.2375, -1.2906],
        [ 0.4118,  0.0385, -1.4375,  ...,  0.2977, -1.5274,  0.5105],
        ...,
        [-1.6624,  0.1980,  1.5112,  ..., -0.9505, -0.2272, -0.2575],
        [-1.6624,  0.1980,  1.5112,  ..., -0.9505, -0.2272, -0.2575],
        [-1.6624,  0.1980,  1.5112,  ..., -0.9505, -0.2272, -0.2575]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 57, 47, 40, 19,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8749, -0.6327,  0.6107,  ...,  0.2472, -1.4448, -0.8836],
        [-0.2570,  1.7889, -0.9800,  ...,  0.3572, -2.1266,  1.5549],
        [ 0.8210, -1.1800, -1.8108,  ...,  0.1770, -0.2373, -1.2897],
        ...,
        [-1.6623,  0.1977,  1.5108,  ..., -0.9509, -0.2272, -0.2575],
        [-1.6623,  0.1977,  1.5108,  ..., -0.9509, -0.2272, -0.2575],
        [-1

PackedSequence(data=tensor([[-0.2429,  0.0920, -0.1449,  ..., -0.4576, -0.6798,  1.6018],
        [ 0.8724, -0.6343,  0.6091,  ...,  0.2487, -1.4461, -0.8839],
        [ 0.8724, -0.6343,  0.6091,  ...,  0.2487, -1.4461, -0.8839],
        ...,
        [-1.6610,  0.1992,  1.5106,  ..., -0.9507, -0.2247, -0.2576],
        [-1.6610,  0.1992,  1.5106,  ..., -0.9507, -0.2247, -0.2576],
        [-1.6610,  0.1992,  1.5106,  ..., -0.9507, -0.2247, -0.2576]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 59, 46, 36, 24, 16,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8208, -1.1811, -1.8112,  ...,  0.1767, -0.2355, -1.2841],
        [ 0.8722, -0.6345,  0.6091,  ...,  0.2489, -1.4462, -0.8841],
        [ 0.8722, -0.6345,  0.6091,  ...,  0.2489, -1.4462, -0.8841],
        ...,
        [-1.6613,  0.1990,  1.5103,  ..., -0.9510, -0.2249, -0.2575],
        [-1.6613,  0.1990,  1.5103,  ..., -0.9510, -0.2249, -0.2575],
        [-1

PackedSequence(data=tensor([[ 0.8733, -0.6352,  0.6083,  ...,  0.2517, -1.4454, -0.8841],
        [ 0.8733, -0.6352,  0.6083,  ...,  0.2517, -1.4454, -0.8841],
        [-0.2582,  1.7881, -0.9830,  ...,  0.3557, -2.1296,  1.5559],
        ...,
        [-1.6617,  0.1991,  1.5109,  ..., -0.9518, -0.2216, -0.2545],
        [-1.6617,  0.1991,  1.5109,  ..., -0.9518, -0.2216, -0.2545],
        [-1.6617,  0.1991,  1.5109,  ..., -0.9518, -0.2216, -0.2545]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 40, 31, 17,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0130,  0.6816,  0.3863,  ...,  1.8308,  0.1072,  1.3188],
        [ 1.0130,  0.6816,  0.3863,  ...,  1.8308,  0.1072,  1.3188],
        [ 0.8735, -0.6352,  0.6081,  ...,  0.2520, -1.4453, -0.8839],
        ...,
        [-1.6617,  0.1993,  1.5111,  ..., -0.9521, -0.2216, -0.2545],
        [-1.6617,  0.1993,  1.5111,  ..., -0.9521, -0.2216, -0.2545],
        [-1

PackedSequence(data=tensor([[ 1.0259, -2.0392,  0.1627,  ...,  0.6620,  1.4245,  1.4064],
        [ 0.8750, -0.6346,  0.6050,  ...,  0.2526, -1.4450, -0.8824],
        [ 0.8750, -0.6346,  0.6050,  ...,  0.2526, -1.4450, -0.8824],
        ...,
        [-1.6618,  0.1988,  1.5109,  ..., -0.9539, -0.2255, -0.2546],
        [-1.6618,  0.1988,  1.5109,  ..., -0.9539, -0.2255, -0.2546],
        [-1.6618,  0.1988,  1.5109,  ..., -0.9539, -0.2255, -0.2546]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 56, 41, 28, 13,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8752, -0.6347,  0.6044,  ...,  0.2526, -1.4451, -0.8818],
        [ 0.4136,  0.0385, -1.4409,  ...,  0.3070, -1.5217,  0.5095],
        [ 0.8145, -1.1771, -1.8168,  ...,  0.1700, -0.2389, -1.2779],
        ...,
        [-1.6619,  0.1988,  1.5105,  ..., -0.9541, -0.2256, -0.2547],
        [-1.6619,  0.1988,  1.5105,  ..., -0.9541, -0.2256, -0.2547],
        [-1

PackedSequence(data=tensor([[ 0.8789, -0.6339,  0.6003,  ...,  0.2525, -1.4434, -0.8801],
        [ 0.8789, -0.6339,  0.6003,  ...,  0.2525, -1.4434, -0.8801],
        [ 1.9318,  1.0012, -0.5314,  ...,  3.1548, -1.5546, -0.5369],
        ...,
        [-1.6604,  0.1996,  1.5106,  ..., -0.9525, -0.2240, -0.2538],
        [-1.6604,  0.1996,  1.5106,  ..., -0.9525, -0.2240, -0.2538],
        [-1.6604,  0.1996,  1.5106,  ..., -0.9525, -0.2240, -0.2538]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 45, 30, 15,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2394,  0.0986, -0.1465,  ..., -0.4586, -0.6761,  1.6056],
        [ 1.0151,  0.6818,  0.3837,  ...,  1.8346,  0.1144,  1.3196],
        [ 1.0151,  0.6818,  0.3837,  ...,  1.8346,  0.1144,  1.3196],
        ...,
        [-1.6607,  0.1998,  1.5104,  ..., -0.9521, -0.2238, -0.2539],
        [-1.6607,  0.1998,  1.5104,  ..., -0.9521, -0.2238, -0.2539],
        [-1

PackedSequence(data=tensor([[ 0.8128, -1.1743, -1.8163,  ...,  0.1693, -0.2381, -1.2775],
        [ 0.8756, -0.6304,  0.6004,  ...,  0.2495, -1.4437, -0.8811],
        [ 1.0147,  0.6820,  0.3828,  ...,  1.8348,  0.1132,  1.3203],
        ...,
        [-1.6615,  0.1958,  1.5100,  ..., -0.9550, -0.2262, -0.2547],
        [-1.6615,  0.1958,  1.5100,  ..., -0.9550, -0.2262, -0.2547],
        [-1.6615,  0.1958,  1.5100,  ..., -0.9550, -0.2262, -0.2547]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 44, 37, 24, 12,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8751, -0.6302,  0.6008,  ...,  0.2492, -1.4440, -0.8810],
        [ 0.8751, -0.6302,  0.6008,  ...,  0.2492, -1.4440, -0.8810],
        [ 0.8751, -0.6302,  0.6008,  ...,  0.2492, -1.4440, -0.8810],
        ...,
        [-1.6615,  0.1958,  1.5100,  ..., -0.9551, -0.2261, -0.2549],
        [-1.6615,  0.1958,  1.5100,  ..., -0.9551, -0.2261, -0.2549],
        [-1

PackedSequence(data=tensor([[ 0.8724, -0.6308,  0.6026,  ...,  0.2492, -1.4486, -0.8818],
        [ 1.0127,  0.6834,  0.3846,  ...,  1.8341,  0.1093,  1.3221],
        [ 0.8724, -0.6308,  0.6026,  ...,  0.2492, -1.4486, -0.8818],
        ...,
        [-1.6635,  0.1976,  1.5140,  ..., -0.9520, -0.2207, -0.2533],
        [-1.6635,  0.1976,  1.5140,  ..., -0.9520, -0.2207, -0.2533],
        [-1.6635,  0.1976,  1.5140,  ..., -0.9520, -0.2207, -0.2533]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 42, 26,  8,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2509,  1.7878, -0.9846,  ...,  0.3523, -2.1297,  1.5501],
        [ 0.8112, -1.1744, -1.8191,  ...,  0.1704, -0.2409, -1.2770],
        [ 1.0127,  0.6834,  0.3847,  ...,  1.8341,  0.1090,  1.3222],
        ...,
        [-1.6636,  0.1972,  1.5144,  ..., -0.9522, -0.2208, -0.2528],
        [-1.6636,  0.1972,  1.5144,  ..., -0.9522, -0.2208, -0.2528],
        [-1

PackedSequence(data=tensor([[ 1.0374, -2.0373,  0.1623,  ...,  0.6559,  1.4244,  1.4024],
        [-0.2372,  0.3495,  0.0031,  ...,  0.6142,  2.2036,  0.9689],
        [ 0.8064, -1.1703, -1.8194,  ...,  0.1655, -0.2431, -1.2809],
        ...,
        [-1.6654,  0.1953,  1.5118,  ..., -0.9541, -0.2235, -0.2530],
        [-1.6654,  0.1953,  1.5118,  ..., -0.9541, -0.2235, -0.2530],
        [-1.6654,  0.1953,  1.5118,  ..., -0.9541, -0.2235, -0.2530]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 36, 30, 17, 12]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4882,  1.2736, -1.1528,  ...,  1.1850,  2.6704,  0.6811],
        [-0.2512,  1.7886, -0.9835,  ...,  0.3521, -2.1285,  1.5509],
        [-0.3255, -0.7677, -1.8457,  ..., -0.9261, -0.9287,  0.0479],
        ...,
        [-1.6657,  0.1958,  1.5116,  ..., -0.9537, -0.2238, -0.2532],
        [-1.6657,  0.1958,  1.5116,  ..., -0.9537, -0.2238, -0.2532],
        [-1

PackedSequence(data=tensor([[ 0.8076, -1.1735, -1.8250,  ...,  0.1662, -0.2441, -1.2797],
        [ 0.8076, -1.1735, -1.8250,  ...,  0.1662, -0.2441, -1.2797],
        [ 0.4159,  0.0338, -1.4377,  ...,  0.3073, -1.5271,  0.5141],
        ...,
        [-1.6677,  0.1999,  1.5108,  ..., -0.9544, -0.2229, -0.2532],
        [-1.6677,  0.1999,  1.5108,  ..., -0.9544, -0.2229, -0.2532],
        [-1.6677,  0.1999,  1.5108,  ..., -0.9544, -0.2229, -0.2532]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 50, 34, 25, 11,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2398,  0.0949, -0.1420,  ..., -0.4576, -0.6768,  1.6041],
        [ 1.0139,  0.6808,  0.3875,  ...,  1.8383,  0.1101,  1.3212],
        [-0.2398,  0.0949, -0.1420,  ..., -0.4576, -0.6768,  1.6041],
        ...,
        [-1.6682,  0.2002,  1.5105,  ..., -0.9541, -0.2231, -0.2531],
        [-1.6682,  0.2002,  1.5105,  ..., -0.9541, -0.2231, -0.2531],
        [-1

PackedSequence(data=tensor([[ 0.4831,  1.2756, -1.1514,  ...,  1.1824,  2.6728,  0.6755],
        [ 0.4137,  0.0324, -1.4396,  ...,  0.3069, -1.5193,  0.5122],
        [ 0.4831,  1.2756, -1.1514,  ...,  1.1824,  2.6728,  0.6755],
        ...,
        [-1.6664,  0.1965,  1.5085,  ..., -0.9561, -0.2276, -0.2535],
        [-1.6664,  0.1965,  1.5085,  ..., -0.9561, -0.2276, -0.2535],
        [-1.6664,  0.1965,  1.5085,  ..., -0.9561, -0.2276, -0.2535]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 41, 30, 20, 12]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8707, -0.6298,  0.6001,  ...,  0.2529, -1.4454, -0.8753],
        [ 0.8707, -0.6298,  0.6001,  ...,  0.2529, -1.4454, -0.8753],
        [ 0.8117, -1.1773, -1.8322,  ...,  0.1680, -0.2336, -1.2840],
        ...,
        [-1.6661,  0.1962,  1.5089,  ..., -0.9565, -0.2278, -0.2535],
        [-1.6661,  0.1962,  1.5089,  ..., -0.9565, -0.2278, -0.2535],
        [-1

PackedSequence(data=tensor([[ 0.8691, -0.6267,  0.6053,  ...,  0.2524, -1.4465, -0.8756],
        [ 0.8135, -1.1804, -1.8316,  ...,  0.1676, -0.2253, -1.2730],
        [ 0.8691, -0.6267,  0.6053,  ...,  0.2524, -1.4465, -0.8756],
        ...,
        [-1.6653,  0.1973,  1.5129,  ..., -0.9548, -0.2253, -0.2536],
        [-1.6653,  0.1973,  1.5129,  ..., -0.9548, -0.2253, -0.2536],
        [-1.6653,  0.1973,  1.5129,  ..., -0.9548, -0.2253, -0.2536]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 36, 26, 14,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8690, -0.6269,  0.6058,  ...,  0.2523, -1.4469, -0.8758],
        [-0.6016,  0.6491, -0.4720,  ...,  0.0517,  0.0709, -0.2894],
        [ 0.8690, -0.6269,  0.6058,  ...,  0.2523, -1.4469, -0.8758],
        ...,
        [-1.6655,  0.1972,  1.5131,  ..., -0.9547, -0.2251, -0.2535],
        [-1.6655,  0.1972,  1.5131,  ..., -0.9547, -0.2251, -0.2535],
        [-1

PackedSequence(data=tensor([[ 0.8146, -1.1808, -1.8314,  ...,  0.1691, -0.2248, -1.2697],
        [ 0.8146, -1.1808, -1.8314,  ...,  0.1691, -0.2248, -1.2697],
        [ 0.8682, -0.6289,  0.6113,  ...,  0.2514, -1.4493, -0.8785],
        ...,
        [-1.6659,  0.1980,  1.5104,  ..., -0.9540, -0.2247, -0.2510],
        [-1.6659,  0.1980,  1.5104,  ..., -0.9540, -0.2247, -0.2510],
        [-1.6659,  0.1980,  1.5104,  ..., -0.9540, -0.2247, -0.2510]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 36, 27, 14,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8680, -0.6288,  0.6117,  ...,  0.2509, -1.4493, -0.8786],
        [-0.0106, -0.6893,  0.4016,  ...,  0.1073, -0.5432, -0.4639],
        [ 0.8680, -0.6288,  0.6117,  ...,  0.2509, -1.4493, -0.8786],
        ...,
        [-1.6657,  0.1980,  1.5103,  ..., -0.9539, -0.2247, -0.2509],
        [-1.6657,  0.1980,  1.5103,  ..., -0.9539, -0.2247, -0.2509],
        [-1

PackedSequence(data=tensor([[ 0.8666, -0.6285,  0.6132,  ...,  0.2471, -1.4485, -0.8813],
        [ 0.8666, -0.6285,  0.6132,  ...,  0.2471, -1.4485, -0.8813],
        [ 0.8666, -0.6285,  0.6132,  ...,  0.2471, -1.4485, -0.8813],
        ...,
        [-1.6644,  0.1973,  1.5118,  ..., -0.9562, -0.2251, -0.2512],
        [-1.6644,  0.1973,  1.5118,  ..., -0.9562, -0.2251, -0.2512],
        [-1.6644,  0.1973,  1.5118,  ..., -0.9562, -0.2251, -0.2512]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 61, 53, 45, 35, 20,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8666, -0.6286,  0.6132,  ...,  0.2469, -1.4485, -0.8815],
        [ 0.8666, -0.6286,  0.6132,  ...,  0.2469, -1.4485, -0.8815],
        [ 0.8666, -0.6286,  0.6132,  ...,  0.2469, -1.4485, -0.8815],
        ...,
        [-1.6645,  0.1980,  1.5116,  ..., -0.9558, -0.2254, -0.2517],
        [-1.6645,  0.1980,  1.5116,  ..., -0.9558, -0.2254, -0.2517],
        [-1

PackedSequence(data=tensor([[-0.2378,  0.1083, -0.1444,  ..., -0.4674, -0.6861,  1.6047],
        [ 0.8672, -0.6291,  0.6097,  ...,  0.2450, -1.4478, -0.8810],
        [ 0.8672, -0.6291,  0.6097,  ...,  0.2450, -1.4478, -0.8810],
        ...,
        [-1.6671,  0.1976,  1.5119,  ..., -0.9535, -0.2245, -0.2499],
        [-1.6671,  0.1976,  1.5119,  ..., -0.9535, -0.2245, -0.2499],
        [-1.6671,  0.1976,  1.5119,  ..., -0.9535, -0.2245, -0.2499]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 42, 26, 19,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8133, -1.1828, -1.8330,  ...,  0.1716, -0.2231, -1.2643],
        [-0.2378,  0.3557, -0.0025,  ...,  0.6168,  2.1978,  0.9773],
        [ 0.8672, -0.6292,  0.6094,  ...,  0.2449, -1.4475, -0.8810],
        ...,
        [-1.6672,  0.1975,  1.5120,  ..., -0.9533, -0.2246, -0.2498],
        [-1.6672,  0.1975,  1.5120,  ..., -0.9533, -0.2246, -0.2498],
        [-1

PackedSequence(data=tensor([[ 1.0120,  0.6809,  0.3735,  ...,  1.8460,  0.0953,  1.3184],
        [ 0.4813,  1.2809, -1.1461,  ...,  1.1833,  2.6573,  0.6898],
        [ 0.8671, -0.6303,  0.6071,  ...,  0.2452, -1.4463, -0.8802],
        ...,
        [-1.6691,  0.1982,  1.5101,  ..., -0.9536, -0.2268, -0.2526],
        [-1.6691,  0.1982,  1.5101,  ..., -0.9536, -0.2268, -0.2526],
        [-1.6691,  0.1982,  1.5101,  ..., -0.9536, -0.2268, -0.2526]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 39, 25, 12,  1]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0117,  0.6810,  0.3737,  ...,  1.8459,  0.0949,  1.3183],
        [ 0.8671, -0.6305,  0.6071,  ...,  0.2455, -1.4462, -0.8800],
        [ 1.0117,  0.6810,  0.3737,  ...,  1.8459,  0.0949,  1.3183],
        ...,
        [-1.6694,  0.1981,  1.5101,  ..., -0.9537, -0.2267, -0.2524],
        [-1.6694,  0.1981,  1.5101,  ..., -0.9537, -0.2267, -0.2524],
        [-1

PackedSequence(data=tensor([[ 0.8656, -0.6333,  0.6066,  ...,  0.2468, -1.4442, -0.8756],
        [ 0.8656, -0.6333,  0.6066,  ...,  0.2468, -1.4442, -0.8756],
        [ 1.0135,  0.6805,  0.3713,  ...,  1.8428,  0.0908,  1.3167],
        ...,
        [-1.6675,  0.2006,  1.5106,  ..., -0.9526, -0.2261, -0.2518],
        [-1.6675,  0.2006,  1.5106,  ..., -0.9526, -0.2261, -0.2518],
        [-1.6675,  0.2006,  1.5106,  ..., -0.9526, -0.2261, -0.2518]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 45, 35, 22,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8653, -0.6333,  0.6066,  ...,  0.2467, -1.4440, -0.8754],
        [ 0.4806,  1.2817, -1.1436,  ...,  1.1826,  2.6570,  0.6898],
        [ 0.8653, -0.6333,  0.6066,  ...,  0.2467, -1.4440, -0.8754],
        ...,
        [-1.6672,  0.2006,  1.5106,  ..., -0.9533, -0.2265, -0.2521],
        [-1.6672,  0.2006,  1.5106,  ..., -0.9533, -0.2265, -0.2521],
        [-1

PackedSequence(data=tensor([[ 0.8642, -0.6344,  0.6036,  ...,  0.2464, -1.4466, -0.8774],
        [ 0.8642, -0.6344,  0.6036,  ...,  0.2464, -1.4466, -0.8774],
        [ 0.4136,  0.0323, -1.4476,  ...,  0.3092, -1.5076,  0.5117],
        ...,
        [-1.6684,  0.1985,  1.5092,  ..., -0.9590, -0.2264, -0.2491],
        [-1.6684,  0.1985,  1.5092,  ..., -0.9590, -0.2264, -0.2491],
        [-1.6684,  0.1985,  1.5092,  ..., -0.9590, -0.2264, -0.2491]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 39, 30, 18,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8639, -0.6346,  0.6035,  ...,  0.2462, -1.4471, -0.8782],
        [ 0.4792,  1.2821, -1.1450,  ...,  1.1838,  2.6606,  0.6905],
        [ 0.4792,  1.2821, -1.1450,  ...,  1.1838,  2.6606,  0.6905],
        ...,
        [-1.6686,  0.1987,  1.5092,  ..., -0.9588, -0.2258, -0.2487],
        [-1.6686,  0.1987,  1.5092,  ..., -0.9588, -0.2258, -0.2487],
        [-1

PackedSequence(data=tensor([[ 0.8609, -0.6366,  0.6037,  ...,  0.2450, -1.4516, -0.8821],
        [ 0.8609, -0.6366,  0.6037,  ...,  0.2450, -1.4516, -0.8821],
        [ 0.8098, -1.1708, -1.8334,  ...,  0.1695, -0.2387, -1.2686],
        ...,
        [-1.6741,  0.2008,  1.5100,  ..., -0.9547, -0.2254, -0.2491],
        [-1.6741,  0.2008,  1.5100,  ..., -0.9547, -0.2254, -0.2491],
        [-1.6741,  0.2008,  1.5100,  ..., -0.9547, -0.2254, -0.2491]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 57, 40, 27, 15,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8606, -0.6368,  0.6038,  ...,  0.2451, -1.4518, -0.8822],
        [ 0.8606, -0.6368,  0.6038,  ...,  0.2451, -1.4518, -0.8822],
        [ 0.8094, -1.1704, -1.8332,  ...,  0.1692, -0.2386, -1.2688],
        ...,
        [-1.6740,  0.2006,  1.5101,  ..., -0.9548, -0.2258, -0.2493],
        [-1.6740,  0.2006,  1.5101,  ..., -0.9548, -0.2258, -0.2493],
        [-1

PackedSequence(data=tensor([[ 1.0181,  0.6741,  0.3635,  ...,  1.8460,  0.0979,  1.3187],
        [ 0.8598, -0.6387,  0.6032,  ...,  0.2448, -1.4527, -0.8832],
        [ 0.8089, -1.1682, -1.8342,  ...,  0.1682, -0.2378, -1.2700],
        ...,
        [-1.6680,  0.2003,  1.5080,  ..., -0.9576, -0.2288, -0.2519],
        [-1.6680,  0.2003,  1.5080,  ..., -0.9576, -0.2288, -0.2519],
        [-1.6680,  0.2003,  1.5080,  ..., -0.9576, -0.2288, -0.2519]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 47, 36, 14,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0175,  0.6746,  0.3636,  ...,  1.8456,  0.0973,  1.3195],
        [ 0.8599, -0.6386,  0.6034,  ...,  0.2447, -1.4529, -0.8834],
        [ 0.8599, -0.6386,  0.6034,  ...,  0.2447, -1.4529, -0.8834],
        ...,
        [-1.6679,  0.2005,  1.5082,  ..., -0.9570, -0.2285, -0.2518],
        [-1.6679,  0.2005,  1.5082,  ..., -0.9570, -0.2285, -0.2518],
        [-1

PackedSequence(data=tensor([[ 0.4077,  0.0307, -1.4466,  ...,  0.3208, -1.5073,  0.5118],
        [ 1.0179,  0.6750,  0.3632,  ...,  1.8443,  0.0904,  1.3222],
        [-0.2411,  0.3451,  0.0085,  ...,  0.6176,  2.2047,  0.9805],
        ...,
        [-1.6687,  0.2012,  1.5089,  ..., -0.9564, -0.2276, -0.2507],
        [-1.6687,  0.2012,  1.5089,  ..., -0.9564, -0.2276, -0.2507],
        [-1.6687,  0.2012,  1.5089,  ..., -0.9564, -0.2276, -0.2507]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 38, 23, 13,  2]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4076,  0.0309, -1.4463,  ...,  0.3207, -1.5072,  0.5118],
        [ 0.4076,  0.0309, -1.4463,  ...,  0.3207, -1.5072,  0.5118],
        [ 0.8611, -0.6352,  0.6048,  ...,  0.2452, -1.4535, -0.8791],
        ...,
        [-1.6686,  0.2009,  1.5091,  ..., -0.9567, -0.2275, -0.2506],
        [-1.6686,  0.2009,  1.5091,  ..., -0.9567, -0.2275, -0.2506],
        [-1

PackedSequence(data=tensor([[-0.2416,  0.3481,  0.0086,  ...,  0.6176,  2.2034,  0.9805],
        [ 0.8627, -0.6301,  0.6062,  ...,  0.2435, -1.4540, -0.8773],
        [ 0.4048,  0.0398, -1.4408,  ...,  0.3133, -1.5155,  0.5261],
        ...,
        [-1.6676,  0.2012,  1.5100,  ..., -0.9595, -0.2275, -0.2475],
        [-1.6676,  0.2012,  1.5100,  ..., -0.9595, -0.2275, -0.2475],
        [-1.6676,  0.2012,  1.5100,  ..., -0.9595, -0.2275, -0.2475]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 37, 27, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8627, -0.6299,  0.6061,  ...,  0.2433, -1.4540, -0.8775],
        [ 0.8627, -0.6299,  0.6061,  ...,  0.2433, -1.4540, -0.8775],
        [ 1.0200,  0.6713,  0.3600,  ...,  1.8449,  0.0954,  1.3231],
        ...,
        [-1.6673,  0.2009,  1.5101,  ..., -0.9597, -0.2278, -0.2474],
        [-1.6673,  0.2009,  1.5101,  ..., -0.9597, -0.2278, -0.2474],
        [-1

PackedSequence(data=tensor([[ 0.8646, -0.6307,  0.6043,  ...,  0.2421, -1.4527, -0.8791],
        [ 0.8646, -0.6307,  0.6043,  ...,  0.2421, -1.4527, -0.8791],
        [ 0.8646, -0.6307,  0.6043,  ...,  0.2421, -1.4527, -0.8791],
        ...,
        [-1.6676,  0.1999,  1.5108,  ..., -0.9595, -0.2279, -0.2470],
        [-1.6676,  0.1999,  1.5108,  ..., -0.9595, -0.2279, -0.2470],
        [-1.6676,  0.1999,  1.5108,  ..., -0.9595, -0.2279, -0.2470]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 48, 29, 19,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0394, -2.0385,  0.1531,  ...,  0.6681,  1.4389,  1.4022],
        [ 0.8647, -0.6307,  0.6043,  ...,  0.2417, -1.4528, -0.8792],
        [-0.3293, -0.7670, -1.8334,  ..., -0.9198, -0.9195,  0.0677],
        ...,
        [-1.6679,  0.1997,  1.5105,  ..., -0.9594, -0.2282, -0.2472],
        [-1.6679,  0.1997,  1.5105,  ..., -0.9594, -0.2282, -0.2472],
        [-1

PackedSequence(data=tensor([[ 0.4036,  0.0428, -1.4413,  ...,  0.3143, -1.5177,  0.5315],
        [ 0.4036,  0.0428, -1.4413,  ...,  0.3143, -1.5177,  0.5315],
        [ 0.8190, -1.1699, -1.8317,  ...,  0.1698, -0.2378, -1.2686],
        ...,
        [-1.6698,  0.1978,  1.5103,  ..., -0.9604, -0.2295, -0.2473],
        [-1.6698,  0.1978,  1.5103,  ..., -0.9604, -0.2295, -0.2473],
        [-1.6698,  0.1978,  1.5103,  ..., -0.9604, -0.2295, -0.2473]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 43, 30, 12,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2422,  0.3508,  0.0086,  ...,  0.6197,  2.2041,  0.9836],
        [ 0.8657, -0.6332,  0.6020,  ...,  0.2400, -1.4564, -0.8807],
        [ 0.8190, -1.1699, -1.8315,  ...,  0.1696, -0.2379, -1.2687],
        ...,
        [-1.6699,  0.1981,  1.5106,  ..., -0.9605, -0.2298, -0.2477],
        [-1.6699,  0.1981,  1.5106,  ..., -0.9605, -0.2298, -0.2477],
        [-1

PackedSequence(data=tensor([[ 0.8652, -0.6328,  0.6025,  ...,  0.2404, -1.4587, -0.8859],
        [ 0.8182, -1.1729, -1.8292,  ...,  0.1668, -0.2362, -1.2635],
        [ 0.8652, -0.6328,  0.6025,  ...,  0.2404, -1.4587, -0.8859],
        ...,
        [-1.6680,  0.1976,  1.5109,  ..., -0.9596, -0.2313, -0.2459],
        [-1.6680,  0.1976,  1.5109,  ..., -0.9596, -0.2313, -0.2459],
        [-1.6680,  0.1976,  1.5109,  ..., -0.9596, -0.2313, -0.2459]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 51, 33, 24, 17,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8654, -0.6327,  0.6026,  ...,  0.2406, -1.4586, -0.8865],
        [ 0.8654, -0.6327,  0.6026,  ...,  0.2406, -1.4586, -0.8865],
        [ 0.4777,  1.2818, -1.1493,  ...,  1.1771,  2.6638,  0.6957],
        ...,
        [-1.6675,  0.1976,  1.5108,  ..., -0.9596, -0.2311, -0.2457],
        [-1.6675,  0.1976,  1.5108,  ..., -0.9596, -0.2311, -0.2457],
        [-1

PackedSequence(data=tensor([[ 0.8629, -0.6297,  0.6043,  ...,  0.2380, -1.4586, -0.8888],
        [ 0.8629, -0.6297,  0.6043,  ...,  0.2380, -1.4586, -0.8888],
        [ 0.8629, -0.6297,  0.6043,  ...,  0.2380, -1.4586, -0.8888],
        ...,
        [-1.6639,  0.1973,  1.5111,  ..., -0.9605, -0.2303, -0.2449],
        [-1.6639,  0.1973,  1.5111,  ..., -0.9605, -0.2303, -0.2449],
        [-1.6639,  0.1973,  1.5111,  ..., -0.9605, -0.2303, -0.2449]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 45, 29, 12,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8152, -1.1754, -1.8306,  ...,  0.1627, -0.2309, -1.2641],
        [ 1.0407, -2.0365,  0.1557,  ...,  0.6610,  1.4376,  1.3969],
        [ 0.8627, -0.6296,  0.6043,  ...,  0.2379, -1.4587, -0.8889],
        ...,
        [-1.6639,  0.1973,  1.5109,  ..., -0.9604, -0.2302, -0.2450],
        [-1.6639,  0.1973,  1.5109,  ..., -0.9604, -0.2302, -0.2450],
        [-1

PackedSequence(data=tensor([[ 0.8673, -0.6312,  0.6050,  ...,  0.2407, -1.4619, -0.8930],
        [ 0.8673, -0.6312,  0.6050,  ...,  0.2407, -1.4619, -0.8930],
        [ 0.8673, -0.6312,  0.6050,  ...,  0.2407, -1.4619, -0.8930],
        ...,
        [-1.6676,  0.2017,  1.5102,  ..., -0.9594, -0.2266, -0.2461],
        [-1.6676,  0.2017,  1.5102,  ..., -0.9594, -0.2266, -0.2461],
        [-1.6676,  0.2017,  1.5102,  ..., -0.9594, -0.2266, -0.2461]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 40, 30, 14,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0225,  0.6707,  0.3563,  ...,  1.8459,  0.0925,  1.3238],
        [ 0.8675, -0.6313,  0.6051,  ...,  0.2408, -1.4622, -0.8931],
        [ 0.8121, -1.1747, -1.8317,  ...,  0.1574, -0.2291, -1.2669],
        ...,
        [-1.6675,  0.2017,  1.5103,  ..., -0.9593, -0.2268, -0.2459],
        [-1.6675,  0.2017,  1.5103,  ..., -0.9593, -0.2268, -0.2459],
        [-1

PackedSequence(data=tensor([[-0.2443,  1.7889, -0.9812,  ...,  0.3507, -2.1472,  1.5528],
        [ 0.8675, -0.6331,  0.6050,  ...,  0.2412, -1.4622, -0.8962],
        [ 1.0208,  0.6708,  0.3584,  ...,  1.8461,  0.0929,  1.3236],
        ...,
        [-1.6689,  0.2007,  1.5100,  ..., -0.9603, -0.2279, -0.2458],
        [-1.6689,  0.2007,  1.5100,  ..., -0.9603, -0.2279, -0.2458],
        [-1.6689,  0.2007,  1.5100,  ..., -0.9603, -0.2279, -0.2458]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 34, 29, 14,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8076, -1.1758, -1.8338,  ...,  0.1562, -0.2280, -1.2678],
        [ 0.8676, -0.6332,  0.6050,  ...,  0.2413, -1.4622, -0.8963],
        [ 0.8676, -0.6332,  0.6050,  ...,  0.2413, -1.4622, -0.8963],
        ...,
        [-1.6692,  0.2008,  1.5097,  ..., -0.9604, -0.2280, -0.2459],
        [-1.6692,  0.2008,  1.5097,  ..., -0.9604, -0.2280, -0.2459],
        [-1

PackedSequence(data=tensor([[ 0.8667, -0.6339,  0.6051,  ...,  0.2424, -1.4622, -0.8958],
        [ 0.8667, -0.6339,  0.6051,  ...,  0.2424, -1.4622, -0.8958],
        [ 1.0201,  0.6721,  0.3583,  ...,  1.8461,  0.0921,  1.3264],
        ...,
        [-1.6736,  0.2019,  1.5071,  ..., -0.9584, -0.2285, -0.2462],
        [-1.6736,  0.2019,  1.5071,  ..., -0.9584, -0.2285, -0.2462],
        [-1.6736,  0.2019,  1.5071,  ..., -0.9584, -0.2285, -0.2462]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 45, 33, 19,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8665, -0.6340,  0.6051,  ...,  0.2423, -1.4621, -0.8957],
        [ 0.4874,  1.2837, -1.1589,  ...,  1.1873,  2.6707,  0.6881],
        [ 0.8665, -0.6340,  0.6051,  ...,  0.2423, -1.4621, -0.8957],
        ...,
        [-1.6742,  0.2015,  1.5076,  ..., -0.9580, -0.2283, -0.2454],
        [-1.6742,  0.2015,  1.5076,  ..., -0.9580, -0.2283, -0.2454],
        [-1

PackedSequence(data=tensor([[-0.2353,  0.1000, -0.1449,  ..., -0.4573, -0.6726,  1.6069],
        [ 0.8130, -1.1791, -1.8329,  ...,  0.1638, -0.2320, -1.2692],
        [-0.2353,  0.1000, -0.1449,  ..., -0.4573, -0.6726,  1.6069],
        ...,
        [-1.6805,  0.2009,  1.5074,  ..., -0.9575, -0.2274, -0.2436],
        [-1.6805,  0.2009,  1.5074,  ..., -0.9575, -0.2274, -0.2436],
        [-1.6805,  0.2009,  1.5074,  ..., -0.9575, -0.2274, -0.2436]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 39, 30, 17,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2506,  1.7934, -0.9777,  ...,  0.3457, -2.1438,  1.5502],
        [-0.2506,  1.7934, -0.9777,  ...,  0.3457, -2.1438,  1.5502],
        [ 0.8649, -0.6329,  0.6042,  ...,  0.2408, -1.4617, -0.8946],
        ...,
        [-1.6804,  0.2010,  1.5073,  ..., -0.9579, -0.2276, -0.2439],
        [-1.6804,  0.2010,  1.5073,  ..., -0.9579, -0.2276, -0.2439],
        [-1

PackedSequence(data=tensor([[-0.2371,  0.3447,  0.0058,  ...,  0.6312,  2.2072,  0.9868],
        [ 0.8667, -0.6365,  0.6028,  ...,  0.2437, -1.4598, -0.8921],
        [ 0.8667, -0.6365,  0.6028,  ...,  0.2437, -1.4598, -0.8921],
        ...,
        [-1.6787,  0.2005,  1.5078,  ..., -0.9620, -0.2262, -0.2431],
        [-1.6787,  0.2005,  1.5078,  ..., -0.9620, -0.2262, -0.2431],
        [-1.6787,  0.2005,  1.5078,  ..., -0.9620, -0.2262, -0.2431]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 38, 29, 12,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2502,  1.7940, -0.9816,  ...,  0.3457, -2.1436,  1.5470],
        [-0.2345,  0.1028, -0.1445,  ..., -0.4594, -0.6743,  1.6056],
        [-0.2345,  0.1028, -0.1445,  ..., -0.4594, -0.6743,  1.6056],
        ...,
        [-1.6790,  0.2005,  1.5078,  ..., -0.9617, -0.2260, -0.2429],
        [-1.6790,  0.2005,  1.5078,  ..., -0.9617, -0.2260, -0.2429],
        [-1

PackedSequence(data=tensor([[ 0.8687, -0.6379,  0.6016,  ...,  0.2463, -1.4571, -0.8932],
        [ 0.8148, -1.1775, -1.8292,  ...,  0.1624, -0.2347, -1.2724],
        [-0.0032, -0.6912,  0.3956,  ...,  0.1139, -0.5500, -0.4724],
        ...,
        [-1.6808,  0.2015,  1.5082,  ..., -0.9588, -0.2264, -0.2456],
        [-1.6808,  0.2015,  1.5082,  ..., -0.9588, -0.2264, -0.2456],
        [-1.6808,  0.2015,  1.5082,  ..., -0.9588, -0.2264, -0.2456]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 42, 27, 15,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8688, -0.6380,  0.6015,  ...,  0.2464, -1.4569, -0.8933],
        [ 0.4014,  0.0546, -1.4283,  ...,  0.3054, -1.5118,  0.5177],
        [ 0.8688, -0.6380,  0.6015,  ...,  0.2464, -1.4569, -0.8933],
        ...,
        [-1.6806,  0.2011,  1.5084,  ..., -0.9589, -0.2264, -0.2454],
        [-1.6806,  0.2011,  1.5084,  ..., -0.9589, -0.2264, -0.2454],
        [-1

PackedSequence(data=tensor([[ 0.4012,  0.0565, -1.4292,  ...,  0.3070, -1.5117,  0.5169],
        [ 1.0212,  0.6715,  0.3566,  ...,  1.8511,  0.0910,  1.3350],
        [ 0.8699, -0.6376,  0.6012,  ...,  0.2473, -1.4564, -0.8923],
        ...,
        [-1.6775,  0.2012,  1.5110,  ..., -0.9593, -0.2289, -0.2453],
        [-1.6775,  0.2012,  1.5110,  ..., -0.9593, -0.2289, -0.2453],
        [-1.6775,  0.2012,  1.5110,  ..., -0.9593, -0.2289, -0.2453]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 38, 28, 10,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8700, -0.6377,  0.6012,  ...,  0.2474, -1.4565, -0.8923],
        [ 0.8700, -0.6377,  0.6012,  ...,  0.2474, -1.4565, -0.8923],
        [ 0.8700, -0.6377,  0.6012,  ...,  0.2474, -1.4565, -0.8923],
        ...,
        [-1.6775,  0.2011,  1.5110,  ..., -0.9596, -0.2291, -0.2453],
        [-1.6775,  0.2011,  1.5110,  ..., -0.9596, -0.2291, -0.2453],
        [-1

PackedSequence(data=tensor([[ 0.8696, -0.6378,  0.6004,  ...,  0.2459, -1.4606, -0.8915],
        [ 0.8133, -1.1721, -1.8273,  ...,  0.1618, -0.2360, -1.2734],
        [-0.2475,  1.7976, -0.9753,  ...,  0.3456, -2.1431,  1.5499],
        ...,
        [-1.6750,  0.2015,  1.5087,  ..., -0.9612, -0.2292, -0.2438],
        [-1.6750,  0.2015,  1.5087,  ..., -0.9612, -0.2292, -0.2438],
        [-1.6750,  0.2015,  1.5087,  ..., -0.9612, -0.2292, -0.2438]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 39, 28,  9,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8130, -1.1719, -1.8275,  ...,  0.1621, -0.2358, -1.2734],
        [-0.2475,  1.7977, -0.9753,  ...,  0.3459, -2.1431,  1.5499],
        [ 0.8130, -1.1719, -1.8275,  ...,  0.1621, -0.2358, -1.2734],
        ...,
        [-1.6752,  0.2016,  1.5086,  ..., -0.9610, -0.2293, -0.2439],
        [-1.6752,  0.2016,  1.5086,  ..., -0.9610, -0.2293, -0.2439],
        [-1

PackedSequence(data=tensor([[ 0.8745, -0.6359,  0.5991,  ...,  0.2487, -1.4565, -0.8950],
        [ 1.0300, -2.0351,  0.1539,  ...,  0.6552,  1.4302,  1.3941],
        [ 0.8107, -1.1701, -1.8304,  ...,  0.1635, -0.2341, -1.2736],
        ...,
        [-1.6767,  0.2024,  1.5082,  ..., -0.9607, -0.2283, -0.2449],
        [-1.6767,  0.2024,  1.5082,  ..., -0.9607, -0.2283, -0.2449],
        [-1.6767,  0.2024,  1.5082,  ..., -0.9607, -0.2283, -0.2449]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 53, 44, 30, 13,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2280,  0.1059, -0.1445,  ..., -0.4612, -0.6753,  1.6005],
        [ 0.8751, -0.6356,  0.5990,  ...,  0.2492, -1.4554, -0.8960],
        [ 0.8105, -1.1698, -1.8304,  ...,  0.1634, -0.2337, -1.2738],
        ...,
        [-1.6766,  0.2025,  1.5084,  ..., -0.9610, -0.2285, -0.2453],
        [-1.6766,  0.2025,  1.5084,  ..., -0.9610, -0.2285, -0.2453],
        [-1

PackedSequence(data=tensor([[ 0.8817, -0.6342,  0.6009,  ...,  0.2523, -1.4471, -0.9035],
        [ 0.8817, -0.6342,  0.6009,  ...,  0.2523, -1.4471, -0.9035],
        [ 0.8817, -0.6342,  0.6009,  ...,  0.2523, -1.4471, -0.9035],
        ...,
        [-1.6777,  0.2019,  1.5083,  ..., -0.9620, -0.2283, -0.2453],
        [-1.6777,  0.2019,  1.5083,  ..., -0.9620, -0.2283, -0.2453],
        [-1.6777,  0.2019,  1.5083,  ..., -0.9620, -0.2283, -0.2453]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 35, 23, 13,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2244,  0.1066, -0.1457,  ..., -0.4599, -0.6745,  1.5995],
        [ 0.8822, -0.6341,  0.6012,  ...,  0.2525, -1.4468, -0.9036],
        [ 0.8822, -0.6341,  0.6012,  ...,  0.2525, -1.4468, -0.9036],
        ...,
        [-1.6776,  0.2020,  1.5082,  ..., -0.9620, -0.2280, -0.2452],
        [-1.6776,  0.2020,  1.5082,  ..., -0.9620, -0.2280, -0.2452],
        [-1

PackedSequence(data=tensor([[-0.6078,  0.6497, -0.4698,  ...,  0.0392,  0.0534, -0.2860],
        [ 0.8835, -0.6375,  0.6047,  ...,  0.2565, -1.4457, -0.9027],
        [ 1.0204,  0.6701,  0.3614,  ...,  1.8502,  0.0968,  1.3343],
        ...,
        [-1.6779,  0.2006,  1.5100,  ..., -0.9635, -0.2266, -0.2433],
        [-1.6779,  0.2006,  1.5100,  ..., -0.9635, -0.2266, -0.2433],
        [-1.6779,  0.2006,  1.5100,  ..., -0.9635, -0.2266, -0.2433]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 44, 28, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8833, -0.6375,  0.6048,  ...,  0.2567, -1.4456, -0.9028],
        [ 0.8833, -0.6375,  0.6048,  ...,  0.2567, -1.4456, -0.9028],
        [ 0.8833, -0.6375,  0.6048,  ...,  0.2567, -1.4456, -0.9028],
        ...,
        [-1.6784,  0.2005,  1.5101,  ..., -0.9631, -0.2265, -0.2429],
        [-1.6784,  0.2005,  1.5101,  ..., -0.9631, -0.2265, -0.2429],
        [-1

PackedSequence(data=tensor([[ 0.8825, -0.6375,  0.6074,  ...,  0.2598, -1.4442, -0.9033],
        [ 0.8825, -0.6375,  0.6074,  ...,  0.2598, -1.4442, -0.9033],
        [ 0.8078, -1.1601, -1.8301,  ...,  0.1558, -0.2346, -1.2750],
        ...,
        [-1.6752,  0.1991,  1.5130,  ..., -0.9668, -0.2282, -0.2428],
        [-1.6752,  0.1991,  1.5130,  ..., -0.9668, -0.2282, -0.2428],
        [-1.6752,  0.1991,  1.5130,  ..., -0.9668, -0.2282, -0.2428]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 47, 31, 12,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8825, -0.6376,  0.6074,  ...,  0.2600, -1.4442, -0.9033],
        [ 0.8825, -0.6376,  0.6074,  ...,  0.2600, -1.4442, -0.9033],
        [ 0.8825, -0.6376,  0.6074,  ...,  0.2600, -1.4442, -0.9033],
        ...,
        [-1.6751,  0.1989,  1.5131,  ..., -0.9671, -0.2280, -0.2428],
        [-1.6751,  0.1989,  1.5131,  ..., -0.9671, -0.2280, -0.2428],
        [-1

PackedSequence(data=tensor([[ 0.8815, -0.6375,  0.6059,  ...,  0.2610, -1.4415, -0.9014],
        [ 0.8815, -0.6375,  0.6059,  ...,  0.2610, -1.4415, -0.9014],
        [ 0.8815, -0.6375,  0.6059,  ...,  0.2610, -1.4415, -0.9014],
        ...,
        [-1.6774,  0.2015,  1.5100,  ..., -0.9638, -0.2277, -0.2433],
        [-1.6774,  0.2015,  1.5100,  ..., -0.9638, -0.2277, -0.2433],
        [-1.6774,  0.2015,  1.5100,  ..., -0.9638, -0.2277, -0.2433]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 39, 30, 14,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8816, -0.6373,  0.6058,  ...,  0.2609, -1.4414, -0.9015],
        [ 0.4977,  1.2752, -1.1475,  ...,  1.1921,  2.6594,  0.6954],
        [ 0.8101, -1.1598, -1.8297,  ...,  0.1565, -0.2369, -1.2728],
        ...,
        [-1.6779,  0.2016,  1.5099,  ..., -0.9637, -0.2277, -0.2431],
        [-1.6779,  0.2016,  1.5099,  ..., -0.9637, -0.2277, -0.2431],
        [-1

PackedSequence(data=tensor([[ 0.8822, -0.6344,  0.6055,  ...,  0.2616, -1.4401, -0.9011],
        [ 0.4046,  0.0582, -1.4352,  ...,  0.3130, -1.5088,  0.5176],
        [ 0.8822, -0.6344,  0.6055,  ...,  0.2616, -1.4401, -0.9011],
        ...,
        [-1.6789,  0.2009,  1.5111,  ..., -0.9679, -0.2282, -0.2408],
        [-1.6789,  0.2009,  1.5111,  ..., -0.9679, -0.2282, -0.2408],
        [-1.6789,  0.2009,  1.5111,  ..., -0.9679, -0.2282, -0.2408]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 51, 41, 22, 15,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8821, -0.6343,  0.6055,  ...,  0.2617, -1.4400, -0.9011],
        [ 0.8120, -1.1584, -1.8287,  ...,  0.1552, -0.2372, -1.2703],
        [ 0.8120, -1.1584, -1.8287,  ...,  0.1552, -0.2372, -1.2703],
        ...,
        [-1.6787,  0.2005,  1.5114,  ..., -0.9684, -0.2287, -0.2409],
        [-1.6787,  0.2005,  1.5114,  ..., -0.9684, -0.2287, -0.2409],
        [-1

PackedSequence(data=tensor([[-0.2407,  0.1085, -0.1374,  ..., -0.4619, -0.6641,  1.6040],
        [-0.2407,  0.1085, -0.1374,  ..., -0.4619, -0.6641,  1.6040],
        [ 0.4917,  1.2720, -1.1363,  ...,  1.1994,  2.6560,  0.7008],
        ...,
        [-1.6789,  0.2007,  1.5111,  ..., -0.9658, -0.2289, -0.2400],
        [-1.6789,  0.2007,  1.5111,  ..., -0.9658, -0.2289, -0.2400],
        [-1.6789,  0.2007,  1.5111,  ..., -0.9658, -0.2289, -0.2400]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 45, 30, 17,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8822, -0.6313,  0.6080,  ...,  0.2607, -1.4424, -0.9015],
        [ 0.8822, -0.6313,  0.6080,  ...,  0.2607, -1.4424, -0.9015],
        [ 0.8822, -0.6313,  0.6080,  ...,  0.2607, -1.4424, -0.9015],
        ...,
        [-1.6790,  0.2006,  1.5114,  ..., -0.9654, -0.2286, -0.2396],
        [-1.6790,  0.2006,  1.5114,  ..., -0.9654, -0.2286, -0.2396],
        [-1

PackedSequence(data=tensor([[ 0.8824, -0.6302,  0.6113,  ...,  0.2602, -1.4454, -0.9004],
        [ 0.8824, -0.6302,  0.6113,  ...,  0.2602, -1.4454, -0.9004],
        [ 0.8824, -0.6302,  0.6113,  ...,  0.2602, -1.4454, -0.9004],
        ...,
        [-1.6757,  0.1999,  1.5119,  ..., -0.9677, -0.2263, -0.2394],
        [-1.6757,  0.1999,  1.5119,  ..., -0.9677, -0.2263, -0.2394],
        [-1.6757,  0.1999,  1.5119,  ..., -0.9677, -0.2263, -0.2394]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 52, 39, 26, 15,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8111, -1.1569, -1.8279,  ...,  0.1548, -0.2361, -1.2669],
        [ 0.4069,  0.0572, -1.4391,  ...,  0.3144, -1.5079,  0.5182],
        [ 0.4069,  0.0572, -1.4391,  ...,  0.3144, -1.5079,  0.5182],
        ...,
        [-1.6756,  0.1998,  1.5122,  ..., -0.9678, -0.2261, -0.2393],
        [-1.6756,  0.1998,  1.5122,  ..., -0.9678, -0.2261, -0.2393],
        [-1

PackedSequence(data=tensor([[ 1.0244,  0.6665,  0.3529,  ...,  1.8573,  0.1078,  1.3351],
        [ 0.8821, -0.6316,  0.6134,  ...,  0.2612, -1.4461, -0.8992],
        [ 0.8106, -1.1568, -1.8286,  ...,  0.1552, -0.2343, -1.2668],
        ...,
        [-1.6762,  0.2014,  1.5137,  ..., -0.9646, -0.2261, -0.2403],
        [-1.6762,  0.2014,  1.5137,  ..., -0.9646, -0.2261, -0.2403],
        [-1.6762,  0.2014,  1.5137,  ..., -0.9646, -0.2261, -0.2403]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 60, 52, 39, 30, 13,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4085,  0.0577, -1.4400,  ...,  0.3125, -1.5081,  0.5174],
        [ 0.8102, -1.1566, -1.8283,  ...,  0.1551, -0.2339, -1.2671],
        [-0.2398,  0.1065, -0.1478,  ..., -0.4601, -0.6634,  1.6074],
        ...,
        [-1.6761,  0.2015,  1.5136,  ..., -0.9644, -0.2260, -0.2405],
        [-1.6761,  0.2015,  1.5136,  ..., -0.9644, -0.2260, -0.2405],
        [-1

PackedSequence(data=tensor([[ 1.0246,  0.6674,  0.3536,  ...,  1.8562,  0.1046,  1.3357],
        [ 0.8834, -0.6328,  0.6120,  ...,  0.2594, -1.4491, -0.9053],
        [-1.4247,  0.7798, -1.8079,  ..., -0.4591,  0.0364,  0.2358],
        ...,
        [-1.6819,  0.1994,  1.5164,  ..., -0.9636, -0.2218, -0.2372],
        [-1.6819,  0.1994,  1.5164,  ..., -0.9636, -0.2218, -0.2372],
        [-1.6819,  0.1994,  1.5164,  ..., -0.9636, -0.2218, -0.2372]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 40, 25, 14,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8837, -0.6330,  0.6119,  ...,  0.2591, -1.4493, -0.9059],
        [ 0.8837, -0.6330,  0.6119,  ...,  0.2591, -1.4493, -0.9059],
        [ 0.8837, -0.6330,  0.6119,  ...,  0.2591, -1.4493, -0.9059],
        ...,
        [-1.6818,  0.1994,  1.5169,  ..., -0.9638, -0.2216, -0.2371],
        [-1.6818,  0.1994,  1.5169,  ..., -0.9638, -0.2216, -0.2371],
        [-1

PackedSequence(data=tensor([[ 0.8786, -0.6396,  0.6122,  ...,  0.2625, -1.4465, -0.9068],
        [ 0.8786, -0.6396,  0.6122,  ...,  0.2625, -1.4465, -0.9068],
        [ 0.8786, -0.6396,  0.6122,  ...,  0.2625, -1.4465, -0.9068],
        ...,
        [-1.6840,  0.2001,  1.5175,  ..., -0.9660, -0.2235, -0.2371],
        [-1.6840,  0.2001,  1.5175,  ..., -0.9660, -0.2235, -0.2371],
        [-1.6840,  0.2001,  1.5175,  ..., -0.9660, -0.2235, -0.2371]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 57, 46, 35, 20,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8785, -0.6397,  0.6116,  ...,  0.2625, -1.4463, -0.9070],
        [ 1.0263,  0.6662,  0.3530,  ...,  1.8578,  0.1059,  1.3339],
        [-0.2372,  1.7960, -0.9739,  ...,  0.3471, -2.1483,  1.5456],
        ...,
        [-1.6836,  0.1998,  1.5176,  ..., -0.9663, -0.2235, -0.2371],
        [-1.6836,  0.1998,  1.5176,  ..., -0.9663, -0.2235, -0.2371],
        [-1

PackedSequence(data=tensor([[ 1.0247,  0.6654,  0.3512,  ...,  1.8580,  0.1082,  1.3376],
        [ 0.8069, -1.1563, -1.8371,  ...,  0.1563, -0.2342, -1.2704],
        [ 0.8805, -0.6386,  0.6066,  ...,  0.2601, -1.4460, -0.9116],
        ...,
        [-1.6816,  0.2028,  1.5165,  ..., -0.9647, -0.2230, -0.2394],
        [-1.6816,  0.2028,  1.5165,  ..., -0.9647, -0.2230, -0.2394],
        [-1.6816,  0.2028,  1.5165,  ..., -0.9647, -0.2230, -0.2394]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 39, 23, 10,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2414,  0.1088, -0.1475,  ..., -0.4771, -0.6644,  1.5926],
        [ 0.8808, -0.6385,  0.6067,  ...,  0.2598, -1.4461, -0.9119],
        [ 0.8808, -0.6385,  0.6067,  ...,  0.2598, -1.4461, -0.9119],
        ...,
        [-1.6816,  0.2026,  1.5169,  ..., -0.9645, -0.2226, -0.2389],
        [-1.6816,  0.2026,  1.5169,  ..., -0.9645, -0.2226, -0.2389],
        [-1

PackedSequence(data=tensor([[ 0.8832, -0.6403,  0.6055,  ...,  0.2575, -1.4487, -0.9140],
        [ 1.0244,  0.6638,  0.3489,  ...,  1.8589,  0.1095,  1.3393],
        [ 0.8832, -0.6403,  0.6055,  ...,  0.2575, -1.4487, -0.9140],
        ...,
        [-1.6797,  0.2014,  1.5144,  ..., -0.9626, -0.2240, -0.2370],
        [-1.6797,  0.2014,  1.5144,  ..., -0.9626, -0.2240, -0.2370],
        [-1.6797,  0.2014,  1.5144,  ..., -0.9626, -0.2240, -0.2370]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 35, 27, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8830, -0.6404,  0.6054,  ...,  0.2573, -1.4489, -0.9138],
        [ 0.8830, -0.6404,  0.6054,  ...,  0.2573, -1.4489, -0.9138],
        [ 0.4917,  1.2680, -1.1358,  ...,  1.2015,  2.6595,  0.6939],
        ...,
        [-1.6792,  0.2014,  1.5146,  ..., -0.9624, -0.2243, -0.2371],
        [-1.6792,  0.2014,  1.5146,  ..., -0.9624, -0.2243, -0.2371],
        [-1

PackedSequence(data=tensor([[ 0.8791, -0.6390,  0.6033,  ...,  0.2554, -1.4512, -0.9140],
        [-0.2319,  1.7914, -0.9812,  ...,  0.3496, -2.1445,  1.5510],
        [ 0.8791, -0.6390,  0.6033,  ...,  0.2554, -1.4512, -0.9140],
        ...,
        [-1.6793,  0.2022,  1.5151,  ..., -0.9641, -0.2238, -0.2376],
        [-1.6793,  0.2022,  1.5151,  ..., -0.9641, -0.2238, -0.2376],
        [-1.6793,  0.2022,  1.5151,  ..., -0.9641, -0.2238, -0.2376]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 40, 27, 15, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0303, -2.0427,  0.1438,  ...,  0.6658,  1.4461,  1.3967],
        [ 1.0283,  0.6611,  0.3492,  ...,  1.8601,  0.1072,  1.3371],
        [-0.6142,  0.6560, -0.4805,  ...,  0.0566,  0.0670, -0.2878],
        ...,
        [-1.6795,  0.2026,  1.5150,  ..., -0.9638, -0.2238, -0.2379],
        [-1.6795,  0.2026,  1.5150,  ..., -0.9638, -0.2238, -0.2379],
        [-1

PackedSequence(data=tensor([[ 1.0298,  0.6598,  0.3487,  ...,  1.8605,  0.1084,  1.3360],
        [ 0.8761, -0.6373,  0.6052,  ...,  0.2546, -1.4522, -0.9161],
        [ 0.4927,  1.2666, -1.1376,  ...,  1.2053,  2.6620,  0.6901],
        ...,
        [-1.6799,  0.2023,  1.5143,  ..., -0.9639, -0.2251, -0.2380],
        [-1.6799,  0.2023,  1.5143,  ..., -0.9639, -0.2251, -0.2380],
        [-1.6799,  0.2023,  1.5143,  ..., -0.9639, -0.2251, -0.2380]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 42, 31, 17,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8761, -0.6373,  0.6052,  ...,  0.2546, -1.4522, -0.9162],
        [ 0.8761, -0.6373,  0.6052,  ...,  0.2546, -1.4522, -0.9162],
        [ 0.4927,  1.2667, -1.1377,  ...,  1.2054,  2.6622,  0.6900],
        ...,
        [-1.6795,  0.2019,  1.5145,  ..., -0.9643, -0.2254, -0.2378],
        [-1.6795,  0.2019,  1.5145,  ..., -0.9643, -0.2254, -0.2378],
        [-1

PackedSequence(data=tensor([[ 1.0310,  0.6588,  0.3505,  ...,  1.8617,  0.1125,  1.3363],
        [ 0.8750, -0.6378,  0.6051,  ...,  0.2549, -1.4517, -0.9144],
        [ 0.8750, -0.6378,  0.6051,  ...,  0.2549, -1.4517, -0.9144],
        ...,
        [-1.6782,  0.2041,  1.5143,  ..., -0.9653, -0.2267, -0.2376],
        [-1.6782,  0.2041,  1.5143,  ..., -0.9653, -0.2267, -0.2376],
        [-1.6782,  0.2041,  1.5143,  ..., -0.9653, -0.2267, -0.2376]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 43, 33, 20,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4922,  1.2682, -1.1390,  ...,  1.2045,  2.6636,  0.6903],
        [ 0.4040,  0.0680, -1.4291,  ...,  0.3091, -1.5128,  0.5196],
        [ 0.4922,  1.2682, -1.1390,  ...,  1.2045,  2.6636,  0.6903],
        ...,
        [-1.6778,  0.2038,  1.5143,  ..., -0.9656, -0.2267, -0.2372],
        [-1.6778,  0.2038,  1.5143,  ..., -0.9656, -0.2267, -0.2372],
        [-1

PackedSequence(data=tensor([[ 0.4908,  1.2686, -1.1385,  ...,  1.2027,  2.6630,  0.6910],
        [ 0.4045,  0.0686, -1.4288,  ...,  0.3098, -1.5131,  0.5193],
        [-0.2395,  1.7937, -0.9751,  ...,  0.3425, -2.1459,  1.5515],
        ...,
        [-1.6784,  0.2019,  1.5101,  ..., -0.9632, -0.2286, -0.2365],
        [-1.6784,  0.2019,  1.5101,  ..., -0.9632, -0.2286, -0.2365],
        [-1.6784,  0.2019,  1.5101,  ..., -0.9632, -0.2286, -0.2365]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 37, 22, 14,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8764, -0.6371,  0.6066,  ...,  0.2558, -1.4529, -0.9109],
        [ 0.8764, -0.6371,  0.6066,  ...,  0.2558, -1.4529, -0.9109],
        [ 0.8764, -0.6371,  0.6066,  ...,  0.2558, -1.4529, -0.9109],
        ...,
        [-1.6784,  0.2020,  1.5099,  ..., -0.9629, -0.2290, -0.2368],
        [-1.6784,  0.2020,  1.5099,  ..., -0.9629, -0.2290, -0.2368],
        [-1

PackedSequence(data=tensor([[ 0.8124, -1.1590, -1.8352,  ...,  0.1583, -0.2350, -1.2665],
        [ 0.8734, -0.6346,  0.6123,  ...,  0.2540, -1.4542, -0.9067],
        [ 0.8734, -0.6346,  0.6123,  ...,  0.2540, -1.4542, -0.9067],
        ...,
        [-1.6781,  0.2023,  1.5132,  ..., -0.9651, -0.2280, -0.2360],
        [-1.6781,  0.2023,  1.5132,  ..., -0.9651, -0.2280, -0.2360],
        [-1.6781,  0.2023,  1.5132,  ..., -0.9651, -0.2280, -0.2360]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 47, 37, 25, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8125, -1.1590, -1.8353,  ...,  0.1584, -0.2351, -1.2664],
        [ 0.4902,  1.2692, -1.1388,  ...,  1.2020,  2.6629,  0.6916],
        [ 0.4042,  0.0684, -1.4325,  ...,  0.3095, -1.5126,  0.5172],
        ...,
        [-1.6785,  0.2028,  1.5131,  ..., -0.9649, -0.2276, -0.2362],
        [-1.6785,  0.2028,  1.5131,  ..., -0.9649, -0.2276, -0.2362],
        [-1

PackedSequence(data=tensor([[ 0.8128, -1.1593, -1.8376,  ...,  0.1596, -0.2353, -1.2650],
        [ 0.1027,  2.0280,  1.3582,  ..., -1.2247, -0.6968,  1.2698],
        [ 0.8704, -0.6391,  0.6139,  ...,  0.2558, -1.4583, -0.9078],
        ...,
        [-1.6803,  0.2000,  1.5118,  ..., -0.9639, -0.2294, -0.2353],
        [-1.6803,  0.2000,  1.5118,  ..., -0.9639, -0.2294, -0.2353],
        [-1.6803,  0.2000,  1.5118,  ..., -0.9639, -0.2294, -0.2353]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 48, 34, 21, 12,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8705, -0.6392,  0.6138,  ...,  0.2559, -1.4582, -0.9080],
        [ 0.8128, -1.1594, -1.8378,  ...,  0.1598, -0.2352, -1.2649],
        [ 0.8705, -0.6392,  0.6138,  ...,  0.2559, -1.4582, -0.9080],
        ...,
        [-1.6802,  0.1999,  1.5121,  ..., -0.9636, -0.2290, -0.2351],
        [-1.6802,  0.1999,  1.5121,  ..., -0.9636, -0.2290, -0.2351],
        [-1

PackedSequence(data=tensor([[ 0.8697, -0.6428,  0.6080,  ...,  0.2585, -1.4528, -0.9053],
        [-0.2375,  1.7923, -0.9788,  ...,  0.3444, -2.1498,  1.5526],
        [ 0.4903,  1.2724, -1.1408,  ...,  1.1989,  2.6647,  0.6917],
        ...,
        [-1.6782,  0.2020,  1.5128,  ..., -0.9635, -0.2304, -0.2377],
        [-1.6782,  0.2020,  1.5128,  ..., -0.9635, -0.2304, -0.2377],
        [-1.6782,  0.2020,  1.5128,  ..., -0.9635, -0.2304, -0.2377]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 57, 47, 35, 17,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8695, -0.6424,  0.6071,  ...,  0.2589, -1.4530, -0.9053],
        [ 1.0254,  0.6591,  0.3528,  ...,  1.8616,  0.1141,  1.3390],
        [ 0.8695, -0.6424,  0.6071,  ...,  0.2589, -1.4530, -0.9053],
        ...,
        [-1.6780,  0.2019,  1.5121,  ..., -0.9638, -0.2312, -0.2380],
        [-1.6780,  0.2019,  1.5121,  ..., -0.9638, -0.2312, -0.2380],
        [-1

PackedSequence(data=tensor([[ 0.8146, -1.1594, -1.8388,  ...,  0.1610, -0.2361, -1.2653],
        [ 0.8674, -0.6396,  0.6022,  ...,  0.2598, -1.4535, -0.9060],
        [ 0.8674, -0.6396,  0.6022,  ...,  0.2598, -1.4535, -0.9060],
        ...,
        [-1.6768,  0.2003,  1.5122,  ..., -0.9662, -0.2294, -0.2371],
        [-1.6768,  0.2003,  1.5122,  ..., -0.9662, -0.2294, -0.2371],
        [-1.6768,  0.2003,  1.5122,  ..., -0.9662, -0.2294, -0.2371]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 41, 26, 11,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8147, -1.1595, -1.8389,  ...,  0.1612, -0.2361, -1.2653],
        [ 0.8673, -0.6395,  0.6021,  ...,  0.2600, -1.4534, -0.9060],
        [ 0.4092,  0.0624, -1.4289,  ...,  0.3068, -1.5116,  0.5132],
        ...,
        [-1.6767,  0.2002,  1.5123,  ..., -0.9663, -0.2295, -0.2369],
        [-1.6767,  0.2002,  1.5123,  ..., -0.9663, -0.2295, -0.2369],
        [-1

PackedSequence(data=tensor([[ 0.4942,  1.2746, -1.1394,  ...,  1.1999,  2.6728,  0.6890],
        [ 0.8167, -1.1651, -1.8431,  ...,  0.1684, -0.2358, -1.2750],
        [ 0.8673, -0.6397,  0.6026,  ...,  0.2629, -1.4516, -0.9074],
        ...,
        [-1.6772,  0.2003,  1.5132,  ..., -0.9667, -0.2323, -0.2368],
        [-1.6772,  0.2003,  1.5132,  ..., -0.9667, -0.2323, -0.2368],
        [-1.6772,  0.2003,  1.5132,  ..., -0.9667, -0.2323, -0.2368]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 44, 31, 18,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4942,  1.2747, -1.1395,  ...,  1.2001,  2.6728,  0.6891],
        [ 0.8673, -0.6397,  0.6028,  ...,  0.2629, -1.4515, -0.9075],
        [-0.2418,  1.7920, -0.9776,  ...,  0.3438, -2.1527,  1.5567],
        ...,
        [-1.6769,  0.2000,  1.5136,  ..., -0.9669, -0.2324, -0.2367],
        [-1.6769,  0.2000,  1.5136,  ..., -0.9669, -0.2324, -0.2367],
        [-1

PackedSequence(data=tensor([[ 0.8679, -0.6380,  0.6041,  ...,  0.2626, -1.4507, -0.9084],
        [ 0.8679, -0.6380,  0.6041,  ...,  0.2626, -1.4507, -0.9084],
        [ 1.0371, -2.0550,  0.1403,  ...,  0.6755,  1.4637,  1.3948],
        ...,
        [-1.6755,  0.1989,  1.5161,  ..., -0.9663, -0.2316, -0.2350],
        [-1.6755,  0.1989,  1.5161,  ..., -0.9663, -0.2316, -0.2350],
        [-1.6755,  0.1989,  1.5161,  ..., -0.9663, -0.2316, -0.2350]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 35, 27, 16,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8192, -1.1662, -1.8435,  ...,  0.1700, -0.2337, -1.2791],
        [ 0.8679, -0.6378,  0.6041,  ...,  0.2625, -1.4507, -0.9084],
        [ 0.8192, -1.1662, -1.8435,  ...,  0.1700, -0.2337, -1.2791],
        ...,
        [-1.6755,  0.1989,  1.5166,  ..., -0.9662, -0.2313, -0.2348],
        [-1.6755,  0.1989,  1.5166,  ..., -0.9662, -0.2313, -0.2348],
        [-1

PackedSequence(data=tensor([[ 0.8694, -0.6354,  0.6025,  ...,  0.2659, -1.4480, -0.9081],
        [-0.2448,  0.1094, -0.1385,  ..., -0.4739, -0.6561,  1.5955],
        [ 0.8694, -0.6354,  0.6025,  ...,  0.2659, -1.4480, -0.9081],
        ...,
        [-1.6799,  0.1973,  1.5190,  ..., -0.9645, -0.2306, -0.2329],
        [-1.6799,  0.1973,  1.5190,  ..., -0.9645, -0.2306, -0.2329],
        [-1.6799,  0.1973,  1.5190,  ..., -0.9645, -0.2306, -0.2329]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 50, 37, 24, 10,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8695, -0.6352,  0.6022,  ...,  0.2660, -1.4476, -0.9082],
        [-0.2380,  1.7873, -0.9806,  ...,  0.3446, -2.1538,  1.5567],
        [ 1.0280,  0.6587,  0.3631,  ...,  1.8603,  0.1170,  1.3326],
        ...,
        [-1.6800,  0.1970,  1.5190,  ..., -0.9648, -0.2307, -0.2330],
        [-1.6800,  0.1970,  1.5190,  ..., -0.9648, -0.2307, -0.2330],
        [-1

PackedSequence(data=tensor([[ 0.8720, -0.6345,  0.6018,  ...,  0.2688, -1.4473, -0.9107],
        [ 0.5027,  1.2722, -1.1366,  ...,  1.2061,  2.6647,  0.6968],
        [ 0.8268, -1.1640, -1.8355,  ...,  0.1707, -0.2346, -1.2781],
        ...,
        [-1.6771,  0.1998,  1.5172,  ..., -0.9641, -0.2303, -0.2385],
        [-1.6771,  0.1998,  1.5172,  ..., -0.9641, -0.2303, -0.2385],
        [-1.6771,  0.1998,  1.5172,  ..., -0.9641, -0.2303, -0.2385]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([42, 42, 42, 42, 41, 37, 25, 17, 11,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8268, -1.1640, -1.8356,  ...,  0.1708, -0.2347, -1.2779],
        [ 0.8723, -0.6348,  0.6019,  ...,  0.2689, -1.4473, -0.9108],
        [ 0.8723, -0.6348,  0.6019,  ...,  0.2689, -1.4473, -0.9108],
        ...,
        [-1.6770,  0.1994,  1.5174,  ..., -0.9641, -0.2298, -0.2381],
        [-1.6770,  0.1994,  1.5174,  ..., -0.9641, -0.2298, -0.2381],
        [-1

PackedSequence(data=tensor([[ 1.0246,  0.6613,  0.3564,  ...,  1.8572,  0.1145,  1.3356],
        [ 0.8745, -0.6379,  0.6020,  ...,  0.2697, -1.4462, -0.9120],
        [ 0.8745, -0.6379,  0.6020,  ...,  0.2697, -1.4462, -0.9120],
        ...,
        [-1.6806,  0.1980,  1.5141,  ..., -0.9620, -0.2285, -0.2355],
        [-1.6806,  0.1980,  1.5141,  ..., -0.9620, -0.2285, -0.2355],
        [-1.6806,  0.1980,  1.5141,  ..., -0.9620, -0.2285, -0.2355]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 43, 27, 17,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2463,  0.1058, -0.1366,  ..., -0.4751, -0.6587,  1.5967],
        [ 1.0247,  0.6612,  0.3563,  ...,  1.8571,  0.1145,  1.3356],
        [ 1.0247,  0.6612,  0.3563,  ...,  1.8571,  0.1145,  1.3356],
        ...,
        [-1.6808,  0.1980,  1.5142,  ..., -0.9619, -0.2286, -0.2355],
        [-1.6808,  0.1980,  1.5142,  ..., -0.9619, -0.2286, -0.2355],
        [-1

PackedSequence(data=tensor([[ 0.5167,  1.2574, -1.1290,  ...,  1.2035,  2.6597,  0.6967],
        [ 0.8747, -0.6379,  0.6016,  ...,  0.2683, -1.4454, -0.9105],
        [ 0.8747, -0.6379,  0.6016,  ...,  0.2683, -1.4454, -0.9105],
        ...,
        [-1.6798,  0.1981,  1.5176,  ..., -0.9636, -0.2270, -0.2348],
        [-1.6798,  0.1981,  1.5176,  ..., -0.9636, -0.2270, -0.2348],
        [-1.6798,  0.1981,  1.5176,  ..., -0.9636, -0.2270, -0.2348]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 47, 32, 18, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8749, -0.6380,  0.6015,  ...,  0.2683, -1.4455, -0.9105],
        [ 1.0254,  0.6611,  0.3554,  ...,  1.8578,  0.1134,  1.3365],
        [ 0.8749, -0.6380,  0.6015,  ...,  0.2683, -1.4455, -0.9105],
        ...,
        [-1.6799,  0.1983,  1.5180,  ..., -0.9636, -0.2265, -0.2345],
        [-1.6799,  0.1983,  1.5180,  ..., -0.9636, -0.2265, -0.2345],
        [-1

PackedSequence(data=tensor([[ 0.8720, -0.6387,  0.6013,  ...,  0.2672, -1.4490, -0.9090],
        [ 0.8720, -0.6387,  0.6013,  ...,  0.2672, -1.4490, -0.9090],
        [ 0.3936,  0.0653, -1.4420,  ...,  0.3131, -1.5131,  0.5091],
        ...,
        [-1.6858,  0.1977,  1.5212,  ..., -0.9645, -0.2272, -0.2352],
        [-1.6858,  0.1977,  1.5212,  ..., -0.9645, -0.2272, -0.2352],
        [-1.6858,  0.1977,  1.5212,  ..., -0.9645, -0.2272, -0.2352]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 56, 42, 33, 14,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8719, -0.6386,  0.6013,  ...,  0.2673, -1.4491, -0.9090],
        [ 0.8719, -0.6386,  0.6013,  ...,  0.2673, -1.4491, -0.9090],
        [ 0.8719, -0.6386,  0.6013,  ...,  0.2673, -1.4491, -0.9090],
        ...,
        [-1.6864,  0.1974,  1.5212,  ..., -0.9646, -0.2273, -0.2352],
        [-1.6864,  0.1974,  1.5212,  ..., -0.9646, -0.2273, -0.2352],
        [-1

PackedSequence(data=tensor([[ 0.8723, -0.6381,  0.6007,  ...,  0.2652, -1.4495, -0.9120],
        [ 0.8256, -1.1574, -1.8425,  ...,  0.1642, -0.2411, -1.2840],
        [ 0.8723, -0.6381,  0.6007,  ...,  0.2652, -1.4495, -0.9120],
        ...,
        [-1.6864,  0.1969,  1.5224,  ..., -0.9606, -0.2283, -0.2362],
        [-1.6864,  0.1969,  1.5224,  ..., -0.9606, -0.2283, -0.2362],
        [-1.6864,  0.1969,  1.5224,  ..., -0.9606, -0.2283, -0.2362]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 41, 30, 18,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8726, -0.6381,  0.6007,  ...,  0.2651, -1.4496, -0.9124],
        [ 0.8726, -0.6381,  0.6007,  ...,  0.2651, -1.4496, -0.9124],
        [ 0.5097,  1.2585, -1.1296,  ...,  1.2025,  2.6591,  0.6958],
        ...,
        [-1.6866,  0.1971,  1.5222,  ..., -0.9606, -0.2281, -0.2360],
        [-1.6866,  0.1971,  1.5222,  ..., -0.9606, -0.2281, -0.2360],
        [-1

PackedSequence(data=tensor([[ 0.8723, -0.6409,  0.6007,  ...,  0.2666, -1.4474, -0.9139],
        [ 0.8216, -1.1544, -1.8464,  ...,  0.1629, -0.2391, -1.2882],
        [ 0.8723, -0.6409,  0.6007,  ...,  0.2666, -1.4474, -0.9139],
        ...,
        [-1.6841,  0.1984,  1.5214,  ..., -0.9612, -0.2288, -0.2362],
        [-1.6841,  0.1984,  1.5214,  ..., -0.9612, -0.2288, -0.2362],
        [-1.6841,  0.1984,  1.5214,  ..., -0.9612, -0.2288, -0.2362]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 43, 34, 18,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8215, -1.1543, -1.8466,  ...,  0.1629, -0.2390, -1.2884],
        [ 0.8723, -0.6409,  0.6009,  ...,  0.2664, -1.4479, -0.9135],
        [ 1.0301,  0.6550,  0.3570,  ...,  1.8657,  0.1020,  1.3294],
        ...,
        [-1.6839,  0.1985,  1.5217,  ..., -0.9612, -0.2290, -0.2364],
        [-1.6839,  0.1985,  1.5217,  ..., -0.9612, -0.2290, -0.2364],
        [-1

PackedSequence(data=tensor([[ 0.8708, -0.6424,  0.6018,  ...,  0.2654, -1.4500, -0.9115],
        [ 0.8708, -0.6424,  0.6018,  ...,  0.2654, -1.4500, -0.9115],
        [ 0.8708, -0.6424,  0.6018,  ...,  0.2654, -1.4500, -0.9115],
        ...,
        [-1.6862,  0.1994,  1.5199,  ..., -0.9608, -0.2282, -0.2346],
        [-1.6862,  0.1994,  1.5199,  ..., -0.9608, -0.2282, -0.2346],
        [-1.6862,  0.1994,  1.5199,  ..., -0.9608, -0.2282, -0.2346]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 57, 47, 30, 16,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.3917,  0.0675, -1.4438,  ...,  0.3131, -1.5145,  0.5135],
        [ 1.0352, -2.0527,  0.1399,  ...,  0.6714,  1.4753,  1.3965],
        [ 1.0307,  0.6532,  0.3559,  ...,  1.8665,  0.1007,  1.3273],
        ...,
        [-1.6860,  0.1994,  1.5195,  ..., -0.9607, -0.2285, -0.2348],
        [-1.6860,  0.1994,  1.5195,  ..., -0.9607, -0.2285, -0.2348],
        [-1

PackedSequence(data=tensor([[-0.2378,  0.1068, -0.1387,  ..., -0.4721, -0.6572,  1.5979],
        [-0.2378,  0.1068, -0.1387,  ..., -0.4721, -0.6572,  1.5979],
        [ 0.8715, -0.6466,  0.5991,  ...,  0.2670, -1.4502, -0.9129],
        ...,
        [-1.6849,  0.1979,  1.5145,  ..., -0.9624, -0.2311, -0.2352],
        [-1.6849,  0.1979,  1.5145,  ..., -0.9624, -0.2311, -0.2352],
        [-1.6849,  0.1979,  1.5145,  ..., -0.9624, -0.2311, -0.2352]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 41, 28, 12,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8718, -0.6470,  0.5990,  ...,  0.2672, -1.4500, -0.9129],
        [ 1.0308,  0.6541,  0.3556,  ...,  1.8651,  0.1006,  1.3294],
        [ 0.8718, -0.6470,  0.5990,  ...,  0.2672, -1.4500, -0.9129],
        ...,
        [-1.6849,  0.1983,  1.5142,  ..., -0.9624, -0.2313, -0.2355],
        [-1.6849,  0.1983,  1.5142,  ..., -0.9624, -0.2313, -0.2355],
        [-1

PackedSequence(data=tensor([[ 1.0316,  0.6547,  0.3537,  ...,  1.8652,  0.1026,  1.3322],
        [ 1.0316,  0.6547,  0.3537,  ...,  1.8652,  0.1026,  1.3322],
        [ 0.5075,  1.2638, -1.1343,  ...,  1.1965,  2.6529,  0.6988],
        ...,
        [-1.6817,  0.1987,  1.5158,  ..., -0.9638, -0.2362, -0.2359],
        [-1.6817,  0.1987,  1.5158,  ..., -0.9638, -0.2362, -0.2359],
        [-1.6817,  0.1987,  1.5158,  ..., -0.9638, -0.2362, -0.2359]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 38, 25, 13,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-1.4404,  0.7847, -1.8121,  ..., -0.4666,  0.0339,  0.2464],
        [ 0.8755, -0.6479,  0.5937,  ...,  0.2673, -1.4457, -0.9088],
        [ 0.8755, -0.6479,  0.5937,  ...,  0.2673, -1.4457, -0.9088],
        ...,
        [-1.6818,  0.1985,  1.5162,  ..., -0.9636, -0.2360, -0.2355],
        [-1.6818,  0.1985,  1.5162,  ..., -0.9636, -0.2360, -0.2355],
        [-1

PackedSequence(data=tensor([[ 0.8754, -0.6459,  0.5909,  ...,  0.2672, -1.4440, -0.9071],
        [ 0.8754, -0.6459,  0.5909,  ...,  0.2672, -1.4440, -0.9071],
        [ 0.8754, -0.6459,  0.5909,  ...,  0.2672, -1.4440, -0.9071],
        ...,
        [-1.6847,  0.1986,  1.5174,  ..., -0.9606, -0.2329, -0.2322],
        [-1.6847,  0.1986,  1.5174,  ..., -0.9606, -0.2329, -0.2322],
        [-1.6847,  0.1986,  1.5174,  ..., -0.9606, -0.2329, -0.2322]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 40, 24, 14, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0325, -2.0540,  0.1392,  ...,  0.6755,  1.4788,  1.3940],
        [ 0.8752, -0.6457,  0.5909,  ...,  0.2675, -1.4439, -0.9069],
        [ 0.8752, -0.6457,  0.5909,  ...,  0.2675, -1.4439, -0.9069],
        ...,
        [-1.6851,  0.1988,  1.5174,  ..., -0.9601, -0.2326, -0.2320],
        [-1.6851,  0.1988,  1.5174,  ..., -0.9601, -0.2326, -0.2320],
        [-1

PackedSequence(data=tensor([[-0.2253,  1.7784, -0.9925,  ...,  0.3462, -2.1506,  1.5618],
        [ 0.8729, -0.6447,  0.5922,  ...,  0.2708, -1.4430, -0.9023],
        [ 0.3875,  0.0656, -1.4403,  ...,  0.3160, -1.5176,  0.5252],
        ...,
        [-1.6851,  0.1964,  1.5173,  ..., -0.9575, -0.2318, -0.2314],
        [-1.6851,  0.1964,  1.5173,  ..., -0.9575, -0.2318, -0.2314],
        [-1.6851,  0.1964,  1.5173,  ..., -0.9575, -0.2318, -0.2314]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 37, 25, 14,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0303,  0.6561,  0.3535,  ...,  1.8608,  0.1021,  1.3383],
        [ 0.0221, -0.7770,  0.1993,  ...,  1.1520, -0.6091,  0.9313],
        [ 1.0303,  0.6561,  0.3535,  ...,  1.8608,  0.1021,  1.3383],
        ...,
        [-1.6851,  0.1962,  1.5175,  ..., -0.9576, -0.2317, -0.2312],
        [-1.6851,  0.1962,  1.5175,  ..., -0.9576, -0.2317, -0.2312],
        [-1

PackedSequence(data=tensor([[-0.2447,  0.1125, -0.1383,  ..., -0.4797, -0.6495,  1.6012],
        [-0.6302,  0.6415, -0.4961,  ...,  0.0741,  0.0442, -0.2883],
        [ 0.4704,  1.7093, -0.1656,  ...,  1.3524, -0.0295,  0.6978],
        ...,
        [-1.6868,  0.1971,  1.5173,  ..., -0.9602, -0.2302, -0.2319],
        [-1.6868,  0.1971,  1.5173,  ..., -0.9602, -0.2302, -0.2319],
        [-1.6868,  0.1971,  1.5173,  ..., -0.9602, -0.2302, -0.2319]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 37, 24, 13,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5037,  1.2671, -1.1352,  ...,  1.1924,  2.6500,  0.7043],
        [-0.2446,  0.1127, -0.1385,  ..., -0.4798, -0.6496,  1.6010],
        [ 0.8172, -1.1517, -1.8497,  ...,  0.1574, -0.2313, -1.2859],
        ...,
        [-1.6866,  0.1972,  1.5170,  ..., -0.9601, -0.2306, -0.2322],
        [-1.6866,  0.1972,  1.5170,  ..., -0.9601, -0.2306, -0.2322],
        [-1

PackedSequence(data=tensor([[ 0.8168, -1.1497, -1.8511,  ...,  0.1559, -0.2322, -1.2871],
        [ 1.0319,  0.6529,  0.3453,  ...,  1.8627,  0.1112,  1.3479],
        [ 1.0319,  0.6529,  0.3453,  ...,  1.8627,  0.1112,  1.3479],
        ...,
        [-1.6871,  0.1950,  1.5169,  ..., -0.9602, -0.2323, -0.2304],
        [-1.6871,  0.1950,  1.5169,  ..., -0.9602, -0.2323, -0.2304],
        [-1.6871,  0.1950,  1.5169,  ..., -0.9602, -0.2323, -0.2304]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 41, 30, 12,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.0043, -0.6923,  0.3812,  ...,  0.1232, -0.5476, -0.4753],
        [ 0.8780, -0.6370,  0.5984,  ...,  0.2650, -1.4552, -0.9006],
        [ 1.0321,  0.6526,  0.3448,  ...,  1.8631,  0.1114,  1.3480],
        ...,
        [-1.6872,  0.1950,  1.5170,  ..., -0.9603, -0.2323, -0.2303],
        [-1.6872,  0.1950,  1.5170,  ..., -0.9603, -0.2323, -0.2303],
        [-1

PackedSequence(data=tensor([[ 0.5004,  1.2657, -1.1382,  ...,  1.1922,  2.6480,  0.7048],
        [ 0.8801, -0.6372,  0.5989,  ...,  0.2660, -1.4557, -0.9014],
        [ 0.8801, -0.6372,  0.5989,  ...,  0.2660, -1.4557, -0.9014],
        ...,
        [-1.6864,  0.1968,  1.5160,  ..., -0.9585, -0.2324, -0.2300],
        [-1.6864,  0.1968,  1.5160,  ..., -0.9585, -0.2324, -0.2300],
        [-1.6864,  0.1968,  1.5160,  ..., -0.9585, -0.2324, -0.2300]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 43, 32, 11,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8804, -0.6374,  0.5989,  ...,  0.2661, -1.4555, -0.9016],
        [ 0.5004,  1.2658, -1.1384,  ...,  1.1922,  2.6478,  0.7050],
        [ 0.8804, -0.6374,  0.5989,  ...,  0.2661, -1.4555, -0.9016],
        ...,
        [-1.6867,  0.1969,  1.5158,  ..., -0.9584, -0.2324, -0.2301],
        [-1.6867,  0.1969,  1.5158,  ..., -0.9584, -0.2324, -0.2301],
        [-1

PackedSequence(data=tensor([[ 1.0310,  0.6512,  0.3419,  ...,  1.8618,  0.1094,  1.3494],
        [ 0.3845,  0.0634, -1.4433,  ...,  0.3250, -1.5192,  0.5239],
        [ 1.0310,  0.6512,  0.3419,  ...,  1.8618,  0.1094,  1.3494],
        ...,
        [-1.6855,  0.1970,  1.5154,  ..., -0.9605, -0.2308, -0.2293],
        [-1.6855,  0.1970,  1.5154,  ..., -0.9605, -0.2308, -0.2293],
        [-1.6855,  0.1970,  1.5154,  ..., -0.9605, -0.2308, -0.2293]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 49, 35, 30, 14,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8829, -0.6374,  0.5982,  ...,  0.2675, -1.4518, -0.9029],
        [ 0.8829, -0.6374,  0.5982,  ...,  0.2675, -1.4518, -0.9029],
        [ 0.8169, -1.1433, -1.8505,  ...,  0.1515, -0.2340, -1.2869],
        ...,
        [-1.6857,  0.1969,  1.5153,  ..., -0.9607, -0.2307, -0.2291],
        [-1.6857,  0.1969,  1.5153,  ..., -0.9607, -0.2307, -0.2291],
        [-1

PackedSequence(data=tensor([[ 0.8173, -1.1427, -1.8503,  ...,  0.1519, -0.2328, -1.2868],
        [ 0.3845,  0.0630, -1.4446,  ...,  0.3273, -1.5176,  0.5241],
        [ 0.8831, -0.6397,  0.5953,  ...,  0.2681, -1.4505, -0.9048],
        ...,
        [-1.6866,  0.1951,  1.5167,  ..., -0.9617, -0.2307, -0.2284],
        [-1.6866,  0.1951,  1.5167,  ..., -0.9617, -0.2307, -0.2284],
        [-1.6866,  0.1951,  1.5167,  ..., -0.9617, -0.2307, -0.2284]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 35, 23, 11,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8173, -1.1426, -1.8503,  ...,  0.1519, -0.2329, -1.2869],
        [ 0.8832, -0.6400,  0.5950,  ...,  0.2683, -1.4504, -0.9049],
        [ 0.8832, -0.6400,  0.5950,  ...,  0.2683, -1.4504, -0.9049],
        ...,
        [-1.6865,  0.1950,  1.5167,  ..., -0.9621, -0.2307, -0.2284],
        [-1.6865,  0.1950,  1.5167,  ..., -0.9621, -0.2307, -0.2284],
        [-1

PackedSequence(data=tensor([[ 0.8189, -1.1440, -1.8514,  ...,  0.1542, -0.2331, -1.2877],
        [ 0.5027,  1.2617, -1.1438,  ...,  1.1924,  2.6444,  0.7055],
        [ 0.8827, -0.6457,  0.5949,  ...,  0.2689, -1.4496, -0.9047],
        ...,
        [-1.6857,  0.1959,  1.5180,  ..., -0.9644, -0.2301, -0.2298],
        [-1.6857,  0.1959,  1.5180,  ..., -0.9644, -0.2301, -0.2298],
        [-1.6857,  0.1959,  1.5180,  ..., -0.9644, -0.2301, -0.2298]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 61, 49, 41, 24, 13,  1]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0281,  0.6535,  0.3378,  ...,  1.8587,  0.1177,  1.3522],
        [ 0.8823, -0.6461,  0.5950,  ...,  0.2690, -1.4497, -0.9045],
        [-0.2423,  0.1112, -0.1309,  ..., -0.4807, -0.6494,  1.6003],
        ...,
        [-1.6860,  0.1960,  1.5183,  ..., -0.9641, -0.2301, -0.2298],
        [-1.6860,  0.1960,  1.5183,  ..., -0.9641, -0.2301, -0.2298],
        [-1

PackedSequence(data=tensor([[-0.2419,  0.1136, -0.1359,  ..., -0.4798, -0.6512,  1.6016],
        [-0.2335,  1.7817, -0.9837,  ...,  0.3457, -2.1590,  1.5728],
        [ 0.3913,  0.0617, -1.4499,  ...,  0.3321, -1.5170,  0.5245],
        ...,
        [-1.6904,  0.1967,  1.5154,  ..., -0.9602, -0.2312, -0.2304],
        [-1.6904,  0.1967,  1.5154,  ..., -0.9602, -0.2312, -0.2304],
        [-1.6904,  0.1967,  1.5154,  ..., -0.9602, -0.2312, -0.2304]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 34, 19, 12,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2419,  0.1137, -0.1362,  ..., -0.4798, -0.6513,  1.6016],
        [ 0.8781, -0.6495,  0.5949,  ...,  0.2685, -1.4501, -0.9018],
        [ 0.3920,  0.0618, -1.4497,  ...,  0.3320, -1.5169,  0.5247],
        ...,
        [-1.6904,  0.1964,  1.5155,  ..., -0.9600, -0.2313, -0.2301],
        [-1.6904,  0.1964,  1.5155,  ..., -0.9600, -0.2313, -0.2301],
        [-1

PackedSequence(data=tensor([[ 0.8763, -0.6497,  0.5971,  ...,  0.2662, -1.4494, -0.9008],
        [ 0.8763, -0.6497,  0.5971,  ...,  0.2662, -1.4494, -0.9008],
        [ 0.8763, -0.6497,  0.5971,  ...,  0.2662, -1.4494, -0.9008],
        ...,
        [-1.6873,  0.1953,  1.5195,  ..., -0.9593, -0.2282, -0.2275],
        [-1.6873,  0.1953,  1.5195,  ..., -0.9593, -0.2282, -0.2275],
        [-1.6873,  0.1953,  1.5195,  ..., -0.9593, -0.2282, -0.2275]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 43, 27, 17,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8760, -0.6498,  0.5977,  ...,  0.2658, -1.4492, -0.9005],
        [ 0.8760, -0.6498,  0.5977,  ...,  0.2658, -1.4492, -0.9005],
        [ 0.8760, -0.6498,  0.5977,  ...,  0.2658, -1.4492, -0.9005],
        ...,
        [-1.6870,  0.1953,  1.5197,  ..., -0.9593, -0.2283, -0.2276],
        [-1.6870,  0.1953,  1.5197,  ..., -0.9593, -0.2283, -0.2276],
        [-1

PackedSequence(data=tensor([[ 0.8763, -0.6518,  0.6003,  ...,  0.2635, -1.4489, -0.9011],
        [ 0.8763, -0.6518,  0.6003,  ...,  0.2635, -1.4489, -0.9011],
        [ 1.0334,  0.6507,  0.3329,  ...,  1.8657,  0.1161,  1.3464],
        ...,
        [-1.6852,  0.1974,  1.5176,  ..., -0.9627, -0.2296, -0.2299],
        [-1.6852,  0.1974,  1.5176,  ..., -0.9627, -0.2296, -0.2299],
        [-1.6852,  0.1974,  1.5176,  ..., -0.9627, -0.2296, -0.2299]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 41, 22, 12,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0334,  0.6506,  0.3328,  ...,  1.8657,  0.1159,  1.3463],
        [ 0.4989,  1.2646, -1.1424,  ...,  1.1926,  2.6527,  0.7055],
        [ 0.8224, -1.1492, -1.8537,  ...,  0.1610, -0.2287, -1.2820],
        ...,
        [-1.6851,  0.1973,  1.5176,  ..., -0.9631, -0.2297, -0.2299],
        [-1.6851,  0.1973,  1.5176,  ..., -0.9631, -0.2297, -0.2299],
        [-1

PackedSequence(data=tensor([[-0.2357,  1.7754, -0.9815,  ...,  0.3382, -2.1546,  1.5633],
        [ 0.8775, -0.6523,  0.6001,  ...,  0.2637, -1.4485, -0.8991],
        [ 0.8775, -0.6523,  0.6001,  ...,  0.2637, -1.4485, -0.8991],
        ...,
        [-1.6844,  0.1966,  1.5184,  ..., -0.9662, -0.2306, -0.2295],
        [-1.6844,  0.1966,  1.5184,  ..., -0.9662, -0.2306, -0.2295],
        [-1.6844,  0.1966,  1.5184,  ..., -0.9662, -0.2306, -0.2295]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 59, 43, 35, 19,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4988,  1.2653, -1.1439,  ...,  1.1901,  2.6513,  0.7077],
        [ 0.4988,  1.2653, -1.1439,  ...,  1.1901,  2.6513,  0.7077],
        [ 1.0335,  0.6499,  0.3320,  ...,  1.8656,  0.1157,  1.3458],
        ...,
        [-1.6845,  0.1964,  1.5184,  ..., -0.9664, -0.2308, -0.2295],
        [-1.6845,  0.1964,  1.5184,  ..., -0.9664, -0.2308, -0.2295],
        [-1

PackedSequence(data=tensor([[ 0.3954,  0.0676, -1.4440,  ...,  0.3259, -1.5168,  0.5256],
        [ 0.8780, -0.6539,  0.5995,  ...,  0.2649, -1.4478, -0.8999],
        [ 0.3954,  0.0676, -1.4440,  ...,  0.3259, -1.5168,  0.5256],
        ...,
        [-1.6865,  0.1957,  1.5172,  ..., -0.9654, -0.2303, -0.2281],
        [-1.6865,  0.1957,  1.5172,  ..., -0.9654, -0.2303, -0.2281],
        [-1.6865,  0.1957,  1.5172,  ..., -0.9654, -0.2303, -0.2281]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 43, 30, 18,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8209, -1.1495, -1.8570,  ...,  0.1566, -0.2281, -1.2783],
        [ 0.8780, -0.6540,  0.5993,  ...,  0.2650, -1.4478, -0.9001],
        [ 0.8780, -0.6540,  0.5993,  ...,  0.2650, -1.4478, -0.9001],
        ...,
        [-1.6865,  0.1957,  1.5169,  ..., -0.9655, -0.2304, -0.2282],
        [-1.6865,  0.1957,  1.5169,  ..., -0.9655, -0.2304, -0.2282],
        [-1

PackedSequence(data=tensor([[ 0.8811, -0.6533,  0.5946,  ...,  0.2699, -1.4482, -0.9029],
        [ 0.0058, -0.6946,  0.3839,  ...,  0.1294, -0.5498, -0.4654],
        [ 0.8811, -0.6533,  0.5946,  ...,  0.2699, -1.4482, -0.9029],
        ...,
        [-1.6868,  0.1949,  1.5157,  ..., -0.9660, -0.2307, -0.2275],
        [-1.6868,  0.1949,  1.5157,  ..., -0.9660, -0.2307, -0.2275],
        [-1.6868,  0.1949,  1.5157,  ..., -0.9660, -0.2307, -0.2275]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 37, 30, 14,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2363,  0.1139, -0.1407,  ..., -0.4791, -0.6520,  1.5991],
        [ 0.8813, -0.6532,  0.5942,  ...,  0.2703, -1.4481, -0.9032],
        [ 0.3955,  0.0683, -1.4433,  ...,  0.3262, -1.5172,  0.5246],
        ...,
        [-1.6869,  0.1951,  1.5159,  ..., -0.9658, -0.2306, -0.2274],
        [-1.6869,  0.1951,  1.5159,  ..., -0.9658, -0.2306, -0.2274],
        [-1

PackedSequence(data=tensor([[ 0.8858, -0.6500,  0.5895,  ...,  0.2716, -1.4473, -0.9057],
        [ 0.8261, -1.1532, -1.8542,  ...,  0.1607, -0.2325, -1.2841],
        [ 0.5027,  1.2613, -1.1441,  ...,  1.1957,  2.6481,  0.7103],
        ...,
        [-1.6904,  0.1963,  1.5164,  ..., -0.9648, -0.2321, -0.2277],
        [-1.6904,  0.1963,  1.5164,  ..., -0.9648, -0.2321, -0.2277],
        [-1.6904,  0.1963,  1.5164,  ..., -0.9648, -0.2321, -0.2277]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 49, 40, 29, 15,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8861, -0.6497,  0.5895,  ...,  0.2718, -1.4471, -0.9059],
        [ 0.8262, -1.1532, -1.8541,  ...,  0.1607, -0.2324, -1.2841],
        [ 0.3957,  0.0675, -1.4440,  ...,  0.3276, -1.5165,  0.5239],
        ...,
        [-1.6904,  0.1964,  1.5166,  ..., -0.9649, -0.2322, -0.2277],
        [-1.6904,  0.1964,  1.5166,  ..., -0.9649, -0.2322, -0.2277],
        [-1

PackedSequence(data=tensor([[ 0.3940,  0.0679, -1.4435,  ...,  0.3275, -1.5152,  0.5244],
        [-0.2265,  1.7775, -0.9846,  ...,  0.3452, -2.1663,  1.5655],
        [-0.2265,  1.7775, -0.9846,  ...,  0.3452, -2.1663,  1.5655],
        ...,
        [-1.6908,  0.1950,  1.5174,  ..., -0.9659, -0.2316, -0.2282],
        [-1.6908,  0.1950,  1.5174,  ..., -0.9659, -0.2316, -0.2282],
        [-1.6908,  0.1950,  1.5174,  ..., -0.9659, -0.2316, -0.2282]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 44, 32, 18,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8269, -1.1549, -1.8544,  ...,  0.1623, -0.2321, -1.2829],
        [ 0.8867, -0.6473,  0.5889,  ...,  0.2722, -1.4456, -0.9073],
        [ 0.5041,  1.2615, -1.1385,  ...,  1.1979,  2.6472,  0.7105],
        ...,
        [-1.6908,  0.1950,  1.5173,  ..., -0.9657, -0.2315, -0.2282],
        [-1.6908,  0.1950,  1.5173,  ..., -0.9657, -0.2315, -0.2282],
        [-1

PackedSequence(data=tensor([[-0.2276,  1.7779, -0.9838,  ...,  0.3442, -2.1679,  1.5666],
        [ 0.8832, -0.6467,  0.5916,  ...,  0.2715, -1.4470, -0.9047],
        [ 1.0339,  0.6470,  0.3339,  ...,  1.8701,  0.1228,  1.3428],
        ...,
        [-1.6920,  0.1942,  1.5170,  ..., -0.9631, -0.2310, -0.2280],
        [-1.6920,  0.1942,  1.5170,  ..., -0.9631, -0.2310, -0.2280],
        [-1.6920,  0.1942,  1.5170,  ..., -0.9631, -0.2310, -0.2280]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 45, 27, 16,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2277,  1.7778, -0.9838,  ...,  0.3442, -2.1679,  1.5667],
        [-0.2354,  0.1091, -0.1394,  ..., -0.4735, -0.6485,  1.5976],
        [ 0.5041,  1.2623, -1.1370,  ...,  1.1977,  2.6456,  0.7108],
        ...,
        [-1.6921,  0.1943,  1.5171,  ..., -0.9630, -0.2310, -0.2282],
        [-1.6921,  0.1943,  1.5171,  ..., -0.9630, -0.2310, -0.2282],
        [-1

PackedSequence(data=tensor([[ 0.8757, -0.6459,  0.5975,  ...,  0.2742, -1.4423, -0.9075],
        [-0.2350,  0.1079, -0.1390,  ..., -0.4725, -0.6480,  1.5955],
        [ 0.8757, -0.6459,  0.5975,  ...,  0.2742, -1.4423, -0.9075],
        ...,
        [-1.6922,  0.1973,  1.5178,  ..., -0.9633, -0.2281, -0.2279],
        [-1.6922,  0.1973,  1.5178,  ..., -0.9633, -0.2281, -0.2279],
        [-1.6922,  0.1973,  1.5178,  ..., -0.9633, -0.2281, -0.2279]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 54, 40, 28, 12,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4013,  0.0621, -1.4422,  ...,  0.3294, -1.5107,  0.5279],
        [ 0.8752, -0.6459,  0.5977,  ...,  0.2741, -1.4420, -0.9077],
        [ 0.8251, -1.1508, -1.8510,  ...,  0.1631, -0.2308, -1.2809],
        ...,
        [-1.6923,  0.1973,  1.5176,  ..., -0.9632, -0.2283, -0.2279],
        [-1.6923,  0.1973,  1.5176,  ..., -0.9632, -0.2283, -0.2279],
        [-1

PackedSequence(data=tensor([[ 1.0354, -2.0760,  0.1381,  ...,  0.6731,  1.4818,  1.3691],
        [ 1.0369,  0.6465,  0.3332,  ...,  1.8717,  0.1223,  1.3458],
        [ 0.5034,  1.2649, -1.1358,  ...,  1.1951,  2.6462,  0.7093],
        ...,
        [-1.6911,  0.1957,  1.5167,  ..., -0.9640, -0.2292, -0.2269],
        [-1.6911,  0.1957,  1.5167,  ..., -0.9640, -0.2292, -0.2269],
        [-1.6911,  0.1957,  1.5167,  ..., -0.9640, -0.2292, -0.2269]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 43, 27, 10,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8253, -1.1517, -1.8511,  ...,  0.1646, -0.2315, -1.2807],
        [ 0.8728, -0.6472,  0.5980,  ...,  0.2754, -1.4415, -0.9119],
        [ 0.8728, -0.6472,  0.5980,  ...,  0.2754, -1.4415, -0.9119],
        ...,
        [-1.6908,  0.1956,  1.5169,  ..., -0.9640, -0.2290, -0.2266],
        [-1.6908,  0.1956,  1.5169,  ..., -0.9640, -0.2290, -0.2266],
        [-1

PackedSequence(data=tensor([[ 0.8726, -0.6472,  0.5984,  ...,  0.2754, -1.4429, -0.9144],
        [ 0.8726, -0.6472,  0.5984,  ...,  0.2754, -1.4429, -0.9144],
        [ 0.8726, -0.6472,  0.5984,  ...,  0.2754, -1.4429, -0.9144],
        ...,
        [-1.6903,  0.1961,  1.5182,  ..., -0.9661, -0.2274, -0.2252],
        [-1.6903,  0.1961,  1.5182,  ..., -0.9661, -0.2274, -0.2252],
        [-1.6903,  0.1961,  1.5182,  ..., -0.9661, -0.2274, -0.2252]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 48, 27, 14,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8726, -0.6471,  0.5985,  ...,  0.2753, -1.4430, -0.9146],
        [ 0.8726, -0.6471,  0.5985,  ...,  0.2753, -1.4430, -0.9146],
        [ 1.0378,  0.6467,  0.3322,  ...,  1.8724,  0.1214,  1.3473],
        ...,
        [-1.6905,  0.1957,  1.5184,  ..., -0.9663, -0.2274, -0.2248],
        [-1.6905,  0.1957,  1.5184,  ..., -0.9663, -0.2274, -0.2248],
        [-1

PackedSequence(data=tensor([[ 0.8723, -0.6475,  0.5990,  ...,  0.2763, -1.4419, -0.9158],
        [ 0.8723, -0.6475,  0.5990,  ...,  0.2763, -1.4419, -0.9158],
        [ 0.8248, -1.1517, -1.8508,  ...,  0.1662, -0.2301, -1.2787],
        ...,
        [-1.6907,  0.1935,  1.5207,  ..., -0.9648, -0.2303, -0.2265],
        [-1.6907,  0.1935,  1.5207,  ..., -0.9648, -0.2303, -0.2265],
        [-1.6907,  0.1935,  1.5207,  ..., -0.9648, -0.2303, -0.2265]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 42, 31, 16,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2421,  0.1096, -0.1380,  ..., -0.4755, -0.6455,  1.6024],
        [ 1.0388,  0.6482,  0.3397,  ...,  1.8682,  0.1163,  1.3401],
        [ 0.8724, -0.6476,  0.5989,  ...,  0.2766, -1.4417, -0.9159],
        ...,
        [-1.6907,  0.1934,  1.5207,  ..., -0.9648, -0.2304, -0.2268],
        [-1.6907,  0.1934,  1.5207,  ..., -0.9648, -0.2304, -0.2268],
        [-1

PackedSequence(data=tensor([[ 0.8745, -0.6496,  0.5963,  ...,  0.2795, -1.4404, -0.9168],
        [ 0.5048,  1.2664, -1.1398,  ...,  1.1947,  2.6475,  0.7113],
        [-0.2420,  0.1083, -0.1380,  ..., -0.4753, -0.6486,  1.6035],
        ...,
        [-1.6901,  0.1939,  1.5184,  ..., -0.9643, -0.2305, -0.2280],
        [-1.6901,  0.1939,  1.5184,  ..., -0.9643, -0.2305, -0.2280],
        [-1.6901,  0.1939,  1.5184,  ..., -0.9643, -0.2305, -0.2280]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 61, 56, 35, 24, 15,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8747, -0.6497,  0.5960,  ...,  0.2796, -1.4404, -0.9170],
        [-0.2419,  0.1083, -0.1381,  ..., -0.4753, -0.6488,  1.6035],
        [ 0.8241, -1.1520, -1.8489,  ...,  0.1662, -0.2297, -1.2789],
        ...,
        [-1.6902,  0.1938,  1.5183,  ..., -0.9643, -0.2302, -0.2277],
        [-1.6902,  0.1938,  1.5183,  ..., -0.9643, -0.2302, -0.2277],
        [-1

PackedSequence(data=tensor([[ 0.8241, -1.1534, -1.8490,  ...,  0.1668, -0.2304, -1.2800],
        [ 0.8757, -0.6505,  0.5955,  ...,  0.2830, -1.4381, -0.9155],
        [ 1.0378,  0.6499,  0.3398,  ...,  1.8680,  0.1169,  1.3414],
        ...,
        [-1.6876,  0.1944,  1.5168,  ..., -0.9662, -0.2288, -0.2282],
        [-1.6876,  0.1944,  1.5168,  ..., -0.9662, -0.2288, -0.2282],
        [-1.6876,  0.1944,  1.5168,  ..., -0.9662, -0.2288, -0.2282]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 61, 53, 41, 17,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8241, -1.1535, -1.8491,  ...,  0.1669, -0.2304, -1.2800],
        [ 0.3958,  0.0623, -1.4407,  ...,  0.3268, -1.5094,  0.5249],
        [ 0.3958,  0.0623, -1.4407,  ...,  0.3268, -1.5094,  0.5249],
        ...,
        [-1.6876,  0.1945,  1.5166,  ..., -0.9663, -0.2291, -0.2285],
        [-1.6876,  0.1945,  1.5166,  ..., -0.9663, -0.2291, -0.2285],
        [-1

PackedSequence(data=tensor([[ 0.8248, -1.1563, -1.8505,  ...,  0.1685, -0.2287, -1.2797],
        [ 0.3950,  0.0601, -1.4403,  ...,  0.3297, -1.5081,  0.5271],
        [-0.2266,  1.7819, -0.9789,  ...,  0.3446, -2.1710,  1.5607],
        ...,
        [-1.6876,  0.1956,  1.5177,  ..., -0.9641, -0.2293, -0.2296],
        [-1.6876,  0.1956,  1.5177,  ..., -0.9641, -0.2293, -0.2296],
        [-1.6876,  0.1956,  1.5177,  ..., -0.9641, -0.2293, -0.2296]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 52, 40, 30, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8250, -1.1566, -1.8505,  ...,  0.1687, -0.2288, -1.2797],
        [ 0.8777, -0.6488,  0.5951,  ...,  0.2826, -1.4379, -0.9133],
        [ 0.5084,  1.2643, -1.1408,  ...,  1.1993,  2.6504,  0.7074],
        ...,
        [-1.6881,  0.1960,  1.5175,  ..., -0.9635, -0.2291, -0.2295],
        [-1.6881,  0.1960,  1.5175,  ..., -0.9635, -0.2291, -0.2295],
        [-1

PackedSequence(data=tensor([[ 0.8777, -0.6470,  0.5943,  ...,  0.2813, -1.4377, -0.9117],
        [ 0.8777, -0.6470,  0.5943,  ...,  0.2813, -1.4377, -0.9117],
        [ 0.8269, -1.1553, -1.8491,  ...,  0.1681, -0.2288, -1.2769],
        ...,
        [-1.6902,  0.1947,  1.5170,  ..., -0.9624, -0.2292, -0.2282],
        [-1.6902,  0.1947,  1.5170,  ..., -0.9624, -0.2292, -0.2282],
        [-1.6902,  0.1947,  1.5170,  ..., -0.9624, -0.2292, -0.2282]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 40, 27, 13,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8269, -1.1552, -1.8491,  ...,  0.1679, -0.2289, -1.2768],
        [ 0.8776, -0.6469,  0.5942,  ...,  0.2811, -1.4377, -0.9117],
        [ 1.0385,  0.6480,  0.3329,  ...,  1.8741,  0.1160,  1.3446],
        ...,
        [-1.6897,  0.1945,  1.5171,  ..., -0.9629, -0.2293, -0.2281],
        [-1.6897,  0.1945,  1.5171,  ..., -0.9629, -0.2293, -0.2281],
        [-1

PackedSequence(data=tensor([[ 0.5060,  1.2646, -1.1378,  ...,  1.1997,  2.6478,  0.7080],
        [ 0.3960,  0.0598, -1.4416,  ...,  0.3320, -1.5052,  0.5272],
        [ 0.0173, -0.7742,  0.1946,  ...,  1.1556, -0.6078,  0.9334],
        ...,
        [-1.6872,  0.1970,  1.5158,  ..., -0.9628, -0.2292, -0.2291],
        [-1.6872,  0.1970,  1.5158,  ..., -0.9628, -0.2292, -0.2291],
        [-1.6872,  0.1970,  1.5158,  ..., -0.9628, -0.2292, -0.2291]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 60, 43, 30, 11,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8772, -0.6473,  0.5940,  ...,  0.2800, -1.4393, -0.9116],
        [-0.2295,  1.7845, -0.9746,  ...,  0.3405, -2.1710,  1.5644],
        [-0.2295,  1.7845, -0.9746,  ...,  0.3405, -2.1710,  1.5644],
        ...,
        [-1.6871,  0.1966,  1.5161,  ..., -0.9629, -0.2289, -0.2287],
        [-1.6871,  0.1966,  1.5161,  ..., -0.9629, -0.2289, -0.2287],
        [-1

PackedSequence(data=tensor([[-0.2294,  1.7855, -0.9719,  ...,  0.3406, -2.1674,  1.5655],
        [ 0.8775, -0.6478,  0.5937,  ...,  0.2778, -1.4403, -0.9130],
        [-0.2294,  1.7855, -0.9719,  ...,  0.3406, -2.1674,  1.5655],
        ...,
        [-1.6884,  0.1960,  1.5159,  ..., -0.9612, -0.2276, -0.2265],
        [-1.6884,  0.1960,  1.5159,  ..., -0.9612, -0.2276, -0.2265],
        [-1.6884,  0.1960,  1.5159,  ..., -0.9612, -0.2276, -0.2265]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 39, 31, 11,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.3942,  0.0591, -1.4403,  ...,  0.3326, -1.5055,  0.5261],
        [ 0.8776, -0.6478,  0.5937,  ...,  0.2777, -1.4403, -0.9131],
        [ 1.0457, -2.0821,  0.1252,  ...,  0.6801,  1.4930,  1.3747],
        ...,
        [-1.6884,  0.1959,  1.5159,  ..., -0.9611, -0.2275, -0.2263],
        [-1.6884,  0.1959,  1.5159,  ..., -0.9611, -0.2275, -0.2263],
        [-1

PackedSequence(data=tensor([[ 0.3942,  0.0596, -1.4404,  ...,  0.3320, -1.5064,  0.5247],
        [ 0.8280, -1.1531, -1.8500,  ...,  0.1646, -0.2294, -1.2745],
        [-0.2393,  0.1109, -0.1433,  ..., -0.4786, -0.6413,  1.5990],
        ...,
        [-1.6886,  0.1941,  1.5170,  ..., -0.9614, -0.2274, -0.2244],
        [-1.6886,  0.1941,  1.5170,  ..., -0.9614, -0.2274, -0.2244],
        [-1.6886,  0.1941,  1.5170,  ..., -0.9614, -0.2274, -0.2244]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 45, 34, 17,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.3944,  0.0598, -1.4407,  ...,  0.3320, -1.5064,  0.5245],
        [ 1.0429, -2.0804,  0.1233,  ...,  0.6784,  1.4948,  1.3773],
        [ 0.8780, -0.6486,  0.5920,  ...,  0.2780, -1.4386, -0.9135],
        ...,
        [-1.6887,  0.1943,  1.5170,  ..., -0.9613, -0.2273, -0.2245],
        [-1.6887,  0.1943,  1.5170,  ..., -0.9613, -0.2273, -0.2245],
        [-1

PackedSequence(data=tensor([[ 0.8801, -0.6465,  0.5898,  ...,  0.2782, -1.4365, -0.9116],
        [ 0.8801, -0.6465,  0.5898,  ...,  0.2782, -1.4365, -0.9116],
        [ 0.8294, -1.1531, -1.8503,  ...,  0.1635, -0.2292, -1.2714],
        ...,
        [-1.6878,  0.1928,  1.5158,  ..., -0.9622, -0.2283, -0.2251],
        [-1.6878,  0.1928,  1.5158,  ..., -0.9622, -0.2283, -0.2251],
        [-1.6878,  0.1928,  1.5158,  ..., -0.9622, -0.2283, -0.2251]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 43, 29, 16,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.3959,  0.0607, -1.4429,  ...,  0.3323, -1.5067,  0.5225],
        [ 0.8804, -0.6463,  0.5897,  ...,  0.2782, -1.4364, -0.9115],
        [ 0.8293, -1.1531, -1.8502,  ...,  0.1634, -0.2290, -1.2709],
        ...,
        [-1.6877,  0.1925,  1.5158,  ..., -0.9625, -0.2283, -0.2248],
        [-1.6877,  0.1925,  1.5158,  ..., -0.9625, -0.2283, -0.2248],
        [-1

PackedSequence(data=tensor([[-0.2303,  1.7866, -0.9747,  ...,  0.3376, -2.1630,  1.5611],
        [ 0.8830, -0.6470,  0.5890,  ...,  0.2779, -1.4348, -0.9136],
        [ 0.8830, -0.6470,  0.5890,  ...,  0.2779, -1.4348, -0.9136],
        ...,
        [-1.6878,  0.1940,  1.5136,  ..., -0.9604, -0.2287, -0.2246],
        [-1.6878,  0.1940,  1.5136,  ..., -0.9604, -0.2287, -0.2246],
        [-1.6878,  0.1940,  1.5136,  ..., -0.9604, -0.2287, -0.2246]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 50, 40, 29, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5082,  1.2641, -1.1280,  ...,  1.2011,  2.6473,  0.7060],
        [ 0.5082,  1.2641, -1.1280,  ...,  1.2011,  2.6473,  0.7060],
        [ 1.0340,  0.6424,  0.3301,  ...,  1.8724,  0.1153,  1.3398],
        ...,
        [-1.6877,  0.1938,  1.5136,  ..., -0.9607, -0.2286, -0.2245],
        [-1.6877,  0.1938,  1.5136,  ..., -0.9607, -0.2286, -0.2245],
        [-1

PackedSequence(data=tensor([[ 0.8874, -0.6467,  0.5884,  ...,  0.2797, -1.4320, -0.9138],
        [ 0.8874, -0.6467,  0.5884,  ...,  0.2797, -1.4320, -0.9138],
        [ 0.8874, -0.6467,  0.5884,  ...,  0.2797, -1.4320, -0.9138],
        ...,
        [-1.6890,  0.1941,  1.5125,  ..., -0.9616, -0.2308, -0.2267],
        [-1.6890,  0.1941,  1.5125,  ..., -0.9616, -0.2308, -0.2267],
        [-1.6890,  0.1941,  1.5125,  ..., -0.9616, -0.2308, -0.2267]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 40, 31, 16,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5063,  1.2638, -1.1291,  ...,  1.2005,  2.6455,  0.7086],
        [ 0.3943,  0.0603, -1.4423,  ...,  0.3325, -1.5066,  0.5218],
        [ 0.8876, -0.6469,  0.5885,  ...,  0.2798, -1.4318, -0.9139],
        ...,
        [-1.6893,  0.1941,  1.5126,  ..., -0.9615, -0.2311, -0.2268],
        [-1.6893,  0.1941,  1.5126,  ..., -0.9615, -0.2311, -0.2268],
        [-1

PackedSequence(data=tensor([[ 1.0403,  0.6419,  0.3270,  ...,  1.8733,  0.1152,  1.3335],
        [ 1.0403,  0.6419,  0.3270,  ...,  1.8733,  0.1152,  1.3335],
        [ 0.8881, -0.6481,  0.5892,  ...,  0.2805, -1.4322, -0.9131],
        ...,
        [-1.6913,  0.1930,  1.5141,  ..., -0.9610, -0.2285, -0.2241],
        [-1.6913,  0.1930,  1.5141,  ..., -0.9610, -0.2285, -0.2241],
        [-1.6913,  0.1930,  1.5141,  ..., -0.9610, -0.2285, -0.2241]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 54, 41, 27, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8880, -0.6482,  0.5893,  ...,  0.2803, -1.4324, -0.9131],
        [ 0.8880, -0.6482,  0.5893,  ...,  0.2803, -1.4324, -0.9131],
        [ 0.8880, -0.6482,  0.5893,  ...,  0.2803, -1.4324, -0.9131],
        ...,
        [-1.6912,  0.1930,  1.5139,  ..., -0.9611, -0.2286, -0.2241],
        [-1.6912,  0.1930,  1.5139,  ..., -0.9611, -0.2286, -0.2241],
        [-1

PackedSequence(data=tensor([[ 1.0343, -2.0786,  0.1235,  ...,  0.6750,  1.4970,  1.3771],
        [ 0.3942,  0.0591, -1.4430,  ...,  0.3329, -1.5059,  0.5223],
        [ 1.0416,  0.6422,  0.3263,  ...,  1.8729,  0.1151,  1.3335],
        ...,
        [-1.6908,  0.1911,  1.5149,  ..., -0.9615, -0.2287, -0.2215],
        [-1.6908,  0.1911,  1.5149,  ..., -0.9615, -0.2287, -0.2215],
        [-1.6908,  0.1911,  1.5149,  ..., -0.9615, -0.2287, -0.2215]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 40, 25, 14,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0419,  0.6421,  0.3261,  ...,  1.8730,  0.1152,  1.3335],
        [ 0.8251, -1.1559, -1.8505,  ...,  0.1667, -0.2273, -1.2692],
        [ 0.8853, -0.6486,  0.5924,  ...,  0.2775, -1.4344, -0.9144],
        ...,
        [-1.6909,  0.1911,  1.5147,  ..., -0.9615, -0.2288, -0.2217],
        [-1.6909,  0.1911,  1.5147,  ..., -0.9615, -0.2288, -0.2217],
        [-1

PackedSequence(data=tensor([[ 0.8848, -0.6508,  0.5935,  ...,  0.2788, -1.4333, -0.9151],
        [ 0.8239, -1.1552, -1.8504,  ...,  0.1672, -0.2258, -1.2691],
        [ 0.8848, -0.6508,  0.5935,  ...,  0.2788, -1.4333, -0.9151],
        ...,
        [-1.6887,  0.1940,  1.5116,  ..., -0.9614, -0.2309, -0.2253],
        [-1.6887,  0.1940,  1.5116,  ..., -0.9614, -0.2309, -0.2253],
        [-1.6887,  0.1940,  1.5116,  ..., -0.9614, -0.2309, -0.2253]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 40, 31, 10,  2]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8848, -0.6509,  0.5935,  ...,  0.2790, -1.4332, -0.9153],
        [ 1.0440,  0.6414,  0.3241,  ...,  1.8742,  0.1164,  1.3319],
        [ 0.8239, -1.1551, -1.8505,  ...,  0.1673, -0.2257, -1.2691],
        ...,
        [-1.6883,  0.1939,  1.5116,  ..., -0.9615, -0.2310, -0.2253],
        [-1.6883,  0.1939,  1.5116,  ..., -0.9615, -0.2310, -0.2253],
        [-1

PackedSequence(data=tensor([[ 0.3924,  0.0565, -1.4440,  ...,  0.3347, -1.5044,  0.5183],
        [ 0.8852, -0.6521,  0.5931,  ...,  0.2801, -1.4329, -0.9155],
        [ 0.8852, -0.6521,  0.5931,  ...,  0.2801, -1.4329, -0.9155],
        ...,
        [-1.6876,  0.1925,  1.5136,  ..., -0.9612, -0.2291, -0.2230],
        [-1.6876,  0.1925,  1.5136,  ..., -0.9612, -0.2291, -0.2230],
        [-1.6876,  0.1925,  1.5136,  ..., -0.9612, -0.2291, -0.2230]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 54, 42, 24, 14, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8853, -0.6522,  0.5930,  ...,  0.2802, -1.4329, -0.9154],
        [ 0.8213, -1.1557, -1.8496,  ...,  0.1679, -0.2282, -1.2672],
        [ 0.8213, -1.1557, -1.8496,  ...,  0.1679, -0.2282, -1.2672],
        ...,
        [-1.6878,  0.1924,  1.5133,  ..., -0.9609, -0.2289, -0.2229],
        [-1.6878,  0.1924,  1.5133,  ..., -0.9609, -0.2289, -0.2229],
        [-1

PackedSequence(data=tensor([[ 0.8204, -1.1554, -1.8470,  ...,  0.1666, -0.2272, -1.2635],
        [ 0.8848, -0.6518,  0.5922,  ...,  0.2802, -1.4333, -0.9158],
        [ 0.3917,  0.0586, -1.4447,  ...,  0.3365, -1.5050,  0.5176],
        ...,
        [-1.6892,  0.1922,  1.5148,  ..., -0.9597, -0.2276, -0.2226],
        [-1.6892,  0.1922,  1.5148,  ..., -0.9597, -0.2276, -0.2226],
        [-1.6892,  0.1922,  1.5148,  ..., -0.9597, -0.2276, -0.2226]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 36, 26, 13,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8847, -0.6517,  0.5923,  ...,  0.2801, -1.4334, -0.9159],
        [ 0.8847, -0.6517,  0.5923,  ...,  0.2801, -1.4334, -0.9159],
        [ 0.4254, -0.3180,  1.5685,  ...,  1.4139,  0.0062,  0.5333],
        ...,
        [-1.6892,  0.1926,  1.5148,  ..., -0.9595, -0.2277, -0.2228],
        [-1.6892,  0.1926,  1.5148,  ..., -0.9595, -0.2277, -0.2228],
        [-1

PackedSequence(data=tensor([[ 0.8836, -0.6524,  0.5916,  ...,  0.2808, -1.4355, -0.9156],
        [ 0.3916,  0.0603, -1.4449,  ...,  0.3362, -1.5059,  0.5185],
        [ 0.8225, -1.1554, -1.8453,  ...,  0.1660, -0.2265, -1.2629],
        ...,
        [-1.6895,  0.1939,  1.5132,  ..., -0.9633, -0.2269, -0.2226],
        [-1.6895,  0.1939,  1.5132,  ..., -0.9633, -0.2269, -0.2226],
        [-1.6895,  0.1939,  1.5132,  ..., -0.9633, -0.2269, -0.2226]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 57, 40, 26, 15,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0460,  0.6402,  0.3196,  ...,  1.8746,  0.1184,  1.3358],
        [ 0.8835, -0.6525,  0.5916,  ...,  0.2809, -1.4356, -0.9155],
        [ 0.5129,  1.2640, -1.1399,  ...,  1.2049,  2.6415,  0.7050],
        ...,
        [-1.6895,  0.1938,  1.5133,  ..., -0.9634, -0.2270, -0.2225],
        [-1.6895,  0.1938,  1.5133,  ..., -0.9634, -0.2270, -0.2225],
        [-1

PackedSequence(data=tensor([[ 0.8842, -0.6548,  0.5885,  ...,  0.2831, -1.4353, -0.9136],
        [ 0.8842, -0.6548,  0.5885,  ...,  0.2831, -1.4353, -0.9136],
        [ 0.8842, -0.6548,  0.5885,  ...,  0.2831, -1.4353, -0.9136],
        ...,
        [-1.6882,  0.1918,  1.5166,  ..., -0.9636, -0.2264, -0.2209],
        [-1.6882,  0.1918,  1.5166,  ..., -0.9636, -0.2264, -0.2209],
        [-1.6882,  0.1918,  1.5166,  ..., -0.9636, -0.2264, -0.2209]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 61, 47, 36, 23, 11,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.3881,  0.0681, -1.4425,  ...,  0.3265, -1.5130,  0.5242],
        [-0.2421,  0.1093, -0.1376,  ..., -0.4768, -0.6466,  1.6008],
        [ 0.8842, -0.6548,  0.5884,  ...,  0.2830, -1.4354, -0.9136],
        ...,
        [-1.6876,  0.1920,  1.5164,  ..., -0.9638, -0.2262, -0.2210],
        [-1.6876,  0.1920,  1.5164,  ..., -0.9638, -0.2262, -0.2210],
        [-1

PackedSequence(data=tensor([[ 0.8858, -0.6532,  0.5889,  ...,  0.2838, -1.4366, -0.9140],
        [-0.2414,  0.1087, -0.1381,  ..., -0.4780, -0.6474,  1.5976],
        [ 1.0447,  0.6416,  0.3212,  ...,  1.8725,  0.1164,  1.3369],
        ...,
        [-1.6864,  0.1922,  1.5146,  ..., -0.9638, -0.2272, -0.2228],
        [-1.6864,  0.1922,  1.5146,  ..., -0.9638, -0.2272, -0.2228],
        [-1.6864,  0.1922,  1.5146,  ..., -0.9638, -0.2272, -0.2228]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 44, 29, 18,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8859, -0.6531,  0.5889,  ...,  0.2838, -1.4367, -0.9140],
        [ 1.0445,  0.6417,  0.3213,  ...,  1.8724,  0.1164,  1.3369],
        [ 0.3850,  0.0690, -1.4424,  ...,  0.3253, -1.5157,  0.5251],
        ...,
        [-1.6865,  0.1922,  1.5146,  ..., -0.9637, -0.2272, -0.2228],
        [-1.6865,  0.1922,  1.5146,  ..., -0.9637, -0.2272, -0.2228],
        [-1

PackedSequence(data=tensor([[ 0.8864, -0.6520,  0.5892,  ...,  0.2857, -1.4375, -0.9126],
        [ 1.0437,  0.6424,  0.3226,  ...,  1.8727,  0.1173,  1.3368],
        [ 0.8864, -0.6520,  0.5892,  ...,  0.2857, -1.4375, -0.9126],
        ...,
        [-1.6890,  0.1949,  1.5130,  ..., -0.9631, -0.2259, -0.2233],
        [-1.6890,  0.1949,  1.5130,  ..., -0.9631, -0.2259, -0.2233],
        [-1.6890,  0.1949,  1.5130,  ..., -0.9631, -0.2259, -0.2233]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 39, 25, 13,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8864, -0.6519,  0.5893,  ...,  0.2858, -1.4375, -0.9124],
        [ 0.8864, -0.6519,  0.5893,  ...,  0.2858, -1.4375, -0.9124],
        [ 0.3842,  0.0702, -1.4424,  ...,  0.3240, -1.5157,  0.5244],
        ...,
        [-1.6891,  0.1948,  1.5131,  ..., -0.9634, -0.2258, -0.2232],
        [-1.6891,  0.1948,  1.5131,  ..., -0.9634, -0.2258, -0.2232],
        [-1

PackedSequence(data=tensor([[ 1.0452,  0.6413,  0.3211,  ...,  1.8732,  0.1196,  1.3387],
        [ 1.0452,  0.6413,  0.3211,  ...,  1.8732,  0.1196,  1.3387],
        [ 1.0405, -2.0790,  0.1264,  ...,  0.6754,  1.4984,  1.3696],
        ...,
        [-1.6893,  0.1940,  1.5128,  ..., -0.9629, -0.2252, -0.2212],
        [-1.6893,  0.1940,  1.5128,  ..., -0.9629, -0.2252, -0.2212],
        [-1.6893,  0.1940,  1.5128,  ..., -0.9629, -0.2252, -0.2212]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 36, 26, 15,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8861, -0.6508,  0.5893,  ...,  0.2867, -1.4376, -0.9114],
        [ 0.8861, -0.6508,  0.5893,  ...,  0.2867, -1.4376, -0.9114],
        [ 0.8861, -0.6508,  0.5893,  ...,  0.2867, -1.4376, -0.9114],
        ...,
        [-1.6893,  0.1940,  1.5128,  ..., -0.9628, -0.2251, -0.2211],
        [-1.6893,  0.1940,  1.5128,  ..., -0.9628, -0.2251, -0.2211],
        [-1

PackedSequence(data=tensor([[ 0.8883, -0.6503,  0.5868,  ...,  0.2861, -1.4359, -0.9118],
        [ 0.8883, -0.6503,  0.5868,  ...,  0.2861, -1.4359, -0.9118],
        [ 0.8883, -0.6503,  0.5868,  ...,  0.2861, -1.4359, -0.9118],
        ...,
        [-1.6897,  0.1937,  1.5129,  ..., -0.9624, -0.2234, -0.2201],
        [-1.6897,  0.1937,  1.5129,  ..., -0.9624, -0.2234, -0.2201],
        [-1.6897,  0.1937,  1.5129,  ..., -0.9624, -0.2234, -0.2201]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 38, 23,  9,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.3890,  0.0685, -1.4447,  ...,  0.3251, -1.5148,  0.5234],
        [ 0.8885, -0.6503,  0.5866,  ...,  0.2860, -1.4358, -0.9119],
        [ 0.8234, -1.1568, -1.8489,  ...,  0.1640, -0.2326, -1.2672],
        ...,
        [-1.6896,  0.1936,  1.5127,  ..., -0.9622, -0.2232, -0.2200],
        [-1.6896,  0.1936,  1.5127,  ..., -0.9622, -0.2232, -0.2200],
        [-1

PackedSequence(data=tensor([[ 0.8881, -0.6503,  0.5897,  ...,  0.2834, -1.4359, -0.9123],
        [ 0.5090,  1.2629, -1.1381,  ...,  1.2032,  2.6421,  0.7040],
        [ 0.8881, -0.6503,  0.5897,  ...,  0.2834, -1.4359, -0.9123],
        ...,
        [-1.6865,  0.1920,  1.5150,  ..., -0.9631, -0.2223, -0.2193],
        [-1.6865,  0.1920,  1.5150,  ..., -0.9631, -0.2223, -0.2193],
        [-1.6865,  0.1920,  1.5150,  ..., -0.9631, -0.2223, -0.2193]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 40, 20,  9,  1]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0456,  0.6412,  0.3191,  ...,  1.8734,  0.1206,  1.3405],
        [ 1.0456,  0.6412,  0.3191,  ...,  1.8734,  0.1206,  1.3405],
        [ 0.8880, -0.6504,  0.5898,  ...,  0.2833, -1.4359, -0.9123],
        ...,
        [-1.6865,  0.1921,  1.5149,  ..., -0.9632, -0.2225, -0.2194],
        [-1.6865,  0.1921,  1.5149,  ..., -0.9632, -0.2225, -0.2194],
        [-1

PackedSequence(data=tensor([[ 0.8870, -0.6505,  0.5913,  ...,  0.2832, -1.4355, -0.9126],
        [ 0.3890,  0.0690, -1.4475,  ...,  0.3259, -1.5141,  0.5226],
        [ 0.8870, -0.6505,  0.5913,  ...,  0.2832, -1.4355, -0.9126],
        ...,
        [-1.6878,  0.1945,  1.5131,  ..., -0.9638, -0.2233, -0.2213],
        [-1.6878,  0.1945,  1.5131,  ..., -0.9638, -0.2233, -0.2213],
        [-1.6878,  0.1945,  1.5131,  ..., -0.9638, -0.2233, -0.2213]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 60, 45, 30, 17,  5,  2]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8870, -0.6504,  0.5914,  ...,  0.2832, -1.4356, -0.9127],
        [-0.6326,  0.6465, -0.4743,  ...,  0.0573,  0.0389, -0.2913],
        [ 0.8870, -0.6504,  0.5914,  ...,  0.2832, -1.4356, -0.9127],
        ...,
        [-1.6879,  0.1944,  1.5131,  ..., -0.9637, -0.2231, -0.2211],
        [-1.6879,  0.1944,  1.5131,  ..., -0.9637, -0.2231, -0.2211],
        [-1

PackedSequence(data=tensor([[ 0.3885,  0.0676, -1.4484,  ...,  0.3268, -1.5143,  0.5232],
        [ 0.8863, -0.6491,  0.5911,  ...,  0.2825, -1.4356, -0.9129],
        [ 0.8863, -0.6491,  0.5911,  ...,  0.2825, -1.4356, -0.9129],
        ...,
        [-1.6888,  0.1938,  1.5148,  ..., -0.9632, -0.2215, -0.2194],
        [-1.6888,  0.1938,  1.5148,  ..., -0.9632, -0.2215, -0.2194],
        [-1.6888,  0.1938,  1.5148,  ..., -0.9632, -0.2215, -0.2194]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 57, 41, 28, 15,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8863, -0.6491,  0.5910,  ...,  0.2824, -1.4355, -0.9129],
        [ 0.8863, -0.6491,  0.5910,  ...,  0.2824, -1.4355, -0.9129],
        [-0.2421,  0.1113, -0.1322,  ..., -0.4791, -0.6440,  1.5944],
        ...,
        [-1.6887,  0.1938,  1.5149,  ..., -0.9636, -0.2214, -0.2194],
        [-1.6887,  0.1938,  1.5149,  ..., -0.9636, -0.2214, -0.2194],
        [-1

PackedSequence(data=tensor([[ 0.8267, -1.1557, -1.8513,  ...,  0.1640, -0.2294, -1.2602],
        [-0.2413,  0.1123, -0.1321,  ..., -0.4792, -0.6456,  1.5965],
        [ 0.5103,  1.2616, -1.1376,  ...,  1.2039,  2.6419,  0.7046],
        ...,
        [-1.6889,  0.1938,  1.5144,  ..., -0.9625, -0.2261, -0.2224],
        [-1.6889,  0.1938,  1.5144,  ..., -0.9625, -0.2261, -0.2224],
        [-1.6889,  0.1938,  1.5144,  ..., -0.9625, -0.2261, -0.2224]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 35, 25, 20,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8856, -0.6485,  0.5918,  ...,  0.2830, -1.4363, -0.9123],
        [ 0.8856, -0.6485,  0.5918,  ...,  0.2830, -1.4363, -0.9123],
        [ 0.8856, -0.6485,  0.5918,  ...,  0.2830, -1.4363, -0.9123],
        ...,
        [-1.6889,  0.1936,  1.5144,  ..., -0.9622, -0.2264, -0.2224],
        [-1.6889,  0.1936,  1.5144,  ..., -0.9622, -0.2264, -0.2224],
        [-1

PackedSequence(data=tensor([[-0.2231,  1.7862, -0.9767,  ...,  0.3331, -2.1782,  1.5555],
        [ 0.8844, -0.6493,  0.5919,  ...,  0.2836, -1.4368, -0.9120],
        [ 1.0382, -2.0877,  0.1138,  ...,  0.6802,  1.5136,  1.3804],
        ...,
        [-1.6865,  0.1928,  1.5145,  ..., -0.9604, -0.2260, -0.2231],
        [-1.6865,  0.1928,  1.5145,  ..., -0.9604, -0.2260, -0.2231],
        [-1.6865,  0.1928,  1.5145,  ..., -0.9604, -0.2260, -0.2231]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 53, 38, 22,  9,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8274, -1.1557, -1.8514,  ...,  0.1659, -0.2299, -1.2613],
        [ 0.8843, -0.6493,  0.5919,  ...,  0.2835, -1.4368, -0.9120],
        [ 0.8843, -0.6493,  0.5919,  ...,  0.2835, -1.4368, -0.9120],
        ...,
        [-1.6862,  0.1926,  1.5147,  ..., -0.9606, -0.2256, -0.2229],
        [-1.6862,  0.1926,  1.5147,  ..., -0.9606, -0.2256, -0.2229],
        [-1

PackedSequence(data=tensor([[ 1.0436,  0.6420,  0.3184,  ...,  1.8752,  0.1234,  1.3459],
        [ 0.8271, -1.1558, -1.8517,  ...,  0.1666, -0.2312, -1.2616],
        [ 0.8840, -0.6496,  0.5913,  ...,  0.2824, -1.4372, -0.9128],
        ...,
        [-1.6866,  0.1931,  1.5144,  ..., -0.9606, -0.2260, -0.2224],
        [-1.6866,  0.1931,  1.5144,  ..., -0.9606, -0.2260, -0.2224],
        [-1.6866,  0.1931,  1.5144,  ..., -0.9606, -0.2260, -0.2224]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 45, 34, 17,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2225,  1.7850, -0.9796,  ...,  0.3341, -2.1785,  1.5552],
        [ 1.0376, -2.0890,  0.1111,  ...,  0.6821,  1.5160,  1.3833],
        [-0.2225,  1.7850, -0.9796,  ...,  0.3341, -2.1785,  1.5552],
        ...,
        [-1.6867,  0.1930,  1.5143,  ..., -0.9606, -0.2262, -0.2224],
        [-1.6867,  0.1930,  1.5143,  ..., -0.9606, -0.2262, -0.2224],
        [-1

PackedSequence(data=tensor([[ 0.8849, -0.6486,  0.5902,  ...,  0.2816, -1.4369, -0.9119],
        [ 0.8291, -1.1561, -1.8498,  ...,  0.1672, -0.2312, -1.2615],
        [ 0.8849, -0.6486,  0.5902,  ...,  0.2816, -1.4369, -0.9119],
        ...,
        [-1.6860,  0.1911,  1.5138,  ..., -0.9624, -0.2279, -0.2226],
        [-1.6860,  0.1911,  1.5138,  ..., -0.9624, -0.2279, -0.2226],
        [-1.6860,  0.1911,  1.5138,  ..., -0.9624, -0.2279, -0.2226]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 40, 30, 11,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.4223, -0.3062,  1.5800,  ...,  1.4055,  0.0062,  0.5408],
        [ 0.8292, -1.1561, -1.8498,  ...,  0.1673, -0.2314, -1.2615],
        [-0.2213,  1.7857, -0.9791,  ...,  0.3340, -2.1776,  1.5537],
        ...,
        [-1.6856,  0.1912,  1.5140,  ..., -0.9626, -0.2277, -0.2226],
        [-1.6856,  0.1912,  1.5140,  ..., -0.9626, -0.2277, -0.2226],
        [-1

PackedSequence(data=tensor([[ 0.8856, -0.6480,  0.5897,  ...,  0.2812, -1.4370, -0.9107],
        [ 0.8856, -0.6480,  0.5897,  ...,  0.2812, -1.4370, -0.9107],
        [ 0.3864,  0.0673, -1.4446,  ...,  0.3312, -1.5107,  0.5276],
        ...,
        [-1.6851,  0.1920,  1.5151,  ..., -0.9632, -0.2255, -0.2217],
        [-1.6851,  0.1920,  1.5151,  ..., -0.9632, -0.2255, -0.2217],
        [-1.6851,  0.1920,  1.5151,  ..., -0.9632, -0.2255, -0.2217]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 40, 28, 12,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5076,  1.2661, -1.1348,  ...,  1.2100,  2.6378,  0.7064],
        [ 0.8856, -0.6480,  0.5896,  ...,  0.2811, -1.4370, -0.9106],
        [ 0.8276, -1.1546, -1.8476,  ...,  0.1664, -0.2313, -1.2607],
        ...,
        [-1.6853,  0.1917,  1.5150,  ..., -0.9631, -0.2253, -0.2214],
        [-1.6853,  0.1917,  1.5150,  ..., -0.9631, -0.2253, -0.2214],
        [-1

PackedSequence(data=tensor([[ 0.8860, -0.6481,  0.5887,  ...,  0.2804, -1.4363, -0.9108],
        [ 1.0433,  0.6428,  0.3243,  ...,  1.8739,  0.1230,  1.3445],
        [ 0.8860, -0.6481,  0.5887,  ...,  0.2804, -1.4363, -0.9108],
        ...,
        [-1.6870,  0.1904,  1.5141,  ..., -0.9607, -0.2265, -0.2199],
        [-1.6870,  0.1904,  1.5141,  ..., -0.9607, -0.2265, -0.2199],
        [-1.6870,  0.1904,  1.5141,  ..., -0.9607, -0.2265, -0.2199]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 43, 30, 20, 10]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5075,  1.2661, -1.1356,  ...,  1.2083,  2.6369,  0.7052],
        [ 0.8861, -0.6482,  0.5886,  ...,  0.2804, -1.4363, -0.9109],
        [ 0.8259, -1.1549, -1.8489,  ...,  0.1676, -0.2328, -1.2593],
        ...,
        [-1.6869,  0.1906,  1.5141,  ..., -0.9607, -0.2264, -0.2202],
        [-1.6869,  0.1906,  1.5141,  ..., -0.9607, -0.2264, -0.2202],
        [-1

PackedSequence(data=tensor([[ 1.0408,  0.6446,  0.3283,  ...,  1.8720,  0.1204,  1.3461],
        [ 0.8876, -0.6497,  0.5882,  ...,  0.2819, -1.4343, -0.9118],
        [ 0.8876, -0.6497,  0.5882,  ...,  0.2819, -1.4343, -0.9118],
        ...,
        [-1.6896,  0.1920,  1.5115,  ..., -0.9598, -0.2248, -0.2217],
        [-1.6896,  0.1920,  1.5115,  ..., -0.9598, -0.2248, -0.2217],
        [-1.6896,  0.1920,  1.5115,  ..., -0.9598, -0.2248, -0.2217]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 37, 23, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8253, -1.1545, -1.8504,  ...,  0.1697, -0.2356, -1.2575],
        [-0.2363,  0.1103, -0.1363,  ..., -0.4772, -0.6405,  1.6015],
        [ 0.5077,  1.2665, -1.1363,  ...,  1.2077,  2.6350,  0.7056],
        ...,
        [-1.6896,  0.1919,  1.5113,  ..., -0.9599, -0.2247, -0.2216],
        [-1.6896,  0.1919,  1.5113,  ..., -0.9599, -0.2247, -0.2216],
        [-1

PackedSequence(data=tensor([[-0.2362,  0.1099, -0.1362,  ..., -0.4767, -0.6409,  1.6017],
        [ 1.0402,  0.6448,  0.3291,  ...,  1.8715,  0.1202,  1.3465],
        [ 0.8239, -1.1542, -1.8516,  ...,  0.1694, -0.2358, -1.2562],
        ...,
        [-1.6887,  0.1919,  1.5106,  ..., -0.9597, -0.2266, -0.2228],
        [-1.6887,  0.1919,  1.5106,  ..., -0.9597, -0.2266, -0.2228],
        [-1.6887,  0.1919,  1.5106,  ..., -0.9597, -0.2266, -0.2228]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 42, 26, 11,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8873, -0.6504,  0.5901,  ...,  0.2825, -1.4319, -0.9116],
        [ 0.8237, -1.1544, -1.8517,  ...,  0.1695, -0.2358, -1.2561],
        [ 0.8873, -0.6504,  0.5901,  ...,  0.2825, -1.4319, -0.9116],
        ...,
        [-1.6888,  0.1921,  1.5105,  ..., -0.9595, -0.2267, -0.2230],
        [-1.6888,  0.1921,  1.5105,  ..., -0.9595, -0.2267, -0.2230],
        [-1

PackedSequence(data=tensor([[ 1.0358, -2.0915,  0.1131,  ...,  0.6820,  1.5102,  1.3781],
        [ 1.0406,  0.6436,  0.3277,  ...,  1.8727,  0.1214,  1.3454],
        [ 0.8874, -0.6511,  0.5899,  ...,  0.2832, -1.4311, -0.9121],
        ...,
        [-1.6879,  0.1922,  1.5116,  ..., -0.9606, -0.2254, -0.2215],
        [-1.6879,  0.1922,  1.5116,  ..., -0.9606, -0.2254, -0.2215],
        [-1.6879,  0.1922,  1.5116,  ..., -0.9606, -0.2254, -0.2215]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 43, 28, 13,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8875, -0.6511,  0.5899,  ...,  0.2833, -1.4311, -0.9121],
        [ 0.8222, -1.1549, -1.8530,  ...,  0.1695, -0.2357, -1.2542],
        [-0.2387,  0.1091, -0.1362,  ..., -0.4775, -0.6428,  1.6041],
        ...,
        [-1.6879,  0.1920,  1.5116,  ..., -0.9606, -0.2252, -0.2213],
        [-1.6879,  0.1920,  1.5116,  ..., -0.9606, -0.2252, -0.2213],
        [-1

PackedSequence(data=tensor([[ 0.8891, -0.6524,  0.5889,  ...,  0.2843, -1.4329, -0.9130],
        [ 0.5092,  1.2603, -1.1346,  ...,  1.2118,  2.6374,  0.7084],
        [ 1.0402,  0.6428,  0.3261,  ...,  1.8730,  0.1230,  1.3442],
        ...,
        [-1.6876,  0.1913,  1.5103,  ..., -0.9601, -0.2240, -0.2206],
        [-1.6876,  0.1913,  1.5103,  ..., -0.9601, -0.2240, -0.2206],
        [-1.6876,  0.1913,  1.5103,  ..., -0.9601, -0.2240, -0.2206]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 40, 22,  9,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8226, -1.1548, -1.8524,  ...,  0.1698, -0.2355, -1.2548],
        [ 1.0402,  0.6428,  0.3259,  ...,  1.8731,  0.1232,  1.3442],
        [ 0.5093,  1.2601, -1.1346,  ...,  1.2120,  2.6375,  0.7085],
        ...,
        [-1.6875,  0.1915,  1.5103,  ..., -0.9600, -0.2238, -0.2206],
        [-1.6875,  0.1915,  1.5103,  ..., -0.9600, -0.2238, -0.2206],
        [-1

PackedSequence(data=tensor([[ 0.8892, -0.6508,  0.5891,  ...,  0.2842, -1.4362, -0.9173],
        [ 0.3881,  0.0659, -1.4504,  ...,  0.3339, -1.5077,  0.5238],
        [ 1.0422,  0.6417,  0.3228,  ...,  1.8734,  0.1246,  1.3441],
        ...,
        [-1.6866,  0.1933,  1.5083,  ..., -0.9600, -0.2240, -0.2208],
        [-1.6866,  0.1933,  1.5083,  ..., -0.9600, -0.2240, -0.2208],
        [-1.6866,  0.1933,  1.5083,  ..., -0.9600, -0.2240, -0.2208]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 40, 28, 18,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8220, -1.1578, -1.8539,  ...,  0.1709, -0.2342, -1.2553],
        [ 0.8892, -0.6507,  0.5890,  ...,  0.2842, -1.4363, -0.9175],
        [ 0.0154, -0.7739,  0.1878,  ...,  1.1410, -0.6112,  0.9392],
        ...,
        [-1.6866,  0.1935,  1.5083,  ..., -0.9599, -0.2242, -0.2210],
        [-1.6866,  0.1935,  1.5083,  ..., -0.9599, -0.2242, -0.2210],
        [-1

PackedSequence(data=tensor([[ 0.8896, -0.6499,  0.5877,  ...,  0.2840, -1.4375, -0.9181],
        [ 0.8896, -0.6499,  0.5877,  ...,  0.2840, -1.4375, -0.9181],
        [ 0.8221, -1.1585, -1.8552,  ...,  0.1716, -0.2328, -1.2565],
        ...,
        [-1.6875,  0.1934,  1.5101,  ..., -0.9609, -0.2262, -0.2228],
        [-1.6875,  0.1934,  1.5101,  ..., -0.9609, -0.2262, -0.2228],
        [-1.6875,  0.1934,  1.5101,  ..., -0.9609, -0.2262, -0.2228]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 40, 25, 15,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8223, -1.1583, -1.8550,  ...,  0.1716, -0.2328, -1.2566],
        [-0.2253,  1.7823, -0.9772,  ...,  0.3357, -2.1764,  1.5628],
        [ 0.8896, -0.6500,  0.5875,  ...,  0.2838, -1.4375, -0.9182],
        ...,
        [-1.6878,  0.1935,  1.5097,  ..., -0.9603, -0.2263, -0.2230],
        [-1.6878,  0.1935,  1.5097,  ..., -0.9603, -0.2263, -0.2230],
        [-1

PackedSequence(data=tensor([[ 0.8891, -0.6502,  0.5830,  ...,  0.2828, -1.4342, -0.9195],
        [ 0.8237, -1.1567, -1.8529,  ...,  0.1710, -0.2333, -1.2575],
        [ 0.5112,  1.2638, -1.1323,  ...,  1.2128,  2.6390,  0.7035],
        ...,
        [-1.6882,  0.1924,  1.5081,  ..., -0.9592, -0.2242, -0.2214],
        [-1.6882,  0.1924,  1.5081,  ..., -0.9592, -0.2242, -0.2214],
        [-1.6882,  0.1924,  1.5081,  ..., -0.9592, -0.2242, -0.2214]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 38, 24, 13,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5109,  1.2640, -1.1321,  ...,  1.2128,  2.6389,  0.7034],
        [ 0.8892, -0.6501,  0.5827,  ...,  0.2826, -1.4338, -0.9194],
        [ 1.0424,  0.6419,  0.3200,  ...,  1.8736,  0.1232,  1.3463],
        ...,
        [-1.6886,  0.1929,  1.5078,  ..., -0.9590, -0.2245, -0.2218],
        [-1.6886,  0.1929,  1.5078,  ..., -0.9590, -0.2245, -0.2218],
        [-1

PackedSequence(data=tensor([[ 0.5102,  1.2626, -1.1359,  ...,  1.2181,  2.6385,  0.7021],
        [ 0.5102,  1.2626, -1.1359,  ...,  1.2181,  2.6385,  0.7021],
        [ 0.8856, -0.6498,  0.5827,  ...,  0.2806, -1.4323, -0.9171],
        ...,
        [-1.6874,  0.1925,  1.5112,  ..., -0.9611, -0.2247, -0.2224],
        [-1.6874,  0.1925,  1.5112,  ..., -0.9611, -0.2247, -0.2224],
        [-1.6874,  0.1925,  1.5112,  ..., -0.9611, -0.2247, -0.2224]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 44, 27, 17,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0454,  0.6374,  0.3178,  ...,  1.8755,  0.1255,  1.3430],
        [ 0.3898,  0.0705, -1.4461,  ...,  0.3325, -1.5108,  0.5252],
        [ 0.8856, -0.6498,  0.5826,  ...,  0.2805, -1.4323, -0.9171],
        ...,
        [-1.6875,  0.1925,  1.5112,  ..., -0.9612, -0.2249, -0.2225],
        [-1.6875,  0.1925,  1.5112,  ..., -0.9612, -0.2249, -0.2225],
        [-1

PackedSequence(data=tensor([[ 0.8244, -1.1570, -1.8544,  ...,  0.1703, -0.2326, -1.2578],
        [-0.2370,  0.1122, -0.1296,  ..., -0.4773, -0.6376,  1.6011],
        [ 0.3908,  0.0702, -1.4454,  ...,  0.3321, -1.5113,  0.5250],
        ...,
        [-1.6878,  0.1919,  1.5119,  ..., -0.9600, -0.2259, -0.2229],
        [-1.6878,  0.1919,  1.5119,  ..., -0.9600, -0.2259, -0.2229],
        [-1.6878,  0.1919,  1.5119,  ..., -0.9600, -0.2259, -0.2229]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 45, 32, 18,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2250,  1.7827, -0.9783,  ...,  0.3359, -2.1782,  1.5644],
        [-0.2370,  0.1122, -0.1297,  ..., -0.4773, -0.6377,  1.6011],
        [ 0.8836, -0.6497,  0.5809,  ...,  0.2782, -1.4321, -0.9165],
        ...,
        [-1.6880,  0.1919,  1.5117,  ..., -0.9598, -0.2257, -0.2229],
        [-1.6880,  0.1919,  1.5117,  ..., -0.9598, -0.2257, -0.2229],
        [-1

PackedSequence(data=tensor([[ 0.8812, -0.6492,  0.5812,  ...,  0.2775, -1.4307, -0.9173],
        [ 1.0464,  0.6356,  0.3155,  ...,  1.8770,  0.1273,  1.3423],
        [ 0.3907,  0.0703, -1.4453,  ...,  0.3318, -1.5113,  0.5258],
        ...,
        [-1.6883,  0.1913,  1.5110,  ..., -0.9594, -0.2225, -0.2209],
        [-1.6883,  0.1913,  1.5110,  ..., -0.9594, -0.2225, -0.2209],
        [-1.6883,  0.1913,  1.5110,  ..., -0.9594, -0.2225, -0.2209]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 38, 19,  9,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8810, -0.6491,  0.5814,  ...,  0.2774, -1.4307, -0.9172],
        [ 1.0387, -2.0996,  0.1094,  ...,  0.6856,  1.5094,  1.3751],
        [ 0.8810, -0.6491,  0.5814,  ...,  0.2774, -1.4307, -0.9172],
        ...,
        [-1.6881,  0.1913,  1.5111,  ..., -0.9594, -0.2223, -0.2207],
        [-1.6881,  0.1913,  1.5111,  ..., -0.9594, -0.2223, -0.2207],
        [-1

PackedSequence(data=tensor([[ 1.0446,  0.6361,  0.3157,  ...,  1.8773,  0.1269,  1.3423],
        [ 0.8216, -1.1561, -1.8534,  ...,  0.1661, -0.2348, -1.2578],
        [ 0.8786, -0.6485,  0.5856,  ...,  0.2781, -1.4298, -0.9165],
        ...,
        [-1.6848,  0.1929,  1.5103,  ..., -0.9617, -0.2233, -0.2208],
        [-1.6848,  0.1929,  1.5103,  ..., -0.9617, -0.2233, -0.2208],
        [-1.6848,  0.1929,  1.5103,  ..., -0.9617, -0.2233, -0.2208]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 57, 41, 31, 18,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8787, -0.6489,  0.5860,  ...,  0.2782, -1.4291, -0.9170],
        [-0.2379,  0.1104, -0.1305,  ..., -0.4748, -0.6403,  1.6049],
        [ 0.3897,  0.0699, -1.4456,  ...,  0.3316, -1.5115,  0.5262],
        ...,
        [-1.6849,  0.1926,  1.5104,  ..., -0.9620, -0.2234, -0.2207],
        [-1.6849,  0.1926,  1.5104,  ..., -0.9620, -0.2234, -0.2207],
        [-1

PackedSequence(data=tensor([[-0.2405,  0.1098, -0.1313,  ..., -0.4750, -0.6412,  1.6047],
        [ 0.8816, -0.6519,  0.5895,  ...,  0.2786, -1.4244, -0.9205],
        [ 0.8816, -0.6519,  0.5895,  ...,  0.2786, -1.4244, -0.9205],
        ...,
        [-1.6853,  0.1930,  1.5095,  ..., -0.9620, -0.2256, -0.2212],
        [-1.6853,  0.1930,  1.5095,  ..., -0.9620, -0.2256, -0.2212],
        [-1.6853,  0.1930,  1.5095,  ..., -0.9620, -0.2256, -0.2212]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 39, 24, 18,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8817, -0.6518,  0.5899,  ...,  0.2787, -1.4244, -0.9207],
        [ 0.8817, -0.6518,  0.5899,  ...,  0.2787, -1.4244, -0.9207],
        [ 0.8817, -0.6518,  0.5899,  ...,  0.2787, -1.4244, -0.9207],
        ...,
        [-1.6853,  0.1929,  1.5098,  ..., -0.9619, -0.2256, -0.2211],
        [-1.6853,  0.1929,  1.5098,  ..., -0.9619, -0.2256, -0.2211],
        [-1

PackedSequence(data=tensor([[ 1.0352, -2.1040,  0.1073,  ...,  0.6887,  1.5123,  1.3788],
        [ 0.0119, -0.7819,  0.1791,  ...,  1.1422, -0.6101,  0.9304],
        [ 0.8830, -0.6512,  0.5937,  ...,  0.2782, -1.4235, -0.9239],
        ...,
        [-1.6866,  0.1925,  1.5110,  ..., -0.9599, -0.2262, -0.2197],
        [-1.6866,  0.1925,  1.5110,  ..., -0.9599, -0.2262, -0.2197],
        [-1.6866,  0.1925,  1.5110,  ..., -0.9599, -0.2262, -0.2197]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 35, 30, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8830, -0.6513,  0.5938,  ...,  0.2783, -1.4233, -0.9239],
        [ 0.3903,  0.0697, -1.4497,  ...,  0.3318, -1.5120,  0.5272],
        [ 0.8830, -0.6513,  0.5938,  ...,  0.2783, -1.4233, -0.9239],
        ...,
        [-1.6866,  0.1923,  1.5110,  ..., -0.9601, -0.2260, -0.2194],
        [-1.6866,  0.1923,  1.5110,  ..., -0.9601, -0.2260, -0.2194],
        [-1

PackedSequence(data=tensor([[ 0.3914,  0.0696, -1.4508,  ...,  0.3310, -1.5118,  0.5260],
        [ 0.8193, -1.1527, -1.8502,  ...,  0.1631, -0.2347, -1.2602],
        [ 0.8852, -0.6506,  0.5970,  ...,  0.2781, -1.4252, -0.9246],
        ...,
        [-1.6865,  0.1917,  1.5114,  ..., -0.9618, -0.2255, -0.2186],
        [-1.6865,  0.1917,  1.5114,  ..., -0.9618, -0.2255, -0.2186],
        [-1.6865,  0.1917,  1.5114,  ..., -0.9618, -0.2255, -0.2186]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 61, 53, 42, 31, 19,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2166,  1.7729, -0.9747,  ...,  0.3266, -2.1842,  1.5550],
        [ 0.8853, -0.6505,  0.5971,  ...,  0.2781, -1.4252, -0.9245],
        [ 0.8853, -0.6505,  0.5971,  ...,  0.2781, -1.4252, -0.9245],
        ...,
        [-1.6863,  0.1918,  1.5113,  ..., -0.9621, -0.2254, -0.2187],
        [-1.6863,  0.1918,  1.5113,  ..., -0.9621, -0.2254, -0.2187],
        [-1

PackedSequence(data=tensor([[-0.2076,  1.7739, -0.9717,  ...,  0.3334, -2.1959,  1.5428],
        [ 0.8202, -1.1522, -1.8475,  ...,  0.1629, -0.2346, -1.2606],
        [ 0.8877, -0.6507,  0.5963,  ...,  0.2790, -1.4260, -0.9252],
        ...,
        [-1.6845,  0.1924,  1.5105,  ..., -0.9629, -0.2243, -0.2186],
        [-1.6845,  0.1924,  1.5105,  ..., -0.9629, -0.2243, -0.2186],
        [-1.6845,  0.1924,  1.5105,  ..., -0.9629, -0.2243, -0.2186]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 59, 44, 27, 12,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8879, -0.6508,  0.5960,  ...,  0.2791, -1.4261, -0.9253],
        [ 0.3916,  0.0705, -1.4514,  ...,  0.3308, -1.5111,  0.5264],
        [-0.2661,  0.3818,  0.0159,  ...,  0.6567,  2.2231,  1.0732],
        ...,
        [-1.6850,  0.1928,  1.5103,  ..., -0.9624, -0.2245, -0.2188],
        [-1.6850,  0.1928,  1.5103,  ..., -0.9624, -0.2245, -0.2188],
        [-1

PackedSequence(data=tensor([[ 0.8918, -0.6475,  0.5921,  ...,  0.2795, -1.4280, -0.9221],
        [ 0.8213, -1.1531, -1.8444,  ...,  0.1650, -0.2358, -1.2634],
        [ 1.0402, -2.1056,  0.1015,  ...,  0.6901,  1.5105,  1.3766],
        ...,
        [-1.6869,  0.1932,  1.5098,  ..., -0.9609, -0.2219, -0.2168],
        [-1.6869,  0.1932,  1.5098,  ..., -0.9609, -0.2219, -0.2168],
        [-1.6869,  0.1932,  1.5098,  ..., -0.9609, -0.2219, -0.2168]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 41, 29, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5143,  1.2667, -1.1427,  ...,  1.2183,  2.6330,  0.7042],
        [ 0.8921, -0.6472,  0.5919,  ...,  0.2795, -1.4281, -0.9217],
        [ 0.8921, -0.6472,  0.5919,  ...,  0.2795, -1.4281, -0.9217],
        ...,
        [-1.6883,  0.1947,  1.5107,  ..., -0.9594, -0.2225, -0.2177],
        [-1.6883,  0.1947,  1.5107,  ..., -0.9594, -0.2225, -0.2177],
        [-1

PackedSequence(data=tensor([[-0.2364,  0.1076, -0.1333,  ..., -0.4794, -0.6404,  1.6042],
        [ 0.5157,  1.2640, -1.1460,  ...,  1.2156,  2.6310,  0.7096],
        [-0.2077,  1.7780, -0.9656,  ...,  0.3347, -2.1964,  1.5368],
        ...,
        [-1.6836,  0.1928,  1.5113,  ..., -0.9639, -0.2256, -0.2174],
        [-1.6836,  0.1928,  1.5113,  ..., -0.9639, -0.2256, -0.2174],
        [-1.6836,  0.1928,  1.5113,  ..., -0.9639, -0.2256, -0.2174]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 42, 32, 14,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8952, -0.6486,  0.5877,  ...,  0.2811, -1.4309, -0.9196],
        [ 0.8952, -0.6486,  0.5877,  ...,  0.2811, -1.4309, -0.9196],
        [ 0.8952, -0.6486,  0.5877,  ...,  0.2811, -1.4309, -0.9196],
        ...,
        [-1.6827,  0.1927,  1.5108,  ..., -0.9646, -0.2257, -0.2176],
        [-1.6827,  0.1927,  1.5108,  ..., -0.9646, -0.2257, -0.2176],
        [-1

PackedSequence(data=tensor([[ 0.8974, -0.6496,  0.5863,  ...,  0.2830, -1.4312, -0.9159],
        [ 0.8974, -0.6496,  0.5863,  ...,  0.2830, -1.4312, -0.9159],
        [ 0.4313, -0.1692, -1.3846,  ...,  1.3028, -0.0330,  0.4026],
        ...,
        [-1.6830,  0.1948,  1.5072,  ..., -0.9625, -0.2300, -0.2215],
        [-1.6830,  0.1948,  1.5072,  ..., -0.9625, -0.2300, -0.2215],
        [-1.6830,  0.1948,  1.5072,  ..., -0.9625, -0.2300, -0.2215]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 58, 44, 29, 14,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5150,  1.2582, -1.1468,  ...,  1.2139,  2.6292,  0.7189],
        [ 1.0542,  0.6331,  0.3101,  ...,  1.8808,  0.1281,  1.3421],
        [ 0.8973, -0.6492,  0.5862,  ...,  0.2827, -1.4314, -0.9156],
        ...,
        [-1.6828,  0.1947,  1.5069,  ..., -0.9624, -0.2299, -0.2213],
        [-1.6828,  0.1947,  1.5069,  ..., -0.9624, -0.2299, -0.2213],
        [-1

PackedSequence(data=tensor([[ 0.8972, -0.6466,  0.5847,  ...,  0.2799, -1.4329, -0.9090],
        [ 0.8118, -1.1632, -1.8545,  ...,  0.1801, -0.2297, -1.2630],
        [ 0.8972, -0.6466,  0.5847,  ...,  0.2799, -1.4329, -0.9090],
        ...,
        [-1.6842,  0.1941,  1.5050,  ..., -0.9606, -0.2290, -0.2200],
        [-1.6842,  0.1941,  1.5050,  ..., -0.9606, -0.2290, -0.2200],
        [-1.6842,  0.1941,  1.5050,  ..., -0.9606, -0.2290, -0.2200]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 54, 42, 27, 11,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8972, -0.6466,  0.5845,  ...,  0.2800, -1.4329, -0.9086],
        [ 0.8972, -0.6466,  0.5845,  ...,  0.2800, -1.4329, -0.9086],
        [ 0.3970,  0.0694, -1.4578,  ...,  0.3370, -1.5067,  0.5301],
        ...,
        [-1.6842,  0.1943,  1.5056,  ..., -0.9604, -0.2291, -0.2199],
        [-1.6842,  0.1943,  1.5056,  ..., -0.9604, -0.2291, -0.2199],
        [-1

PackedSequence(data=tensor([[ 0.8117, -1.1623, -1.8528,  ...,  0.1801, -0.2270, -1.2618],
        [ 0.8117, -1.1623, -1.8528,  ...,  0.1801, -0.2270, -1.2618],
        [ 0.0174, -0.6921,  0.3945,  ...,  0.1305, -0.5323, -0.4546],
        ...,
        [-1.6866,  0.1945,  1.5064,  ..., -0.9581, -0.2304, -0.2190],
        [-1.6866,  0.1945,  1.5064,  ..., -0.9581, -0.2304, -0.2190],
        [-1.6866,  0.1945,  1.5064,  ..., -0.9581, -0.2304, -0.2190]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 55, 42, 30, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.3984,  0.0698, -1.4598,  ...,  0.3364, -1.5074,  0.5276],
        [ 0.0173, -0.6921,  0.3945,  ...,  0.1305, -0.5323, -0.4547],
        [ 0.8971, -0.6485,  0.5826,  ...,  0.2806, -1.4325, -0.9059],
        ...,
        [-1.6867,  0.1942,  1.5063,  ..., -0.9583, -0.2308, -0.2188],
        [-1.6867,  0.1942,  1.5063,  ..., -0.9583, -0.2308, -0.2188],
        [-1

PackedSequence(data=tensor([[ 0.8998, -0.6525,  0.5804,  ...,  0.2830, -1.4327, -0.9087],
        [ 0.4000,  0.0701, -1.4588,  ...,  0.3365, -1.5080,  0.5259],
        [ 0.8998, -0.6525,  0.5804,  ...,  0.2830, -1.4327, -0.9087],
        ...,
        [-1.6878,  0.1960,  1.5035,  ..., -0.9585, -0.2347, -0.2222],
        [-1.6878,  0.1960,  1.5035,  ..., -0.9585, -0.2347, -0.2222],
        [-1.6878,  0.1960,  1.5035,  ..., -0.9585, -0.2347, -0.2222]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 56, 47, 34, 15,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8092, -1.1604, -1.8543,  ...,  0.1775, -0.2266, -1.2549],
        [ 1.0598,  0.6293,  0.3094,  ...,  1.8808,  0.1295,  1.3405],
        [-0.2375,  0.1105, -0.1369,  ..., -0.4814, -0.6419,  1.6010],
        ...,
        [-1.6878,  0.1960,  1.5033,  ..., -0.9582, -0.2343, -0.2220],
        [-1.6878,  0.1960,  1.5033,  ..., -0.9582, -0.2343, -0.2220],
        [-1

PackedSequence(data=tensor([[ 0.5244,  1.2537, -1.1407,  ...,  1.2209,  2.6400,  0.7183],
        [ 0.9000, -0.6550,  0.5818,  ...,  0.2849, -1.4360, -0.9111],
        [ 0.9000, -0.6550,  0.5818,  ...,  0.2849, -1.4360, -0.9111],
        ...,
        [-1.6869,  0.1934,  1.5022,  ..., -0.9551, -0.2286, -0.2174],
        [-1.6869,  0.1934,  1.5022,  ..., -0.9551, -0.2286, -0.2174],
        [-1.6869,  0.1934,  1.5022,  ..., -0.9551, -0.2286, -0.2174]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 52, 38, 23, 11,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8997, -0.6549,  0.5819,  ...,  0.2848, -1.4363, -0.9111],
        [ 0.8997, -0.6549,  0.5819,  ...,  0.2848, -1.4363, -0.9111],
        [ 0.8082, -1.1646, -1.8585,  ...,  0.1787, -0.2278, -1.2528],
        ...,
        [-1.6865,  0.1937,  1.5025,  ..., -0.9546, -0.2282, -0.2176],
        [-1.6865,  0.1937,  1.5025,  ..., -0.9546, -0.2282, -0.2176],
        [-1

PackedSequence(data=tensor([[ 0.8995, -0.6549,  0.5837,  ...,  0.2849, -1.4384, -0.9091],
        [ 1.0572,  0.6299,  0.3071,  ...,  1.8801,  0.1280,  1.3416],
        [ 0.8995, -0.6549,  0.5837,  ...,  0.2849, -1.4384, -0.9091],
        ...,
        [-1.6815,  0.1930,  1.5048,  ..., -0.9579, -0.2257, -0.2187],
        [-1.6815,  0.1930,  1.5048,  ..., -0.9579, -0.2257, -0.2187],
        [-1.6815,  0.1930,  1.5048,  ..., -0.9579, -0.2257, -0.2187]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 50, 40, 25, 16,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0571,  0.6301,  0.3070,  ...,  1.8801,  0.1277,  1.3416],
        [-1.8895, -1.3038, -0.7297,  ...,  0.5751,  0.0728,  0.7205],
        [ 0.8995, -0.6550,  0.5838,  ...,  0.2850, -1.4386, -0.9090],
        ...,
        [-1.6816,  0.1932,  1.5045,  ..., -0.9578, -0.2256, -0.2190],
        [-1.6816,  0.1932,  1.5045,  ..., -0.9578, -0.2256, -0.2190],
        [-1

PackedSequence(data=tensor([[ 0.3961,  0.0712, -1.4566,  ...,  0.3322, -1.5054,  0.5285],
        [ 1.0262, -2.1068,  0.0991,  ...,  0.6838,  1.5242,  1.3667],
        [ 0.9000, -0.6566,  0.5838,  ...,  0.2859, -1.4375, -0.9078],
        ...,
        [-1.6821,  0.1923,  1.5032,  ..., -0.9569, -0.2274, -0.2164],
        [-1.6821,  0.1923,  1.5032,  ..., -0.9569, -0.2274, -0.2164],
        [-1.6821,  0.1923,  1.5032,  ..., -0.9569, -0.2274, -0.2164]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 48, 39, 30, 19,  8]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.9001, -0.6566,  0.5838,  ...,  0.2858, -1.4375, -0.9078],
        [ 0.9001, -0.6566,  0.5838,  ...,  0.2858, -1.4375, -0.9078],
        [ 1.0266, -2.1073,  0.0989,  ...,  0.6840,  1.5240,  1.3666],
        ...,
        [-1.6823,  0.1923,  1.5031,  ..., -0.9568, -0.2274, -0.2162],
        [-1.6823,  0.1923,  1.5031,  ..., -0.9568, -0.2274, -0.2162],
        [-1

PackedSequence(data=tensor([[ 1.0546,  0.6325,  0.3064,  ...,  1.8822,  0.1264,  1.3424],
        [ 0.8996, -0.6577,  0.5836,  ...,  0.2858, -1.4362, -0.9073],
        [ 0.8996, -0.6577,  0.5836,  ...,  0.2858, -1.4362, -0.9073],
        ...,
        [-1.6854,  0.1933,  1.5043,  ..., -0.9561, -0.2271, -0.2168],
        [-1.6854,  0.1933,  1.5043,  ..., -0.9561, -0.2271, -0.2168],
        [-1.6854,  0.1933,  1.5043,  ..., -0.9561, -0.2271, -0.2168]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 51, 37, 21, 12,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0289, -2.1117,  0.0961,  ...,  0.6842,  1.5236,  1.3648],
        [ 0.8995, -0.6576,  0.5837,  ...,  0.2858, -1.4361, -0.9072],
        [ 0.8995, -0.6576,  0.5837,  ...,  0.2858, -1.4361, -0.9072],
        ...,
        [-1.6858,  0.1933,  1.5045,  ..., -0.9560, -0.2272, -0.2171],
        [-1.6858,  0.1933,  1.5045,  ..., -0.9560, -0.2272, -0.2171],
        [-1

PackedSequence(data=tensor([[-0.2374,  0.1149, -0.1387,  ..., -0.4812, -0.6447,  1.5987],
        [ 1.0302, -2.1136,  0.0946,  ...,  0.6860,  1.5251,  1.3665],
        [ 0.0235, -0.6957,  0.3937,  ...,  0.1327, -0.5358, -0.4608],
        ...,
        [-1.6859,  0.1942,  1.5071,  ..., -0.9574, -0.2262, -0.2188],
        [-1.6859,  0.1942,  1.5071,  ..., -0.9574, -0.2262, -0.2188],
        [-1.6859,  0.1942,  1.5071,  ..., -0.9574, -0.2262, -0.2188]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 41, 25, 15,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5166,  1.2558, -1.1433,  ...,  1.2219,  2.6355,  0.7173],
        [ 0.8977, -0.6575,  0.5830,  ...,  0.2872, -1.4344, -0.9087],
        [ 0.8977, -0.6575,  0.5830,  ...,  0.2872, -1.4344, -0.9087],
        ...,
        [-1.6858,  0.1941,  1.5070,  ..., -0.9574, -0.2259, -0.2184],
        [-1.6858,  0.1941,  1.5070,  ..., -0.9574, -0.2259, -0.2184],
        [-1

PackedSequence(data=tensor([[ 1.0311, -2.1156,  0.0928,  ...,  0.6890,  1.5274,  1.3688],
        [ 0.8965, -0.6548,  0.5811,  ...,  0.2866, -1.4335, -0.9109],
        [ 1.0558,  0.6312,  0.3061,  ...,  1.8832,  0.1308,  1.3455],
        ...,
        [-1.6871,  0.1946,  1.5032,  ..., -0.9560, -0.2270, -0.2203],
        [-1.6871,  0.1946,  1.5032,  ..., -0.9560, -0.2270, -0.2203],
        [-1.6871,  0.1946,  1.5032,  ..., -0.9560, -0.2270, -0.2203]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 57, 45, 32, 23, 12]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0559,  0.6311,  0.3061,  ...,  1.8832,  0.1309,  1.3457],
        [ 0.5138,  1.2557, -1.1417,  ...,  1.2231,  2.6355,  0.7169],
        [-0.2181,  1.7776, -0.9822,  ...,  0.3366, -2.1981,  1.5424],
        ...,
        [-1.6870,  0.1944,  1.5032,  ..., -0.9560, -0.2269, -0.2200],
        [-1.6870,  0.1944,  1.5032,  ..., -0.9560, -0.2269, -0.2200],
        [-1

PackedSequence(data=tensor([[-0.2332,  0.1200, -0.1384,  ..., -0.4774, -0.6495,  1.6070],
        [ 1.0313, -2.1162,  0.0926,  ...,  0.6913,  1.5292,  1.3710],
        [ 0.8966, -0.6560,  0.5776,  ...,  0.2883, -1.4336, -0.9120],
        ...,
        [-1.6846,  0.1932,  1.5043,  ..., -0.9583, -0.2265, -0.2180],
        [-1.6846,  0.1932,  1.5043,  ..., -0.9583, -0.2265, -0.2180],
        [-1.6846,  0.1932,  1.5043,  ..., -0.9583, -0.2265, -0.2180]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 40, 33, 17,  4]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0566,  0.6301,  0.3043,  ...,  1.8835,  0.1303,  1.3453],
        [ 0.5122,  1.2554, -1.1405,  ...,  1.2230,  2.6368,  0.7158],
        [ 0.8968, -0.6561,  0.5770,  ...,  0.2888, -1.4337, -0.9119],
        ...,
        [-1.6846,  0.1932,  1.5042,  ..., -0.9584, -0.2263, -0.2179],
        [-1.6846,  0.1932,  1.5042,  ..., -0.9584, -0.2263, -0.2179],
        [-1

PackedSequence(data=tensor([[ 0.8992, -0.6560,  0.5716,  ...,  0.2922, -1.4354, -0.9105],
        [ 0.8143, -1.1681, -1.8551,  ...,  0.1816, -0.2268, -1.2598],
        [-0.2347,  0.1241, -0.1356,  ..., -0.4791, -0.6459,  1.6045],
        ...,
        [-1.6859,  0.1938,  1.5038,  ..., -0.9584, -0.2256, -0.2171],
        [-1.6859,  0.1938,  1.5038,  ..., -0.9584, -0.2256, -0.2171],
        [-1.6859,  0.1938,  1.5038,  ..., -0.9584, -0.2256, -0.2171]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 62, 53, 43, 23,  8,  2]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8143, -1.1683, -1.8550,  ...,  0.1816, -0.2268, -1.2597],
        [ 0.8994, -0.6561,  0.5714,  ...,  0.2921, -1.4356, -0.9106],
        [ 0.3866,  0.0728, -1.4508,  ...,  0.3268, -1.5113,  0.5314],
        ...,
        [-1.6858,  0.1940,  1.5038,  ..., -0.9584, -0.2256, -0.2172],
        [-1.6858,  0.1940,  1.5038,  ..., -0.9584, -0.2256, -0.2172],
        [-1

PackedSequence(data=tensor([[ 0.9021, -0.6577,  0.5717,  ...,  0.2927, -1.4351, -0.9096],
        [ 0.9021, -0.6577,  0.5717,  ...,  0.2927, -1.4351, -0.9096],
        [ 0.9021, -0.6577,  0.5717,  ...,  0.2927, -1.4351, -0.9096],
        ...,
        [-1.6850,  0.1964,  1.5018,  ..., -0.9563, -0.2282, -0.2209],
        [-1.6850,  0.1964,  1.5018,  ..., -0.9563, -0.2282, -0.2209],
        [-1.6850,  0.1964,  1.5018,  ..., -0.9563, -0.2282, -0.2209]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 56, 42, 28, 12,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2334,  0.1247, -0.1365,  ..., -0.4786, -0.6440,  1.6043],
        [ 0.9023, -0.6576,  0.5720,  ...,  0.2929, -1.4349, -0.9096],
        [ 0.8138, -1.1686, -1.8541,  ...,  0.1819, -0.2259, -1.2593],
        ...,
        [-1.6853,  0.1963,  1.5017,  ..., -0.9562, -0.2285, -0.2211],
        [-1.6853,  0.1963,  1.5017,  ..., -0.9562, -0.2285, -0.2211],
        [-1

PackedSequence(data=tensor([[ 1.0571,  0.6267,  0.3051,  ...,  1.8837,  0.1331,  1.3416],
        [ 0.9034, -0.6565,  0.5744,  ...,  0.2954, -1.4336, -0.9086],
        [ 0.9034, -0.6565,  0.5744,  ...,  0.2954, -1.4336, -0.9086],
        ...,
        [-1.6844,  0.1935,  1.5018,  ..., -0.9558, -0.2258, -0.2185],
        [-1.6844,  0.1935,  1.5018,  ..., -0.9558, -0.2258, -0.2185],
        [-1.6844,  0.1935,  1.5018,  ..., -0.9558, -0.2258, -0.2185]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 61, 48, 33, 17,  3]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8112, -1.1689, -1.8562,  ...,  0.1814, -0.2256, -1.2558],
        [ 0.3881,  0.0726, -1.4530,  ...,  0.3277, -1.5126,  0.5256],
        [-0.2390,  0.1232, -0.1215,  ..., -0.4827, -0.6549,  1.6081],
        ...,
        [-1.6839,  0.1934,  1.5014,  ..., -0.9556, -0.2255, -0.2186],
        [-1.6839,  0.1934,  1.5014,  ..., -0.9556, -0.2255, -0.2186],
        [-1

PackedSequence(data=tensor([[ 1.0567,  0.6272,  0.3046,  ...,  1.8833,  0.1315,  1.3422],
        [ 0.9039, -0.6563,  0.5752,  ...,  0.2956, -1.4334, -0.9076],
        [ 0.9039, -0.6563,  0.5752,  ...,  0.2956, -1.4334, -0.9076],
        ...,
        [-1.6843,  0.1946,  1.5053,  ..., -0.9569, -0.2255, -0.2181],
        [-1.6843,  0.1946,  1.5053,  ..., -0.9569, -0.2255, -0.2181],
        [-1.6843,  0.1946,  1.5053,  ..., -0.9569, -0.2255, -0.2181]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 53, 42, 34, 18,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0566,  0.6272,  0.3048,  ...,  1.8833,  0.1313,  1.3421],
        [ 0.9039, -0.6563,  0.5752,  ...,  0.2957, -1.4333, -0.9075],
        [ 0.9039, -0.6563,  0.5752,  ...,  0.2957, -1.4333, -0.9075],
        ...,
        [-1.6844,  0.1947,  1.5054,  ..., -0.9572, -0.2257, -0.2182],
        [-1.6844,  0.1947,  1.5054,  ..., -0.9572, -0.2257, -0.2182],
        [-1

PackedSequence(data=tensor([[ 0.9030, -0.6566,  0.5744,  ...,  0.2965, -1.4334, -0.9072],
        [ 0.9030, -0.6566,  0.5744,  ...,  0.2965, -1.4334, -0.9072],
        [ 0.9030, -0.6566,  0.5744,  ...,  0.2965, -1.4334, -0.9072],
        ...,
        [-1.6837,  0.1935,  1.5051,  ..., -0.9595, -0.2266, -0.2153],
        [-1.6837,  0.1935,  1.5051,  ..., -0.9595, -0.2266, -0.2153],
        [-1.6837,  0.1935,  1.5051,  ..., -0.9595, -0.2266, -0.2153]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 53, 42, 29, 19,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.9029, -0.6566,  0.5742,  ...,  0.2964, -1.4334, -0.9072],
        [ 0.9029, -0.6566,  0.5742,  ...,  0.2964, -1.4334, -0.9072],
        [ 0.3922,  0.0718, -1.4542,  ...,  0.3299, -1.5128,  0.5264],
        ...,
        [-1.6839,  0.1934,  1.5051,  ..., -0.9595, -0.2268, -0.2152],
        [-1.6839,  0.1934,  1.5051,  ..., -0.9595, -0.2268, -0.2152],
        [-1

PackedSequence(data=tensor([[ 1.0542,  0.6289,  0.3089,  ...,  1.8845,  0.1280,  1.3396],
        [ 0.4119, -0.2988,  1.5828,  ...,  1.3888,  0.0099,  0.5607],
        [ 0.5055,  1.2607, -1.1351,  ...,  1.2132,  2.6310,  0.7261],
        ...,
        [-1.6857,  0.1941,  1.5047,  ..., -0.9587, -0.2281, -0.2156],
        [-1.6857,  0.1941,  1.5047,  ..., -0.9587, -0.2281, -0.2156],
        [-1.6857,  0.1941,  1.5047,  ..., -0.9587, -0.2281, -0.2156]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 57, 44, 25, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.9014, -0.6571,  0.5733,  ...,  0.2957, -1.4341, -0.9066],
        [ 0.9014, -0.6571,  0.5733,  ...,  0.2957, -1.4341, -0.9066],
        [ 1.0250, -2.1130,  0.0972,  ...,  0.6893,  1.5326,  1.3709],
        ...,
        [-1.6858,  0.1942,  1.5048,  ..., -0.9586, -0.2281, -0.2155],
        [-1.6858,  0.1942,  1.5048,  ..., -0.9586, -0.2281, -0.2155],
        [-1

PackedSequence(data=tensor([[ 0.5060,  1.2617, -1.1326,  ...,  1.2094,  2.6306,  0.7266],
        [ 0.9004, -0.6567,  0.5738,  ...,  0.2967, -1.4351, -0.9052],
        [-0.2132,  1.7795, -0.9808,  ...,  0.3405, -2.2056,  1.5407],
        ...,
        [-1.6888,  0.1926,  1.5052,  ..., -0.9581, -0.2285, -0.2146],
        [-1.6888,  0.1926,  1.5052,  ..., -0.9581, -0.2285, -0.2146],
        [-1.6888,  0.1926,  1.5052,  ..., -0.9581, -0.2285, -0.2146]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 38, 23, 11,  2]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5061,  1.2616, -1.1326,  ...,  1.2092,  2.6306,  0.7267],
        [ 0.8101, -1.1665, -1.8520,  ...,  0.1800, -0.2282, -1.2562],
        [ 0.9004, -0.6566,  0.5740,  ...,  0.2967, -1.4352, -0.9050],
        ...,
        [-1.6890,  0.1924,  1.5051,  ..., -0.9580, -0.2287, -0.2147],
        [-1.6890,  0.1924,  1.5051,  ..., -0.9580, -0.2287, -0.2147],
        [-1

PackedSequence(data=tensor([[ 0.8107, -1.1622, -1.8489,  ...,  0.1779, -0.2269, -1.2557],
        [ 0.5066,  1.2616, -1.1326,  ...,  1.2078,  2.6302,  0.7258],
        [ 0.9001, -0.6543,  0.5747,  ...,  0.2951, -1.4347, -0.9050],
        ...,
        [-1.6880,  0.1904,  1.5038,  ..., -0.9568, -0.2273, -0.2132],
        [-1.6880,  0.1904,  1.5038,  ..., -0.9568, -0.2273, -0.2132],
        [-1.6880,  0.1904,  1.5038,  ..., -0.9568, -0.2273, -0.2132]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 42, 34, 21,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.5066,  1.2615, -1.1327,  ...,  1.2079,  2.6301,  0.7257],
        [ 0.9001, -0.6543,  0.5747,  ...,  0.2950, -1.4347, -0.9050],
        [ 0.9001, -0.6543,  0.5747,  ...,  0.2950, -1.4347, -0.9050],
        ...,
        [-1.6879,  0.1904,  1.5041,  ..., -0.9567, -0.2270, -0.2131],
        [-1.6879,  0.1904,  1.5041,  ..., -0.9567, -0.2270, -0.2131],
        [-1

PackedSequence(data=tensor([[ 0.8986, -0.6532,  0.5754,  ...,  0.2934, -1.4338, -0.9054],
        [ 1.0553,  0.6276,  0.3081,  ...,  1.8857,  0.1266,  1.3420],
        [ 0.8100, -1.1619, -1.8504,  ...,  0.1759, -0.2263, -1.2565],
        ...,
        [-1.6860,  0.1921,  1.5054,  ..., -0.9566, -0.2263, -0.2153],
        [-1.6860,  0.1921,  1.5054,  ..., -0.9566, -0.2263, -0.2153],
        [-1.6860,  0.1921,  1.5054,  ..., -0.9566, -0.2263, -0.2153]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 52, 35, 22, 13,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 1.0553,  0.6276,  0.3082,  ...,  1.8856,  0.1265,  1.3420],
        [ 0.8100, -1.1619, -1.8505,  ...,  0.1758, -0.2263, -1.2565],
        [ 0.8984, -0.6533,  0.5755,  ...,  0.2934, -1.4337, -0.9054],
        ...,
        [-1.6860,  0.1922,  1.5055,  ..., -0.9566, -0.2262, -0.2155],
        [-1.6860,  0.1922,  1.5055,  ..., -0.9566, -0.2262, -0.2155],
        [-1

PackedSequence(data=tensor([[ 1.0550,  0.6275,  0.3078,  ...,  1.8846,  0.1266,  1.3414],
        [-0.6414,  0.6406, -0.4751,  ...,  0.0552,  0.0430, -0.2875],
        [ 0.0215, -0.7010,  0.3968,  ...,  0.1331, -0.5183, -0.4569],
        ...,
        [-1.6842,  0.1921,  1.5066,  ..., -0.9575, -0.2230, -0.2147],
        [-1.6842,  0.1921,  1.5066,  ..., -0.9575, -0.2230, -0.2147],
        [-1.6842,  0.1921,  1.5066,  ..., -0.9575, -0.2230, -0.2147]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 54, 40, 30, 18,  9]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8971, -0.6550,  0.5761,  ...,  0.2940, -1.4328, -0.9048],
        [ 0.8971, -0.6550,  0.5761,  ...,  0.2940, -1.4328, -0.9048],
        [ 1.0550,  0.6274,  0.3077,  ...,  1.8845,  0.1266,  1.3412],
        ...,
        [-1.6843,  0.1922,  1.5064,  ..., -0.9574, -0.2231, -0.2149],
        [-1.6843,  0.1922,  1.5064,  ..., -0.9574, -0.2231, -0.2149],
        [-1

PackedSequence(data=tensor([[-0.2080,  1.7772, -0.9830,  ...,  0.3415, -2.2072,  1.5425],
        [ 0.8979, -0.6557,  0.5753,  ...,  0.2940, -1.4318, -0.9052],
        [ 0.5048,  1.2585, -1.1356,  ...,  1.2080,  2.6234,  0.7167],
        ...,
        [-1.6859,  0.1920,  1.5046,  ..., -0.9569, -0.2265, -0.2167],
        [-1.6859,  0.1920,  1.5046,  ..., -0.9569, -0.2265, -0.2167],
        [-1.6859,  0.1920,  1.5046,  ..., -0.9569, -0.2265, -0.2167]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 49, 39, 30, 10,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8980, -0.6557,  0.5753,  ...,  0.2940, -1.4317, -0.9053],
        [ 0.5048,  1.2585, -1.1356,  ...,  1.2078,  2.6233,  0.7165],
        [-0.6417,  0.6423, -0.4757,  ...,  0.0552,  0.0421, -0.2877],
        ...,
        [-1.6858,  0.1919,  1.5045,  ..., -0.9569, -0.2265, -0.2166],
        [-1.6858,  0.1919,  1.5045,  ..., -0.9569, -0.2265, -0.2166],
        [-1

PackedSequence(data=tensor([[ 1.0333, -2.1176,  0.0955,  ...,  0.6879,  1.5298,  1.3693],
        [ 1.0333, -2.1176,  0.0955,  ...,  0.6879,  1.5298,  1.3693],
        [ 0.3904,  0.0698, -1.4541,  ...,  0.3327, -1.5143,  0.5287],
        ...,
        [-1.6868,  0.1923,  1.5056,  ..., -0.9572, -0.2245, -0.2160],
        [-1.6868,  0.1923,  1.5056,  ..., -0.9572, -0.2245, -0.2160],
        [-1.6868,  0.1923,  1.5056,  ..., -0.9572, -0.2245, -0.2160]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 58, 42, 30, 14,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8986, -0.6555,  0.5743,  ...,  0.2941, -1.4307, -0.9056],
        [ 0.8077, -1.1632, -1.8567,  ...,  0.1748, -0.2274, -1.2559],
        [ 0.8986, -0.6555,  0.5743,  ...,  0.2941, -1.4307, -0.9056],
        ...,
        [-1.6866,  0.1925,  1.5055,  ..., -0.9572, -0.2244, -0.2161],
        [-1.6866,  0.1925,  1.5055,  ..., -0.9572, -0.2244, -0.2161],
        [-1

PackedSequence(data=tensor([[ 0.8990, -0.6570,  0.5741,  ...,  0.2938, -1.4307, -0.9060],
        [ 1.0559,  0.6285,  0.3062,  ...,  1.8861,  0.1321,  1.3481],
        [ 1.0559,  0.6285,  0.3062,  ...,  1.8861,  0.1321,  1.3481],
        ...,
        [-1.6869,  0.1925,  1.5044,  ..., -0.9563, -0.2265, -0.2169],
        [-1.6869,  0.1925,  1.5044,  ..., -0.9563, -0.2265, -0.2169],
        [-1.6869,  0.1925,  1.5044,  ..., -0.9563, -0.2265, -0.2169]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 38, 25, 15,  5]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[-0.2433,  0.1031, -0.1231,  ..., -0.4842, -0.6542,  1.5913],
        [ 0.8990, -0.6572,  0.5743,  ...,  0.2939, -1.4307, -0.9062],
        [ 0.8990, -0.6572,  0.5743,  ...,  0.2939, -1.4307, -0.9062],
        ...,
        [-1.6869,  0.1923,  1.5045,  ..., -0.9563, -0.2265, -0.2167],
        [-1.6869,  0.1923,  1.5045,  ..., -0.9563, -0.2265, -0.2167],
        [-1

PackedSequence(data=tensor([[ 0.8989, -0.6594,  0.5745,  ...,  0.2947, -1.4314, -0.9079],
        [ 0.0223, -0.6995,  0.3946,  ...,  0.1325, -0.5183, -0.4565],
        [ 0.8989, -0.6594,  0.5745,  ...,  0.2947, -1.4314, -0.9079],
        ...,
        [-1.6866,  0.1916,  1.5044,  ..., -0.9562, -0.2247, -0.2145],
        [-1.6866,  0.1916,  1.5044,  ..., -0.9562, -0.2247, -0.2145],
        [-1.6866,  0.1916,  1.5044,  ..., -0.9562, -0.2247, -0.2145]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 40, 23, 12,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8988, -0.6595,  0.5744,  ...,  0.2947, -1.4315, -0.9080],
        [ 0.8988, -0.6595,  0.5744,  ...,  0.2947, -1.4315, -0.9080],
        [ 0.8988, -0.6595,  0.5744,  ...,  0.2947, -1.4315, -0.9080],
        ...,
        [-1.6868,  0.1916,  1.5041,  ..., -0.9560, -0.2249, -0.2146],
        [-1.6868,  0.1916,  1.5041,  ..., -0.9560, -0.2249, -0.2146],
        [-1

PackedSequence(data=tensor([[ 0.9010, -0.6601,  0.5744,  ...,  0.2954, -1.4325, -0.9072],
        [ 0.0020, -0.7734,  0.1870,  ...,  1.1507, -0.6012,  0.9378],
        [ 0.9010, -0.6601,  0.5744,  ...,  0.2954, -1.4325, -0.9072],
        ...,
        [-1.6891,  0.1931,  1.5047,  ..., -0.9571, -0.2248, -0.2166],
        [-1.6891,  0.1931,  1.5047,  ..., -0.9571, -0.2248, -0.2166],
        [-1.6891,  0.1931,  1.5047,  ..., -0.9571, -0.2248, -0.2166]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 64, 55, 40, 27, 15,  2]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.9010, -0.6600,  0.5744,  ...,  0.2954, -1.4325, -0.9070],
        [ 0.9010, -0.6600,  0.5744,  ...,  0.2954, -1.4325, -0.9070],
        [ 0.8077, -1.1643, -1.8552,  ...,  0.1762, -0.2235, -1.2517],
        ...,
        [-1.6890,  0.1931,  1.5048,  ..., -0.9572, -0.2249, -0.2167],
        [-1.6890,  0.1931,  1.5048,  ..., -0.9572, -0.2249, -0.2167],
        [-1

PackedSequence(data=tensor([[ 0.9008, -0.6593,  0.5748,  ...,  0.2952, -1.4326, -0.9048],
        [ 0.8084, -1.1660, -1.8543,  ...,  0.1780, -0.2217, -1.2522],
        [ 1.0585,  0.6290,  0.3009,  ...,  1.8911,  0.1334,  1.3480],
        ...,
        [-1.6879,  0.1918,  1.5050,  ..., -0.9560, -0.2242, -0.2150],
        [-1.6879,  0.1918,  1.5050,  ..., -0.9560, -0.2242, -0.2150],
        [-1.6879,  0.1918,  1.5050,  ..., -0.9560, -0.2242, -0.2150]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 60, 52, 42, 25, 14,  6]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.9008, -0.6593,  0.5750,  ...,  0.2953, -1.4325, -0.9047],
        [-0.2438,  0.1067, -0.1211,  ..., -0.4866, -0.6529,  1.5966],
        [ 0.8086, -1.1660, -1.8541,  ...,  0.1780, -0.2217, -1.2522],
        ...,
        [-1.6877,  0.1917,  1.5052,  ..., -0.9560, -0.2239, -0.2147],
        [-1.6877,  0.1917,  1.5052,  ..., -0.9560, -0.2239, -0.2147],
        [-1

PackedSequence(data=tensor([[ 0.8991, -0.6601,  0.5759,  ...,  0.2957, -1.4322, -0.9034],
        [-0.6449,  0.6366, -0.4804,  ...,  0.0581,  0.0429, -0.2882],
        [ 0.8991, -0.6601,  0.5759,  ...,  0.2957, -1.4322, -0.9034],
        ...,
        [-1.6874,  0.1922,  1.5052,  ..., -0.9561, -0.2239, -0.2160],
        [-1.6874,  0.1922,  1.5052,  ..., -0.9561, -0.2239, -0.2160],
        [-1.6874,  0.1922,  1.5052,  ..., -0.9561, -0.2239, -0.2160]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([64, 64, 64, 64, 63, 51, 40, 30, 19,  7]), sorted_indices=None, unsorted_indices=None)
PackedSequence(data=tensor([[ 0.8990, -0.6601,  0.5758,  ...,  0.2958, -1.4322, -0.9033],
        [ 0.8990, -0.6601,  0.5758,  ...,  0.2958, -1.4322, -0.9033],
        [ 0.8990, -0.6601,  0.5758,  ...,  0.2958, -1.4322, -0.9033],
        ...,
        [-1.6875,  0.1920,  1.5050,  ..., -0.9561, -0.2240, -0.2161],
        [-1.6875,  0.1920,  1.5050,  ..., -0.9561, -0.2240, -0.2161],
        [-1

In [24]:
# Save the model to disk (the pth-files will be submitted automatically together with your notebook)
if not skip_training:
    tools.save_model(encoder, '5_encoder.pth')
    tools.save_model(decoder, '5_decoder.pth')
else:
    hidden_size = 256
    encoder = Encoder(trainset.input_lang.n_words, embed_size, hidden_size)
    tools.load_model(encoder, '5_encoder.pth', device)
    
    decoder = Decoder(trainset.output_lang.n_words, embed_size, hidden_size)
    tools.load_model(decoder, '5_decoder.pth', device)

Do you want to save the model (type yes to confirm)? yes
Model saved to 5_encoder.pth.
Do you want to save the model (type yes to confirm)? yes
Model saved to 5_decoder.pth.


In [25]:
# This cell tests training accuracy

## Evaluation

Next we need to implement a function that converts a source sequence to an output sequence using the trained sequence-to-sequence model.

In [29]:
def translate(encoder, decoder, src_seq):
    """Translate given sentence src_seq using trained encoder and decoder.
    
    Args:
      encoder (Encoder): Trained encoder.
      decoder (Decoder): Trained decoder.
      src_seq of shape (src_seq_length,): LongTensor of word indices of the source sequence.
    
    Returns:
      out_seq of shape (out_seq_length,): LongTensor of word indices of the output sequence.
    """
    # YOUR CODE HERE
    try:
        hidden = encoder.init_hidden().to(device)
        src_seq = src_seq.view(-1, 1).to(device)
        out , hidden = encoder(src_seq,[src_seq.size(0)] , hidden)
        out , hidden = decoder(hidden)
        decoder_outputs = torch.max(out,dim=2)[1]
        return decoder_outputs
    except:
        raise NotImplementedError()

In [30]:
def test_translate_shapes():
    src_seq = torch.tensor([1, 2, 3, 4]).to(device)
    out_seq = translate(encoder, decoder, src_seq)
    assert out_seq.shape[0] <= MAX_LENGTH, \
        f"Too long output sequence: tgt_seq.shape[0]={tgt_seq.shape[0]}"
    print('Success')

test_translate_shapes()

PackedSequence(data=tensor([[-1.6890,  0.1918,  1.5031,  ..., -0.9552, -0.2277, -0.2183],
        [-0.2122,  1.7797, -0.9777,  ...,  0.3391, -2.2032,  1.5433],
        [ 1.5658,  1.0050, -0.6939,  ...,  0.4823,  0.1171,  0.9921],
        [-0.5672, -0.6882,  0.2482,  ...,  0.9172, -1.3292,  2.3397]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([1, 1, 1, 1]), sorted_indices=None, unsorted_indices=None)
Success


Let us now translate random sentences from the training set and print the source, target, and produced output.

If you trained the model well enough, the model should memorize the training data well.

In [31]:
# Translate random sentences from the training set
print('Translate training data:')
print('-----------------------------')
for i in range(5):
    src_sentence, tgt_sentence = trainset[np.random.choice(len(trainset))]
    print('SRC:', ' '.join(trainset.input_lang.index2word[i.item()] for i in src_sentence))
    print('TGT:', ' '.join(trainset.output_lang.index2word[i.item()] for i in tgt_sentence))
    out_sentence = translate(encoder, decoder, src_sentence)
    print('OUT:', ' '.join(trainset.output_lang.index2word[i.item()] for i in out_sentence))
    print('')

Translate training data:
-----------------------------
SRC: tu es si belle dans cette robe ! EOS
TGT: you re so beautiful in that dress . EOS
PackedSequence(data=tensor([[ 0.3936,  0.0738, -1.4573,  ...,  0.3330, -1.5065,  0.5300],
        [-0.7665,  1.1029, -0.3305,  ..., -1.2278, -1.0094, -0.6278],
        [-0.1730,  0.4422,  0.2513,  ...,  1.6282,  0.0563, -0.8459],
        ...,
        [ 1.6057, -0.0306, -0.1469,  ..., -1.6759,  0.0730,  0.3636],
        [ 1.0496,  0.1356,  0.3527,  ...,  0.0754,  0.2407, -1.6080],
        [-1.6890,  0.1918,  1.5031,  ..., -0.9552, -0.2277, -0.2183]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([1, 1, 1, 1, 1, 1, 1, 1, 1]), sorted_indices=None, unsorted_indices=None)
OUT: you re so beautiful in that dress . EOS EOS

SRC: vous etes des cretins . EOS
TGT: you are morons . EOS
PackedSequence(data=tensor([[ 0.5080,  1.2527, -1.1351,  ...,  1.2083,  2.6243,  0.7065],
        [-1.2052,  0.1859, -2.4632,  ...,  0.6781,  0.9663, -0.238

Now we translate random sentences from the test set. A well-trained model should output sentences that look similar to the target ones. The mistakes are usually done for words that were rare in the training set.

In [32]:
testset = TranslationDataset(data_dir, train=False)

In [33]:
print('Translate test data:')
print('-----------------------------')
for i in range(5):
    src_sentence, tgt_sentence = testset[np.random.choice(len(testset))]
    print('SRC:', ' '.join(testset.input_lang.index2word[i.item()] for i in src_sentence))
    print('TGT:', ' '.join(testset.output_lang.index2word[i.item()] for i in tgt_sentence))
    out_sentence = translate(encoder, decoder, src_sentence)
    print('OUT:', ' '.join(testset.output_lang.index2word[i.item()] for i in out_sentence))
    print('')

Translate test data:
-----------------------------
SRC: je ne suis pas occupe . EOS
TGT: i m not busy . EOS
PackedSequence(data=tensor([[ 0.8991, -0.6610,  0.5752,  ...,  0.2959, -1.4321, -0.9027],
        [-0.0271,  0.7982,  1.6409,  ..., -1.0739,  0.6021, -0.2495],
        [ 2.0962,  0.3000, -0.4227,  ..., -0.5595, -1.4396, -0.6483],
        ...,
        [-1.7943, -1.3819,  0.9851,  ...,  0.1078, -0.1593,  0.2307],
        [-1.9663,  1.8077,  0.2535,  ..., -0.7386,  0.7112,  0.1785],
        [-1.6890,  0.1918,  1.5031,  ..., -0.9552, -0.2277, -0.2183]],
       grad_fn=<PackPaddedSequenceBackward>), batch_sizes=tensor([1, 1, 1, 1, 1, 1, 1]), sorted_indices=None, unsorted_indices=None)
OUT: i m not busy busy . EOS EOS EOS EOS

SRC: il est etudiant . EOS
TGT: he is a student . EOS
PackedSequence(data=tensor([[ 0.8107, -1.1672, -1.8538,  ...,  0.1790, -0.2224, -1.2545],
        [ 0.1443,  0.0887,  1.3498,  ...,  0.4902,  1.9784, -0.7329],
        [ 1.0223, -0.0500,  0.3705,  ...,  0.9585