# Pytorch PackedSequence Tutorial
---

This article is optimized at [nbviewer](https://nbviewer.jupyter.org/github/simonjisu/pytorch_tutorials/blob/master/00_Basic/PackedSequence/PackedSequence_Tutorial.ipynb) or clone this [repo](https://github.com/simonjisu/pytorch_tutorials.git)

## Contents

1. [Preprocessing](#1.-Preprocessing)
2. [How to use PackedSequence object in pytorch](#2.-How-to-use-PackedSequence-object-in-pytorch)

---

![fig1](./figs/0705img1.png)

figure from: https://medium.com/huggingface/understanding-emotions-from-keras-to-pytorch-3ccb61d5a983 

## 1. Preprocessing

Always have to do this preprocessing, while you are working on NLP.

* make vocabulary, one token matches single unique index.
* add <pad> token.
* change all tokens to vocabulary index that you made.

In [1]:
import torch
import torch.nn as nn
import numpy as np
np.random.seed(123)
batch_data = ["I love Mom ' s cooking", "I love you too !", "No way", "This is the shit", "Yes"]
input_seq = [s.split() for s in batch_data]
max_len = 0
for s in input_seq:
    if len(s) >= max_len:
        max_len = len(s)
vocab = {w: i for i, w in enumerate(set([t for s in input_seq for t in s]), 1)}
vocab["<pad>"] = 0
input_seq = [s+["<pad>"]*(max_len-len(s)) if len(s) < max_len else s for s in input_seq]
input_seq2idx = torch.LongTensor([list(map(vocab.get, s)) for s in input_seq])

In [2]:
input_seq

[['I', 'love', 'Mom', "'", 's', 'cooking'],
 ['I', 'love', 'you', 'too', '!', '<pad>'],
 ['No', 'way', '<pad>', '<pad>', '<pad>', '<pad>'],
 ['This', 'is', 'the', 'shit', '<pad>', '<pad>'],
 ['Yes', '<pad>', '<pad>', '<pad>', '<pad>', '<pad>']]

In [3]:
input_seq2idx

tensor([[14,  8,  7, 16,  1, 15],
        [14,  8,  4,  2, 10,  0],
        [ 5, 11,  0,  0,  0,  0],
        [ 3,  9, 12, 13,  0,  0],
        [ 6,  0,  0,  0,  0,  0]])

---

## 2. How to use PackedSequence object in pytorch

1. [using pack_padded_sequence](#2.1-using-pack_padded_sequence)
2. [usage in RNN](#2.2-usage-in-RNN)
3. [unpack to get output](#2.3-unpack-to-get-output)
4. [last hidden state mapped to output](#2.4-last-hidden-state-mapped-to-output)

### 2.1 using pack_padded_sequence

Change batch matrix in a decreasing order of sentence length.

![fig2](./figs/0705img2.png)

figure from: https://medium.com/huggingface/understanding-emotions-from-keras-to-pytorch-3ccb61d5a983 

In [4]:
from torch.nn.utils.rnn import pack_padded_sequence

In [5]:
input_lengths = torch.LongTensor([torch.max(input_seq2idx[i, :].data.nonzero())+1 
                                  for i in range(input_seq2idx.size(0))])
input_lengths, sorted_idx = input_lengths.sort(0, descending=True)
input_seq2idx = input_seq2idx[sorted_idx]

In [6]:
input_seq2idx

tensor([[14,  8,  7, 16,  1, 15],
        [14,  8,  4,  2, 10,  0],
        [ 3,  9, 12, 13,  0,  0],
        [ 5, 11,  0,  0,  0,  0],
        [ 6,  0,  0,  0,  0,  0]])

In [7]:
input_lengths  # length of each sentences in batch

tensor([6, 5, 4, 2, 1])

In [8]:
packed_input = pack_padded_sequence(input_seq2idx, input_lengths.tolist(), batch_first=True)

In [9]:
print(type(packed_input))
print(packed_input[0])  # packed data
print(packed_input[1])  # batch_sizes

<class 'torch.nn.utils.rnn.PackedSequence'>
tensor([14, 14,  3,  5,  6,  8,  8,  9, 11,  7,  4, 12, 16,  2, 13,  1, 10, 15])
tensor([5, 4, 3, 3, 2, 1])


### 2.2 usage in RNN

Any RNN type(RNN, LSTM, GRU) that you use it's not matter.

Also, normaliy we use `Embedding layer` to map all tokens to a real number vector space. In traning step, let the network learn the suitable sapce to solve a task. If you don't familiar with `Embedding layer` search under references.

* Pytorch documentation: https://pytorch.org/docs/stable/nn.html?highlight=embedding#torch.nn.Embedding
* presented some picture how embedding works in my blog (korean) https://simonjisu.github.io/nlp/2018/04/20/allaboutwv2.html

In [10]:
vocab_size = len(vocab)
hidden_size = 1
embedding_size = 5
num_layers = 3

In [11]:
embed = nn.Embedding(vocab_size, embedding_size, padding_idx=0)
gru = nn.RNN(input_size=embedding_size, hidden_size=hidden_size, num_layers=num_layers, 
             bidirectional=False, batch_first=True)

In [12]:
embeded = embed(input_seq2idx)
packed_input = pack_padded_sequence(embeded, input_lengths.tolist(), batch_first=True)
packed_output, hidden = gru(packed_input)

In [13]:
packed_output[0].size(), packed_output[1]

(torch.Size([18, 1]), tensor([5, 4, 3, 3, 2, 1], grad_fn=<PackPaddedBackward>))

### 2.3 unpack to get output

In [14]:
from torch.nn.utils.rnn import pad_packed_sequence

In [15]:
output, output_lengths = pad_packed_sequence(packed_output, batch_first=True)

In [16]:
output.size(), output_lengths

(torch.Size([5, 6, 1]), tensor([6, 5, 4, 2, 1]))

it fills all <pad\> output as zeros

In [17]:
packed_output[0]

tensor([[0.5452],
        [0.5452],
        [0.5571],
        [0.5942],
        [0.5959],
        [0.4812],
        [0.4812],
        [0.4254],
        [0.4137],
        [0.5322],
        [0.5655],
        [0.5119],
        [0.4390],
        [0.4866],
        [0.5362],
        [0.4834],
        [0.5585],
        [0.5477]], grad_fn=<CatBackward>)

In [18]:
output

tensor([[[0.5452],
         [0.4812],
         [0.5322],
         [0.4390],
         [0.4834],
         [0.5477]],

        [[0.5452],
         [0.4812],
         [0.5655],
         [0.4866],
         [0.5585],
         [0.0000]],

        [[0.5571],
         [0.4254],
         [0.5119],
         [0.5362],
         [0.0000],
         [0.0000]],

        [[0.5942],
         [0.4137],
         [0.0000],
         [0.0000],
         [0.0000],
         [0.0000]],

        [[0.5959],
         [0.0000],
         [0.0000],
         [0.0000],
         [0.0000],
         [0.0000]]], grad_fn=<TransposeBackward0>)

### 2.4 last hidden state mapped to output

In [19]:
import pandas as pd

In [20]:
def color_white(val):
    color = 'white' if val == 0 else 'black'
    return 'color: {}'.format(color)
def color_red(data):
    max_len = len(data)
    fmt = 'color: red'
    lst = []
    for i, v in enumerate(data):
        if (v != 0) and (i == max_len-1):
            lst.append(fmt)
        elif (v != 0) and (data[i+1] == 0):
            lst.append(fmt)
        else:
            lst.append('')
    return lst

In [21]:
df = pd.DataFrame(np.concatenate([o.detach().numpy() for o in output.transpose(0, 1)], axis=1).round(4))
df.index.name = 'batch'
df.columns.name = 'hidden_step'
df.style.applymap(color_white).apply(color_red, axis=1)

hidden_step,0,1,2,3,4,5
batch,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,0.5452,0.4812,0.5322,0.439,0.4834,0.5477
1,0.5452,0.4812,0.5655,0.4866,0.5585,0.0
2,0.5571,0.4254,0.5119,0.5362,0.0,0.0
3,0.5942,0.4137,0.0,0.0,0.0,0.0
4,0.5959,0.0,0.0,0.0,0.0,0.0


The **red vectors** are last hidden vectors.

In [22]:
hidden[-1]

tensor([[0.5477],
        [0.5585],
        [0.5362],
        [0.4137],
        [0.5959]], grad_fn=<SelectBackward>)

In [23]:
packed_output[0], packed_output[1]

(tensor([[0.5452],
         [0.5452],
         [0.5571],
         [0.5942],
         [0.5959],
         [0.4812],
         [0.4812],
         [0.4254],
         [0.4137],
         [0.5322],
         [0.5655],
         [0.5119],
         [0.4390],
         [0.4866],
         [0.5362],
         [0.4834],
         [0.5585],
         [0.5477]], grad_fn=<CatBackward>),
 tensor([5, 4, 3, 3, 2, 1], grad_fn=<PackPaddedBackward>))