# RNN POS Tagger

A *part of speech tagger* (POS tagger) takes a sentence and labels each word with its part of speech (POS).

For example:
- **Input:** The cat saw the dog on the bench.
- **Output:** DET NOUN VERB DET NOUN ADP DET NOUN PUNCT

In this part of the lecture, we will create a POS tagger using an SRN architecture.

Import the following:

In [1]:
import torch
from torch import nn
from torch import optim
from torchtext import data
from torchtext.datasets import UDPOS

## Model Definition

<img src="images/srn.png" style="width: 50%; height: 50%;"/>

In [2]:
class RNNPOSTagger(nn.Module):
    def __init__(self, x_vocab, y_vocab, embedding_size, hidden_size):
        super(RNNPOSTagger, self).__init__()
        self._embedding = nn.Embedding(len(x_vocab), embedding_size)
        self._rnn = nn.RNN(embedding_size, hidden_size)
        self._linear = nn.Linear(hidden_size, len(y_vocab))

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        embeddings = self._embedding(x)
        rnn_output, _ = self._rnn(embeddings)
        return self._linear(rnn_output)

## Setup
Prepare the data:

In [3]:
# Load the data
UDPOS.download("data")
x_field = data.Field()
y_field = data.Field()
fields = [("x_field", x_field), ("y_field", y_field)]
train_data, val_data, test_data = UDPOS.splits(fields, root="data")

# Build the vocab
x_field.build_vocab(train_data, val_data, test_data)
y_field.build_vocab(train_data, val_data, test_data)

downloading en-ud-v2.zip


en-ud-v2.zip: 100%|██████████| 688k/688k [00:00<00:00, 738kB/s] 


extracting


Set up the model:

In [4]:
model = RNNPOSTagger(x_field.vocab, y_field.vocab, 50, 50)
loss_function = nn.CrossEntropyLoss(ignore_index=1)
optimizer = optim.Adam(model.parameters(), lr=.01)

## Training

In [5]:
ctr = 0
for batch in data.BucketIterator(train_data, 32):
    # Forward pass
    y_hat = model(batch.x_field)
    loss = 0
    for t in range(len(y_hat)):
        loss += loss_function(y_hat[t], batch.y_field[t])

    # Backward pass
    model.zero_grad()
    loss.backward()
    optimizer.step()

    # Display progress
    ctr += 1
    if ctr % 10 != 0:
        continue

    num_correct = (y_hat.argmax(axis=2) == batch.y_field).sum()
    num_total = (batch.y_field != 1).sum()
    acc = float(num_correct) / float(num_total)
    print("Batch {} Loss: {:.3f}, Accuracy: {:.3f}".format(ctr, loss, acc))

Batch 10 Loss: 77.078, Accuracy: 0.350
Batch 20 Loss: 59.508, Accuracy: 0.504
Batch 30 Loss: 64.290, Accuracy: 0.549
Batch 40 Loss: 55.345, Accuracy: 0.623
Batch 50 Loss: 118.026, Accuracy: 0.611
Batch 60 Loss: 68.761, Accuracy: 0.613
Batch 70 Loss: 53.593, Accuracy: 0.634
Batch 80 Loss: 70.788, Accuracy: 0.728
Batch 90 Loss: 62.213, Accuracy: 0.647
Batch 100 Loss: 90.538, Accuracy: 0.690
Batch 110 Loss: 31.041, Accuracy: 0.657
Batch 120 Loss: 140.014, Accuracy: 0.701
Batch 130 Loss: 44.190, Accuracy: 0.747
Batch 140 Loss: 47.666, Accuracy: 0.711
Batch 150 Loss: 44.179, Accuracy: 0.712
Batch 160 Loss: 51.545, Accuracy: 0.762
Batch 170 Loss: 31.979, Accuracy: 0.758
Batch 180 Loss: 49.518, Accuracy: 0.755
Batch 190 Loss: 29.433, Accuracy: 0.776
Batch 200 Loss: 35.392, Accuracy: 0.792
Batch 210 Loss: 29.950, Accuracy: 0.820
Batch 220 Loss: 28.512, Accuracy: 0.804
Batch 230 Loss: 43.355, Accuracy: 0.774
Batch 240 Loss: 64.275, Accuracy: 0.823
Batch 250 Loss: 29.085, Accuracy: 0.814
Batch 2

## Testing

In [6]:
loss = 0
num_correct = 0
num_total = 0
for batch in data.BucketIterator(test_data, 32):
    y_hat = model(batch.x_field)
    for t in range(len(y_hat)):
        loss += loss_function(y_hat[t], batch.y_field[t])
    num_correct += (y_hat.argmax(axis=2) == batch.y_field).sum()
    num_total += (batch.y_field != 1).sum()

acc = float(num_correct) / float(num_total)
print("Testing Loss: {:.3f}, Testing Accuracy: {:.3f}".format(loss, acc))

Testing Loss: 1579.206, Testing Accuracy: 0.819
