In the earlier section, we used classical machine learning techniques to build our text classifiers. In this chapter, we will replace those with two very popular deep learning techniques: Convolutional Neural Networks and Recurrent Neural Networks. 

We will build the simplest possible architectures. We assume a general familiarity with CNNs and RNNs and don’t introduce the same again. We share some best practices for building these deep networks. 

Lastly, we use one of the popular architectures: the Bi-LSTM layers to do some linguistic tasks we shared earlier. 

Skills learned: For each heading, insert what the reader will learn to DO in this chapter?
- SKILL : Comfortable with programming in PyTorch 
- SKILL : How to tokenize text and how to use word embeddings that we saw earlier
- SKILL : Using CNN for Text Classification
- SKILL : What recurrent networks are, and how to use them for text classification; How to stack RNN layers and use bidirectional RNNs to build more-powerful sequence-processing models
- SKILL : Using Bi-LSTM models for linguistic tasks

# PyTorch Introduction
- The three main parts: the model architecture, the loss function and the training strategy

# CNN Classifiers

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F

In [2]:
use_gpu = True
if use_gpu:
    assert torch.cuda.is_available()

In [3]:
torch.cuda.device_count()

1

In [4]:
class  CharCNN(nn.Module):
    def __init__(self, args):
        super(CharCNN, self).__init__()
        self.conv1 = nn.Sequential(
            nn.Conv1d(args.num_features, 256, kernel_size=7, stride=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3, stride=3)
        )

        self.conv2 = nn.Sequential(
            nn.Conv1d(256, 256, kernel_size=7, stride=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3, stride=3)
        )            
            
        self.conv3 = nn.Sequential(
            nn.Conv1d(256, 256, kernel_size=3, stride=1),
            nn.ReLU()
        )

        self.conv4 = nn.Sequential(
            nn.Conv1d(256, 256, kernel_size=3, stride=1),
            nn.ReLU()    
        )
        
        self.conv5 = nn.Sequential(
            nn.Conv1d(256, 256, kernel_size=3, stride=1),
            nn.ReLU()
        )

        self.conv6 = nn.Sequential(
            nn.Conv1d(256, 256, kernel_size=3, stride=1),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=3, stride=3)
        )
            
        
        self.fc1 = nn.Sequential(
            nn.Linear(8704, 1024),
            nn.ReLU(),
            nn.Dropout(p=args.dropout)
        )
        
        self.fc2 = nn.Sequential(
            nn.Linear(1024, 1024),
            nn.ReLU(),
            nn.Dropout(p=args.dropout)
        )

        self.fc3 = nn.Linear(1024, 4)
        self.log_softmax = nn.LogSoftmax()

    def forward(self, x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.conv3(x)
        x = self.conv4(x)
        x = self.conv5(x)
        x = self.conv6(x)

        x = x.view(x.size(0), -1) # collapse
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)
        x = self.log_softmax(x)
        
        return x

In [5]:
from torchtext import data
from torchtext import datasets
from torchtext.vocab import GloVe

In [6]:
TEXT = data.Field(lower=True, include_lengths=True, batch_first=True)
LABEL = data.Field(sequential=False)

In [8]:
train, test = datasets.IMDB.splits(TEXT, LABELi

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 803: character maps to <undefined>

# Bi-LSTM Classifiers
- What is a Bi-LSTM? 
- What is a RNN? 
- Implement LSTM-only classification example
- Implement Bi-LSTM classification example

# Bi-LSTM for Linguistic Tasks