# CNN-based Text Classification

## Imports

Here are the packages we need to import.

In [5]:
from nlpmodels.models import text_cnn
from nlpmodels.utils import train,utils,text_cnn_dataset
from argparse import Namespace
utils.set_seed_everywhere()


## Sentiment Analysis with CNNs

Following the logic in Kim's paper, we are running an embedding + convolutional layer architecture in order
to conduct sentiment analysis.

### Hyper-parameters

These are the data processing and model training hyper-parameters for this run. Note that we are running a smaller model
than cited in the paper for fewer iterations...on a CPU. This is meant merely to demonstrate it works.

In [6]:
args = Namespace(
        # Model hyper-parameters
        max_sequence_length=175,
        dim_model=100,
        num_filters=3,
        window_sizes=[3,4,5],
        num_classes=2,
        dropout=0.5, #from paper
        # Training hyper-parameters
        num_epochs=30, #30 from original implementation
        learning_rate=1.e-3,
        batch_size=50
)

In [7]:
train_loader, vocab = text_cnn_dataset.TextCNNDataset.get_training_dataloader(args)
model = text_cnn.TextCNN(vocab_size = len(vocab),
                        dim_model = args.dim_model,
                        num_filters = args.num_filters,
                        window_sizes =  args.window_sizes,
                        num_classes = args.num_classes,
                        dropout = args.dropout)

trainer = train.TextCNNTrainer(args, vocab.mask_index, model, train_loader, vocab)

25000lines [00:01, 18234.04lines/s]


Let's run this.

In [8]:
trainer.run()

[Epoch 0]: 100%|██████████| 263/263 [00:49<00:00,  5.37it/s, accuracy=4.67, loss=0]       
[Epoch 1]: 100%|██████████| 263/263 [00:48<00:00,  5.38it/s, accuracy=4.67, loss=2.57e-5] 
[Epoch 2]: 100%|██████████| 263/263 [00:49<00:00,  5.32it/s, accuracy=4.67, loss=0]       
[Epoch 3]: 100%|██████████| 263/263 [00:49<00:00,  5.27it/s, accuracy=4.67, loss=0]       
[Epoch 4]: 100%|██████████| 263/263 [00:49<00:00,  5.28it/s, accuracy=4.67, loss=0]       
[Epoch 5]: 100%|██████████| 263/263 [00:50<00:00,  5.22it/s, accuracy=4.67, loss=0]       
[Epoch 6]: 100%|██████████| 263/263 [00:50<00:00,  5.23it/s, accuracy=4.67, loss=0]       
[Epoch 7]: 100%|██████████| 263/263 [00:51<00:00,  5.13it/s, accuracy=4.67, loss=0.272]   
[Epoch 8]: 100%|██████████| 263/263 [00:50<00:00,  5.16it/s, accuracy=4.67, loss=0]       
[Epoch 9]: 100%|██████████| 263/263 [00:51<00:00,  5.14it/s, accuracy=4.67, loss=0]       
[Epoch 10]: 100%|██████████| 263/263 [00:51<00:00,  5.11it/s, accuracy=4.67, loss=0]      

Finished Training...


The goal is just to show how this works - you can play with the hyper-parameters as you see fit.

In an ideal situation, we would check the data against an unseen val or test set to diagnose performance.
