# IMDb Reviews (Classical)

This notebook trains and evaluates classical baselines for the IMDb Reviews sentiment classification task. Note that this is a text classification task.
You can find information about the dataset at https://www.tensorflow.org/datasets/catalog/imdb_reviews.

In [1]:
import jax
import tensorflow as tf
tf.config.set_visible_devices([], device_type='GPU')  # Ensure TF does not see GPU and grab all GPU memory.
tf.random.set_seed(42)  # For reproducibility.

from quantum_transformers.datasets import get_imdb_dataloaders
from quantum_transformers.training import train_and_evaluate
from quantum_transformers.transformers import Transformer

data_dir = '/global/cfs/cdirs/m4392/salcc/data'

2023-11-03 03:06:28.166936: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-03 03:06:28.166966: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-03 03:06:28.166993: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Please first ``pip install -U cirq`` to enable related functionality in translation module


The models are trained using the following devices:

In [2]:
for d in jax.devices():
    print(d, d.device_kind)

gpu:0 NVIDIA A100-SXM4-40GB


Let's check how big is the vocabulary, and see an example of one example review (both in tokenized and raw form).

In [3]:
(imdb_train_dataloader, imdb_val_dataloader, imdb_test_dataloader), vocab, tokenizer = get_imdb_dataloaders(batch_size=32, data_dir=data_dir, max_vocab_size=20_000, max_seq_len=512)
print(f"Vocabulary size: {len(vocab)}")
first_batch = next(iter(imdb_train_dataloader))
print(first_batch[0][0])
print(' '.join(map(bytes.decode, tokenizer.detokenize(first_batch[0])[0].numpy().tolist())))

Cardinalities (train, val, test): 22500 2500 25000
Vocabulary size: 19769
[  140   198  2023    98  2191   313   113  3086   658    16     5  6662
     5    99   536   120    97   237   198    17    95   317  1105    98
  1520   376   175    42   836    16  4251  3272    15   110   300   319
   101  3642    17    95  2266    97   103   783    99   114    98  2362
    98  1224   147    42   908   317    15   110   341    98  6505  5022
    95  1471    16   851  2063   739    17    31   100    18    33    31
   100    18    33   106   247    15   668  3681   106    42   146  7141
   186    97   864   117   246    17   181    97    95   194   116 12160
   113  1404    15    96   499   176   123    10   248   341   272  1691
  3271    17  8331  8412   137  2422   392  3179    98   119    42  9449
    15   588    98  9690    96  1223    17   150  8941    98  8412    15
   110   117   405    97  3179   281   155   588    98  1223    95  9229
   908    16  3244    17  3143    15   101    10  

Next, we train a relatively big Transformer that obtains a good AUC score on the test set (hyperparameters found by random hyperparameter search). Note however that this model size is too big to be replicated on a quantum computer currently.

In [4]:
model = Transformer(num_tokens=len(vocab), max_seq_len=512, num_classes=2, hidden_size=64, num_heads=2, num_transformer_blocks=4, mlp_hidden_size=32)
train_and_evaluate(model, imdb_train_dataloader, imdb_val_dataloader, imdb_test_dataloader, num_classes=2, num_epochs=30)

Number of parameters = 1382594


Epoch   1/30: 100%|██████████| 703/703 [00:10<00:00, 67.04batch/s, Loss = 0.5424, AUC = 81.85%] 
Epoch   2/30: 100%|██████████| 703/703 [00:04<00:00, 158.05batch/s, Loss = 0.3482, AUC = 92.99%]
Epoch   3/30: 100%|██████████| 703/703 [00:04<00:00, 160.02batch/s, Loss = 0.3267, AUC = 94.52%]
Epoch   4/30: 100%|██████████| 703/703 [00:04<00:00, 157.67batch/s, Loss = 0.3122, AUC = 94.88%]
Epoch   5/30: 100%|██████████| 703/703 [00:04<00:00, 157.65batch/s, Loss = 0.3847, AUC = 94.34%]
Epoch   6/30: 100%|██████████| 703/703 [00:04<00:00, 157.69batch/s, Loss = 0.4091, AUC = 94.04%]
Epoch   7/30: 100%|██████████| 703/703 [00:04<00:00, 158.37batch/s, Loss = 0.4692, AUC = 93.88%]
Epoch   8/30: 100%|██████████| 703/703 [00:04<00:00, 158.76batch/s, Loss = 0.5167, AUC = 94.00%]
Epoch   9/30: 100%|██████████| 703/703 [00:04<00:00, 152.83batch/s, Loss = 0.5313, AUC = 94.17%]
Epoch  10/30: 100%|██████████| 703/703 [00:04<00:00, 159.44batch/s, Loss = 0.6408, AUC = 93.74%]
Epoch  11/30: 100%|██████████|

Total training time = 139.41s, best validation AUC = 94.88% at epoch 4


Testing: 100%|██████████| 781/781 [00:05<00:00, 147.19batch/s, Loss = 0.3632, AUC = 93.18%]


(Array(0.36320385, dtype=float32),
 93.1777571685361,
 array([0.        , 0.        , 0.        , ..., 0.99959994, 0.99975996,
        1.        ]),
 array([0.00000000e+00, 8.00384184e-05, 7.20345766e-04, ...,
        1.00000000e+00, 1.00000000e+00, 1.00000000e+00]))

Now let's train a smaller model which could be run on a quantum computer. Note that the number of parameters is much smaller.

In [5]:
model = Transformer(num_tokens=len(vocab), max_seq_len=512, num_classes=2, hidden_size=8, num_heads=2, num_transformer_blocks=4, mlp_hidden_size=4)
train_and_evaluate(model, imdb_train_dataloader, imdb_val_dataloader, imdb_test_dataloader, num_classes=2, num_epochs=30)

Number of parameters = 163866


Epoch   1/30: 100%|██████████| 703/703 [00:08<00:00, 81.53batch/s, Loss = 0.6908, AUC = 53.89%] 
Epoch   2/30: 100%|██████████| 703/703 [00:03<00:00, 184.13batch/s, Loss = 0.6870, AUC = 57.98%]
Epoch   3/30: 100%|██████████| 703/703 [00:03<00:00, 184.61batch/s, Loss = 0.5736, AUC = 79.28%]
Epoch   4/30: 100%|██████████| 703/703 [00:03<00:00, 182.53batch/s, Loss = 0.4744, AUC = 85.89%]
Epoch   5/30: 100%|██████████| 703/703 [00:03<00:00, 182.43batch/s, Loss = 0.4941, AUC = 89.21%]
Epoch   6/30: 100%|██████████| 703/703 [00:03<00:00, 184.32batch/s, Loss = 0.3875, AUC = 90.96%]
Epoch   7/30: 100%|██████████| 703/703 [00:03<00:00, 185.33batch/s, Loss = 0.3723, AUC = 92.12%]
Epoch   8/30: 100%|██████████| 703/703 [00:03<00:00, 184.15batch/s, Loss = 0.4038, AUC = 91.96%]
Epoch   9/30: 100%|██████████| 703/703 [00:03<00:00, 184.59batch/s, Loss = 0.3936, AUC = 93.07%]
Epoch  10/30: 100%|██████████| 703/703 [00:03<00:00, 185.92batch/s, Loss = 0.4683, AUC = 93.18%]
Epoch  11/30: 100%|██████████|

Total training time = 119.52s, best validation AUC = 93.18% at epoch 10


Testing: 100%|██████████| 781/781 [00:04<00:00, 169.00batch/s, Loss = 0.5277, AUC = 91.54%]


(Array(0.5276519, dtype=float32),
 91.53631861392364,
 array([0.        , 0.        , 0.        , ..., 0.99695951, 0.99711954,
        1.        ]),
 array([0.00000000e+00, 8.00384184e-05, 3.28157516e-03, ...,
        1.00000000e+00, 1.00000000e+00, 1.00000000e+00]))