# Classical classifiers

This notebook shows how well classical classifiers solve the sentence classification problem that is chosen for the pre-alpha prototype. This is mainly needed to have a baseline for quantum solution to compare against.

We first load our library files and the required packages.

In [81]:
import sys
sys.path.append("../../my_lib/src/ClassicalNLP/Classifier/")

In [82]:
import os
import json
import pandas as pd
import numpy as np
from sklearn.neighbors import KNeighborsClassifier

from NNClassifier import (loadData,evaluate, NNClassifier,
       prepareTrainTestXYWords, prepareTrainTestXYSentence,)

For data vectorizing separate services are used which are run as Docker containers. To simplify the presentation via Jupyter notebook, for demonstration we will just load the already vectorized data.

We have implemented sentence vectorization using pretrained _BERT_ (see [Git](https://github.com/google-research/bert), [arXiv](https://arxiv.org/abs/1810.04805)) base model (each sentence is represented as a 768-dimensional real-valued vector),
as well as word-level vectorization using [_fastText_](https://fasttext.cc/) model [pretrained on English Wikipedia](https://fasttext.cc/docs/en/pretrained-vectors.html) (each word in a sentence is represented as a 300-dimensional real-valued vector).

We will start with the BERT sentence-level vectorization.

In [83]:
data = loadData("../../my_lib/src/ClassicalNLP/Datasets/dataset_vectorized_bert_uncased.json")
print(f"Training samples: {len(data['train_data'])}, test samples: {len(data['test_data'])}")
print(f"An example sentence: {data['train_data'][2]['sentence']}, type: {data['train_data'][2]['sentence_type']}, truth value: {data['train_data'][2]['truth_value']}")
print(f"Vectorized sentence dimension: {len(data['train_data'][0]['sentence_vectorized'][0])}")

Training samples: 89, test samples: 23
An example sentence: chicken eats fox, type: NOUN-TVERB-NOUN, truth value: False
Vectorized sentence dimension: 768


We reformat the data as numpy arrays for classifier training. 

In [84]:
trainX, trainY, testX, testY = prepareTrainTestXYSentence(data)
print(f"{trainX.shape}")
print(f"{trainY.shape}")
print(f"{testX.shape}")
print(f"{testY.shape}")

(89, 768)
(89, 2)
(23, 768)
(23, 2)


We test the accuracy of classifying the sentence with the label corresponding to the closest sentence in the training data.

In [85]:
neigh = KNeighborsClassifier(n_neighbors=1)
neigh.fit(trainX, trainY)
res = neigh.predict(testX)
score = np.sum(res == testY)/2/len(testY)
print(score)

0.6956521739130435


Now we train a feedforward neural network classifier.

In [86]:
classifier = NNClassifier()
classifier.train(trainX, trainY)

Epoch 1/100
3/3 - 0s - loss: 0.7827 - accuracy: 0.4607
Epoch 2/100
3/3 - 0s - loss: 0.7454 - accuracy: 0.4494
Epoch 3/100
3/3 - 0s - loss: 0.7247 - accuracy: 0.4831
Epoch 4/100
3/3 - 0s - loss: 0.7062 - accuracy: 0.5730
Epoch 5/100
3/3 - 0s - loss: 0.6878 - accuracy: 0.5955
Epoch 6/100
3/3 - 0s - loss: 0.6733 - accuracy: 0.5843
Epoch 7/100
3/3 - 0s - loss: 0.6607 - accuracy: 0.5955
Epoch 8/100
3/3 - 0s - loss: 0.6481 - accuracy: 0.5955
Epoch 9/100
3/3 - 0s - loss: 0.6383 - accuracy: 0.5730
Epoch 10/100
3/3 - 0s - loss: 0.6295 - accuracy: 0.6067
Epoch 11/100
3/3 - 0s - loss: 0.6226 - accuracy: 0.6067
Epoch 12/100
3/3 - 0s - loss: 0.6118 - accuracy: 0.6180
Epoch 13/100
3/3 - 0s - loss: 0.6031 - accuracy: 0.6629
Epoch 14/100
3/3 - 0s - loss: 0.5962 - accuracy: 0.6854
Epoch 15/100
3/3 - 0s - loss: 0.5897 - accuracy: 0.6966
Epoch 16/100
3/3 - 0s - loss: 0.5837 - accuracy: 0.6966
Epoch 17/100
3/3 - 0s - loss: 0.5769 - accuracy: 0.7079
Epoch 18/100
3/3 - 0s - loss: 0.5725 - accuracy: 0.6966
E

And measure the accuracy on the test set.

In [87]:
res = classifier.predict(testX)
score = evaluate(res, testY)
print(f"FFNN accuracy: {score}")

FFNN accuracy: 0.6521739130434783


Depending on random, the NN classifier might perform better or worse than the 1-nearest neighbor algorithm.

Now we load fastText word embeddings and train a convolutional network on them.

In [88]:
data = loadData("../../my_lib/src/ClassicalNLP/Datasets/dataset_vectorized_fasttext.json")

maxLen = 5
trainX, trainY, testX, testY = prepareTrainTestXYWords(data, maxLen)

classifier = NNClassifier(model="CNN", vectorSpaceSize=300)
classifier.train(trainX, trainY)

Epoch 1/30
3/3 - 0s - loss: 0.8288 - accuracy: 0.5169
Epoch 2/30
3/3 - 0s - loss: 0.6508 - accuracy: 0.5730
Epoch 3/30
3/3 - 0s - loss: 0.5874 - accuracy: 0.6966
Epoch 4/30
3/3 - 0s - loss: 0.5726 - accuracy: 0.6854
Epoch 5/30
3/3 - 0s - loss: 0.6016 - accuracy: 0.6742
Epoch 6/30
3/3 - 0s - loss: 0.5121 - accuracy: 0.7865
Epoch 7/30
3/3 - 0s - loss: 0.4671 - accuracy: 0.7865
Epoch 8/30
3/3 - 0s - loss: 0.4244 - accuracy: 0.8202
Epoch 9/30
3/3 - 0s - loss: 0.4001 - accuracy: 0.8315
Epoch 10/30
3/3 - 0s - loss: 0.4194 - accuracy: 0.8427
Epoch 11/30
3/3 - 0s - loss: 0.4203 - accuracy: 0.8202
Epoch 12/30
3/3 - 0s - loss: 0.3070 - accuracy: 0.8876
Epoch 13/30
3/3 - 0s - loss: 0.3846 - accuracy: 0.7865
Epoch 14/30
3/3 - 0s - loss: 0.3470 - accuracy: 0.8202
Epoch 15/30
3/3 - 0s - loss: 0.3679 - accuracy: 0.8202
Epoch 16/30
3/3 - 0s - loss: 0.3052 - accuracy: 0.9213
Epoch 17/30
3/3 - 0s - loss: 0.2882 - accuracy: 0.8876
Epoch 18/30
3/3 - 0s - loss: 0.2617 - accuracy: 0.9326
Epoch 19/30
3/3 - 0

In [77]:
res = classifier.predict(testX)
score = evaluate(res, testY)
print(score)

0.8260869565217391


Typically the convolutional network performs better than the previous algorithms.

A promising future direction to explore seems to be implementing vectorizing using BERT that outputs word embeddings for each word in a sentence and training a convolutional network on top of them.