# Classical classifiers

This notebook shows how well classical classifiers solve the sentence classification problem that is chosen for the pre-alpha prototype. This is mainly needed to have a baseline for quantum solution to compare against.

We first load our library files and the required packages.

In [1]:
import sys
import os
import json
import numpy as np
from sklearn.neighbors import KNeighborsClassifier

sys.path.append("../../../my_lib/src/ClassicalNLP/Classifier/")

from NNClassifier import (loadData,evaluate, NNClassifier,
       prepareTrainTestXYWords, prepareTrainTestXYSentence,)

For data vectorizing separate services are used which are run as Docker containers. To simplify the presentation via Jupyter notebook, for demonstration we will just load the already vectorized data.

We have implemented sentence vectorization using pretrained _BERT_ (see [Git](https://github.com/google-research/bert), [arXiv](https://arxiv.org/abs/1810.04805)) base model (each sentence is represented as a 768-dimensional real-valued vector),
as well as word-level vectorization using [_fastText_](https://fasttext.cc/) model [pretrained on English Wikipedia](https://fasttext.cc/docs/en/pretrained-vectors.html) (each word in a sentence is represented as a 300-dimensional real-valued vector).

We will start with the BERT sentence-level vectorization.

In [2]:
data = loadData("../../../my_lib/src/ClassicalNLP/Datasets/dataset_vectorized_bert_uncased.json")
print(f"Training samples: {len(data['train_data'])}, test samples: {len(data['test_data'])}")
print(f"An example sentence: {data['train_data'][2]['sentence']}, type: {data['train_data'][2]['sentence_type']}, truth value: {data['train_data'][2]['truth_value']}")
print(f"Vectorized sentence dimension: {len(data['train_data'][0]['sentence_vectorized'][0])}")

Training samples: 89, test samples: 23
An example sentence: chicken eats fox, type: NOUN-TVERB-NOUN, truth value: False
Vectorized sentence dimension: 768


We reformat the data as numpy arrays for classifier training. 

In [3]:
trainX, trainY, testX, testY = prepareTrainTestXYSentence(data)
print(f"{trainX.shape}")
print(f"{trainY.shape}")
print(f"{testX.shape}")
print(f"{testY.shape}")

(89, 768)
(89, 2)
(23, 768)
(23, 2)


We test the accuracy of classifying the sentence with the label corresponding to the closest sentence in the training data.

In [4]:
neigh = KNeighborsClassifier(n_neighbors=1)
neigh.fit(trainX, trainY)
res = neigh.predict(testX)
score = np.sum(res == testY)/2/len(testY)
print(score)

0.6956521739130435


Now we train a feedforward neural network classifier.

In [5]:
classifier = NNClassifier()
classifier.train(trainX, trainY)

Epoch 1/100
3/3 - 0s - loss: 0.7482 - accuracy: 0.5618
Epoch 2/100
3/3 - 0s - loss: 0.7056 - accuracy: 0.4944
Epoch 3/100
3/3 - 0s - loss: 0.6973 - accuracy: 0.5618
Epoch 4/100
3/3 - 0s - loss: 0.6674 - accuracy: 0.5955
Epoch 5/100
3/3 - 0s - loss: 0.6564 - accuracy: 0.6292
Epoch 6/100
3/3 - 0s - loss: 0.6465 - accuracy: 0.5955
Epoch 7/100
3/3 - 0s - loss: 0.6313 - accuracy: 0.6517
Epoch 8/100
3/3 - 0s - loss: 0.6203 - accuracy: 0.6404
Epoch 9/100
3/3 - 0s - loss: 0.6089 - accuracy: 0.7191
Epoch 10/100
3/3 - 0s - loss: 0.5972 - accuracy: 0.7303
Epoch 11/100
3/3 - 0s - loss: 0.5891 - accuracy: 0.7416
Epoch 12/100
3/3 - 0s - loss: 0.5808 - accuracy: 0.7191
Epoch 13/100
3/3 - 0s - loss: 0.5763 - accuracy: 0.7303
Epoch 14/100
3/3 - 0s - loss: 0.5670 - accuracy: 0.7528
Epoch 15/100
3/3 - 0s - loss: 0.5609 - accuracy: 0.7528
Epoch 16/100
3/3 - 0s - loss: 0.5561 - accuracy: 0.7191
Epoch 17/100
3/3 - 0s - loss: 0.5501 - accuracy: 0.7528
Epoch 18/100
3/3 - 0s - loss: 0.5443 - accuracy: 0.7753
E

And measure the accuracy on the test set.

In [6]:
res = classifier.predict(testX)
score = evaluate(res, testY)
print(f"FFNN accuracy: {score}")

FFNN accuracy: 0.6956521739130435


Depending on random, the NN classifier might perform better or worse (or the same) than the 1-nearest neighbor algorithm.

Now we load fastText word embeddings and train a convolutional network on them.

In [7]:
data = loadData("../../../my_lib/src/ClassicalNLP/Datasets/dataset_vectorized_fasttext.json")

maxLen = 5
trainX, trainY, testX, testY = prepareTrainTestXYWords(data, maxLen)

classifier = NNClassifier(model="CNN", vectorSpaceSize=300)
classifier.train(trainX, trainY)

Epoch 1/30
3/3 - 0s - loss: 0.7933 - accuracy: 0.5393
Epoch 2/30
3/3 - 0s - loss: 0.7259 - accuracy: 0.5281
Epoch 3/30
3/3 - 0s - loss: 0.6248 - accuracy: 0.6629
Epoch 4/30
3/3 - 0s - loss: 0.5158 - accuracy: 0.7978
Epoch 5/30
3/3 - 0s - loss: 0.5534 - accuracy: 0.7416
Epoch 6/30
3/3 - 0s - loss: 0.4603 - accuracy: 0.8202
Epoch 7/30
3/3 - 0s - loss: 0.4278 - accuracy: 0.7865
Epoch 8/30
3/3 - 0s - loss: 0.4139 - accuracy: 0.8427
Epoch 9/30
3/3 - 0s - loss: 0.4094 - accuracy: 0.8764
Epoch 10/30
3/3 - 0s - loss: 0.3949 - accuracy: 0.8315
Epoch 11/30
3/3 - 0s - loss: 0.3751 - accuracy: 0.8202
Epoch 12/30
3/3 - 0s - loss: 0.3409 - accuracy: 0.8764
Epoch 13/30
3/3 - 0s - loss: 0.4087 - accuracy: 0.8315
Epoch 14/30
3/3 - 0s - loss: 0.3475 - accuracy: 0.8764
Epoch 15/30
3/3 - 0s - loss: 0.2947 - accuracy: 0.9101
Epoch 16/30
3/3 - 0s - loss: 0.2909 - accuracy: 0.8652
Epoch 17/30
3/3 - 0s - loss: 0.2731 - accuracy: 0.8876
Epoch 18/30
3/3 - 0s - loss: 0.2674 - accuracy: 0.9326
Epoch 19/30
3/3 - 0

In [8]:
res = classifier.predict(testX)
score = evaluate(res, testY)
print(score)

0.7391304347826086


Typically the convolutional network performs better than the previous algorithms.

A promising future direction to explore seems to be implementing vectorizing using BERT that outputs word embeddings for each word in a sentence and training a convolutional network on top of them.