# TensorFlow vs IBM Watson Comparison
This notebook compares the labeling accuracy between TensorFlow and IBM Watson models training on scientific paper titles.

## Setup Libraries

In [9]:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tensorflow as tf

## Import and Process IBM Watson Results
Two data files are used in this step:

1. arxiv_classifier_test_set.csv, which contains the titles and correct labels
2. watson_classifier_test_results_small.csv, which contains the predictied labels using the 13 different subjects

In [10]:
# Correct labels
correct_data = pd.read_csv('arxiv_classifier_test_set.csv', encoding='latin1')
                             #names=['title', 'main_label', 'label_2', 'label_3', 'label_4'], encoding='latin1')
titles = correct_data['Titles']
correct_labels = correct_data['Labels']

# Predicted labels
predicted_labels = pd.read_csv('watson_classifier_test_results_small.csv',
                              header=None, encoding='latin1')
predicted_labels = predicted_labels[:][1]

In [11]:
# Function to convert correct and predicted labels to integers
def label_to_int(labels):
    # Define classes
    labels_class = np.array([1 if 'astro' in label else 2 if 'physics' in label\
                            else 3 if 'gr-' in label else 4 if 'hep' in label\
                            else 5 if 'math' in label else 6 if 'nlin' in label\
                            else 7 if 'nucl' in label else 8 if 'cond-mat' in label\
                            else 9 if 'q-bio' in label else 10 if 'q-fin' in label\
                            else 11 if 'quant-ph' in label else 12 if 'stat' in label
                            else 0 for label in labels])
    return labels_class

In [12]:
# Convert labels to integers
correct_labels_int = label_to_int(correct_labels)
predicted_labels_int = label_to_int(predicted_labels)

## Calculate IBM Watson Subject Prediction Accuracy

In [13]:
# Create function to check matches
def check_value_match(array1, array2):
    matches = [1 if array1[ii] == array2[ii] else 0 for ii in range(len(array1))]
    return matches

In [14]:
# Calculate accuracy
labels_matched = check_value_match(correct_labels_int, predicted_labels_int)

# Print IBM Watson accuracy
print("IBM Watson accuracy on titles: {:3f}".format(np.mean(labels_matched)) )

IBM Watson accuracy on titles: 0.679389


## Comparison
IBM Watson outperform TensorFlow model by about 12% (56% vs. 68%). This may be due to the following:

- Algorithm(s): Watson may be using something other than RNNs or LSTMs
- Nueral network architecture (if applicable): If Watson is using RNN and LSTMs, then the architecture may be different
- Supplementary data: In addition to the training data, Watson may be using other NPL knowledge supplementing their models

In any case, it may be possible to improve the TensorFlow model by simply increasing the dataset, more thorough pre-processing steps, tyring other neural network architectures, and supplementing the model with pre-determined word embeddings.