# Testing Alternative Classifiers for NLP

Based on the classification results, the classifiers had very similar results. That said, the only classifier that was on par with the Naive Bayes' classifier is the _Support Vector Classifier_ with an accuracy of _73.5%_. This is most likely because both classifiers aren't biased by outlier datapoints. As a result, training reviews that don't contain related words won't be used to classify testing reviews.

<hr>

__Initializing Variables & Importing Libraries:__

In [1]:
import numpy as np
from sklearn.metrics import confusion_matrix

X_train = np.loadtxt('X_train', dtype = 'int')
X_test = np.loadtxt('X_test', dtype = 'int')
y_train = np.loadtxt('y_train', dtype = 'int')
y_test = np.loadtxt('y_test', dtype = 'int')

<hr>

__Logistic Regression Classification:__<br>

In [2]:
# Fitting classifier to the Training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(solver = 'lbfgs', random_state = 0)
classifier.fit(X_train, y_train)

# Predicting the Test set results
y_pred = classifier.predict(X_test)

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
print(confusion_matrix(y_test, y_pred))
print(accuracy_score(y_test, y_pred))

[[80 17]
 [28 75]]
0.775


<hr>

__K Nearest Neighbors Classification:__<br>

In [3]:
# Fitting classifier to the Training set
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 5, metric = 'minkowski', p = 2)
classifier.fit(X_train, y_train)

# Predicting the Test set results
y_pred = classifier.predict(X_test)

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
print(confusion_matrix(y_test, y_pred))
print(accuracy_score(y_test, y_pred))

[[74 23]
 [45 58]]
0.66


<hr>

__Decision Tree Classifier:__<br>

In [4]:
# Fitting classifier to the Training set
from sklearn.tree import DecisionTreeClassifier
classifier = DecisionTreeClassifier(criterion = 'entropy', random_state = 0)
classifier.fit(X_train, y_train)

# Predicting the Test set results
y_pred = classifier.predict(X_test)

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
print(confusion_matrix(y_test, y_pred))
print(accuracy_score(y_test, y_pred))

[[78 19]
 [31 72]]
0.75


<hr>

__Random Forest Classification:__<br>

In [5]:
# Fitting classifier to the Training set
from sklearn.ensemble import RandomForestClassifier
classifier = RandomForestClassifier(n_estimators = 10, criterion = 'entropy', random_state = 0)
classifier.fit(X_train, y_train)

# Predicting the Test set results
y_pred = classifier.predict(X_test)

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
print(confusion_matrix(y_test, y_pred))
print(accuracy_score(y_test, y_pred))

[[87 10]
 [45 58]]
0.725


<hr>

__Support Vector Classification:__<br>

In [6]:
# Fitting classifier to the Training set
from sklearn.svm import SVC
classifier = SVC(kernel = 'rbf', random_state = 0)
classifier.fit(X_train, y_train)

# Predicting the Test set results
y_pred = classifier.predict(X_test)

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
print(confusion_matrix(y_test, y_pred))
print(accuracy_score(y_test, y_pred))

[[89  8]
 [36 67]]
0.78
