## Importing libraries

In [428]:
import numpy as np
import pandas as pd
import nltk
import sys

In [429]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB

In [430]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.metrics import classification_report, confusion_matrix 

In [431]:
pd.set_option('display.max_colwidth', None)
np.set_printoptions(threshold=sys.maxsize)

## Task 1: Train and evaluate a unigram-based baseline classifier

In [432]:
chatgpt_train = pd.read_csv("chatgpt_train.csv")
chatgpt_train.head()

Unnamed: 0,date,title,review,rating
0,5/21/2023 16:42,Much more accessible for blind users than the web version,"Up to this point I?€?ve mostly been using ChatGPT on my windows desktop using Google Chrome. While it?€?s doable, screen reader navigation is pretty difficult on the desktop site and you really have to be an advanced user to find your way through it. I have submitted numerous feedbacks to open AI about this but nothing has changed on that front.\nWell, the good news ?€? the iOS app pretty much addresses all of those problems. The UI seems really clean, uncluttered and designed well to be compatible with voiceover, the screen reader built into iOS. I applaud the inclusivity of this design ?€? I only wish they would give the same attention and care to the accessibility experience of the desktop app.\nI would have given this review five stars but I have just a couple minor quibbles. First, once I submit my prompt, voiceover starts to read aloud ChatGPT?€?s response before that response is finished, so I will hear the first few words of the response followed by voiceover reading aloud the ?€?stop generating?€? button, which isn?€?t super helpful. It would be great if you could better coordinate this alert so that it didn?€?t start reading the message until it had been fully generated. The other thing I?€?d like is a Feedback button easily accessible from within the main screen of the app, to make it as easy as possible to get continuing suggestions and feedback from your users.\nOtherwise, fantastic app so far!",4
1,7/11/2023 12:24,"Much anticipated, wasn?€?t let down.","I?€?ve been a user since it?€?s initial roll out and have been waiting for a mobile application ever since using the web app. For reference I?€?m a software engineering student while working in IT full time. I have to say GPT is an crucial tool. It takes far less time to get information quickly that you?€?d otherwise have to source from stack-overflow, various red-hat articles, Ubuntu articles, searching through software documentation, Microsoft documentation ect. Typically chat gpt can find the answer in a fraction of a second that google can. Obviously it is wrong, a lot. But to have the ability to get quick information on my phone like I can in the web browser I?€?m super excited about and have already been using the mobile app since download constantly. And I?€?m excited for the future of this program becoming more accurate and it seems to be getting more and more precise with every roll out. Gone are the days scouring the internet for obscure pieces of information, chat gpt can find it for you with 2 or 3 prompts. I love this app and I?€?m so happy it?€?s on mobile now. The UI is also very sleek, easy to use. My only complaint with the interface is the history tab at the top right. I actually prefer the conversation tabs on the left in the web app but I understand it would make the app kind of clunky especially on mobile since the screen size is smaller. Anyway, awesome app 5 stars.",4
2,5/19/2023 10:16,"Almost 5 stars, but?€? no search function","This app would almost be perfect if it wasn?€?t for ONE little thing: a ?€?search in?€? function. As anyone can imagine, these AI chats can get quuuuite long, and quite lengthy. And sometimes I wanna go into a chat & look up a specific part or parts of that particular chat by using a search function to look up key words within that chat & track down whatever part I was looking for. For example, in a chat, if I had searched way early into the chat ?€?how do lions hunt??€? And say days later, I wanted to revisit that particular response, I wanna be able to go into the actual chat go to a ?€?search in?€? function and be able to type in key words like ?€?lions?€? or ?€?hunt?€? to be able to automatically find that part in the chat instead of having to scroll through a massive chat to find that part. Similar to what you can do in Microsoft Word docs, or even on web browsers. I think the app already kind of has this, but it doesn?€?t work exactly like that. I tested it out, and all it does is, you type in key words, and it shows you that those words are present in the chat or chats, but it doesn?€?t actually track it down or take you to that specific part. If this app can have that? It?€?s an absolute massive perfect winner. Addendum - I also noticed that my phone (iPhone 14 Pro Max) tends to run a little hotter, which I?€?m sure will affect battery life, when using the app. Just wanted to add that too.",4
3,5/27/2023 21:57,"4.5 stars, here?€?s why","I recently downloaded the app and overall, it's a great platform with excellent potential. However, I did encounter a couple of issues with logging in that I feel need to be addressed. Firstly, the login process was somewhat cumbersome. It took me a few attempts to successfully log in, as the app didn't always recognize my credentials right away. This could be improved by streamlining the login flow and ensuring a smoother user experience. Secondly, the app occasionally experienced login glitches, where it would unexpectedly log me out without any apparent reason. It was frustrating to have to re-enter my login information repeatedly, disrupting my usage. Despite these issues, I must say that once I managed to log in successfully, the app itself was fantastic. It offers a wide range of features and a sleek user interface that is visually appealing. The content available on the platform is diverse and engaging, keeping me entertained for hours. I do hope the developers take note of the login challenges and work on improving this aspect of the app. With a more reliable login system, this app has the potential to become a top-tier platform. In conclusion, while the app has its share of login issues, it still holds promise. I'm optimistic that with some updates and improvements, it can provide an even better user experience. I look forward to seeing future enhancements and continued growth of the app.",4
4,6/9/2023 7:49,"Good, but Siri support would take it to the next level","I appreciate the devs implementing Siri support?€?it is already enhancing the usefulness of the app, despite being a little clunky. I?€?d prefer if it were possible to make a query in one fell swoop, however. Currently, it seems like I have to say ?€?ask ChatGPT?€? then wait to be asked what my query is before saying the actual query. I know that it?€?s possible in other contexts to submit a request to a third-party app through Siri in a single query?€?I?€?m able to say ?€?Create a reminder for XYZ in Things,?€? for example. \n\nIn addition, the responses produced when querying Siri seem identical to those you would get in-app. This isn?€?t actually ideal in a scenario when you?€?re getting a spoken response?€?they just go on for way too long. I wonder if the app could be smart enough to recognize that a request was coming via Siri and then effectively append an invisible ?€?Please answer in less than 50 words?€? to those requests. \n\nOtherwise, a solid but basic chat app that closely replicates the web experience. Aside from being uncannily good at punctuating my speech, the dictation feature is not really any different to iOS?€?s stock dictation and is not all that useful.",4


In [433]:
chatgpt_train.columns

Index(['date', 'title', 'review', 'rating'], dtype='object')

In [434]:
chatgpt_test = pd.read_csv("chatgpt_test.csv")
chatgpt_test.head()

Unnamed: 0,date,title,review,rating
0,5/19/2023 6:09,error unsupported country,cant login,2
1,5/19/2023 9:39,Hype junk,More harm than help.,1
2,5/19/2023 4:12,your gpt 4 is fake,Fix it,1
3,5/20/2023 3:01,Please impose IPadOS,We need IPadOS!!!,5
4,5/19/2023 20:49,Amazing,Great,5


### Number of instances in the training dataset with blank reviews

In [435]:
len(chatgpt_train)

1834

### Number of instances in the test dataset with blank reviews

In [436]:
len(chatgpt_test)

458

In [437]:
chatgpt_train = chatgpt_train[~chatgpt_train["review"].isnull()]
chatgpt_test = chatgpt_test[~chatgpt_test["review"].isnull()]

### Number of instances in the training dataset without blank reviews

In [438]:
len(chatgpt_train)

1829

### Number of instances in the training dataset without blank reviews

In [439]:
len(chatgpt_test)

458

In [440]:
def three_way_classes(rating):
    if rating == 1 or rating == 2:
        val = 'negative'
    elif rating == 4 or rating == 5:
        val = 'positive'
    else:
        val = 'neutral'
        
    return val

In [441]:
chatgpt_train.loc[:, ('3_way_class')] = chatgpt_train.loc[:, ('rating')].apply(three_way_classes)
chatgpt_train.head()

Unnamed: 0,date,title,review,rating,3_way_class
0,5/21/2023 16:42,Much more accessible for blind users than the web version,"Up to this point I?€?ve mostly been using ChatGPT on my windows desktop using Google Chrome. While it?€?s doable, screen reader navigation is pretty difficult on the desktop site and you really have to be an advanced user to find your way through it. I have submitted numerous feedbacks to open AI about this but nothing has changed on that front.\nWell, the good news ?€? the iOS app pretty much addresses all of those problems. The UI seems really clean, uncluttered and designed well to be compatible with voiceover, the screen reader built into iOS. I applaud the inclusivity of this design ?€? I only wish they would give the same attention and care to the accessibility experience of the desktop app.\nI would have given this review five stars but I have just a couple minor quibbles. First, once I submit my prompt, voiceover starts to read aloud ChatGPT?€?s response before that response is finished, so I will hear the first few words of the response followed by voiceover reading aloud the ?€?stop generating?€? button, which isn?€?t super helpful. It would be great if you could better coordinate this alert so that it didn?€?t start reading the message until it had been fully generated. The other thing I?€?d like is a Feedback button easily accessible from within the main screen of the app, to make it as easy as possible to get continuing suggestions and feedback from your users.\nOtherwise, fantastic app so far!",4,positive
1,7/11/2023 12:24,"Much anticipated, wasn?€?t let down.","I?€?ve been a user since it?€?s initial roll out and have been waiting for a mobile application ever since using the web app. For reference I?€?m a software engineering student while working in IT full time. I have to say GPT is an crucial tool. It takes far less time to get information quickly that you?€?d otherwise have to source from stack-overflow, various red-hat articles, Ubuntu articles, searching through software documentation, Microsoft documentation ect. Typically chat gpt can find the answer in a fraction of a second that google can. Obviously it is wrong, a lot. But to have the ability to get quick information on my phone like I can in the web browser I?€?m super excited about and have already been using the mobile app since download constantly. And I?€?m excited for the future of this program becoming more accurate and it seems to be getting more and more precise with every roll out. Gone are the days scouring the internet for obscure pieces of information, chat gpt can find it for you with 2 or 3 prompts. I love this app and I?€?m so happy it?€?s on mobile now. The UI is also very sleek, easy to use. My only complaint with the interface is the history tab at the top right. I actually prefer the conversation tabs on the left in the web app but I understand it would make the app kind of clunky especially on mobile since the screen size is smaller. Anyway, awesome app 5 stars.",4,positive
2,5/19/2023 10:16,"Almost 5 stars, but?€? no search function","This app would almost be perfect if it wasn?€?t for ONE little thing: a ?€?search in?€? function. As anyone can imagine, these AI chats can get quuuuite long, and quite lengthy. And sometimes I wanna go into a chat & look up a specific part or parts of that particular chat by using a search function to look up key words within that chat & track down whatever part I was looking for. For example, in a chat, if I had searched way early into the chat ?€?how do lions hunt??€? And say days later, I wanted to revisit that particular response, I wanna be able to go into the actual chat go to a ?€?search in?€? function and be able to type in key words like ?€?lions?€? or ?€?hunt?€? to be able to automatically find that part in the chat instead of having to scroll through a massive chat to find that part. Similar to what you can do in Microsoft Word docs, or even on web browsers. I think the app already kind of has this, but it doesn?€?t work exactly like that. I tested it out, and all it does is, you type in key words, and it shows you that those words are present in the chat or chats, but it doesn?€?t actually track it down or take you to that specific part. If this app can have that? It?€?s an absolute massive perfect winner. Addendum - I also noticed that my phone (iPhone 14 Pro Max) tends to run a little hotter, which I?€?m sure will affect battery life, when using the app. Just wanted to add that too.",4,positive
3,5/27/2023 21:57,"4.5 stars, here?€?s why","I recently downloaded the app and overall, it's a great platform with excellent potential. However, I did encounter a couple of issues with logging in that I feel need to be addressed. Firstly, the login process was somewhat cumbersome. It took me a few attempts to successfully log in, as the app didn't always recognize my credentials right away. This could be improved by streamlining the login flow and ensuring a smoother user experience. Secondly, the app occasionally experienced login glitches, where it would unexpectedly log me out without any apparent reason. It was frustrating to have to re-enter my login information repeatedly, disrupting my usage. Despite these issues, I must say that once I managed to log in successfully, the app itself was fantastic. It offers a wide range of features and a sleek user interface that is visually appealing. The content available on the platform is diverse and engaging, keeping me entertained for hours. I do hope the developers take note of the login challenges and work on improving this aspect of the app. With a more reliable login system, this app has the potential to become a top-tier platform. In conclusion, while the app has its share of login issues, it still holds promise. I'm optimistic that with some updates and improvements, it can provide an even better user experience. I look forward to seeing future enhancements and continued growth of the app.",4,positive
4,6/9/2023 7:49,"Good, but Siri support would take it to the next level","I appreciate the devs implementing Siri support?€?it is already enhancing the usefulness of the app, despite being a little clunky. I?€?d prefer if it were possible to make a query in one fell swoop, however. Currently, it seems like I have to say ?€?ask ChatGPT?€? then wait to be asked what my query is before saying the actual query. I know that it?€?s possible in other contexts to submit a request to a third-party app through Siri in a single query?€?I?€?m able to say ?€?Create a reminder for XYZ in Things,?€? for example. \n\nIn addition, the responses produced when querying Siri seem identical to those you would get in-app. This isn?€?t actually ideal in a scenario when you?€?re getting a spoken response?€?they just go on for way too long. I wonder if the app could be smart enough to recognize that a request was coming via Siri and then effectively append an invisible ?€?Please answer in less than 50 words?€? to those requests. \n\nOtherwise, a solid but basic chat app that closely replicates the web experience. Aside from being uncannily good at punctuating my speech, the dictation feature is not really any different to iOS?€?s stock dictation and is not all that useful.",4,positive


In [442]:
chatgpt_test.loc[:, ('3_way_class')] = chatgpt_test.loc[:, ('rating')].apply(three_way_classes)
chatgpt_test.head()

Unnamed: 0,date,title,review,rating,3_way_class
0,5/19/2023 6:09,error unsupported country,cant login,2,negative
1,5/19/2023 9:39,Hype junk,More harm than help.,1,negative
2,5/19/2023 4:12,your gpt 4 is fake,Fix it,1,negative
3,5/20/2023 3:01,Please impose IPadOS,We need IPadOS!!!,5,positive
4,5/19/2023 20:49,Amazing,Great,5,positive


### Extract unigram features

In [443]:
# use review for model
train_text = chatgpt_train["review"]
test_text = chatgpt_test["review"]

# set the unigram range
vectorizer = CountVectorizer(ngram_range = (1,1))

# create training data representation
train_data_count = vectorizer.fit_transform(train_text.values.astype('U'))
print("NUMBER OF FEATURES IN TRAINING DATASET WITH UNIGRAMS")
print(train_data_count.shape,"\n") 

# create test data representation
test_data_count = vectorizer.transform(test_text.values.astype('U'))
print("NUMBER OF FEATURES IN TEST DATASET WITH UNIGRAMS")
print(test_data_count.shape,"\n") 

NUMBER OF FEATURES IN TRAINING DATASET WITH UNIGRAMS
(1829, 5551) 

NUMBER OF FEATURES IN TEST DATASET WITH UNIGRAMS
(458, 5551) 



In [444]:
# define true labels from train set

X_train = train_data_count
y_train = chatgpt_train["3_way_class"]
X_test = test_data_count
y_test = chatgpt_test["3_way_class"]

In [445]:
unigram_model = MultinomialNB()
unigram_model.fit(X_train, y_train)

In [446]:
# predict the labels for the test data
predictions = unigram_model.predict(X_test)
predictions[:5]

array(['negative', 'positive', 'negative', 'positive', 'positive'],
      dtype='<U8')

In [447]:
print ("Overall Accuracy score: ", accuracy_score(y_test, predictions), "\n")
print ("Overall Macro Recall score: ", recall_score(y_test, predictions, average='macro'))
print ("Overall Macro Precision score: ", precision_score(y_test, predictions, average='macro'))
print ("Overall Macro F1 score: ", f1_score(y_test, predictions, average='macro'), "\n")
print ("Overall Micro Recall score: ", recall_score(y_test, predictions, average='micro'))
print ("Overall Micro Precision score: ", precision_score(y_test, predictions, average='micro'))
print ("Overall Micro F1 score: ", f1_score(y_test, predictions, average='micro'), "\n")
print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
print ("Individual label performance: ")
print (classification_report(y_test, predictions))
print ("Confusion Matrix: ")
print (confusion_matrix(y_test, predictions))

Overall Accuracy score:  0.7358078602620087 

Overall Macro Recall score:  0.5007752841737103
Overall Macro Precision score:  0.5420844714259355
Overall Macro F1 score:  0.5116057233704292 

Overall Micro Recall score:  0.7358078602620087
Overall Micro Precision score:  0.7358078602620087
Overall Micro F1 score:  0.7358078602620087 

Overall Weighted Recall score:  0.7358078602620087
Overall Weighted Precision score:  0.7128098567692405
Overall Weighted F1 score:  0.7176894078769239 

Individual label performance: 
              precision    recall  f1-score   support

    negative       0.69      0.53      0.60       141
     neutral       0.17      0.08      0.11        25
    positive       0.77      0.89      0.83       292

    accuracy                           0.74       458
   macro avg       0.54      0.50      0.51       458
weighted avg       0.71      0.74      0.72       458

Confusion Matrix: 
[[ 75   4  62]
 [  8   2  15]
 [ 26   6 260]]


## Task 2: (feature selection 1): Remove features with low variance

In [448]:
from sklearn.feature_selection import VarianceThreshold

In [449]:
low_threshold = 0.001
high_threshold = 0.005

In [450]:
feature_selector = VarianceThreshold(threshold = low_threshold)

In [451]:
# Low variance threshold = 0.001

feature_selector = VarianceThreshold(threshold = low_threshold)

X_train_low_variance_features_1 = feature_selector.fit(train_data_count).transform(train_data_count)
print ("Train feature space before low variance filtering (threshold=0.001): ", train_data_count.shape)
print ("Train feature space after low variance filtering (threshold=0.001): ", X_train_low_variance_features_1.shape)

print()

X_test_low_variance_features_1 = feature_selector.fit(train_data_count).transform(test_data_count)
print ("Test feature space before low variance filtering (threshold=0.001): ", test_data_count.shape)
print ("Test feature space after low variance filtering (threshold=0.001): ", X_test_low_variance_features_1.shape)

Train feature space before low variance filtering (threshold=0.001):  (1829, 5551)
Train feature space after low variance filtering (threshold=0.001):  (1829, 3054)

Test feature space before low variance filtering (threshold=0.001):  (458, 5551)
Test feature space after low variance filtering (threshold=0.001):  (458, 3054)


In [452]:
# define true labels from train set

X_train = X_train_low_variance_features_1
y_train = chatgpt_train["3_way_class"]
X_test = X_test_low_variance_features_1
y_test = chatgpt_test["3_way_class"]

In [453]:
low_variance_0001_model = MultinomialNB()
low_variance_0001_model.fit(X_train, y_train)

In [454]:
predictions = low_variance_0001_model.predict(X_test)
predictions[:5]

array(['negative', 'positive', 'negative', 'positive', 'positive'],
      dtype='<U8')

In [455]:
print ("Overall Accuracy score: ", accuracy_score(y_test, predictions), "\n")
print ("Overall Macro Recall score: ", recall_score(y_test, predictions, average='macro'))
print ("Overall Macro Precision score: ", precision_score(y_test, predictions, average='macro'))
print ("Overall Macro F1 score: ", f1_score(y_test, predictions, average='macro'), "\n")
print ("Overall Micro Recall score: ", recall_score(y_test, predictions, average='micro'))
print ("Overall Micro Precision score: ", precision_score(y_test, predictions, average='micro'))
print ("Overall Micro F1 score: ", f1_score(y_test, predictions, average='micro'), "\n")
print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
print ("Individual label performance: ")
print (classification_report(y_test, predictions))
print ("Confusion Matrix: ")
print (confusion_matrix(y_test, predictions))

Overall Accuracy score:  0.7270742358078602 

Overall Macro Recall score:  0.4974315878104861
Overall Macro Precision score:  0.5286054212610907
Overall Macro F1 score:  0.5065362940263409 

Overall Micro Recall score:  0.7270742358078602
Overall Micro Precision score:  0.7270742358078602
Overall Micro F1 score:  0.7270742358078602 

Overall Weighted Recall score:  0.7270742358078602
Overall Weighted Precision score:  0.7060217680304899
Overall Weighted F1 score:  0.7117444835328599 

Individual label performance: 
              precision    recall  f1-score   support

    negative       0.67      0.54      0.60       141
     neutral       0.14      0.08      0.10        25
    positive       0.77      0.87      0.82       292

    accuracy                           0.73       458
   macro avg       0.53      0.50      0.51       458
weighted avg       0.71      0.73      0.71       458

Confusion Matrix: 
[[ 76   4  61]
 [  8   2  15]
 [ 29   8 255]]


In [456]:
# High variance threshold = 0.005

feature_selector = VarianceThreshold(threshold = high_threshold)

X_train_low_variance_features_5 = feature_selector.fit(train_data_count).transform(train_data_count)
print ("Train feature space before low variance filtering (threshold=0.005): ", train_data_count.shape)
print ("Train feature space after low variance filtering (threshold=0.005): ", X_train_low_variance_features_5.shape)

print()

X_test_low_variance_features_5 = feature_selector.fit(train_data_count).transform(test_data_count)
print ("Test feature space before low variance filtering (threshold=0.005): ", test_data_count.shape)
print ("Test feature space after low variance filtering (threshold=0.005): ", X_test_low_variance_features_5.shape)

Train feature space before low variance filtering (threshold=0.005):  (1829, 5551)
Train feature space after low variance filtering (threshold=0.005):  (1829, 980)

Test feature space before low variance filtering (threshold=0.005):  (458, 5551)
Test feature space after low variance filtering (threshold=0.005):  (458, 980)


In [457]:
# define true labels from train set

X_train = X_train_low_variance_features_5
y_train = chatgpt_train["3_way_class"]
X_test = X_test_low_variance_features_5
y_test = chatgpt_test["3_way_class"]

In [458]:
low_variance_0005_model = MultinomialNB()
low_variance_0005_model.fit(X_train, y_train)

In [459]:
predictions = low_variance_0005_model.predict(X_test)
predictions[:5]

array(['negative', 'positive', 'negative', 'positive', 'positive'],
      dtype='<U8')

In [460]:
print ("Overall Accuracy score: ", accuracy_score(y_test, predictions), "\n")
print ("Overall Macro Recall score: ", recall_score(y_test, predictions, average='macro'))
print ("Overall Macro Precision score: ", precision_score(y_test, predictions, average='macro'))
print ("Overall Macro F1 score: ", f1_score(y_test, predictions, average='macro'), "\n")
print ("Overall Micro Recall score: ", recall_score(y_test, predictions, average='micro'))
print ("Overall Micro Precision score: ", precision_score(y_test, predictions, average='micro'))
print ("Overall Micro F1 score: ", f1_score(y_test, predictions, average='micro'), "\n")
print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
print ("Individual label performance: ")
print (classification_report(y_test, predictions))
print ("Confusion Matrix: ")
print (confusion_matrix(y_test, predictions))

Overall Accuracy score:  0.7358078602620087 

Overall Macro Recall score:  0.4995527704912724
Overall Macro Precision score:  0.5362426035502958
Overall Macro F1 score:  0.5100713456114153 

Overall Micro Recall score:  0.7358078602620087
Overall Micro Precision score:  0.7358078602620087
Overall Micro F1 score:  0.7358078602620087 

Overall Weighted Recall score:  0.7358078602620087
Overall Weighted Precision score:  0.7181904214361385
Overall Weighted F1 score:  0.7195576238803115 

Individual label performance: 
              precision    recall  f1-score   support

    negative       0.71      0.52      0.60       141
     neutral       0.12      0.08      0.10        25
    positive       0.77      0.89      0.83       292

    accuracy                           0.74       458
   macro avg       0.54      0.50      0.51       458
weighted avg       0.72      0.74      0.72       458

Confusion Matrix: 
[[ 74   4  63]
 [  9   2  14]
 [ 21  10 261]]


## Task 3 (feature selection 2): Select top k-best features using information gain (mutual information)

In [461]:
from sklearn.feature_selection import SelectKBest, mutual_info_classif

In [462]:
# k = 1000

selector = SelectKBest(mutual_info_classif, k=1000)
X_train_mutual_info_features_1000 = selector.fit_transform(train_data_count, y_train)
print ("Train feature space before filtering with 1000 best features: ", train_data_count.shape)
print ("Train feature space after filtering with 1000 best features: ", X_train_mutual_info_features_1000.shape)

print()

X_test_mutual_info_features_1000 = selector.transform(test_data_count)
print ("Test feature space before filtering with 1000 best features: ", test_data_count.shape)
print ("Test feature space after filtering with 1000 best features: ", X_test_mutual_info_features_1000.shape)

Train feature space before filtering with 1000 best features:  (1829, 5551)
Train feature space after filtering with 1000 best features:  (1829, 1000)

Test feature space before filtering with 1000 best features:  (458, 5551)
Test feature space after filtering with 1000 best features:  (458, 1000)


In [463]:
# define true labels from train set

X_train = X_train_mutual_info_features_1000
y_train = chatgpt_train["3_way_class"]
X_test = X_test_mutual_info_features_1000
y_test = chatgpt_test["3_way_class"]

In [464]:
best_features_1000_model = MultinomialNB()
best_features_1000_model.fit(X_train, y_train)

In [465]:
predictions = best_features_1000_model.predict(X_test_mutual_info_features_1000)
predictions[:5]

array(['negative', 'positive', 'negative', 'positive', 'positive'],
      dtype='<U8')

In [466]:
print ("Overall Accuracy score: ", accuracy_score(y_test, predictions), "\n")
print ("Overall Macro Recall score: ", recall_score(y_test, predictions, average='macro'))
print ("Overall Macro Precision score: ", precision_score(y_test, predictions, average='macro'))
print ("Overall Macro F1 score: ", f1_score(y_test, predictions, average='macro'), "\n")
print ("Overall Micro Recall score: ", recall_score(y_test, predictions, average='micro'))
print ("Overall Micro Precision score: ", precision_score(y_test, predictions, average='micro'))
print ("Overall Micro F1 score: ", f1_score(y_test, predictions, average='micro'), "\n")
print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
print ("Individual label performance: ")
print (classification_report(y_test, predictions))
print ("Confusion Matrix: ")
print (confusion_matrix(y_test, predictions))

Overall Accuracy score:  0.740174672489083 

Overall Macro Recall score:  0.5030583891965413
Overall Macro Precision score:  0.5389112618341105
Overall Macro F1 score:  0.5134618118836196 

Overall Micro Recall score:  0.740174672489083
Overall Micro Precision score:  0.740174672489083
Overall Micro F1 score:  0.740174672489083 

Overall Weighted Recall score:  0.740174672489083
Overall Weighted Precision score:  0.7223889137762837
Overall Weighted F1 score:  0.7241716069662414 

Individual label performance: 
              precision    recall  f1-score   support

    negative       0.71      0.53      0.61       141
     neutral       0.12      0.08      0.10        25
    positive       0.78      0.90      0.83       292

    accuracy                           0.74       458
   macro avg       0.54      0.50      0.51       458
weighted avg       0.72      0.74      0.72       458

Confusion Matrix: 
[[ 75   5  61]
 [  9   2  14]
 [ 21   9 262]]


In [467]:
# k = 2000

selector = SelectKBest(mutual_info_classif, k=2000)
X_train_mutual_info_features_2000 = selector.fit_transform(train_data_count, y_train)
print ("Train feature space before filtering with 2000 best features: ", train_data_count.shape)
print ("Train feature space after filtering with 2000 best features: ", X_train_mutual_info_features_2000.shape)

print()

X_test_mutual_info_features_2000 = selector.transform(test_data_count)
print ("Test feature space before filtering with 2000 best features: ", test_data_count.shape)
print ("Test feature space after filtering with 2000 best features: ", X_test_mutual_info_features_2000.shape)

Train feature space before filtering with 2000 best features:  (1829, 5551)
Train feature space after filtering with 2000 best features:  (1829, 2000)

Test feature space before filtering with 2000 best features:  (458, 5551)
Test feature space after filtering with 2000 best features:  (458, 2000)


In [468]:
# define true labels from train set

X_train = X_train_mutual_info_features_2000
y_train = chatgpt_train["3_way_class"]
X_test = X_test_mutual_info_features_2000
y_test = chatgpt_test["3_way_class"]

In [469]:
best_features_2000_model = MultinomialNB()
best_features_2000_model.fit(X_train, y_train)

In [470]:
predictions = best_features_2000_model.predict(X_test_mutual_info_features_2000)
predictions[:5]

array(['negative', 'positive', 'negative', 'positive', 'positive'],
      dtype='<U8')

In [471]:
print ("Overall Accuracy score: ", accuracy_score(y_test, predictions), "\n")
print ("Overall Macro Recall score: ", recall_score(y_test, predictions, average='macro'))
print ("Overall Macro Precision score: ", precision_score(y_test, predictions, average='macro'))
print ("Overall Macro F1 score: ", f1_score(y_test, predictions, average='macro'), "\n")
print ("Overall Micro Recall score: ", recall_score(y_test, predictions, average='micro'))
print ("Overall Micro Precision score: ", precision_score(y_test, predictions, average='micro'))
print ("Overall Micro F1 score: ", f1_score(y_test, predictions, average='micro'), "\n")
print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
print ("Individual label performance: ")
print (classification_report(y_test, predictions))
print ("Confusion Matrix: ")
print (confusion_matrix(y_test, predictions))

Overall Accuracy score:  0.7379912663755459 

Overall Macro Recall score:  0.49094756954564595
Overall Macro Precision score:  0.5183691642643176
Overall Macro F1 score:  0.4976540194190983 

Overall Micro Recall score:  0.7379912663755459
Overall Micro Precision score:  0.7379912663755459
Overall Micro F1 score:  0.7379912663755459 

Overall Weighted Recall score:  0.7379912663755459
Overall Weighted Precision score:  0.7146149936633656
Overall Weighted F1 score:  0.7199030187879334 

Individual label performance: 
              precision    recall  f1-score   support

    negative       0.70      0.54      0.61       141
     neutral       0.08      0.04      0.05        25
    positive       0.77      0.89      0.83       292

    accuracy                           0.74       458
   macro avg       0.52      0.49      0.50       458
weighted avg       0.71      0.74      0.72       458

Confusion Matrix: 
[[ 76   5  60]
 [  8   1  16]
 [ 24   7 261]]


## Task 4: (feature selection 3): Lexicon-based feature selection

In [523]:
f = open("positive_words_list.txt", "r")
positive_contents = f.read()

positive_words = positive_contents.split("\n")
positive_words

['a+',
 'abound',
 'abounds',
 'abundance',
 'abundant',
 'accessable',
 'accessible',
 'acclaim',
 'acclaimed',
 'acclamation',
 'accolade',
 'accolades',
 'accommodative',
 'accomodative',
 'accomplish',
 'accomplished',
 'accomplishment',
 'accomplishments',
 'accurate',
 'accurately',
 'achievable',
 'achievement',
 'achievements',
 'achievible',
 'acumen',
 'adaptable',
 'adaptive',
 'adequate',
 'adjustable',
 'admirable',
 'admirably',
 'admiration',
 'admire',
 'admirer',
 'admiring',
 'admiringly',
 'adorable',
 'adore',
 'adored',
 'adorer',
 'adoring',
 'adoringly',
 'adroit',
 'adroitly',
 'adulate',
 'adulation',
 'adulatory',
 'advanced',
 'advantage',
 'advantageous',
 'advantageously',
 'advantages',
 'adventuresome',
 'adventurous',
 'advocate',
 'advocated',
 'advocates',
 'affability',
 'affable',
 'affably',
 'affectation',
 'affection',
 'affectionate',
 'affinity',
 'affirm',
 'affirmation',
 'affirmative',
 'affluence',
 'affluent',
 'afford',
 'affordable',
 'af

In [524]:
f = open("negative_words_list.txt", "r")
negative_contents = f.read()

negative_words = negative_contents.split("\n")
negative_words

['2-faced',
 '2-faces',
 'abnormal',
 'abolish',
 'abominable',
 'abominably',
 'abominate',
 'abomination',
 'abort',
 'aborted',
 'aborts',
 'abrade',
 'abrasive',
 'abrupt',
 'abruptly',
 'abscond',
 'absence',
 'absent-minded',
 'absentee',
 'absurd',
 'absurdity',
 'absurdly',
 'absurdness',
 'abuse',
 'abused',
 'abuses',
 'abusive',
 'abysmal',
 'abysmally',
 'abyss',
 'accidental',
 'accost',
 'accursed',
 'accusation',
 'accusations',
 'accuse',
 'accuses',
 'accusing',
 'accusingly',
 'acerbate',
 'acerbic',
 'acerbically',
 'ache',
 'ached',
 'aches',
 'achey',
 'aching',
 'acrid',
 'acridly',
 'acridness',
 'acrimonious',
 'acrimoniously',
 'acrimony',
 'adamant',
 'adamantly',
 'addict',
 'addicted',
 'addicting',
 'addicts',
 'admonish',
 'admonisher',
 'admonishingly',
 'admonishment',
 'admonition',
 'adulterate',
 'adulterated',
 'adulteration',
 'adulterier',
 'adversarial',
 'adversary',
 'adverse',
 'adversity',
 'afflict',
 'affliction',
 'afflictive',
 'affront',


In [525]:
# Function to extract lexicon features

def extract_lexicon_features(text):
    words = set(text.split())
    positive = set([word for word in words if word in positive_words])
    negative = set([word for word in words if word in negative_words])
    joined = positive.union(negative)
    return " ".join(joined)

In [526]:
chatgpt_train['lexicon_features'] = chatgpt_train['review'].apply(extract_lexicon_features)
chatgpt_test['lexicon_features'] = chatgpt_test['review'].apply(extract_lexicon_features)

In [527]:
# define true labels from train set

X_train = chatgpt_train['lexicon_features']
y_train = chatgpt_train["3_way_class"]
X_test = chatgpt_test['lexicon_features']
y_test = chatgpt_test["3_way_class"]

In [532]:
# Using CountVectorizer
vectorizer = CountVectorizer(ngram_range = (1,1))

# create training data representation
X_train_lexicon_features_count_vec = vectorizer.fit_transform(X_train.values.astype('U'))
print("Number of features in training dataset with lexicons")
print(X_train_lexicon_features_count_vec.shape,"\n") 

# create test data representation
X_test_data_count_vec = vectorizer.transform(X_test.values.astype('U'))
print("Number of features in test dataset with lexicons")
print(X_test_data_count_vec.shape,"\n") 

Number of features in training dataset with lexicons
(1829, 915) 

Number of features in test dataset with lexicons
(458, 915) 



In [529]:
lexicon_based_model = MultinomialNB()
lexicon_based_model.fit(X_train_lexicon_features_count_vec, y_train)

In [530]:
predictions = lexicon_based_model.predict(X_test_data_count_vec)
predictions[:5]

array(['positive', 'positive', 'positive', 'positive', 'positive'],
      dtype='<U8')

In [531]:
print ("Overall Accuracy score: ", accuracy_score(y_test, predictions), "\n")
print ("Overall Macro Recall score: ", recall_score(y_test, predictions, average='macro'))
print ("Overall Macro Precision score: ", precision_score(y_test, predictions, average='macro'))
print ("Overall Macro F1 score: ", f1_score(y_test, predictions, average='macro'), "\n")
print ("Overall Micro Recall score: ", recall_score(y_test, predictions, average='micro'))
print ("Overall Micro Precision score: ", precision_score(y_test, predictions, average='micro'))
print ("Overall Micro F1 score: ", f1_score(y_test, predictions, average='micro'), "\n")
print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
print ("Individual label performance: ")
print (classification_report(y_test, predictions))
print ("Confusion Matrix: ")
print (confusion_matrix(y_test, predictions))

Overall Accuracy score:  0.6615720524017468 

Overall Macro Recall score:  0.36422811619547263
Overall Macro Precision score:  0.44745621351125936
Overall Macro F1 score:  0.3250859569877975 

Overall Micro Recall score:  0.6615720524017468
Overall Micro Precision score:  0.6615720524017468
Overall Micro F1 score:  0.6615720524017468 

Overall Weighted Recall score:  0.6615720524017468
Overall Weighted Precision score:  0.631041697775803
Overall Weighted F1 score:  0.5611001905641175 

Individual label performance: 
              precision    recall  f1-score   support

    negative       0.68      0.11      0.18       141
     neutral       0.00      0.00      0.00        25
    positive       0.66      0.99      0.79       292

    accuracy                           0.66       458
   macro avg       0.45      0.36      0.33       458
weighted avg       0.63      0.66      0.56       458

Confusion Matrix: 
[[ 15   0 126]
 [  3   0  22]
 [  4   0 288]]


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


## Task 6: Extract and select best unigrams

In [481]:
# use review for model
train_text = chatgpt_train["review"]
test_text = chatgpt_test["review"]

# set the unigram range
vectorizer = CountVectorizer(ngram_range = (1,1))

# create training data representation
train_data_count_vec = vectorizer.fit_transform(train_text.values.astype('U'))
print("NUMBER OF FEATURES IN TRAINING DATASET WITH UNIGRAMS")
print(train_data_count_vec.shape,"\n") 

# create test data representation
test_data_count_vec = vectorizer.transform(test_text.values.astype('U'))
print("NUMBER OF FEATURES IN TEST DATASET WITH UNIGRAMS")
print(test_data_count_vec.shape,"\n") 

NUMBER OF FEATURES IN TRAINING DATASET WITH UNIGRAMS
(1829, 5551) 

NUMBER OF FEATURES IN TEST DATASET WITH UNIGRAMS
(458, 5551) 



In [482]:
# k = 2000

selector = SelectKBest(mutual_info_classif, k=2000)

X_train_mutual_info_best_2000_features = selector.fit_transform(train_data_count, y_train)
print ("Train feature space before filtering with 2000 best features: ", train_data_count.shape)
print ("Train feature space after filtering with 2000 best features: ", X_train_mutual_info_best_2000_features.shape)

print()

X_test_mutual_info_best_2000_features = selector.transform(test_data_count)
print ("Test feature space before filtering with 2000 best features: ", test_data_count.shape)
print ("Test feature space after filtering with 2000 best features: ", X_test_mutual_info_best_2000_features.shape)

Train feature space before filtering with 2000 best features:  (1829, 5551)
Train feature space after filtering with 2000 best features:  (1829, 2000)

Test feature space before filtering with 2000 best features:  (458, 5551)
Test feature space after filtering with 2000 best features:  (458, 2000)


## Task 7: Train and evaluate a Naive Bayes classifier using cross-validation

In [483]:
from sklearn.model_selection import StratifiedKFold, cross_val_score

In [484]:
cross_validation = StratifiedKFold(n_splits=5)
cross_validation

StratifiedKFold(n_splits=5, random_state=None, shuffle=False)

In [485]:
# get data columns

y_label = chatgpt_train['3_way_class']
y_label = np.array(y_label)

print("Train feature space with 2000 best features: ", X_train_mutual_info_best_2000_features.shape)
print("Test labels shape: ", y_label.shape) 

Train feature space with 2000 best features:  (1829, 2000)
Test labels shape:  (1829,)


In [486]:
# counter i
i = 0
f1_nb = []
accuracy_nb = []

for train_index, test_index in cross_validation.split(X_train_mutual_info_best_2000_features, y_label):
    
    print(f"CV with Multinomial Naive Bayes: {i+1}")
    print(train_index)

    X_train = X_train_mutual_info_best_2000_features[train_index]
    X_test = X_train_mutual_info_best_2000_features[test_index]
    y_train = y_label[train_index]
    y_test = y_label[test_index]

    multinomial_naive_bayes = MultinomialNB()

    multinomial_naive_bayes.fit(X_train, y_train)

    # to see all the hyper parameters
    print()
    # print(pipeline.get_params())
    
    predictions = multinomial_naive_bayes.predict(X_test)

    print ("Overall Accuracy score: ", accuracy_score(y_test, predictions))
    print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
    print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
    print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
    f1_nb.append(f1_score(y_test, predictions, average='weighted'))
    accuracy_nb.append(accuracy_score(y_test, predictions))
    print ("..................................................\n\n")
    i += 1

CV with Multinomial Naive Bayes: 1
[ 280  282  284  285  286  287  291  293  294  295  296  298  300  301
  302  303  304  306  307  308  309  312  315  316  318  319  320  321
  322  323  324  326  327  328  330  331  332  333  334  336  338  339
  341  343  344  346  347  349  353  354  355  356  357  358  361  363
  364  365  366  367  368  369  370  371  372  373  376  377  378  379
  381  382  383  384  385  386  387  388  389  390  391  392  393  394
  395  396  397  398  399  400  401  403  404  405  408  410  411  412
  413  417  418  420  423  424  425  426  427  428  429  430  431  432
  433  434  435  436  437  438  439  440  443  444  445  446  447  448
  449  450  452  453  454  455  458  461  462  463  464  465  466  467
  468  469  470  471  472  473  474  475  476  477  478  479  480  481
  482  483  484  485  486  487  489  490  491  493  494  495  498  499
  500  501  502  503  504  505  506  507  509  510  511  512  513  514
  515  516  517  518  520  522  523  524  

In [487]:
from scipy.stats import sem, t

In [488]:
confidence_interval = 0.95

In [489]:
accuracy_nb

[0.7622950819672131,
 0.7404371584699454,
 0.7404371584699454,
 0.7650273224043715,
 0.8]

In [490]:
import math
import scipy.stats as st 

In [491]:
def calculate_confidence_interval(accuracy_scores, confidence_interval):
    
    mean_accuracy = np.mean(accuracy_scores)
    standard_error = sem(accuracy_scores)

    z_score = 1.959964
    margin_of_error = z_score * standard_error

    # Bounds
    lower_bound = mean_accuracy - margin_of_error
    upper_bound = mean_accuracy + margin_of_error

    print(f"Mean Accuracy: {mean_accuracy:.4f}")
    print(f"95% Confidence Interval: [{lower_bound:.4f}, {upper_bound:.4f}]")

In [492]:
calculate_confidence_interval(accuracy_nb, confidence_interval)

Mean Accuracy: 0.7616
95% Confidence Interval: [0.7402, 0.7830]


## Task 8: Train a linear SVM classifier

In [493]:
from sklearn.svm import LinearSVC

In [494]:
# counter i
i = 0
f1_svc = []
accuracy_svc = []

for train_index, test_index in cross_validation.split(X_train_mutual_info_best_2000_features, y_label):
    
    print(f"CV with Linear SVC: {i+1}")

    X_train = X_train_mutual_info_best_2000_features[train_index]
    X_test = X_train_mutual_info_best_2000_features[test_index]
    y_train = y_label[train_index]
    y_test = y_label[test_index]

    linear_svc = LinearSVC(dual=True)

    linear_svc.fit(X_train, y_train)

    print()
    
    predictions = linear_svc.predict(X_test)

    print ("Overall Accuracy score: ", accuracy_score(y_test, predictions))
    print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
    print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
    print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
    f1_svc.append(f1_score(y_test, predictions, average='weighted'))
    accuracy_svc.append(accuracy_score(y_test, predictions))
    print ("..................................................\n\n")
    i += 1

CV with Linear SVC: 1

Overall Accuracy score:  0.6475409836065574
Overall Weighted Recall score:  0.6475409836065574
Overall Weighted Precision score:  0.6656190102027453
Overall Weighted F1 score:  0.6558938057219318 

..................................................


CV with Linear SVC: 2

Overall Accuracy score:  0.674863387978142
Overall Weighted Recall score:  0.674863387978142
Overall Weighted Precision score:  0.7037675219767167
Overall Weighted F1 score:  0.685178843364506 

..................................................


CV with Linear SVC: 3

Overall Accuracy score:  0.7021857923497268
Overall Weighted Recall score:  0.7021857923497268
Overall Weighted Precision score:  0.6846943624206632
Overall Weighted F1 score:  0.6913392989892845 

..................................................


CV with Linear SVC: 4

Overall Accuracy score:  0.7349726775956285
Overall Weighted Recall score:  0.7349726775956285
Overall Weighted Precision score:  0.7104527451716034
Overall W

In [495]:
accuracy_svc

[0.6475409836065574,
 0.674863387978142,
 0.7021857923497268,
 0.7349726775956285,
 0.7205479452054795]

In [496]:
calculate_confidence_interval(accuracy_svc, confidence_interval)

Mean Accuracy: 0.6960
95% Confidence Interval: [0.6652, 0.7269]


## Task 9: Train a logistic regression classifier

In [497]:
from sklearn.linear_model import LogisticRegression

In [498]:
# counter i
i = 0
f1_lr = []
accuracy_lr = []

for train_index, test_index in cross_validation.split(X_train_mutual_info_best_2000_features, y_label):
    
    print(f"CV with Logistic Regression: {i+1}")

    X_train = X_train_mutual_info_best_2000_features[train_index]
    X_test = X_train_mutual_info_best_2000_features[test_index]
    y_train = y_label[train_index]
    y_test = y_label[test_index]

    logistic_regression = LogisticRegression(max_iter=50000)

    logistic_regression.fit(X_train, y_train)

    print()
    
    predictions = logistic_regression.predict(X_test)

    print ("Overall Accuracy score: ", accuracy_score(y_test, predictions))
    print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
    print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
    print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
    f1_lr.append(f1_score(y_test, predictions, average='weighted'))
    accuracy_lr.append(accuracy_score(y_test, predictions))
    print ("..................................................\n\n")
    i += 1

CV with Logistic Regression: 1

Overall Accuracy score:  0.6721311475409836
Overall Weighted Recall score:  0.6721311475409836
Overall Weighted Precision score:  0.686377586245145
Overall Weighted F1 score:  0.6788006388200585 

..................................................


CV with Logistic Regression: 2

Overall Accuracy score:  0.7158469945355191
Overall Weighted Recall score:  0.7158469945355191
Overall Weighted Precision score:  0.7116041781054723
Overall Weighted F1 score:  0.7133182685241422 

..................................................


CV with Logistic Regression: 3

Overall Accuracy score:  0.7131147540983607
Overall Weighted Recall score:  0.7131147540983607
Overall Weighted Precision score:  0.6759404979844112
Overall Weighted F1 score:  0.6899667912456394 

..................................................


CV with Logistic Regression: 4

Overall Accuracy score:  0.7513661202185792
Overall Weighted Recall score:  0.7513661202185792
Overall Weighted Precisio

In [499]:
accuracy_lr

[0.6721311475409836,
 0.7158469945355191,
 0.7131147540983607,
 0.7513661202185792,
 0.7205479452054795]

In [500]:
calculate_confidence_interval(accuracy_lr, confidence_interval)

Mean Accuracy: 0.7146
95% Confidence Interval: [0.6898, 0.7394]


## Task 10: Model comparison using paired t-test

In [501]:
from scipy import stats

In [502]:
f1_nb

[0.7353849453441034,
 0.7316207081945378,
 0.7344702865676813,
 0.7573191566719791,
 0.7603869670326174]

In [503]:
f1_svc

[0.6558938057219318,
 0.685178843364506,
 0.6913392989892845,
 0.7195614618227334,
 0.6733759130628014]

In [504]:
f1_lr

[0.6788006388200585,
 0.7133182685241422,
 0.6899667912456394,
 0.7253747229354577,
 0.6672201439683422]

In [505]:
# Student's t-test for Linear SVC vs Naive Bayes

mnb_svc_ttest = stats.ttest_ind(f1_nb, f1_svc, alternative='greater')
print("Student's t-test result for Linear SVC vs Naive Bayes: ", mnb_svc_ttest) 

Student's t-test result for Linear SVC vs Naive Bayes:  Ttest_indResult(statistic=4.813487870897465, pvalue=0.0006661935588968813)


In [506]:
# Student's t-test for Naive Bayes vs Logistic Regression

mnb_lr_ttest = stats.ttest_ind(f1_nb, f1_lr, alternative='greater')
print("Student's t-test result for Naive Bayes vs Logistic Regression: ", mnb_lr_ttest) 

Student's t-test result for Naive Bayes vs Logistic Regression:  Ttest_indResult(statistic=3.9394014895599385, pvalue=0.002149754619768332)


In [507]:
# Student's t-test for Linear SVC vs Logistic Regression

lvc_lr_ttest = stats.ttest_ind(f1_svc, f1_lr, alternative='greater')
print("Student's t-test result for Linear SVC vs Logistic Regression: ", lvc_lr_ttest) 

Student's t-test result for Linear SVC vs Logistic Regression:  Ttest_indResult(statistic=-0.6552346866913336, pvalue=0.7346487315731108)


## Task 11: Hyperparameter tuning

In [508]:
from sklearn.model_selection import GridSearchCV

In [509]:
X_train = X_train_mutual_info_best_2000_features[train_index]
X_test = X_train_mutual_info_best_2000_features[test_index]
y_train = y_label[train_index]
y_test = y_label[test_index]

In [510]:
svc_C_params = [
    {'C': [0.01, 0.1,1,10,100,1000]}
]

In [511]:
linear_svc_raw = LinearSVC(dual=True, max_iter=50000)

In [512]:
grid_cv = GridSearchCV(estimator=linear_svc_raw, param_grid=svc_C_params)
grid_cv.fit(X_train, y_train)

In [513]:
grid_cv.best_params_

{'C': 0.1}

In [514]:
grid_cv.best_score_

0.7261045397166768

In [515]:
linear_SVC_tuned = LinearSVC(dual=True, max_iter=50000, C=0.1)

In [516]:
linear_SVC_tuned.fit(X_train, y_train)

In [517]:
predictions = linear_SVC_tuned.predict(X_test)
predictions[:5]

array(['negative', 'positive', 'positive', 'positive', 'positive'],
      dtype=object)

In [518]:
print ("Overall Accuracy score: ", accuracy_score(y_test, predictions), "\n")
print ("Overall Weighted Recall score: ", recall_score(y_test, predictions, average='weighted'))
print ("Overall Weighted Precision score: ", precision_score(y_test, predictions, average='weighted'))
print ("Overall Weighted F1 score: ", f1_score(y_test, predictions, average='weighted'), "\n")
print ("Individual label performance: ")
print (classification_report(y_test, predictions))
print ("Confusion Matrix: ")
print (confusion_matrix(y_test, predictions))

Overall Accuracy score:  0.7123287671232876 

Overall Weighted Recall score:  0.7123287671232876
Overall Weighted Precision score:  0.6526870389884089
Overall Weighted F1 score:  0.6524200913242011 

Individual label performance: 
              precision    recall  f1-score   support

    negative       0.78      0.39      0.52        98
     neutral       0.00      0.00      0.00        39
    positive       0.71      0.97      0.82       228

    accuracy                           0.71       365
   macro avg       0.50      0.45      0.45       365
weighted avg       0.65      0.71      0.65       365

Confusion Matrix: 
[[ 38   1  59]
 [  8   0  31]
 [  3   3 222]]
