# Model Performance

We will use this notebook to make use of the various classes created for ETL, Text Processing, Model and Metrics to measure the performance of various models. We have tried the following models

<ul>
    <li>
        <b>Twitter:</b> TBD
    </li>
</ul>

In [1]:
!pip install --upgrade pip
!pip install nltk
!pip install contractions
!pip install inflect
!pip install numpy 
!pip install scikit-learn 
!pip install gensim
!pip uninstall -y tensorflow
!pip install torch
!pip install transformers



In [2]:
# Set up the notebook to import modules from relative paths
import os, sys

#'/home/user/example/parent/child'
current_path = os.path.abspath('.')

#'/home/user/example/parent'
parent_path = os.path.dirname(current_path)

sys.path.append(parent_path)

In [3]:
from model import Sentiment_Analysis_Model

twitter_model = Sentiment_Analysis_Model()

Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment-latest were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


First provide the base directory where you have the Amazon Reviews Dataset in CSV format along with the exact file name. We will make use of the <b>ETL_Pipeline</b> class to load this raw file and perform the transformations we have deemed to be needed for the model. We will also save the transformed file since it will help us eventually speed the process

In [4]:
from etl_pipeline import ETL_Pipeline 

# Initialize the reviews
base_dir = "/Users/shaileshhemdev/ai/ai-enabledsystems/workspace/"
source_file = "amazon_movie_reviews.csv"
path = base_dir + source_file

dp = ETL_Pipeline(base_dir)
transformed_df = dp.process(source_file)
transformed_df.head()

Unnamed: 0,rating,review_title,text,helpful_vote,main_category,categories,tags,class
0,5.0,Five Stars,"Amazon, please buy the show! I'm hooked!",0,Prime Video,Suspense Drama,Nudity violence substance use alcohol use smok...,2
1,5.0,Five Stars,My Kiddos LOVE this show!!,0,Prime Video,Kids,,2
2,3.0,Some decent moments...but...,Annabella Sciorra did her character justice wi...,0,Prime Video,,Violence substance use foul language sexual co...,1
3,4.0,"Decent Depiction of Lower-Functioning Autism, ...",...there should be more of a range of characte...,1,Prime Video,,Violence alcohol use foul language sexual content,2
4,5.0,What Love Is...,"...isn't always how you expect it to be, but w...",0,Prime Video,,,2


Load the dataset to get training and testing datasets

In [5]:
from dataset import Sentiment_Analysis_Dataset
import random

# Initialize the Sentiment Analysis Dataset
dataset = Sentiment_Analysis_Dataset(transformed_df, 'class')

# Get a random fold
random_fold = random.randint(0, 4)

# Get Training and Test datasets for this fold
train = dataset.get_training_dataset(random_fold)
test = dataset.get_testing_dataset(random_fold)

# Print the sizes to see if you get a good split
print(f" Fold = {random_fold+1} yields Training Dataset with {len(train[0])} records and Testing with {len(test[0])} records.")

 Fold = 4 yields Training Dataset with 800000 records and Testing with 200000 records.


In [6]:
import pandas as pd
import numpy as np
import sklearn
import seaborn as sns

from IPython.display import display, HTML

# Display Properties
from IPython.display import display, HTML
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 1000)
pd.set_option('display.colheader_justify', 'center')
pd.set_option('display.precision', 2)
pd.set_option('display.float_format', lambda x: '%.3f' % x)

sns.set(style="ticks", color_codes=True)

We will now verify the Model results first using a model run done before with 76000 results which took several hours. 

In [7]:
model_run_df = pd.read_csv("../sentiments-twitter-model.csv")
model_run_df.head()

Unnamed: 0,text,rating,class,predicted_class
0,"Amazon, please buy the show! I'm hooked!",5.0,2,1
1,My Kiddos LOVE this show!!,5.0,2,2
2,Annabella Sciorra did her character justice wi...,3.0,1,1
3,...there should be more of a range of characte...,4.0,2,2
4,"...isn't always how you expect it to be, but w...",5.0,2,1


In [15]:
from metrics import Metrics
from sklearn.metrics import confusion_matrix

metrics = Metrics()

y_vals = model_run_df["class"].values
y_pred = model_run_df["predicted_class"].values

acc, acc_bal, prec, recall, f1 = metrics.run(y_vals, y_pred)
accs, acc_bals, precs, recalls, f1s = [], [], [], [], []

accs += [acc]
acc_bals += [acc_bal]
precs += [prec]
recalls += [recall]
f1s += [f1]

metrics.generate_report(accs, acc_bals, precs, recalls, f1s, ['twitter-roberta-base'],'../results/model-run1.txt')

Now we will run it in a loop to generate results with samples extracted from each 

In [30]:
import random

# List of classifiers
classifiers = ['twitter-roberta-base']

for classifier in classifiers:
    for fold in range(0,5):
        # Initialize Metric Arrays 
        accs, acc_bals, precs, recalls, f1s  = [], [], [], [], []

        # Get the training, validation and test datasets
        X_train, y_train = dataset.get_training_dataset(fold)
        X_test, y_test = dataset.get_testing_dataset(fold)
        X_val, y_val = dataset.get_validation_dataset(fold)
        
        # Get random samples from train, val and test 
        train_sublist = random.sample(range(len(X_train)), 200)
        test_sublist = random.sample(range(len(X_test)), 200)
        val_sublist = random.sample(range(len(X_val)), 200)
        
        # Obtain the metrics for training subset
        X = X_train[train_sublist][:,2]
        y = y_train[train_sublist]

        acc, acc_bal, prec, recall, f1 = twitter_model.test(X, y)
    
        # Collect the metrics 
        accs += [acc]
        acc_bals += [acc_bal]
        precs += [prec]
        recalls += [recall]
        f1s += [f1]
        
        # Obtain the metrics for testing subset
        X = X_test[test_sublist][:,2]
        y = y_test[test_sublist]

        acc, acc_bal, prec, recall, f1 = twitter_model.test(X, y)
    
        # Collect the metrics 
        accs += [acc]
        acc_bals += [acc_bal]
        precs += [prec]
        recalls += [recall]
        f1s += [f1]
        
        # Obtain the metrics for validation subset
        X = X_val[val_sublist][:,2]
        y = y_val[val_sublist]

        acc, acc_bal, prec, recall, f1 = twitter_model.test(X, y)
    
        # Collect the metrics 
        accs += [acc]
        acc_bals += [acc_bal]
        precs += [prec]
        recalls += [recall]
        f1s += [f1]

        # Generate the Report
        metrics.generate_report(accs, acc_bals, precs, recalls, f1s, ['twitter-roberta-base'],'../results/model-run' + str(fold) + '.txt')

I liked the premise but this movie was exhaustingly slower than it  needed to be. By the time it was at the end I didn't care what was happening.
Upfront: it's missing the sweet, innocent, magical and heartwarming charm of &#34;The Fellowship.&#34; I saw all those familiar places and familiar faces: Gandalf, Bag End, The Shire, Elrond, Rivendell. I missed Strider and the original hobbits. I missed the danger they were running from, the reason for their journey. The tone of this film is different, which is perfectly fine, I just wasn't really expecting that difference. It was harder to reconcile in this film. In the 2nd and 3rd films, when we get away from all the old and familiar and into the new and different they were much easier to get into and enjoy on their own.<br /><br />Everyone complains that it's long, but I don't have a problem with length if it fleshes out the story. OTOH, contriving shots to stuff into already over-extended battle/fight scenes definitely adds to the weight