<a href="https://colab.research.google.com/github/VighneshS/sentiment_prediction/blob/master/sentiment_prediction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sentiment Prediction using Naive Bayes Classifier (NBC)
This is a notebook to understand how Naive Bayes Classifier (NBC) works and also how it is useful to classify text based on sentiment.

We will also see how it will be effective against missing data.

## Settings
Training Percentage

In [755]:
training_ratio = 80 / 100

## Importing the Data
We used the [kaggle dataset](https://storage.googleapis.com/kagglesdsdata/datasets/22169/30047/sentiment%20labelled%20sentences/imdb_labelled.txt?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=gcp-kaggle-com%40kaggle-161607.iam.gserviceaccount.com%2F20210425%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20210425T202010Z&X-Goog-Expires=259199&X-Goog-SignedHeaders=host&X-Goog-Signature=6133706ef10bc2dcd0b58f8398b4d73ab9e9d788de1718b07334df91f6007e1e4ca0b78e3176f95b8250e0c4535ce1633528f4fabffeb7e4124af3ee3f895ac34c03044fca9b23b23c4ddb8fa90d84dfc14869ff4806f03783cafad53b19445b3c3052983fdf1ca4384257eac1bc0a4270d238a1ea89d1289866c7a0ea7ad7c97a76f2e142c148019e39cc5a1295f92650747ac5ea5946b026f7ad6d5d262d4c4a370aee6bc1f5d5b445bb6d93692debe678a79e5e1c1fe3d3e68ea4f2fad3115795d3361e0626e98156fbc7f5967beb7cf0f00e07351d23a00d8677ebb75e3e13b1bfa07762266efabf6f6f9d53206be31b7623cf3614f60f8cf5011cf23def) to get the ground truth of sample IMDB reviews.

In [756]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np  # linear algebra
import pandas as pd  # data processing, CSV file I/O (e.g. pd.read_csv)
import math

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

# import os
# for dirname, _, filenames in os.walk('/kaggle/input'):
#     for filename in filenames:
#         print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

data = pd.read_csv(
    r"http://storage.googleapis.com/kagglesdsdata/datasets/22169/30047/sentiment%20labelled%20sentences/imdb_labelled.txt?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=gcp-kaggle-com%40kaggle-161607.iam.gserviceaccount.com%2F20210425%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20210425T202010Z&X-Goog-Expires=259199&X-Goog-SignedHeaders=host&X-Goog-Signature=6133706ef10bc2dcd0b58f8398b4d73ab9e9d788de1718b07334df91f6007e1e4ca0b78e3176f95b8250e0c4535ce1633528f4fabffeb7e4124af3ee3f895ac34c03044fca9b23b23c4ddb8fa90d84dfc14869ff4806f03783cafad53b19445b3c3052983fdf1ca4384257eac1bc0a4270d238a1ea89d1289866c7a0ea7ad7c97a76f2e142c148019e39cc5a1295f92650747ac5ea5946b026f7ad6d5d262d4c4a370aee6bc1f5d5b445bb6d93692debe678a79e5e1c1fe3d3e68ea4f2fad3115795d3361e0626e98156fbc7f5967beb7cf0f00e07351d23a00d8677ebb75e3e13b1bfa07762266efabf6f6f9d53206be31b7623cf3614f60f8cf5011cf23def",
    delimiter="\t", header=None, names=["IMDB Review", "Sentiment"])
data = data.sample(frac=1).reset_index(drop=True)

### Split Data
We split the data into train, development and test

In [757]:
train = data[:math.floor(data.shape[0] * training_ratio)]

In [758]:
validation = data[math.floor(data.shape[0] * training_ratio):].sample(frac=1).reset_index(drop=True)
dev, test = np.array_split(validation, 2)

In [759]:
train, dev, test

(                                           IMDB Review  Sentiment
 0    It was clear that she had the range and abilit...          1
 1    I was deeply impressed with the character he p...          1
 2          This scene is very strong and unpleasant.            0
 3                   Everything about it is just bad.            0
 4                                   Not recommended.            0
 ..                                                 ...        ...
 593  Crash is a depressing little nothing, that pro...          0
 594                   Definitely worth checking out.            1
 595  I guess I liked the details of his dysfunction...          1
 596                   It will drive you barking mad!            0
 597  The warmth it generates is in contrast to its ...          1
 
 [598 rows x 2 columns],
                                           IMDB Review  Sentiment
 0   Ursula Burton's portrayal of the nun is both t...          1
 1                                   

## Generation of Vocabulary list

In [760]:
def split_words(review):
    return review.lower().replace(',', '').replace('"', '').replace('(', '').replace(')', '').replace('\'s',
                                                                                                      '').replace(
        '.',
        '').replace(
        '!', '').replace('-', ' ').replace('/', ' ').split()


def get_word_count(review_data_frame: pd.DataFrame, column_name: str):
    vocab = review_data_frame["IMDB Review"].apply(lambda review: pd.value_counts(
        split_words(review))).sum(axis=0).to_frame()
    vocab.columns = [column_name]
    vocab.reset_index(inplace=True)
    vocab = vocab.rename(columns={'index': 'Word'})
    return vocab

In [761]:
vocabulary = get_word_count(train, "Frequency")
vocabulary

Unnamed: 0,Word,Frequency
0,had,13.0
1,the,668.0
2,was,157.0
3,it,257.0
4,she,8.0
...,...,...
2761,mad,1.0
2762,barking,1.0
2763,austere,1.0
2764,generates,1.0


### Probability of the word
Frequency of the word in all documents / Total number of words

### Total Number of words

In [762]:
total_words = vocabulary["Frequency"].sum(axis=0)
total_words

11738.0

In [763]:
total_sentiments = train.count(axis=0)['Sentiment']
total_sentiments

598

In [764]:
vocabulary['Word Probability'] = vocabulary["Frequency"].div(total_words)
vocabulary

Unnamed: 0,Word,Frequency,Word Probability
0,had,13.0,0.001108
1,the,668.0,0.056909
2,was,157.0,0.013375
3,it,257.0,0.021895
4,she,8.0,0.000682
...,...,...,...
2761,mad,1.0,0.000085
2762,barking,1.0,0.000085
2763,austere,1.0,0.000085
2764,generates,1.0,0.000085


### Conditional Probability based on sentiment
i.e. P(word | sentiment = "Positive"(1)/ "Negative"(0))

###

In [765]:
positive_sentiments = train[train['Sentiment'] == 1]
positive_vocabulary = get_word_count(positive_sentiments, "Positive Sentiment Count")
vocabulary = vocabulary.merge(positive_vocabulary, how='left', on='Word')
vocabulary

Unnamed: 0,Word,Frequency,Word Probability,Positive Sentiment Count
0,had,13.0,0.001108,8.0
1,the,668.0,0.056909,288.0
2,was,157.0,0.013375,58.0
3,it,257.0,0.021895,114.0
4,she,8.0,0.000682,5.0
...,...,...,...,...
2761,mad,1.0,0.000085,
2762,barking,1.0,0.000085,
2763,austere,1.0,0.000085,1.0
2764,generates,1.0,0.000085,1.0


In [766]:
total_positive_words = positive_sentiments.count(axis=0)['Sentiment']
total_positive_words

304

In [767]:
probability_of_positive_sentiments = total_positive_words / total_sentiments
probability_of_positive_sentiments

0.5083612040133779

In [768]:
vocabulary['Positive Sentiments Probability'] = vocabulary['Positive Sentiment Count'].div(total_positive_words)
vocabulary

Unnamed: 0,Word,Frequency,Word Probability,Positive Sentiment Count,Positive Sentiments Probability
0,had,13.0,0.001108,8.0,0.026316
1,the,668.0,0.056909,288.0,0.947368
2,was,157.0,0.013375,58.0,0.190789
3,it,257.0,0.021895,114.0,0.375000
4,she,8.0,0.000682,5.0,0.016447
...,...,...,...,...,...
2761,mad,1.0,0.000085,,
2762,barking,1.0,0.000085,,
2763,austere,1.0,0.000085,1.0,0.003289
2764,generates,1.0,0.000085,1.0,0.003289


In [769]:
negative_sentiments = train[train['Sentiment'] == 0]
negative_vocabulary = get_word_count(negative_sentiments, "Negative Sentiment Count")
vocabulary = vocabulary.merge(negative_vocabulary, how='left', on='Word')
vocabulary

Unnamed: 0,Word,Frequency,Word Probability,Positive Sentiment Count,Positive Sentiments Probability,Negative Sentiment Count
0,had,13.0,0.001108,8.0,0.026316,5.0
1,the,668.0,0.056909,288.0,0.947368,380.0
2,was,157.0,0.013375,58.0,0.190789,99.0
3,it,257.0,0.021895,114.0,0.375000,143.0
4,she,8.0,0.000682,5.0,0.016447,3.0
...,...,...,...,...,...,...
2761,mad,1.0,0.000085,,,1.0
2762,barking,1.0,0.000085,,,1.0
2763,austere,1.0,0.000085,1.0,0.003289,
2764,generates,1.0,0.000085,1.0,0.003289,


In [770]:
total_negative_words = negative_sentiments.count(axis=0)['Sentiment']
total_negative_words

294

In [771]:
probability_of_negative_sentiments = total_negative_words / total_sentiments
probability_of_negative_sentiments



0.4916387959866221

In [772]:
vocabulary['Negative Sentiments Probability'] = vocabulary['Negative Sentiment Count'].div(total_negative_words)
vocabulary

Unnamed: 0,Word,Frequency,Word Probability,Positive Sentiment Count,Positive Sentiments Probability,Negative Sentiment Count,Negative Sentiments Probability
0,had,13.0,0.001108,8.0,0.026316,5.0,0.017007
1,the,668.0,0.056909,288.0,0.947368,380.0,1.292517
2,was,157.0,0.013375,58.0,0.190789,99.0,0.336735
3,it,257.0,0.021895,114.0,0.375000,143.0,0.486395
4,she,8.0,0.000682,5.0,0.016447,3.0,0.010204
...,...,...,...,...,...,...,...
2761,mad,1.0,0.000085,,,1.0,0.003401
2762,barking,1.0,0.000085,,,1.0,0.003401
2763,austere,1.0,0.000085,1.0,0.003289,,
2764,generates,1.0,0.000085,1.0,0.003289,,


In [773]:
def get_probabilities(review: str, sentiment: bool, smoothening: bool):
    prob = 1
    column_name = 'Positive Sentiments Probability' if sentiment else 'Negative Sentiments Probability'
    individual_prob = 0 if not smoothening else 1 / (probability_of_positive_sentiments if sentiment else probability_of_negative_sentiments)
    for word in split_words(review):
        if word in vocabulary.values:
            individual_prob = vocabulary[vocabulary['Word'] == word].iloc[0][column_name]
        prob *= 0 if math.isnan(individual_prob) else individual_prob
    return prob * (probability_of_positive_sentiments if sentiment else probability_of_negative_sentiments)

In [774]:
train["Conditional Positive Probability"] = train["IMDB Review"].apply(
    lambda review: get_probabilities(review, True, False))
train["Conditional Negative Probability"] = train["IMDB Review"].apply(
    lambda review: get_probabilities(review, False, False))
train["Predicted sentiment"] = train["Conditional Positive Probability"] > train["Conditional Negative Probability"]
print("Train Accuracy: ",
      train.loc[train["Predicted sentiment"] == train["Sentiment"]].count(axis=0)['Sentiment'] * 100 /
      train.count(axis=0)['Sentiment'])
train.loc[train["Predicted sentiment"] != train["Sentiment"]]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  train["Conditional Positive Probability"] = train["IMDB Review"].apply(


Train Accuracy:  96.98996655518394


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  train["Conditional Negative Probability"] = train["IMDB Review"].apply(
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  train["Predicted sentiment"] = train["Conditional Positive Probability"] > train["Conditional Negative Probability"]


Unnamed: 0,IMDB Review,Sentiment,Conditional Positive Probability,Conditional Negative Probability,Predicted sentiment
96,This movie is so awesome!,1,5.455198e-06,1.08888e-05,False
158,I keep watching it over and over.,1,1.002496e-09,5.733873e-09,False
168,Go watch it!,1,7.426069e-05,9.682962e-05,False
178,the cast was great.,1,0.000151127,0.0001782412,False
181,But this movie really got to me.,1,1.477665e-08,1.374912e-07,False
183,Everything from acting to cinematography was s...,1,1.223778e-10,3.204293e-10,False
349,I don't think you will be disappointed.,1,5.552148e-11,3.894015e-10,False
354,This is just a great movie.,1,1.677473e-05,9.68288e-05,False
367,This is actually a very smart movie.,1,4.259512e-08,4.911355e-08,False
368,It is a very well acted and done TV Movie.,1,1.422757e-11,2.284682e-11,False


In [775]:
dev["Conditional Positive Probability"] = dev["IMDB Review"].apply(
    lambda review: get_probabilities(review, True, False))
dev["Conditional Negative Probability"] = dev["IMDB Review"].apply(
    lambda review: get_probabilities(review, False, False))
dev["Predicted sentiment"] = dev["Conditional Positive Probability"] > dev["Conditional Negative Probability"]
print("Dev Accuracy: ",
      dev.loc[dev["Predicted sentiment"] == dev["Sentiment"]].count(axis=0)['Sentiment'] * 100 / dev.count(axis=0)[
          'Sentiment'])
dev.loc[dev["Predicted sentiment"] != dev["Sentiment"]]

Dev Accuracy:  54.666666666666664


Unnamed: 0,IMDB Review,Sentiment,Conditional Positive Probability,Conditional Negative Probability,Predicted sentiment
0,Ursula Burton's portrayal of the nun is both t...,1,0.0,0.0,False
1,Go rent it.,1,1.237678e-05,2.766561e-05,False
3,"Mark my words, this is one of those cult films...",1,0.0,0.0,False
11,Julian Fellowes has triumphed again.,1,0.0,0.0,False
12,It presents a idyllic yet serious portrayal of...,1,0.0,0.0,False
15,"If you have not seen this movie, I definitely ...",1,1.374982e-13,2.943093e-12,False
16,I loved this movie it was a great portrayal of...,1,0.0,5.761066e-42,False
17,Brilliance indeed.,1,5.500792e-06,5.687894e-06,False
19,I really hope the team behind this movie makes...,1,0.0,0.0,False
25,I liked this movie way too much.,1,3.298359e-10,5.084819e-09,False


In [776]:
test["Conditional Positive Probability"] = test["IMDB Review"].apply(
    lambda review: get_probabilities(review, True, False))
test["Conditional Negative Probability"] = test["IMDB Review"].apply(
    lambda review: get_probabilities(review, False, False))
test["Predicted sentiment"] = test["Conditional Positive Probability"] > test["Conditional Negative Probability"]
print("Test Accuracy: ",
      test.loc[test["Predicted sentiment"] == test["Sentiment"]].count(axis=0)['Sentiment'] * 100 / test.count(axis=0)[
          'Sentiment'])
test.loc[test["Predicted sentiment"] != test["Sentiment"]]

Test Accuracy:  60.0


Unnamed: 0,IMDB Review,Sentiment,Conditional Positive Probability,Conditional Negative Probability,Predicted sentiment
75,add betty white and jean smart and you have a ...,1,0.0,0.0,False
78,(My mother and brother had to do this)When I s...,1,0.0,0.0,False
80,Judith Light is one of my favorite actresses a...,1,0.0,0.0,False
83,":) Anyway, the plot flowed smoothly and the ma...",1,0.0,0.0,False
85,He's a national treasure.,1,0.00367568,0.007031718,False
87,Also notable is John Bailey's fine crisp beaut...,1,0.0,0.0,False
88,"An interesting premise, and Billy Drago is alw...",1,0.0,0.0,False
91,The interplay between Martin and Emilio contai...,1,0.0,0.0,False
93,"Just consider the excellent story, solid actin...",1,1.737537e-17,2.411006e-16,False
97,"I think the most wonderful parts (literally, f...",1,0.0,0.0,False


## Smoothening

In [777]:
vocabulary["Frequency"] += 1
total_words += 2
total_sentiments += 2
vocabulary['Word Probability'] = vocabulary["Frequency"].div(total_words)

vocabulary["Positive Sentiment Count"] += 1
vocabulary["Positive Sentiment Count"] = vocabulary["Positive Sentiment Count"].fillna(value=1)

total_positive_words += 2

probability_of_positive_sentiments = total_positive_words / total_sentiments

vocabulary['Positive Sentiments Probability'] = vocabulary['Positive Sentiment Count'].div(total_positive_words)
vocabulary["Negative Sentiment Count"] += 1
vocabulary["Negative Sentiment Count"] = vocabulary["Negative Sentiment Count"].fillna(value=1)

total_negative_words += 2

probability_of_negative_sentiments = total_negative_words / total_sentiments

vocabulary['Negative Sentiments Probability'] = vocabulary['Negative Sentiment Count'].div(total_negative_words)
vocabulary

Unnamed: 0,Word,Frequency,Word Probability,Positive Sentiment Count,Positive Sentiments Probability,Negative Sentiment Count,Negative Sentiments Probability
0,had,14.0,0.001193,9.0,0.029412,6.0,0.020270
1,the,669.0,0.056985,289.0,0.944444,381.0,1.287162
2,was,158.0,0.013458,59.0,0.192810,100.0,0.337838
3,it,258.0,0.021976,115.0,0.375817,144.0,0.486486
4,she,9.0,0.000767,6.0,0.019608,4.0,0.013514
...,...,...,...,...,...,...,...
2761,mad,2.0,0.000170,1.0,0.003268,2.0,0.006757
2762,barking,2.0,0.000170,1.0,0.003268,2.0,0.006757
2763,austere,2.0,0.000170,2.0,0.006536,1.0,0.003378
2764,generates,2.0,0.000170,2.0,0.006536,1.0,0.003378


In [778]:
train["Conditional Positive Probability"] = train["IMDB Review"].apply(
    lambda review: get_probabilities(review, True, True))
train["Conditional Negative Probability"] = train["IMDB Review"].apply(
    lambda review: get_probabilities(review, False, True))
train["Predicted sentiment"] = train["Conditional Positive Probability"] > train["Conditional Negative Probability"]
print("Train Accuracy: ",
      train.loc[train["Predicted sentiment"] == train["Sentiment"]].count(axis=0)['Sentiment'] * 100 /
      train.count(axis=0)['Sentiment'])
train.loc[train["Predicted sentiment"] != train["Sentiment"]]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  train["Conditional Positive Probability"] = train["IMDB Review"].apply(


Train Accuracy:  84.94983277591973


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  train["Conditional Negative Probability"] = train["IMDB Review"].apply(
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  train["Predicted sentiment"] = train["Conditional Positive Probability"] > train["Conditional Negative Probability"]


Unnamed: 0,IMDB Review,Sentiment,Conditional Positive Probability,Conditional Negative Probability,Predicted sentiment
0,It was clear that she had the range and abilit...,1,7.543952e-20,1.165596e-19,False
16,Overall I rate this movie a 10 out of a 1-10 s...,1,6.329398e-15,1.813160e-14,False
17,The movie had you on the edge of your seat and...,1,1.562233e-30,7.639549e-30,False
41,You'll love it! \t1\nThis movie is BAD. \t0\...,1,7.097510e-271,7.053394e-251,False
47,The transfers are very good.,1,2.102529e-06,2.127954e-06,False
...,...,...,...,...,...
558,"This movie contained an all-star cast, and wha...",1,1.574783e-35,2.255959e-34,False
560,"This is a very ""right on case"" movie that deli...",1,9.397772e-21,3.681174e-20,False
575,PS the only scene in the movie that was cool i...,1,5.660866e-26,9.348785e-26,False
576,The cast is good.,1,4.523723e-04,5.578888e-04,False


In [779]:
dev["Conditional Positive Probability"] = dev["IMDB Review"].apply(lambda review: get_probabilities(review, True, True))
dev["Conditional Negative Probability"] = dev["IMDB Review"].apply(
    lambda review: get_probabilities(review, False, True))
dev["Predicted sentiment"] = dev["Conditional Positive Probability"] > dev["Conditional Negative Probability"]
print("Dev Accuracy: ",
      dev.loc[dev["Predicted sentiment"] == dev["Sentiment"]].count(axis=0)['Sentiment'] * 100 / dev.count(axis=0)[
          'Sentiment'])
dev.loc[dev["Predicted sentiment"] != dev["Sentiment"]]

Dev Accuracy:  53.333333333333336


Unnamed: 0,IMDB Review,Sentiment,Conditional Positive Probability,Conditional Negative Probability,Predicted sentiment
0,Ursula Burton's portrayal of the nun is both t...,1,5.401182e-24,1.6612160000000003e-23,False
1,Go rent it.,1,2.865707e-05,4.930606e-05,False
3,"Mark my words, this is one of those cult films...",1,3.583004e-52,4.424043e-48,False
8,The camera really likes her in this movie.,1,1.973474e-09,2.097245e-09,False
12,It presents a idyllic yet serious portrayal of...,1,1.477837e-17,4.971419e-17,False
15,"If you have not seen this movie, I definitely ...",1,2.908177e-13,6.657487e-12,False
16,I loved this movie it was a great portrayal of...,1,1.539521e-41,2.5533410000000003e-39,False
17,Brilliance indeed.,1,2.178649e-05,2.252252e-05,False
19,I really hope the team behind this movie makes...,1,6.810893e-32,7.858053e-32,False
25,I liked this movie way too much.,1,8.431562e-10,1.279204e-08,False


In [780]:
test["Conditional Positive Probability"] = test["IMDB Review"].apply(
    lambda review: get_probabilities(review, True, True))
test["Conditional Negative Probability"] = test["IMDB Review"].apply(
    lambda review: get_probabilities(review, False, True))
test["Predicted sentiment"] = test["Conditional Positive Probability"] > test["Conditional Negative Probability"]
print("Test Accuracy: ",
      test.loc[test["Predicted sentiment"] == test["Sentiment"]].count(axis=0)['Sentiment'] * 100 / test.count(axis=0)[
          'Sentiment'])
test.loc[test["Predicted sentiment"] != test["Sentiment"]]

Test Accuracy:  65.33333333333333


Unnamed: 0,IMDB Review,Sentiment,Conditional Positive Probability,Conditional Negative Probability,Predicted sentiment
78,(My mother and brother had to do this)When I s...,1,2.95775e-36,7.595290999999999e-36,False
79,Also great directing and photography.,1,2.767734e-08,2.905988e-08,False
83,":) Anyway, the plot flowed smoothly and the ma...",1,2.340475e-15,1.134437e-11,False
85,He's a national treasure.,1,0.003919472,0.007320397,False
88,"An interesting premise, and Billy Drago is alw...",1,1.178261e-51,1.164876e-49,False
93,"Just consider the excellent story, solid actin...",1,7.326456000000001e-17,1.549679e-15,False
104,I believe the screenwriter did a good job of t...,1,6.18535e-13,2.044146e-12,False
106,Nothing short of magnificent photography/cinem...,1,7.924633e-12,1.06775e-10,False
109,This is a stunning movie.,1,2.261582e-05,0.0001876493,False
110,The art style has the appearance of crayon/pen...,1,2.264951e-15,9.130451e-15,False
