### __Sentiment Analysis__
Sentiment analysis is the process of determining the emotional tone or opinion expressed in a piece of text. It involves classifying text into positive, negative, or neutral categories based on the sentiment conveyed by the words and phrases used.

### __Polarity__
Polarity is a measure of the positive or negative orientation of a text. It represents the degree to which a text expresses a positive or negative sentiment. Polarity scores typically range from -1 to 1, where -1 indicates a highly negative sentiment, 0 represents a neutral sentiment, and 1 indicates a highly positive sentiment.

### __Subjectivity__
Subjectivity refers to the extent to which a text expresses personal opinions, beliefs, or feelings, rather than factual information. It measures how subjective or opinionated a text is. Subjectivity scores usually range from 0 to 1, where 0 indicates a highly objective text and 1 indicates a highly subjective text.

In [2]:
# Install required libraries
!pip install spacytextblob



In [3]:
# Import necessary libraries
import spacy
import pandas as pd
from spacytextblob.spacytextblob import SpacyTextBlob

In [4]:
# Load the spaCy model
nlp = spacy.load('en_core_web_lg')
# nlp = spacy.load('en_core_web_sm')

In [5]:
# View current spacy pipeline
print(nlp.pipe_names)

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner']


In [6]:
# Add SpacyTextBlob to the pipeline
nlp.add_pipe('spacytextblob')

<spacytextblob.spacytextblob.SpacyTextBlob at 0x324bde710>

In [7]:
print(nlp.pipe_names)

['tok2vec', 'tagger', 'parser', 'attribute_ruler', 'lemmatizer', 'ner', 'spacytextblob']


In [1]:
# Text Preprocessing
def preprocess_text(text):
    # Create a spaCy document
    doc = nlp(text)
    
    # Remove stopwords and punctuation
    filtered_tokens = [token for token in doc if not token.is_stop and not token.is_punct]
    
    # Lemmatize the tokens
    lemmatized_tokens = [token.lemma_ for token in filtered_tokens]
    
    # Join the lemmatized tokens back into a string
    preprocessed_text = ' '.join(lemmatized_tokens)
    
    return preprocessed_text

In [8]:
# Sentiment Analysis
def analyze_sentiment(text):
    # Create a spaCy document
    doc = nlp(text)
    
    # Get the sentiment polarity and subjectivity
    polarity = doc._.blob.polarity
    subjectivity = doc._.blob.subjectivity
    
    # Determine the sentiment label
    if polarity > 0:
        sentiment = 'Positive'
    elif polarity < 0:
        sentiment = 'Negative'
    else:
        sentiment = 'Neutral'
    
    return sentiment, polarity, subjectivity

In [9]:
# Load in the dataset
df = pd.read_csv('Reviews.csv')

In [10]:
# Preprocess the text column
df['preprocessed_text'] = df['Text'].apply(preprocess_text)

In [11]:
# Analyze the sentiment of the preprocessed text
df['sentiment'], df['polarity'], df['subjectivity'] = zip(*df['preprocessed_text'].apply(analyze_sentiment))

In [12]:
# Print the first few rows of the DataFrame
df.head()

Unnamed: 0,Id,Summary,Text,preprocessed_text,sentiment,polarity,subjectivity
0,1,Good Quality Dog Food,I have bought several of the Vitality canned d...,buy Vitality can dog food product find good qu...,Positive,0.7,0.6
1,2,Not as Advertised,Product arrived labeled as Jumbo Salted Peanut...,product arrived label Jumbo Salted Peanuts pea...,Positive,0.216667,0.762963
2,3,"""Delight"" says it all",This is a confection that has been around a fe...,confection century light pillowy citrus gela...,Positive,0.187,0.548
3,4,Cough Medicine,If you are looking for the secret ingredient i...,look secret ingredient Robitussin believe find...,Positive,0.15,0.65
4,5,Great taffy,Great taffy at a great price. There was a wid...,great taffy great price wide assortment yumm...,Positive,0.458333,0.6


In [15]:
# Show text of top 5 most negative reviews along with their polarity and preprocessed text
for index, row in df.sort_values('polarity').head().iterrows():
    print(f"Review: {row['Text']}")
    print(f"Preprocessed Text: {row['preprocessed_text']}")
    print(f"Polarity: {row['polarity']}")
    print()

Review: I purchased the Mango flavor, and to me it doesn't take like Mango at all.  There is no hint of sweetness, and unfortunately there is a hint or aftertaste almost like licorice.  I've been consuming various sports nutrition products for decades, so I'm familiar and have come to like the taste of the most of the products I've tried.  The mango flavor is one of the least appealing I've tasted.  It's not terrible, but it's bad enough that I notice the bad taste every sip I take.
Preprocessed Text: purchase Mango flavor like Mango   hint sweetness unfortunately hint aftertaste like licorice   consume sport nutrition product decade familiar come like taste product try   mango flavor appeal taste   terrible bad notice bad taste sip
Polarity: -0.5049999999999999

Review: Arrived in 6 days and were so stale i could not eat any of the 6 bags!!
Preprocessed Text: arrive 6 day stale eat 6 bag
Polarity: -0.5

Review: The Strawberry Twizzlers are my guilty pleasure - yummy. Six pounds will b

In [17]:
# Show the top 5 most positive reviews along with their polarity and preprocessed text
for index, row in df.sort_values('polarity', ascending=False).head().iterrows():
    print(f"Review: {row['Text']}")
    print(f"Preprocessed Text: {row['preprocessed_text']}")
    print(f"Polarity: {row['polarity']}")
    print()

Review: I can remember buying this candy as a kid and the quality hasn't dropped in all these years. Still a superb product you won't be disappointed with.
Preprocessed Text: remember buy candy kid quality drop year superb product will disappoint
Polarity: 1.0

Review: This offer is a great price and a great taste, thanks Amazon for selling this product.<br /><br />Staral
Preprocessed Text: offer great price great taste thank Amazon sell product.<br /><br />Staral
Polarity: 0.8

Review: This is great dog food, my dog has severs allergies and this brand is the only one that we can feed him.
Preprocessed Text: great dog food dog sever allergy brand feed
Polarity: 0.8

Review: Great product, nice combination of chocolates and perfect size!  The bags had plenty, and they were shipped promptly.  The kids in the neighborhood liked our candies!
Preprocessed Text: great product nice combination chocolate perfect size   bag plenty ship promptly   kid neighborhood like candy
Polarity: 0.79999999

__Strengths__:
- The sentiment analysis model can quickly classify reviews into positive, negative, and neutral categories.
- It provides a quantitative measure of sentiment polarity and subjectivity.
- The model can handle a large dataset of reviews efficiently.

__Limitations__:
- The model relies on the pre-trained spaCy model and TextBlob library, which may not capture all nuances and context-specific sentiments.
- Sarcasm, irony, and complex language structures can be challenging for the model to interpret correctly.
- The model's performance may be affected by the quality and representativeness of the training data used by spaCy and TextBlob.