## Boushra Almazroua - Sentiment Analysis 
***

### Project Description
 For this exciting project, we were asked to analyze a number of product reviews and categorize them as either positive, negative, or neutral. Below is a Sentiment Analysis tool, also known as a Sentiment Analyzer, which is an natural language processing (NLP) tool that takes in a piece of text and evaluates the sentiment or emotion associated with said text. 
 ***

### Dataset Used
The dataset that was sent to us is a CSV file that has customer reviews for a number of different products. It includes information on the products, including their brands, categories, manufacturers, dates of reviews, URLs, ratings, and textual descriptions of the reviews.
***

To analyze the data, I utilized the *TextBlob* Library, which is imported and has extensive NLP capabilities:

1. A text input is passed to the sentiment_analyzer function, which analyzes it for sentiment.

2. Created a textblob object that offers numerous ways to examine textual data. 

3. The attitude The sentiment is used to calculate the text's polarity.A property of the TextBlob object is its polarity. The polarity scale goes from 0 (neutral) to 1 (most positive), with -1 being the most negative.

4. The sentiment is categorized as "positive" if the polarity is larger than 0, "negative" if the polarity is less than 0, and "neutral" if the polarity is exactly 0. This is based on the polarity score.


5. Example text inputs are used to test the sentiment_analyzer function after it has been defined. The sentiment_analyzer function receives each input text, processes it, and prints the sentiment categorization that results.


In [16]:
from textblob import TextBlob

def sentiment_analyzer(text):
    # Create a TextBlob object
    blob = TextBlob(text)
    
    # Get the sentiment polarity (-1 for most negative, 0 for neutral, 1 for most positive)
    polarity = blob.sentiment.polarity
    
    # Classify sentiment based on polarity
    if polarity > 0:
        return "positive"
    elif polarity < 0:
        return "negative"
    else:
        return "neutral"

# Test the sentiment analyzer
text = "I love this product! It's amazing."
print(sentiment_analyzer(text))  

text = "This product is terrible. I regret buying it."
print(sentiment_analyzer(text))  

text = "The product arrived on time."
print(sentiment_analyzer(text))  

positive
negative
neutral


In [17]:
import pandas as pd

# Specify the data types for columns with mixed types
dtype_dict = {"name": str, "reviews.didPurchase": str}

# Read the CSV file with specified data types
df = pd.read_csv("data.csv", dtype=dtype_dict)

# Filter the DataFrame for the specific product id
productDF = df[df["id"] == "AVpjEN4jLJeJML43rpUe"]

# Display the first few rows of the filtered DataFrame
productDF.head()


Unnamed: 0,id,name,asins,brand,categories,keys,manufacturer,reviews.date,reviews.dateAdded,reviews.dateSeen,...,reviews.doRecommend,reviews.id,reviews.numHelpful,reviews.rating,reviews.sourceURLs,reviews.text,reviews.title,reviews.userCity,reviews.userProvince,reviews.username
14733,AVpjEN4jLJeJML43rpUe,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,B018Y225IA,Amazon,"Computers/Tablets & Networking,Tablets & eBook...","841667103143,0841667103143,brandnewamazonkindl...",Amazon,2016-08-10T00:00:00.000Z,,"2017-08-27T00:00:00Z,2017-08-09T00:00:00Z,2017...",...,True,,0.0,4.0,http://reviews.bestbuy.com/3545/5025500/review...,My kids have enjoyed using this device. They h...,Perfect for my kids,,,Coach
14734,AVpjEN4jLJeJML43rpUe,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,B018Y225IA,Amazon,"Computers/Tablets & Networking,Tablets & eBook...","841667103143,0841667103143,brandnewamazonkindl...",Amazon,2016-08-10T00:00:00.000Z,,"2017-08-27T00:00:00Z,2017-08-09T00:00:00Z,2017...",...,True,,2.0,5.0,http://reviews.bestbuy.com/3545/5025500/review...,This tablet is the perfect size and so easy to...,Great Tablet,,,gracie
14735,AVpjEN4jLJeJML43rpUe,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,B018Y225IA,Amazon,"Computers/Tablets & Networking,Tablets & eBook...","841667103143,0841667103143,brandnewamazonkindl...",Amazon,2016-08-07T00:00:00.000Z,,"2017-08-27T00:00:00Z,2017-08-09T00:00:00Z,2017...",...,True,,2.0,5.0,http://reviews.bestbuy.com/3545/5025500/review...,"Bought this for vacation electronics. Music, m...",All in One,,,YippySkippy
14736,AVpjEN4jLJeJML43rpUe,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,B018Y225IA,Amazon,"Computers/Tablets & Networking,Tablets & eBook...","841667103143,0841667103143,brandnewamazonkindl...",Amazon,2016-08-06T00:00:00.000Z,,"2017-08-27T00:00:00Z,2017-08-09T00:00:00Z,2017...",...,True,,0.0,4.0,http://reviews.bestbuy.com/3545/5025500/review...,Easy access to book reader. Love watching my N...,Excellent Book Reader,,,dicksquared
14737,AVpjEN4jLJeJML43rpUe,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,B018Y225IA,Amazon,"Computers/Tablets & Networking,Tablets & eBook...","841667103143,0841667103143,brandnewamazonkindl...",Amazon,2016-08-06T00:00:00.000Z,,"2017-08-27T00:00:00Z,2017-08-09T00:00:00Z,2017...",...,True,,0.0,5.0,http://reviews.bestbuy.com/3545/5025500/review...,Upgraded; easy to use; thinner; very happy wit...,Kindle Fire 7,,,Papaw


### Product Summary: 

In [18]:
import pandas as pd

# Describe the data
print("Product Name: " + str(productDF["name"].iloc[0]))  # Access the first element of the "name" column
print("Number of reviews: " + str(len(productDF)))
print("Length of shortest review: " + str(productDF["reviews.text"].str.len().min()))  # Calculate the length of the shortest review


Product Name: Brand New Amazon Kindle Fire 16gb 7 Ips Display Tablet Wifi 16 Gb Blue,,,
Number of reviews: 1038
Length of shortest review: 50


In [19]:
 #Keep the product name, product brand, review rating, reviews title, and review text

productDF = productDF.drop(["categories", "asins", "keys", "manufacturer", "reviews.date", "reviews.dateAdded",
                            "reviews.dateSeen", "reviews.didPurchase", "reviews.doRecommend", "reviews.id",
                            "reviews.numHelpful", "reviews.sourceURLs",
                            "reviews.userCity", "reviews.userProvince", "reviews.username", "id"], axis=1)
#productDF.head()

In [20]:
productDF.loc[:, "sentiment"] = productDF["reviews.text"].apply(sentiment_analyzer)
#productDF.head()

### Postive Reviews

In [21]:
num_positive_reviews = len(productDF[productDF["sentiment"] == "positive"])
total_reviews = len(productDF)
positive_review_rate = num_positive_reviews / total_reviews
print("Number of positive reviews: {}".format(num_positive_reviews))
print("Rate of positive reviews: {:.2%}".format(positive_review_rate))

productDF[productDF["sentiment"] == "positive"].head()

Number of positive reviews: 911
Rate of positive reviews: 87.76%


Unnamed: 0,name,brand,reviews.rating,reviews.text,reviews.title,sentiment
14733,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,4.0,My kids have enjoyed using this device. They h...,Perfect for my kids,positive
14734,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,5.0,This tablet is the perfect size and so easy to...,Great Tablet,positive
14736,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,4.0,Easy access to book reader. Love watching my N...,Excellent Book Reader,positive
14737,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,5.0,Upgraded; easy to use; thinner; very happy wit...,Kindle Fire 7,positive
14738,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,4.0,I got this tablet on a deal and has good quali...,Good tablet to watch videos,positive


### Negative Reviews

In [22]:
num_negative_reviews = len(productDF[productDF["sentiment"] == "negative"])
negative_review_rate = num_negative_reviews / total_reviews
print("Number of negative reviews: {}".format(num_negative_reviews))
print("Rate of negative reviews: {:.2%}".format(negative_review_rate))

productDF[productDF["sentiment"] == "negative"].head()

Number of negative reviews: 59
Rate of negative reviews: 5.68%


Unnamed: 0,name,brand,reviews.rating,reviews.text,reviews.title,sentiment
14735,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,5.0,"Bought this for vacation electronics. Music, m...",All in One,negative
14744,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,3.0,Too slow for games and videos i bought it for ...,Ok for what you pay,negative
14747,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,4.0,This was a birthday gift for a family member w...,Nice gift idea,negative
14773,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,5.0,fits easily into my purse or bag. small compac...,love it!,negative
14803,Brand New Amazon Kindle Fire 16gb 7 Ips Displa...,Amazon,1.0,"Difficult to purchase. No stock in 6 stores, n...",Returned,negative


### Netural Reviews

In [23]:
num_netural_reviews = len(productDF[productDF["sentiment"] == "netural"])
netural_review_rate = num_netural_reviews / total_reviews
print("Number of netural reviews: {}".format(num_netural_reviews))
print("Rate of netural reviews: {:.2%}".format(netural_review_rate))

productDF[productDF["sentiment"] == "netural"].head()

Number of netural reviews: 0
Rate of netural reviews: 0.00%


Unnamed: 0,name,brand,reviews.rating,reviews.text,reviews.title,sentiment


Fixes: there are some comments that are positive but are considered negative, and the netural reviews is not working