## Overview
This project aims to analyse sentiments of Amazon reviewers against 50 products in each of following 3 product lines - phones, consoles and cameras betweenn 2006 and 2018. Furthermore, it tries to measure percentage of positive, negative and neutral sentiments against specific features and showcase it as a Tableau dashboard - https://public.tableau.com/profile/arya4413#!/vizhome/ReviewAnlayzerTool/ExecutiveSummary

Throughout the notebook several functions have been defined to make the code user-friendly. Each function has specific task such as to import "camera", "console", or "phone" data etc.

## Data Source
The data was extracted using existing APIs and ASIN numbers as pickle file then converted to separate csv files. Since, I have not covered data cleaning in this project. I have uploaded both raw and processed files for each of the 3 product lines in the attached Data.rar file. User can directly use the "...processed" files to run the notebook.

## Import Packages

In [1]:
# import libraries
import pandas as pd
import os, time
from textblob import TextBlob
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

## Define Functions

In [5]:
# importing cameras data
def analyse_sentiments(filename):
    df = pd.read_csv(filename + '.csv')
    df = vader(df)
    df = textblob(df)
    return(df)

# analyzing sentiments through VADER
def vader(df):
    vader_obj = SentimentIntensityAnalyzer() 
    df['VADER_score']=df.apply(lambda x : vader_obj.polarity_scores(str(x['processed']))['compound'], 
                                  axis=1)
    return(df)

def textblob(df):
    df['TEXTBLOB_score']=df.apply(lambda x : TextBlob(str(x['processed'])).sentiment[0], 
                                  axis=1)
    df['Subjectivity_score']=df.apply(lambda x : TextBlob(str(x['processed'])).sentiment[1], 
                                  axis=1)
    
    return(df)

def process_sentiment(df):
    df['Combined_Sentiment_Score'] = df.apply(lambda x: x['TEXTBLOB_score'] if 
                                              x['VADER_score'] == 0 else
                                              x['VADER_score'],axis=1)
    df['Sentiment'] = df.apply(lambda x: 'Positive' if 
                               x['Combined_Sentiment_Score'] > 0 else
                               ('Negative'if x['Combined_Sentiment_Score']< 0 else
                                'Neutral'),
                               axis=1)
    return(df)

def export_to_csv(df,out_filename):
    df.to_csv(out_filename + '.csv')

## Execute Functions

In [6]:
### Camera

# calculating sentiments of the product lines
camera_df = analyse_sentiments('Camera Final Processed Data')

# subsetting relavant columns to analyse the accuracy of sentiments
camera_subset_df = camera_df[['processed','VADER_score','TEXTBLOB_score']]

# processing final sentiment scores and sentiments
camera_df = process_sentiment(camera_df)

# exporting to csv
export_to_csv(camera_df,'camera')

### Consoles

# calculating sentiments of the product lines
console_df = analyse_sentiments('Console Final Processed Data')

# subsetting relavant columns to analyse the accuracy of sentiments
console_subset_df = console_df[['processed','VADER_score','TEXTBLOB_score']]

# processing final sentiment scores and sentiments
console_df = process_sentiment(console_df)

# exporting to csv
export_to_csv(console_df,'console')

### Phones

# calculating sentiments of the product lines
phone_df = analyse_sentiments('Phone Final Processed Data')

# subsetting relavant columns to analyse the accuracy of sentiments
phone_subset_df = phone_df[['processed','VADER_score','TEXTBLOB_score']]

# processing final sentiment scores and sentiments
phone_df = process_sentiment(phone_df)

In [7]:
console_df.head()

Unnamed: 0.2,Unnamed: 0,index,Unnamed: 0.1,product_reviewer_index,product,brand,date,rating,reviewer_name,summary,...,Document_No,Dominant_Topic,Topic_Perc_Contrib,Keywords,Text,VADER_score,TEXTBLOB_score,Subjectivity_score,Combined_Sentiment_Score,Sentiment
0,0,1,0,0,Nintendo 2DS XL,Nintendo,2019-12-03,5.0,BCH,Love this thing!,...,0,5.0,0.4691,"easy, drive, amazon, fast, code, hard, set, ti...","['speaker', 'placement', 'sound']",-0.5096,0.6,0.575,-0.5096,Negative
1,1,3,1,1,Nintendo 2DS XL,Nintendo,2019-06-29,2.0,Dizzy Turney,More time charging than playing,...,1,3.0,0.62,"time, day, christmas, gift, problem, son, year...","['day', 'ago', 'decide', 'upgrade', 'original']",0.3182,0.375,0.75,0.3182,Positive
2,2,4,1,1,Nintendo 2DS XL,Nintendo,2019-06-29,2.0,Dizzy Turney,More time charging than playing,...,2,3.0,0.895,"time, day, christmas, gift, problem, son, year...","['cheaper', 'quality', 'original', 'far', 'spe...",0.3182,0.14375,0.5125,0.3182,Positive
3,3,5,1,1,Nintendo 2DS XL,Nintendo,2019-06-29,2.0,Dizzy Turney,More time charging than playing,...,3,2.0,0.8808,"controller, month, battery, return, wait, year...","['battery', 'die', 'warn', 'hour', 'lose', 'da...",-0.783,0.0,0.0,-0.783,Negative
4,4,8,3,3,Nintendo 2DS XL,Nintendo,2019-01-04,5.0,Amazon Customer,Would recommend! :),...,4,3.0,0.8301,"time, day, christmas, gift, problem, son, year...","['stay', 'charge', 'lengthy', 'time']",0.0,0.0,0.0,0.0,Neutral


In [10]:
camera_df.head()

Unnamed: 0.2,Unnamed: 0,index,Unnamed: 0.1,asin,product,brand,date,vote,rating,reviewer_id,...,Document_No,Dominant_Topic,Topic_Perc_Contrib,Keywords,Text,VADER_score,TEXTBLOB_score,Subjectivity_score,Combined_Sentiment_Score,Sentiment
0,0,0,1,B00I8BIBCW,DSCW800 Digital Camera Black,Sony,20/5/2014,244.0,5,A17WWPEMFJAX5E,...,0,0,0.4779,"video, card, link, iso, minute, 10, tz5, peopl...","['occasional', 'photographer', 'meet']",0.0,0.0,0.125,0.0,Neutral
1,1,2,1,B00I8BIBCW,DSCW800 Digital Camera Black,Sony,20/5/2014,244.0,5,A17WWPEMFJAX5E,...,1,3,0.4697,"battery, lens, wide, time, life, feature, angl...","['rechargeable', 'li', 'battery', 'improvement...",0.4588,0.1,0.2,0.4588,Positive
2,2,3,1,B00I8BIBCW,DSCW800 Digital Camera Black,Sony,20/5/2014,244.0,5,A17WWPEMFJAX5E,...,2,1,0.8964,"light, low, picture, zoom, flash, photo, focus...","['feature', 'corrects', 'shake', 'blurry', 'ph...",0.5106,0.15,0.2,0.5106,Positive
3,3,5,1,B00I8BIBCW,DSCW800 Digital Camera Black,Sony,20/5/2014,244.0,5,A17WWPEMFJAX5E,...,3,3,0.4698,"battery, lens, wide, time, life, feature, angl...","['expensive', 'case', 'drop', 'lake', 'bridge'...",-0.1621,-0.083333,0.422222,-0.1621,Negative
4,4,6,2,B00I8BIBCW,DSCW800 Digital Camera Black,Sony,14/6/2018,,5,A1R122NT8D7DHT,...,4,4,0.8387,"mode, manual, set, auto, flash, setting, contr...","['item', 'arrive', 'time', 'described']",0.0,0.0,0.0,0.0,Neutral


In [13]:
phone_df.head()

Unnamed: 0.2,Unnamed: 0,index,Unnamed: 0.1,product_reviewer_index,product,brand,date,rating,reviewer_name,summary,...,tokenised,Document_No,Dominant_Topic,Topic_Perc_Contrib,Keywords,VADER_score,TEXTBLOB_score,Subjectivity_score,Combined_Sentiment_Score,Sentiment
0,0,0,0,0,iPhone 7,Apple,11/5/2017,5,Sayed Elgamal,Good deal,...,"['cosmetically', 'phoe', 'condition', 'look']",0,3,0.8015,"screen, camera, scratch, el, android, la, qual...",0.4404,0.418182,0.527273,0.4404,Positive
1,1,3,2,2,iPhone 7,Apple,10/23/2019,5,Wil C.,"Second iPhone (had and 8), but this works just...",...,"['replan', 'company', 'turn']",1,5,0.7403,"sim, card, charge, day, charger, time, old, bo...",0.0,0.0,0.0,0.0,Neutral
2,2,10,3,3,iPhone 7,Apple,12/25/2018,5,Thop,Looks New - Works Fine,...,"['battery', '84', 'expect', 'refurbish', 'easi...",2,4,0.5098,"battery, life, update, fast, long, day, charge...",0.4767,0.216667,0.479167,0.4767,Positive
3,3,13,4,4,iPhone 7,Apple,3/19/2020,1,BigPappiG,NOT AS ADVERTISED,...,"['advertised', 'completely', 'unlocked', 'usea...",3,2,0.8943,"unlocked, verizon, amazon, service, mobile, ne...",0.0,0.1,0.4,0.1,Positive
4,4,15,4,4,iPhone 7,Apple,3/19/2020,1,BigPappiG,NOT AS ADVERTISED,...,"['spent', 'hour', 'set', 'tell', 'wrong']",4,1,0.5173,"price, old, fingerprint, return, year, reader,...",-0.4767,-0.3,0.5,-0.4767,Negative


User of this notebook can use at the column "Sentiment" in each of the 3 dataframes to analyse the sentiments of reviewers against the products - use group by.
I have done similar processing in Tableau and showcased on Tableau public. Link is at the top of the notebook.