## Problem Defnition

Customer reviews & feedback are crucial for any product in the market. Product reviews & feedback from customers play a pivotal role in enriching the product's quality & alongside meet the market expectations. It is easy for any seller to get reviews through one-one conversations with customers if the product is sold in offline stores, but it is difficult to retrieve & analyse the same reviews if the same product is sold online. E-commerce is one of the booming industries & is a one-stop destination for various sellers to market & sell their products online to attract a larger market. Given a set of customer reviews of each category (camera, battery, display, value for money performance) for a mobile that is live on an e-commerce platform like (Flipkart/Amazon. etc):

1) Categorize & analyse the reviews to calculate the percentage of positive & negative reviews.

2) Calculate the total rating on a scale of 5 for each category.

3)Create a Ranking table for each Mobile phone based on each category and overall ranking.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import re
from google.colab import drive
import nltk
#Ignore warnings
import warnings
warnings.filterwarnings('ignore')

from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report,ConfusionMatrixDisplay
from sklearn import naive_bayes
from sklearn.linear_model import LogisticRegression
from sklearn.svm import LinearSVC
from sklearn.neighbors import KNeighborsClassifier
from nltk.corpus import stopwords
from sklearn.ensemble import RandomForestClassifier

In [None]:
data= pd.read_csv('/content/drive/MyDrive/Colab Notebooks/data/Flipkart_Amazon Mobile Reviews - Flipkart_Amazon Mobile Reviews.csv')
## print shape of dataset with rows and columns and information
print ("The shape of the  data is (row, column):"+ str(data.shape))
print (data.info())

The shape of the  data is (row, column):(23777, 5)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23777 entries, 0 to 23776
Data columns (total 5 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Unnamed: 0    23777 non-null  int64 
 1   Review-Title  23775 non-null  object
 2   rating        23777 non-null  object
 3   Review-Body   23160 non-null  object
 4   Product Name  23777 non-null  object
dtypes: int64(1), object(4)
memory usage: 928.9+ KB
None


In [None]:
data=data.drop(['Unnamed: 0'],axis=1)
data.head(10)

Unnamed: 0,Review-Title,rating,Review-Body,Product Name
0,Worst phone ever,1.0 out of 5 stars,Hang problem,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
1,Ok !!! Not up to the mark,2.0 out of 5 stars,I'm writing this review after using 3days !!!B...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
2,Awesome look,5.0 out of 5 stars,Camera is so good n very fast phone back look ...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
3,One plus losing is originality!!!,3.0 out of 5 stars,The media could not be loaded.\n ...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
4,Read,1.0 out of 5 stars,I got my delivery on 23 feb when I unboxed the...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
5,Fantastic but some bug fixes required!,4.0 out of 5 stars,The media could not be loaded.\n ...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
6,A good choice for upgrade,5.0 out of 5 stars,Nord CE 2 is a decent choice for someone looki...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
7,Camera is not good... oppo is the best,3.0 out of 5 stars,Phone is over all good but some heating proble...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
8,****VERY DISPOINTED BY 1PLUS****. Sound and ba...,1.0 out of 5 stars,****Don't buy any phones from Amazon*** i real...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
9,Excellent all rounder!,5.0 out of 5 stars,The media could not be loaded.\n ...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."


In [None]:
#Checking for null values
data.isnull().sum()

Review-Title      2
rating            0
Review-Body     617
Product Name      0
dtype: int64

In [None]:
# replace null values with nan
data['Review-Title']=data['Review-Title'].fillna('missing')
data['Review-Body']=data['Review-Body'].fillna('missing')

In [None]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23777 entries, 0 to 23776
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   Review-Title  23777 non-null  object
 1   rating        23777 non-null  object
 2   Review-Body   23777 non-null  object
 3   Product Name  23777 non-null  object
dtypes: object(4)
memory usage: 743.2+ KB


In [None]:
# check the number of null values per column replacing with nan and dropping nan
print(data.isnull().sum())

Review-Title    0
rating          0
Review-Body     0
Product Name    0
dtype: int64


In [None]:
data.duplicated().sum()

1162

In [None]:
# Find duplicates in the 'ID' column
data =data.drop_duplicates(subset=['Review-Title','Review-Body'],keep='first')

# Print the rows with duplicates in the 'ID' column
data.head(10)

Unnamed: 0,Review-Title,rating,Review-Body,Product Name
0,Worst phone ever,1.0 out of 5 stars,Hang problem,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
1,Ok !!! Not up to the mark,2.0 out of 5 stars,I'm writing this review after using 3days !!!B...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
2,Awesome look,5.0 out of 5 stars,Camera is so good n very fast phone back look ...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
3,One plus losing is originality!!!,3.0 out of 5 stars,The media could not be loaded.\n ...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
4,Read,1.0 out of 5 stars,I got my delivery on 23 feb when I unboxed the...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
5,Fantastic but some bug fixes required!,4.0 out of 5 stars,The media could not be loaded.\n ...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
6,A good choice for upgrade,5.0 out of 5 stars,Nord CE 2 is a decent choice for someone looki...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
7,Camera is not good... oppo is the best,3.0 out of 5 stars,Phone is over all good but some heating proble...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
8,****VERY DISPOINTED BY 1PLUS****. Sound and ba...,1.0 out of 5 stars,****Don't buy any phones from Amazon*** i real...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."
9,Excellent all rounder!,5.0 out of 5 stars,The media could not be loaded.\n ...,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12..."


In [None]:
data.duplicated().sum()

0

In [None]:
data.shape

(21723, 4)

In [None]:
# Using Series.str.cat() function
data['Review'] = data['Review-Title'].str.cat(data['Review-Body'], sep = " ")
data=data.drop(['Review-Title','Review-Body'],axis=1)
data.head(10)

Unnamed: 0,rating,Product Name,Review
0,1.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Worst phone ever Hang problem
1,2.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Ok !!! Not up to the mark I'm writing this rev...
2,5.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Awesome look Camera is so good n very fast pho...
3,3.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",One plus losing is originality!!! The media co...
4,1.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Read I got my delivery on 23 feb when I unboxe...
5,4.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Fantastic but some bug fixes required! The med...
6,5.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",A good choice for upgrade Nord CE 2 is a decen...
7,3.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Camera is not good... oppo is the best Phone i...
8,1.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",****VERY DISPOINTED BY 1PLUS****. Sound and ba...
9,5.0 out of 5 stars,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Excellent all rounder! The media could not be ...


In [None]:
unwanted_string = "out of 5 stars"

# Apply the string removal to the entire column
data['rating'] = data['rating'].str.replace(unwanted_string, '')
# Print the DataFrame with the unwanted string removed
data.head()

Unnamed: 0,rating,Product Name,Review
0,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Worst phone ever Hang problem
1,2.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Ok !!! Not up to the mark I'm writing this rev...
2,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Awesome look Camera is so good n very fast pho...
3,3.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",One plus losing is originality!!! The media co...
4,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Read I got my delivery on 23 feb when I unboxe...


In [None]:
data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 21723 entries, 0 to 23776
Data columns (total 3 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   rating        21723 non-null  object
 1   Product Name  21723 non-null  object
 2   Review        21723 non-null  object
dtypes: object(3)
memory usage: 678.8+ KB


In [None]:
data['rating'] = pd.to_numeric(data['rating'], errors='coerce')
data["Sentiment"] = data["rating"].apply(lambda score: "positive" if score >= 3 else "negative")
data['Sentiment'] = data['Sentiment'].map({'positive':1, 'negative':0})

In [None]:
data.head(10)

Unnamed: 0,rating,Product Name,Review,Sentiment
0,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Worst phone ever Hang problem,0
1,2.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Ok !!! Not up to the mark I'm writing this rev...,0
2,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Awesome look Camera is so good n very fast pho...,1
3,3.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",One plus losing is originality!!! The media co...,1
4,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Read I got my delivery on 23 feb when I unboxe...,0
5,4.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Fantastic but some bug fixes required! The med...,1
6,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",A good choice for upgrade Nord CE 2 is a decen...,1
7,3.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Camera is not good... oppo is the best Phone i...,1
8,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",****VERY DISPOINTED BY 1PLUS****. Sound and ba...,0
9,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Excellent all rounder! The media could not be ...,1


In [None]:
data.isnull().sum()

rating          0
Product Name    0
Review          0
Sentiment       0
dtype: int64

In [None]:
data.shape

(21723, 4)

In [None]:
# Remove emoticons from the 'Text' column
data['Review'] = data['Review'].apply(lambda x: re.sub(r'[\U0001F600-\U0001F64F\U0001F300-\U0001F5FF\U0001F680-\U0001F6FF\U0001F700-\U0001F77F\U0001F780-\U0001F7FF\U0001F800-\U0001F8FF\U0001F900-\U0001F9FF\U0001FA00-\U0001FA6F\U0001FA70-\U0001FAFF\U0001FAB0-\U0001FABF\U0001FAC0-\U0001FAFF\U0001FAD0-\U0001FAD9\U0001F300-\U0001F5FF\U0001F004-\U0001F0CF]+', '', x))
data.head(10)

Unnamed: 0,rating,Product Name,Review,Sentiment
0,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Worst phone ever Hang problem,0
1,2.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Ok !!! Not up to the mark I'm writing this rev...,0
2,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Awesome look Camera is so good n very fast pho...,1
3,3.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",One plus losing is originality!!! The media co...,1
4,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Read I got my delivery on 23 feb when I unboxe...,0
5,4.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Fantastic but some bug fixes required! The med...,1
6,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",A good choice for upgrade Nord CE 2 is a decen...,1
7,3.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Camera is not good... oppo is the best Phone i...,1
8,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",****VERY DISPOINTED BY 1PLUS****. Sound and ba...,0
9,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Excellent all rounder! The media could not be ...,1


In [None]:
data['Review'] = data['Review'].str.encode('utf-8').str.decode('utf-8')

In [None]:
# export the data frame to a .csv file
data.to_csv (r'amazon_reviews.csv', index = False, header=True)

In [None]:
import pandas as pd
from google.cloud import translate_v2 as translate
# Path to your service account key file
key_path = '/content/flash-audio-403004-9f7d4ec959aa.json'

# Initialize the translation client with the key file
client = translate.Client.from_service_account_json(key_path)

# Load your data into a Pandas DataFrame (replace 'data.csv' with your file)
data = pd.read_csv('amazon_reviews.csv')

# Identify the columns that need translation (e.g., 'text_to_translate')
columns_to_translate = ['Review']

# Create a new column to store the translated text
data['translated_text'] = ""

# Loop through the data and perform translations for each row
for index, row in data.iterrows():
    for column in columns_to_translate:
        text_to_translate = row[column]
        if pd.notnull(text_to_translate):  # Check for non-null values
            # Translate 'text_to_translate' to a target language (e.g., Spanish)
            target_language = "en"
            translation = client.translate(text_to_translate, target_language=target_language)

            # Get the translated text and store it in the 'translated_text' column
            data.at[index, 'translated_text'] = translation["translatedText"]

# Save the translated data to a new CSV file (replace 'translated_data.csv' with your desired file name)
data.to_csv('translated_data.csv', index=False)


In [None]:
data.head(20)

Unnamed: 0,rating,Product Name,Review,Sentiment,translated_text
0,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Worst phone ever Hang problem,0,Worst phone ever Hang problem
1,2.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Ok !!! Not up to the mark I'm writing this rev...,0,Ok !!! Not up to the mark I'm writing this rev...
2,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Awesome look Camera is so good n very fast pho...,1,Awesome look Camera is so good n very fast pho...
3,3.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",One plus losing is originality!!! The media co...,1,One plus losing is originality!!! The media co...
4,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Read I got my delivery on 23 feb when I unboxe...,0,Read I got my delivery on 23 feb when I unboxe...
5,4.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Fantastic but some bug fixes required! The med...,1,Fantastic but some bug fixes required! The med...
6,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",A good choice for upgrade Nord CE 2 is a decen...,1,A good choice for upgrade Nord CE 2 is a decen...
7,3.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Camera is not good... oppo is the best Phone i...,1,Camera is not good... oppo is the best Phone i...
8,1.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",****VERY DISPOINTED BY 1PLUS****. Sound and ba...,0,****VERY DISPOINTED BY 1PLUS****. Sound and ba...
9,5.0,"OnePlus Nord CE 2 5G (Gray Mirror, 8GB RAM, 12...",Excellent all rounder! The media could not be ...,1,Excellent all rounder! The media could not be ...
