---
# FeelFusion: Brand Sentiment Analyzer
---

### Introduction
---
In today's business world, knowing what customers think is vital. **FeelFusion: Brand Sentiment Analyzer** is here to help brands understand customer opinions by analyzing reviews and social media mentions.

Using advanced sentiment analysis, FeelFusion turns this data into useful insights. Brands can monitor their reputation, spot trends, and make decisions to improve customer satisfaction. Whether tracking a new product's success or understanding reactions to a marketing campaign, FeelFusion gives brands the edge they need.

FeelFusion isn't just about data; it's about grasping the emotions behind it. By capturing customer sentiment, FeelFusion offers brands a unique view of their performance, helping them navigate customer feedback confidently and clearly.

---

## 1.) Import Required Packages

####  Importing Pandas, Matplotlib, Seaborn and Warings Library.

In [14]:
import pandas as pd
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

---
## 2.) Data Collection
- Dataset Source - https://www.kaggle.com/datasets/mahmoudshaheen1134/amazon-alexa-reviews-dataset

#### Import the CSV Data as Pandas DataFrame

In [15]:
df = pd.read_csv('../data/raw_data/amazon_alexa.tsv', delimiter='\t', quoting=3)

#### Show Top 5 Records

In [16]:
df.head()

Unnamed: 0,rating,date,variation,verified_reviews,feedback
0,5,31-Jul-18,Charcoal Fabric,Love my Echo!,1
1,5,31-Jul-18,Charcoal Fabric,Loved it!,1
2,4,31-Jul-18,Walnut Finish,"""Sometimes while playing a game, you can answe...",1
3,5,31-Jul-18,Charcoal Fabric,"""I have had a lot of fun with this thing. My 4...",1
4,5,31-Jul-18,Charcoal Fabric,Music,1


#### Shape of the dataset

In [17]:
df.shape

(3150, 5)

### 2.1 Dataset information

- **Rating** : The numerical rating given by the customer, typically on a scale from 1 to 5, where 1 is the lowest rating and 5 is the highest.

- **Date** : The date when the review was posted.

- **Variation** : The specific variant or version of the product being reviewed (e.g., color, finish).

- **Verified_reviews** : The actual text of the review written by the customer.

- **Feedback** : Indicates whether the review was marked helpful by other users (usually a binary value, with 1 meaning "helpful" and 0 meaning "not helpful").

---
## 3.) Data Checks to perform

- Check Missing values
- Check Duplicates
- Check data type
- Check the number of unique values of each column
- Check statistics of data set

### 3.1 Check Missing values

In [18]:
df.isna().sum()

rating              0
date                0
variation           0
verified_reviews    1
feedback            0
dtype: int64

### 3.2 Check Duplicates

In [19]:
df.duplicated().sum()

715

### 3.3 Check data types

In [20]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3150 entries, 0 to 3149
Data columns (total 5 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   rating            3150 non-null   int64 
 1   date              3150 non-null   object
 2   variation         3150 non-null   object
 3   verified_reviews  3149 non-null   object
 4   feedback          3150 non-null   int64 
dtypes: int64(2), object(3)
memory usage: 123.2+ KB


### 3.4 Checking the number of unique values of each column

In [21]:
df.nunique()

rating                 5
date                  77
variation             16
verified_reviews    2300
feedback               2
dtype: int64

In [22]:
df.columns

Index(['rating', 'date', 'variation', 'verified_reviews', 'feedback'], dtype='object')

### 3.5 Check statistics of data set

In [23]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
rating,3150.0,4.463175,1.068506,1.0,4.0,5.0,5.0,5.0
feedback,3150.0,0.918413,0.273778,0.0,1.0,1.0,1.0,1.0


In [24]:
df.describe(include='object').T

Unnamed: 0,count,unique,top,freq
date,3150,77,30-Jul-18,1603
variation,3150,16,Black Dot,516
verified_reviews,3149,2300,,79
