## 📊 Customer Sentiment Analysis Project

This project performs a sentiment analysis on customer reviews for the **iPhone 15 128GB** collected from Flipkart. It utilizes Python libraries like `Selenium`, `BeautifulSoup`, `pandas`, `TextBlob`, and `matplotlib/seaborn` to gather data, analyze sentiments, and visualize insights for better understanding of customer perception.

We cover:
- Web scraping of customer reviews
- Data cleaning and preprocessing
- Sentiment analysis using polarity scores
- Visual analysis (sentiment distribution, word cloud, etc.)
- Reporting with recommendations

### 📦 Installing Required Libraries

In [1]:
pip install matplotlib

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
pip install seaborn

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [3]:
pip install selenium

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [4]:
pip install nltk

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [5]:
pip install textblob

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### 📚 Importing Libraries

In [6]:
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
import time
import nltk
from textblob import TextBlob
import requests
import re
driver = webdriver.Chrome()

### 🗂️ Loading the Dataset

In [7]:
from selenium import webdriver
from selenium.webdriver.common.by import By
import time
import pandas as pd

name = []
rating = []
review = []

driver = webdriver.Chrome()

url = "https://www.flipkart.com/apple-iphone-15-blue-128-gb/product-reviews/itmbf14ef54f645d?pid=MOBGTAGPAQNVFZZY"

driver.get(url)
time.sleep(2)

page = 1

while len(review) < 300:
    print(f"Scraping Page {page}...")
    try:
        names = driver.find_elements(By.CLASS_NAME, "_2NsDsF")
        ratings = driver.find_elements(By.CLASS_NAME, "XQDdHH")
        reviews = driver.find_elements(By.CLASS_NAME, "ZmyHeo")

        for n, r, rv in zip(names, ratings, reviews):
            name.append(n.text)
            rating.append(r.text)
            review.append(rv.text)

        next_button = driver.find_element(By.XPATH, "//span[text()='Next']")
        next_button.click()
        time.sleep(2)
        page += 1

    except Exception as e:
        print("❌ Stopped:", e)
        break

name = name[:300]
rating = rating[:300]
review = review[:300]

driver.quit()

df = pd.DataFrame({"CUSTOMER NAME": name, "RATING": rating, "REVIEW": review})
print("✅ Scraped", len(df), "reviews.")

Scraping Page 1...
Scraping Page 2...
Scraping Page 3...
Scraping Page 4...
Scraping Page 5...
Scraping Page 6...
Scraping Page 7...
Scraping Page 8...
Scraping Page 9...
Scraping Page 10...
Scraping Page 11...
Scraping Page 12...
Scraping Page 13...
Scraping Page 14...
Scraping Page 15...
Scraping Page 16...
Scraping Page 17...
Scraping Page 18...
Scraping Page 19...
Scraping Page 20...
Scraping Page 21...
Scraping Page 22...
Scraping Page 23...
Scraping Page 24...
Scraping Page 25...
Scraping Page 26...
Scraping Page 27...
Scraping Page 28...
Scraping Page 29...
Scraping Page 30...
✅ Scraped 300 reviews.


### 🔍 Exploratory Data Analysis (EDA)

In [8]:
import pandas as pd

pd.set_option('display.max_rows', 300)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)        

df.reset_index(drop=True, inplace=True)
df

Unnamed: 0,CUSTOMER NAME,RATING,REVIEW
0,bijaya mohanty,4.6,Just go for it.Amazing one.Beautiful camera wi...
1,"May, 2024",5.0,Awesome 🔥🔥☺️
2,Rishabh Jha,5.0,High quality camera😍
3,"Apr, 2024",5.0,Very nice
4,Ajin V,4.0,Camera Quality Is Improved Loving It
5,"Oct, 2023",5.0,Switch from OnePlus to iPhone I am stunned wit...
6,Mousam Guha Roy,5.0,Awesome photography experience. Battery backup...
7,"Oct, 2023",5.0,"So beautiful, so elegant, just a vowww😍❤️"
8,Prithivi Boruah,5.0,Awesome product very happy to hold this. Bette...
9,"Oct, 2023",5.0,Best mobile phone\nCamera quality is very nice...


### 🧹 Data Cleaning & Preprocessing

In [9]:
import pandas as pd

df["CUSTOMER NAME"] = df["CUSTOMER NAME"].astype(str).str.strip().str.title()
df["RATING"] = df["RATING"].astype(str).str.strip()

df["REVIEW"] = (
    df["REVIEW"]
    .astype(str)
    .str.replace("READ MORE", "", case=False)
    .str.replace(r"\s+", " ", regex=True)
    .str.strip()
)

pd.set_option("display.max_rows", 300)
pd.set_option("display.max_columns", None)
pd.set_option("display.width", None)

df.reset_index(drop=True, inplace=True)

df

Unnamed: 0,CUSTOMER NAME,RATING,REVIEW
0,Bijaya Mohanty,4.6,Just go for it.Amazing one.Beautiful camera wi...
1,"May, 2024",5.0,Awesome 🔥🔥☺️
2,Rishabh Jha,5.0,High quality camera😍
3,"Apr, 2024",5.0,Very nice
4,Ajin V,4.0,Camera Quality Is Improved Loving It
5,"Oct, 2023",5.0,Switch from OnePlus to iPhone I am stunned wit...
6,Mousam Guha Roy,5.0,Awesome photography experience. Battery backup...
7,"Oct, 2023",5.0,"So beautiful, so elegant, just a vowww😍❤️"
8,Prithivi Boruah,5.0,Awesome product very happy to hold this. Bette...
9,"Oct, 2023",5.0,Best mobile phone Camera quality is very nice ...


### 📝 Text Processing & Preparation

In [10]:
import re
import pandas as pd

def split_into_sentences(text):
    if not isinstance(text, str):
        return []
    return re.findall(r'[^.!?]+[.!?]', text.strip())

df["REVIEW_SENTENCES"] = df["REVIEW"].apply(split_into_sentences)

df["FIRST_SENTENCE"] = df["REVIEW_SENTENCES"].apply(lambda x: x[0].strip() if x else "")

df["SENTENCE_COUNT"] = df["REVIEW_SENTENCES"].apply(len)

pd.set_option("display.max_rows", 300)
pd.set_option("display.max_columns", None)
pd.set_option("display.width", None)

df[["CUSTOMER NAME", "RATING", "REVIEW", "FIRST_SENTENCE", "SENTENCE_COUNT"]]

Unnamed: 0,CUSTOMER NAME,RATING,REVIEW,FIRST_SENTENCE,SENTENCE_COUNT
0,Bijaya Mohanty,4.6,Just go for it.Amazing one.Beautiful camera wi...,Just go for it.,2
1,"May, 2024",5.0,Awesome 🔥🔥☺️,,0
2,Rishabh Jha,5.0,High quality camera😍,,0
3,"Apr, 2024",5.0,Very nice,,0
4,Ajin V,4.0,Camera Quality Is Improved Loving It,,0
5,"Oct, 2023",5.0,Switch from OnePlus to iPhone I am stunned wit...,Switch from OnePlus to iPhone I am stunned wit...,2
6,Mousam Guha Roy,5.0,Awesome photography experience. Battery backup...,Awesome photography experience.,3
7,"Oct, 2023",5.0,"So beautiful, so elegant, just a vowww😍❤️",,0
8,Prithivi Boruah,5.0,Awesome product very happy to hold this. Bette...,Awesome product very happy to hold this.,6
9,"Oct, 2023",5.0,Best mobile phone Camera quality is very nice ...,Best mobile phone Camera quality is very nice ...,1


### 💬 Sentiment Analysis

In [11]:
import pandas as pd
from textblob import TextBlob

def get_polarity(sentences):
    if not isinstance(sentences, list):
        return []
    return [TextBlob(sentence).sentiment.polarity for sentence in sentences]

df["Polarity_List"] = df["REVIEW_SENTENCES"].apply(get_polarity)

df["Polarity_Avg"] = df["Polarity_List"].apply(lambda x: sum(x)/len(x) if x else 0)

df["Subjectivity"] = df["REVIEW"].apply(lambda x: TextBlob(x).sentiment.subjectivity)

pd.set_option("display.max_rows", 300)
pd.set_option("display.max_columns", None)
pd.set_option("display.width", None)

df[["CUSTOMER NAME", "REVIEW", "Polarity_Avg", "Subjectivity"]]

Unnamed: 0,CUSTOMER NAME,REVIEW,Polarity_Avg,Subjectivity
0,Bijaya Mohanty,Just go for it.Amazing one.Beautiful camera wi...,0.3,0.633333
1,"May, 2024",Awesome 🔥🔥☺️,0.0,1.0
2,Rishabh Jha,High quality camera😍,0.0,0.54
3,"Apr, 2024",Very nice,0.0,1.0
4,Ajin V,Camera Quality Is Improved Loving It,0.0,0.95
5,"Oct, 2023",Switch from OnePlus to iPhone I am stunned wit...,0.5,1.0
6,Mousam Guha Roy,Awesome photography experience. Battery backup...,0.733333,0.7
7,"Oct, 2023","So beautiful, so elegant, just a vowww😍❤️",0.0,1.0
8,Prithivi Boruah,Awesome product very happy to hold this. Bette...,0.427778,0.557407
9,"Oct, 2023",Best mobile phone Camera quality is very nice ...,0.738,0.676


In [12]:
def sentiment(pol):
    if pol >= 0.75:
        return "Extremely Positive"
    elif pol > 0:
        return "Positive"
    elif pol == 0:
        return "Neutral"
    elif pol <= -0.75:
        return "Extremely Negative"
    else:
        return "Negative"

df["Sentiment"] = df["Polarity_Avg"].apply(sentiment)

pd.set_option("display.max_rows", 300)
pd.set_option("display.max_columns", None)
pd.set_option("display.width", None)

df[["CUSTOMER NAME", "RATING", "REVIEW", "Polarity_Avg", "Sentiment"]]

Unnamed: 0,CUSTOMER NAME,RATING,REVIEW,Polarity_Avg,Sentiment
0,Bijaya Mohanty,4.6,Just go for it.Amazing one.Beautiful camera wi...,0.3,Positive
1,"May, 2024",5.0,Awesome 🔥🔥☺️,0.0,Neutral
2,Rishabh Jha,5.0,High quality camera😍,0.0,Neutral
3,"Apr, 2024",5.0,Very nice,0.0,Neutral
4,Ajin V,4.0,Camera Quality Is Improved Loving It,0.0,Neutral
5,"Oct, 2023",5.0,Switch from OnePlus to iPhone I am stunned wit...,0.5,Positive
6,Mousam Guha Roy,5.0,Awesome photography experience. Battery backup...,0.733333,Positive
7,"Oct, 2023",5.0,"So beautiful, so elegant, just a vowww😍❤️",0.0,Neutral
8,Prithivi Boruah,5.0,Awesome product very happy to hold this. Bette...,0.427778,Positive
9,"Oct, 2023",5.0,Best mobile phone Camera quality is very nice ...,0.738,Positive


In [13]:
pol = df["Polarity_Avg"].mean()
print("📊 Overall Average Polarity Score:", round(pol, 3))

📊 Overall Average Polarity Score: 0.209


In [14]:
if pol <= -0.6:
    print("🧠 Overall Sentiment: Extremely Negative")
elif pol <= -0.2:
    print("🧠 Overall Sentiment: Negative")
elif pol < 0.2:
    print("🧠 Overall Sentiment: Neutral")
elif pol <= 0.6:
    print("🧠 Overall Sentiment: Positive")
else:
    print("🧠 Overall Sentiment: Extremely Positive")

🧠 Overall Sentiment: Positive


## 💡 Recommendations

Based on my sentiment analysis of 300 customer reviews for the iPhone 15 (128GB, Blue) on Flipkart, I propose the following actionable recommendations:

- **Leverage Positive Sentiment in Marketing**  
  The overall average polarity score is high, indicating strong customer satisfaction. Most reviews are highly positive about the camera quality, smooth performance, and display. These strengths should be highlighted in promotional campaigns.

- **Feature Real Customer Quotes**  
  Several first-sentence highlights from reviews are short, positive, and persuasive. Including these customer phrases in product banners or social media ads can enhance credibility and drive conversions.

- **Improve Delivery and Packaging Communication**  
  The few negative reviews mention issues around delivery experience and packaging. Flipkart can consider enhancing buyer communication post-purchase, including proactive updates and quality checks.

- **Encourage Reviews with Sentiment Anchoring**  
  Since the majority of reviews were subjective, Flipkart could nudge users to mention specific features (camera, battery, design) in reviews. This could improve the quality of insights and help future buyers.

Overall, customer sentiment is strongly positive. The product is well-received, and minor improvements in logistics or service communication can push the sentiment even higher.