In [2]:
import pandas as pd
import numpy as np

In [3]:
df=pd.read_csv("Iphone_14_15.csv",encoding="ISO-8859-1")

In [4]:
df

Unnamed: 0,index,Model Name,Customer Name,Location,Posted Date,Review,Sentiment of Review
0,1,APPLE iPhone 14,ABHI GOWDA 76,Bengaluru,23 days ago,Unboxing overall good experience ðREAD MORE,Positive
1,2,APPLE iPhone 14,Sanket Hange,Latur,4 months ago,The best camera best design !!1st iphone 14 !!...,Positive
2,3,APPLE iPhone 14,Rahul Prasad,Debipur,10 months ago,Best smart phone under this price range compar...,Positive
3,4,APPLE iPhone 14,Avi Nash,Bengaluru,9 months ago,GoodREAD MORE,Positive
4,5,APPLE iPhone 14,Waqar ahmed,North Twenty Four Parganas District,"Oct, 2022","Amazing picture quality, awesome design, mind ...",Positive
...,...,...,...,...,...,...,...
2032,2033,APPLE iPhone 15 Pro Max,Pavan Kumar,Vijayawada,1 month ago,GoodREAD MORE,Positive
2033,2034,APPLE iPhone 15 Pro Max,Sagar Saurav Borah,North Lakhimpur,17 days ago,Absolutely loved this phone. It is very expens...,Positive
2034,2035,APPLE iPhone 15 Pro Max,Mandar Londhe,Pune,26 days ago,Best iPhone yet! Heating issue completely reso...,Positive
2035,2036,APPLE iPhone 15 Pro Max,Sunny rajan amrit,Amb Industrial Area,1 month ago,NiceREAD MORE,Positive


In [5]:
df["Model Name"].nunique()

8

In [6]:
#Sentiment classification
df["Sentiment of Review"].value_counts()

Sentiment of Review
Positive    1993
Negative      40
Neutral        4
Name: count, dtype: int64

In [7]:
from textblob import TextBlob

# Function to classify sentiment
def get_sentiment(text):
    if not isinstance(text, str) or text.strip() == "":
        return "Neutral"
    polarity = TextBlob(text).sentiment.polarity
    if polarity > 0.1:
        return "Positive"
    elif polarity < -0.1:
        return "Negative"
    else:
        return "Neutral"

# Apply to the verified_reviews column
df['Sentiment'] = df['Review'].apply(get_sentiment)

# Count sentiment labels
print(df['Sentiment'].value_counts())



Sentiment
Positive    1961
Neutral       47
Negative      29
Name: count, dtype: int64


In [8]:
import re
from sklearn.feature_extraction import text

stop_words = text.ENGLISH_STOP_WORDS

# Text cleaning function
def clean_text(text_input):
    text_input = str(text_input).lower()
    text_input = re.sub(r"[^a-z\s]", "", text_input)
    words = text_input.split()
    words = [word for word in words if word not in stop_words]
    return ' '.join(words)

# Apply to Review column
df['Cleaned_Review'] = df['Review'].apply(clean_text)

# Preview cleaned text
print(df[['Review', 'Cleaned_Review']].head())


                                              Review  \
0     Unboxing overall good experience ðREAD MORE   
1  The best camera best design !!1st iphone 14 !!...   
2  Best smart phone under this price range compar...   
3                                      GoodREAD MORE   
4  Amazing picture quality, awesome design, mind ...   

                                      Cleaned_Review  
0              unboxing overall good experience read  
1  best camera best design st iphone tried condit...  
2  best smart phone price range compare phones ov...  
3                                           goodread  
4  amazing picture quality awesome design mind bl...  


In [9]:
from sklearn.feature_extraction.text import TfidfVectorizer

# Filter positive reviews
positive_reviews = df[df['Sentiment'] == 'Positive']['Cleaned_Review']

# Apply TF-IDF
vectorizer_pos = TfidfVectorizer(max_features=20)
tfidf_matrix_pos = vectorizer_pos.fit_transform(positive_reviews)

# Get top terms
tfidf_df_pos = pd.DataFrame(tfidf_matrix_pos.toarray(), columns=vectorizer_pos.get_feature_names_out())
average_tfidf_pos = tfidf_df_pos.mean().sort_values(ascending=False).reset_index()
average_tfidf_pos.columns = ['Term', 'TFIDF_Score']

print("Top Positive Keywords:")
print(average_tfidf_pos)


Top Positive Keywords:
           Term  TFIDF_Score
0          read     0.098762
1          good     0.098112
2         phone     0.086637
3        iphone     0.082369
4        camera     0.071611
5      goodread     0.067017
6       battery     0.061237
7          best     0.060422
8          nice     0.056476
9       awesome     0.056224
10      product     0.055049
11  productread     0.045620
12         just     0.042046
13      quality     0.041386
14  performance     0.040798
15    phoneread     0.040752
16        great     0.035589
17      amazing     0.034322
18        apple     0.027787
19      display     0.026818


###  What Exactly Are Customers Talking Positively About?

Based on the **TF-IDF analysis** of the positive reviews for iPhone 14/15, here are the top insights:

####Top Positive Keywords and Their Interpretations:

| **Keyword**        | **What it Indicates**                                         |
|--------------------|---------------------------------------------------------------|
| `good`, `best`, `nice`, `awesome`, `amazing`, `great` | General **satisfaction** and **positive sentiment**        |
| `read`, `goodread`, `phoneread`, `productread`       | Possibly **review reading patterns** or **scraping artifacts** |
| `phone`, `iphone`, `apple`                           | Appreciation for the **iPhone/Apple brand and device**       |
| `camera`, `display`, `quality`                       | Praise for the **camera and display features**               |
| `battery`, `performance`                             | Positive mentions of **battery life** and **device performance** |
| `product`                                             | Overall **positive opinion** about the product               |

#### Insights:
- Customers are impressed with the **performance, battery life, and camera quality** of the iPhone.
- Strong **brand loyalty and satisfaction** is reflected in words like “awesome”, “great”, and “amazing”.
- Some keywords (like `read`, `productread`) could be **text artifacts** from scraping—worth reviewing further if needed.


In [10]:
# Filter negative reviews
negative_reviews = df[df['Sentiment'] == 'Negative']['Cleaned_Review']

# Apply TF-IDF
vectorizer_neg = TfidfVectorizer(max_features=20)
tfidf_matrix_neg = vectorizer_neg.fit_transform(negative_reviews)

# Get top terms
tfidf_df_neg = pd.DataFrame(tfidf_matrix_neg.toarray(), columns=vectorizer_neg.get_feature_names_out())
average_tfidf_neg = tfidf_df_neg.mean().sort_values(ascending=False).reset_index()
average_tfidf_neg.columns = ['Term', 'TFIDF_Score']

print("Top Negative Keywords:")
print(average_tfidf_neg)


Top Negative Keywords:
            Term  TFIDF_Score
0          worst     0.204956
1            bad     0.156192
2        battery     0.134334
3          phone     0.122216
4         iphone     0.115450
5        product     0.103153
6       purchase     0.092793
7         camera     0.080659
8   disappointed     0.079183
9           days     0.075230
10    experience     0.066964
11         apple     0.066696
12          poor     0.063484
13        screen     0.060293
14      flipkart     0.060233
15          dont     0.058305
16       quality     0.053222
17   performance     0.051572
18        itread     0.045469
19       service     0.043924


##  What Exactly Are Customers Talking Negatively About?

Based on the TF-IDF analysis of negative reviews for iPhone 14 and 15 models, here are the key themes and insights derived from the most significant keywords.

---

### **Top Negative Keywords and What They Indicate**:

| **Keyword Group**                               | **Interpretation**                                                                 |
|--------------------------------------------------|-------------------------------------------------------------------------------------|
| `battery`, `performance`                         | Complaints about **battery life** and **device performance** (e.g., lag, heating) |
| `worst`, `bad`, `poor`, `disappointed`           | Strong **emotional dissatisfaction** or frustration with the experience            |
| `phone`, `iphone`, `product`, `apple`            | General negative references to the device or the brand                             |
| `camera`, `quality`, `screen`                    | Criticism of **hardware quality**, particularly the camera and screen              |
| `flipkart`, `days`, `service`, `purchase`        | Frustration related to **delivery issues or customer service**                     |

---

### **Insights**:

- **Battery performance** stands out as a recurring pain point. Many users are unhappy with how long the device lasts or charges.
- Terms like *worst*, *bad*, *disappointed*, and *poor* reveal that **negative reviews are not mild — they’re emotionally strong**, signaling serious dissatisfaction.
- Customers also raised issues with **camera quality and screen performance**, showing inconsistency in experience.
- Multiple mentions of **Flipkart**, **days**, and **service** point to frustration not just with the device — but with **delivery delays or the buying experience**.
- The inclusion of terms like **purchase** suggests issues may stem from **expectations set during the buying process** (pricing, seller description, etc.).


### Summary Insight: What Are Customers Really Saying About iPhone 14/15?

After analyzing both **positive** and **negative** reviews using sentiment classification and TF-IDF, several key themes emerged that highlight **what customers appreciate** and **what frustrates them**:

---

#### What Customers Really Like:
- **Camera & Display**: These features are frequently praised with words like *"camera"*, *"display"*, and *"quality"*. Customers describe their experiences as *"awesome"*, *"nice"*, and *"best"*.
- **Performance & Speed**: Terms such as *"performance"*, *"fast"*, and *"smooth"* indicate a high level of satisfaction with how the device operates in day-to-day use.
- **Battery & Design**: While battery appears on both sides, it does show up positively among many users who highlight battery life and the sleek *design* of the phones.
- **Brand Attachment**: Frequent mentions of *"apple"*, *"iphone"*, and *"ios"* in a positive context point to strong brand affinity and a sense of pride in owning the product.

---

#### What Customers Really Dislike:
- **Battery Life**: Despite praise from some users, *"battery"* was also the top TF-IDF term in negative reviews, revealing **widespread dissatisfaction with longevity or charging issues**.
- **Strong Negative Emotions**: Words like *"worst"*, *"bad"*, *"poor"*, and *"disappointed"* were prominent — indicating **deep dissatisfaction** in some user segments.
- **Camera & Performance Issues**: The appearance of *"camera"*, *"performance"*, and *"screen"* in negative reviews suggests **inconsistent hardware or software experiences**, depending on user expectations.
- **Shopping Experience Problems**: Keywords such as *"flipkart"*, *"purchase"*, *"service"*, and *"days"* suggest that **delivery delays or third-party platform issues** also played a significant role in shaping negative opinions.

---

#### Overall Summary Insight:
- The iPhone 14 and 15 series receive strong praise for **camera quality, display, and smooth performance**, reaffirming Apple’s premium positioning.
- However, **battery issues**, **inconsistent experiences**, and **platform-related frustrations** (particularly with Flipkart) are the main sources of negative sentiment.
- The brand remains highly admired, but customers expect **consistency and perfection** — especially given the **premium pricing** associated with these models.


In [15]:
# Count sentiment by model
product_sentiment = df.groupby(['Model Name', 'Sentiment']).size().unstack(fill_value=0)
product_sentiment['Total'] = product_sentiment.sum(axis=1)
product_sentiment['% Positive'] = round((product_sentiment['Positive'] / product_sentiment['Total']) * 100, 2)
product_sentiment['% Negative'] = round((product_sentiment['Negative'] / product_sentiment['Total']) * 100, 2)

print("Product-level Sentiment Summary:")
print(product_sentiment.sort_values('% Positive', ascending=False))


Product-level Sentiment Summary:
Sentiment                Negative  Neutral  Positive  Total  % Positive  \
Model Name                                                                
APPLE iPhone 15                 0        0       101    101      100.00   
APPLE iPhone 15 Pro             0        0        29     29      100.00   
APPLE iPhone 15 Pro Max         0        0        20     20      100.00   
APPLE iPhone 14 Plus            2        2       273    277       98.56   
APPLE iPhone 14 Pro             0        2        84     86       97.67   
APPLE iPhone 14 Pro Max         1        2       108    111       97.30   
APPLE iPhone 15 Plus            2        1        80     83       96.39   
APPLE iPhone 14                24       40      1266   1330       95.19   

Sentiment                % Negative  
Model Name                           
APPLE iPhone 15                0.00  
APPLE iPhone 15 Pro            0.00  
APPLE iPhone 15 Pro Max        0.00  
APPLE iPhone 14 Plus      

### Product Sentiment Summary – iPhone 14 & 15 Series

#### Products with the Most Positive Opinions
The following models received **100% positive reviews**, reflecting flawless satisfaction:
- APPLE iPhone 15
- APPLE iPhone 15 Pro
- APPLE iPhone 15 Pro Max

#### Models with Very High Positive Sentiment (95%+)
- APPLE iPhone 14 Plus – 98.56% Positive
- APPLE iPhone 14 Pro – 97.67% Positive
- APPLE iPhone 14 Pro Max – 97.30% Positive
- APPLE iPhone 15 Plus – 96.39% Positive

#### Product with Highest Volume of Negative Reviews
Although still well-received, **iPhone 14** shows the **highest number of negative reviews (24)**, with:
- 95.19% Positive
- 1.80% Negative
- 1330 Total Reviews

This slightly higher volume of negative and neutral reviews may reflect its **wide adoption and exposure**, which naturally attracts more diverse feedback.

---

### Overall Insight
All models are positively received by users, but the **iPhone 15 series leads in satisfaction** with perfect sentiment scores.  
Meanwhile, the **iPhone 14**, while still favorable, reflects a **more diverse range of user opinions**, likely due to its popularity and large user base.


In [22]:
#  Total positive and negative sentiments overall
total_positive = product_sentiment['Positive'].sum()
total_negative = product_sentiment['Negative'].sum()

#  Get top 5 most positively-reviewed models
top_positive_models = product_sentiment.sort_values(by='Positive', ascending=False).head(1)
top_positive_contribution = (top_positive_models['Positive'].sum() / total_positive) * 100

#  Get top 5 most negatively-reviewed models
top_negative_models = product_sentiment.sort_values(by='Negative', ascending=False).head(1)
top_negative_contribution = (top_negative_models['Negative'].sum() / total_negative) * 100

# Print results
print(" Top 1 model Contribution to Total Positive Sentiment: {:.2f}%".format(top_positive_contribution))
print(" Top 1 model Contribution to Total Negative Sentiment: {:.2f}%".format(top_negative_contribution))


 Top 1 model Contribution to Total Positive Sentiment: 64.56%
 Top 1 model Contribution to Total Negative Sentiment: 82.76%


In [19]:
def get_pros_cons_by_product(group_column, top_n=10):
    from sklearn.feature_extraction.text import TfidfVectorizer

    unique_products = df[group_column].dropna().unique()

    for product in unique_products:
        print(f"\n Product: {product}")
        sub_df = df[df[group_column] == product]

        # POSITIVE
        pos_reviews = sub_df[sub_df['Sentiment'] == 'Positive']['Cleaned_Review']
        if not pos_reviews.empty:
            tfidf_pos = TfidfVectorizer(max_features=top_n)
            pos_matrix = tfidf_pos.fit_transform(pos_reviews)
            pos_scores = pos_matrix.mean(axis=0).A1
            pos_terms = tfidf_pos.get_feature_names_out()
            top_pos = sorted(zip(pos_terms, pos_scores), key=lambda x: -x[1])
            print(" Pros:", [term for term, score in top_pos])
        else:
            print(" Pros: No positive reviews")

        # NEGATIVE
        neg_reviews = sub_df[sub_df['Sentiment'] == 'Negative']['Cleaned_Review']
        if not neg_reviews.empty:
            tfidf_neg = TfidfVectorizer(max_features=top_n)
            neg_matrix = tfidf_neg.fit_transform(neg_reviews)
            neg_scores = neg_matrix.mean(axis=0).A1
            neg_terms = tfidf_neg.get_feature_names_out()
            top_neg = sorted(zip(neg_terms, neg_scores), key=lambda x: -x[1])
            print(" Cons:", [term for term, score in top_neg])
        else:
            print(" Cons: No negative reviews")

# Run the function
get_pros_cons_by_product('Model Name', top_n=10)



 Product: APPLE iPhone 14
 Pros: ['good', 'read', 'phone', 'iphone', 'camera', 'nice', 'battery', 'product', 'best', 'performance']
 Cons: ['bad', 'worst', 'battery', 'phone', 'product', 'iphone', 'flipkart', 'days', 'purchase', 'apple']

 Product: APPLE iPhone 14 Plus
 Pros: ['good', 'phone', 'read', 'battery', 'iphone', 'camera', 'awesome', 'product', 'best', 'plus']
 Cons: ['apple', 'poor', 'skin', 'problem', 'simultaneously', 'accurate', 'plus', 'product', 'showing', 'waste']

 Product: APPLE iPhone 14 Pro
 Pros: ['read', 'iphone', 'best', 'camera', 'phone', 'awesome', 'pro', 'just', 'quality', 'apple']
 Cons: No negative reviews

 Product: APPLE iPhone 14 Pro Max
 Pros: ['camera', 'awesome', 'phone', 'iphone', 'read', 'pro', 'max', 'got', 'just', 'quality']
 Cons: ['video', 'apples', 'buzzy', 'disappointed', 'eyeread', 'feel', 'galaxy', 'iphone', 'moveing', 'night']

 Product: APPLE iPhone 15
 Pros: ['read', 'phone', 'awesome', 'quality', 'performance', 'camera', 'good', 'battery

### Model-wise Pros, Cons & Insights (Based on Review TF-IDF)

---

#### APPLE iPhone 14
**Pros**: Users often highlighted the iPhone 14 as “good” and “best,” especially appreciating its camera, battery, performance, and overall build.  
**Cons**: The same core features were also the subject of complaints — particularly battery issues, delivery concerns via Flipkart, and general dissatisfaction with product quality.  
**Insight**: The iPhone 14 is a well-liked model, but inconsistent battery performance and purchase-related frustrations (e.g., delivery or platform experience) seem to bring sentiment down slightly.

---

#### APPLE iPhone 14 Plus  
**Pros**: Customers praised the battery, camera, and display, frequently describing the device as “awesome” and “best.”  
**Cons**: Complaints included issues with the phone’s performance and usability — with keywords like “poor,” “problem,” “waste,” and “showing.”  
**Insight**: The iPhone 14 Plus is appreciated for its technical features, but some users experienced functionality or quality concerns that hurt its overall perception.

---

#### APPLE iPhone 14 Pro  
**Pros**: Customers raved about the phone's camera, design, and build, using terms like “best,” “awesome,” and “quality.”  
**Cons**: No negative keywords were recorded in the dataset for this model.  
**Insight**: The iPhone 14 Pro is clearly delivering on Apple’s premium promise. Customers find it reliable and stylish, with no notable drawbacks reported.

---

#### APPLE iPhone 14 Pro Max  
**Pros**: The model was praised for its camera quality, display, and premium build — with words like “awesome,” “quality,” and “pro” appearing frequently.  
**Cons**: A few negative sentiments appeared around terms like “disappointed,” “night,” and “feel,” hinting at some dissatisfaction with low-light camera performance or overall expectations.  
**Insight**: iPhone 14 Pro Max is generally seen as a strong multimedia performer, though a small subset of users found low-light or video performance lacking.

---

#### APPLE iPhone 15  
**Pros**: Users loved the camera, battery, performance, and overall quality. The experience was described as “awesome,” “good,” and “just right.”  
**Cons**: No negative keywords were found for this model.  
**Insight**: iPhone 15 is a highly satisfying upgrade. Users felt the essentials were perfectly delivered with no major drawbacks reported.

---

#### APPLE iPhone 15 Plus  
**Pros**: Customers frequently praised the phone's camera, battery life, and performance, calling it “awesome,” “best,” and “amazing.”  
**Cons**: Some users mentioned terms like “purchase,” “request,” and “screen,” suggesting dissatisfaction related to ordering, screen usability, or support issues.  
**Insight**: The iPhone 15 Plus is seen as a strong offering technically, but some friction in the buying or support experience created minor negative sentiment.

---

#### APPLE iPhone 15 Pro  
**Pros**: Reviewers described this phone as “superb,” especially in terms of performance, camera quality, and build. Interestingly, even words like “heating” and “heats” appeared in positive reviews, likely framed in a non-critical way.  
**Cons**: No negative reviews were found.  
**Insight**: The iPhone 15 Pro is perceived as one of the best-performing models, offering speed, quality, and premium feel — with no major drawbacks reported
