
---

# **Sentiment Analysis of iPhone 15 Amazon Reviews**  
*An In-Depth NLP Exploration*

---

## 🚀 **1. Introduction**  
This project delves into **sentiment analysis** on **Amazon reviews** for the **iPhone 15** using three powerful NLP techniques:  
- **Lexicon-Based (NLTK VADER)**  
- **VADER Sentiment Analysis**  
- **TextBlob Sentiment Analysis**  

Our goal: **Identify customer sentiment, highlight key trends,** and **compare** the effectiveness of these methods.

---

## 🔍 **2. Data Loading & Initial Exploration**  
### **Dataset Overview**  
- **Source:** `amazon_reviews_all.csv`  
- **Size:** 494 reviews × 7 columns  
- **Key Features:**  
  - `Reviewer Name`  
  - `Stars` (rating)  
  - `Review Title`  
  - `Review` (text)  
  - `Review Date`  
  - `Verified Purchase`  
  - `Helpful Votes`

### **Initial Observations:**  
- Many reviews had **"Unknown"** in the `Review` column.  
- Some entries were **duplicates** or **incomplete**.

---

## 🧹 **3. Data Preprocessing**  
### **Cleaning Steps**  
- **Dropped Irrelevant Columns** → Focused on `Review`.  
- **Removed "Unknown" Reviews** → Filtered out 274 invalid entries.  
- **Handled Missing Data** → Dropped 5 NaN records.  
- **Final Dataset:** **215 reviews** ready for analysis.

### **Text Preprocessing Pipeline**  
- **Lowercased** → `"iPhone"` → `"iphone"`  
- **Removed Digits** → `"iPhone 15"` → `"iphone"`  
- **Trimmed Extra Spaces** → `"good   phone"` → `"good phone"`  
- **Punctuation Removed** → `"Great!"` → `"great"`  
- **Converted Emojis** → `"👍"` → `":thumbs_up:"`  
- **Tokenized & Removed Stopwords** → `"the phone is good"` → `["phone", "good"]`  
- **Lemmatized** → `"running"` → `"run"`

This created a **cleaned_Review** column optimized for NLP.

---

## 🧠 **4. Sentiment Analysis Techniques**  
### **1. Lexicon-Based (NLTK VADER)**  
- Uses **pre-defined sentiment scores**.  
- **Classification Rule:**  
  - Positive if compound score ≥ **0.05**  
  - Negative if compound score ≤ **-0.05**  
  - Neutral otherwise

### **2. VADER Sentiment Analysis**  
- Optimized for **short social media-like texts**.  
- **Intensity** analysis (positive/negative/neutral).

### **3. TextBlob Sentiment Analysis**  
- Based on **Naive Bayes** & **pattern analysis**.  
- **Classification Rule:**  
  - Positive if polarity > **0**  
  - Negative if polarity < **0**  
  - Neutral otherwise

---

## 📊 **5. Results & Insights**  
### **Key Findings**  
- ✅ **Overwhelmingly Positive Sentiment** – Reviews were mostly favorable.  
- 🔄 **High Agreement** – VADER and TextBlob were largely in sync.  
- ⚠️ **Neutral/Positive Discrepancies** – TextBlob showed a more cautious approach.

### **Example Classifications**  
| **Review Excerpt**                  | **Lexicon**  | **VADER**    | **TextBlob** |  
|-------------------------------------|--------------|--------------|--------------|  
| *"Lightweight, great battery life!"*| Positive     | Positive     | Positive     |  
| *"Camera is just okay."*            | Positive     | Positive     | Neutral      |  
| *"Overheats while charging."*       | Negative     | Negative     | Negative     |  

---

## 🔮 **6. Conclusion & Next Steps**  
### **Summary**  
- Analyzed **215 reviews** with **3 NLP techniques**.  
- Found **positive sentiment** dominating.  
- VADER & TextBlob showed high alignment.

### **Next Steps**  
🔹 **Topic Modeling** – Identify key discussion points like battery or camera.  
🔹 **Star Rating Correlation** – Compare sentiment vs. actual ratings.  
🔹 **BERT/Deep Learning** – Test advanced models for more accurate sentiment analysis.

---

### 🛠️ **Tools Used:**  
Python, Pandas, NLTK, VADER, TextBlob  

### **Author:** [Your Name]  
**Date:** [Current Date]  

---

**"Data tells a story—let's read it!"** 📊✨

---
