# 📱 iPhone Sales Analysis Project
This notebook analyzes iPhone sales, ratings, reviews, and pricing data to uncover trends and insights.
We follow a structured approach based on the Data Visualization Project rubric.

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style='whitegrid')
%matplotlib inline

In [None]:
# Load dataset
df = pd.read_csv('apple_products.csv')
df.head()

## 🔍 1. Data Cleaning and Handling Missing Values

In [None]:
# Check for missing values
df.isnull().sum()

In [None]:
# Visualize missing data
plt.figure(figsize=(10,6))
sns.heatmap(df.isnull(), cbar=False, cmap='viridis')
plt.title('Missing Values Heatmap')
plt.show()

## 🧠 2. Feature Selection and Engineering

In [None]:
# Example of new features
df['DiscountAmount'] = df['MRP'] - df['Sale Price']
df['DiscountPercent'] = (df['DiscountAmount'] / df['MRP']) * 100
df.head()

## 🔒 3. Data Integrity and Consistency

In [None]:
# Check for duplicate or inconsistent entries
df['Product Name'] = df['Product Name'].str.strip()
df['Product Name'].value_counts().head()

## 📊 4. Summary Statistics and Insights

In [None]:
# Get summary statistics
df.describe()

## 🔍 5. Identifying Patterns, Trends, and Anomalies

In [None]:
# Plot ratings vs sale price
plt.figure(figsize=(10,6))
sns.scatterplot(data=df, x='Sale Price', y='Rating', hue='Product Name')
plt.title('Sale Price vs Rating')
plt.show()

## 🚨 6. Handling Outliers and Data Transformations

In [None]:
# Visualize potential outliers in Sale Price
plt.figure(figsize=(10,6))
sns.boxplot(x=df['Sale Price'])
plt.title('Outliers in Sale Price')
plt.show()

## 📈 7. Visual Representation of Key Findings

In [None]:
# Correlation heatmap
plt.figure(figsize=(10,6))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()


## 📘 8. Final Insights & Storytelling

Based on the visualized data, here are the key insights:

- 📉 **Sale Price vs. Number of Ratings**: There's a negative correlation, indicating that lower-priced iPhones tend to receive more ratings. This suggests affordability boosts popularity and user engagement.
- 🎯 **Discount Percentage vs. Number of Ratings**: A positive trend is observed — iPhones with higher discounts receive more ratings. Discounts appear to drive higher customer response.
- ⭐ **Strategic Insight**: To maximize customer engagement and reviews, mid-range or discounted iPhones could be promoted more actively.
- 💡 **Recommendation**: Future marketing strategies should leverage discounts and optimize pricing tiers for greater product visibility.

These findings can support business decisions for marketing, inventory, and pricing optimization.



## 🌐 Interactive Visualization: Sale Price vs Number of Reviews

The following interactive chart uses Plotly to explore the relationship between **sale price** and **number of reviews** for various iPhone models. You can hover over points to view more detail.


In [None]:

import plotly.express as px

# Ensure 'df' exists from previous cells
fig = px.scatter(
    df,
    x='sale_price',
    y='number_of_reviews',
    color='product_name',
    hover_data=['discount_percentage'],
    title='📉 Sale Price vs Number of Reviews (Interactive)'
)

fig.show()
