# üìä iPhone Sales Business Insights Analysis
This notebook performs meaningful business analytics on iPhone sales data, including:
- Best-selling models
- Region-wise demand strength
- Low-demand iPhones
- Customer price preference
- Overpriced models detection
- Storage preference analysis
- Model-region-storage bestseller combos
- Price‚Äìdemand sensitivity

Dataset required: `iphone_sales.csv` (generated earlier).

In [1]:

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

sns.set(style="whitegrid")

df = pd.read_csv("iphone_sales.csv")
df['Date'] = pd.to_datetime(df['Date'])
df.head()


Unnamed: 0,Model,Storage,Color,Region,Price,Units_Sold,Date,Month,Quarter,Year
0,iPhone 14,512,Gold,East,106820,152,2021-01-03,1,1,2021
1,iPhone XR,256,Blue,East,74131,409,2021-01-10,1,1,2021
2,iPhone 14 Pro,256,Silver,North,30769,393,2021-01-17,1,1,2021
3,iPhone 13 Pro,128,Red,North,113104,363,2021-01-24,1,1,2021
4,iPhone 13 Pro,64,Red,North,55658,219,2021-01-31,1,1,2021


## 1Ô∏è‚É£ Most Sold iPhone Model (Overall)

In [2]:

top_model = df.groupby("Model")["Units_Sold"].sum().sort_values(ascending=False)
print("üìå MOST SOLD iPhone model overall:")
display(top_model.head(1))


üìå MOST SOLD iPhone model overall:


Model
iPhone 12    11542
Name: Units_Sold, dtype: int64

## 2Ô∏è‚É£ Best Region for Each iPhone Model

In [3]:

region_model_sales = df.groupby(["Model", "Region"])["Units_Sold"].sum().reset_index()

best_region_per_model = region_model_sales.loc[
    region_model_sales.groupby("Model")["Units_Sold"].idxmax()
]

print("üìå BEST REGION for each iPhone model:")
display(best_region_per_model)


üìå BEST REGION for each iPhone model:


Unnamed: 0,Model,Region,Units_Sold
3,iPhone 11,West,2520
6,iPhone 11 Pro,South,3353
11,iPhone 12,West,6311
12,iPhone 12 Pro,East,1792
18,iPhone 13,South,3658
21,iPhone 13 Pro,North,3792
24,iPhone 14,East,4399
29,iPhone 14 Pro,North,3251
35,iPhone SE,West,2670
36,iPhone XR,East,2589


## 3Ô∏è‚É£ iPhones With Low Demand

In [4]:

low_demand = top_model[top_model < top_model.mean()]
print("üìâ Low-demand iPhone models:")
display(low_demand)


üìâ Low-demand iPhone models:


Model
iPhone 14        8395
iPhone SE        7684
iPhone XR        7603
iPhone 12 Pro    5445
Name: Units_Sold, dtype: int64

## 4Ô∏è‚É£ Customer Price Range Preference

In [5]:

df["Price_Range"] = pd.cut(
    df["Price"],
    bins=[30000,50000,70000,90000,110000,130000],
    labels=["30‚Äì50k","50‚Äì70k","70‚Äì90k","90‚Äì110k","110‚Äì130k"]
)

price_popularity = df.groupby("Price_Range")["Units_Sold"].sum().sort_values(ascending=False)

print("üí∞ Price Range Most Preferred by Customers:")
display(price_popularity)


üí∞ Price Range Most Preferred by Customers:


  price_popularity = df.groupby("Price_Range")["Units_Sold"].sum().sort_values(ascending=False)


Price_Range
30‚Äì50k      22196
90‚Äì110k     19645
50‚Äì70k      18211
70‚Äì90k      16873
110‚Äì130k     8900
Name: Units_Sold, dtype: int64

## 5Ô∏è‚É£ Best-Selling Storage Variant

In [6]:

storage_sales = df.groupby("Storage")["Units_Sold"].sum().sort_values(ascending=False)

print("üì¶ Best-selling storage option:")
display(storage_sales)


üì¶ Best-selling storage option:


Storage
64     25790
256    20729
128    20003
512    19303
Name: Units_Sold, dtype: int64

## 6Ô∏è‚É£ Identify Overpriced Models (High Price, Low Sales)

In [7]:

overpriced = df.groupby("Model").apply(
    lambda x: x["Price"].mean() / x["Units_Sold"].sum()
).sort_values(ascending=False)

print("‚ö†Ô∏è Potentially overpriced iPhones:")
display(overpriced.head(5))


‚ö†Ô∏è Potentially overpriced iPhones:


  overpriced = df.groupby("Model").apply(


Model
iPhone 12 Pro    13.585772
iPhone XR         9.497875
iPhone 14         9.419718
iPhone 11         9.369712
iPhone SE         9.221913
dtype: float64

## 7Ô∏è‚É£ Region With Highest Total Demand

In [8]:

region_demand = df.groupby("Region")["Units_Sold"].sum().sort_values(ascending=False)

print("üåç Region with highest overall iPhone demand:")
display(region_demand)


üåç Region with highest overall iPhone demand:


Region
West     23926
East     21710
North    20120
South    20069
Name: Units_Sold, dtype: int64

## 8Ô∏è‚É£ Bestselling Model‚ÄìStorage‚ÄìRegion Combination

In [9]:

best_combo = df.groupby(["Model","Storage","Region"])["Units_Sold"].sum().sort_values(ascending=False)

print("üèÜ Bestselling Model‚ÄìStorage‚ÄìRegion Combinations:")
display(best_combo.head(10))


üèÜ Bestselling Model‚ÄìStorage‚ÄìRegion Combinations:


Model          Storage  Region
iPhone 12      512      West      2601
               256      West      2464
iPhone 14      64       East      2013
iPhone 13      256      South     1898
iPhone 13 Pro  64       North     1817
               128      South     1730
iPhone 11 Pro  512      South     1578
iPhone 14 Pro  128      West      1531
iPhone 11 Pro  256      North     1445
iPhone 14 Pro  256      North     1353
Name: Units_Sold, dtype: int64

## 9Ô∏è‚É£ Price‚ÄìDemand Sensitivity

In [10]:

price_demand_corr = df["Price"].corr(df["Units_Sold"])
print(f"üìà Price‚ÄìDemand Correlation: {price_demand_corr}")

if price_demand_corr < -0.3:
    print("‚û°Ô∏è Strong negative correlation: Higher prices reduce sales significantly.")
elif price_demand_corr < 0:
    print("‚û°Ô∏è Mild negative correlation: Customers prefer lower-priced iPhones.")
else:
    print("‚û°Ô∏è No negative correlation: Customers buy even at higher prices.")


üìà Price‚ÄìDemand Correlation: -0.04244315910817519
‚û°Ô∏è Mild negative correlation: Customers prefer lower-priced iPhones.


## üìù Summary of Insights
This analysis provides:
- Best-selling models and weak performers
- Region-specific demand patterns
- Ideal pricing sweet spot
- Underperforming overpriced models
- Proven customer preferences for storage and price range
- Best-performing product combinations
- Real price‚Äìdemand behavior for decision-making

This is actionable business intelligence suitable for presentations, dashboards, or reports.