An exploratory data analysis and machine learning project investigating the US iPhone resale market post-iPhone 17 launch. This project leverages real marketplace data to identify pricing dynamics, dead inventory, and the key drivers of consumer demand.
Dataset sourced from Kaggle.
This project analyzes a unique dataset capturing active iPhone resale listings from publicly available e-commerce platforms in the USA in 2026.
Scope & Details:
- Timeframe: 2026 (Post-iPhone 17 launch window).
- Generations: iPhone 12 through iPhone 17.
- Variants: 23 distinct model configurations (Pro, Pro Max, Plus, Mini).
- Storage: Full range from 64GB to 1TB.
- Nature: Real marketplace listings with actual transaction pricing—not synthetic or estimated data.
Unique Value: Unlike standard pricing datasets, this includes a rare demand signal (units_sold), revealing not just what sellers are asking, but what buyers are actually paying for. This makes it uniquely suited for resale market analysis, depreciation modeling, and consumer electronics pricing research.
.
├── demand_analysis.py # Main Python script for analysis and modeling
├── ecommerce_iphone_resale_market_intelligence_usa_2026.csv # Source data
├── price_elasticity.png # Generated: Price vs. Demand analysis
├── demand_drivers.png # Generated: Feature importance chart
└── README.md # This file
The demand_analysis.py script performs the following operations:
-
Data Cleaning & Engineering:
- Cleans pricing data (handling international number formats).
- Calculates Price Z-Scores to identify over/underpriced listings relative to identical models.
- Calculates Supply Saturation (count of identical listings currently active).
- Derives device Age based on generation number.
-
Dead Inventory Detection:
- Identifies "stale" listings (0 sales) in markets with high supply.
- Finding: Identified iPhone 14 (128GB) and iPhone 12 (64GB) as the highest risk inventory.
-
Price Elasticity:
- Analyzes how price tiers impact unit sales for specific high-volume models (e.g., iPhone 13 128GB).
-
Predictive Modeling:
- Target:
is_high_demand(Top 25th percentile of sales). - Models: Logistic Regression vs. Random Forest Classifier.
- Outcome: Random Forest achieved ~71% accuracy in identifying high-demand listings based on price, age, and supply.
- Target:
- Python 3.8+
- Virtual Environment (recommended)
-
Clone the repository
git clone https://github.com/siyamthanda-code/iphone_classification.git
cd iphone_classification -
Create and activate virtual environment (Windows)
python -m venv venv
venv\Scripts\activate
-
Install dependencies
pip install pandas numpy scikit-learn matplotlib seaborn
Ensure the dataset file is in the root directory, then run:
python demand_analysis.pyThe script will output classification reports to the console and generate two visualization files (price_elasticity.png and demand_drivers.png).
- Supply Glut: The standard iPhone 14 (128GB) is currently the most saturated segment with the highest volume of unsold inventory.
- Demand Drivers: Age and supply saturation are the strongest negative predictors of demand, while competitive pricing (low Z-score) is the strongest positive predictor.
- Model Performance: The Random Forest model significantly outperformed Logistic Regression, suggesting non-linear relationships between price and demand in the resale market.

