- Clean and merge the provided dataset.
- Perform exploratory data analysis to understand patterns in customer behavior, products, and transactions.
- Derive at least 5 business insights based on the analysis.
- EDA Notebook:
Likhith_Raj_EDA.ipynb
- Business Insights PDF:
Likhith_Raj_EDA.pdf
- Most Active Region: South America had the highest number of orders, with a total of 59 orders.
- Top-Selling Category: Books emerged as the most sold product category in 2024.
- Top Customer: Customer C0109 was the most active, having placed 11 orders.
- Peak Transaction Month: January 2024 recorded the highest number of transactions.
- Revenue Trends: Total revenue in 2024 reached $332,669.30, with significant contributions from South Africa.
- Build a model that takes customer information and recommends the 3 most similar customers based on transaction and profile data.
- Calculate similarity using cosine similarity.
- Output a CSV file (
Lookalike.csv
) containing recommended lookalikes for each customer.
- Lookalike CSV:
Lookalike.csv
- Lookalike Model Notebook:
Likhith_Raj_Lookalike.ipynb
- Lookalike 1: C0190, Similarity Score: 0.9546
- Lookalike 2: C0091, Similarity Score: 0.9086
- Lookalike 3: C0174, Similarity Score: 0.9045
- Perform customer segmentation using clustering techniques.
- Use customer profile and transaction data for clustering.
- Choose a suitable clustering algorithm (e.g., KMeans) and evaluate the clustering using the Davies-Bouldin Index (DBI).
- Visualize the clusters and analyze the results.
- Clustering Report PDF:
Likhith_Raj_Clustering.pdf
- Clustering Notebook:
Likhith_Raj_Clustering.ipynb
- 3 Clusters: DBI = 0.89
- 4 Clusters: DBI = 1.03
- 9 Clusters: DBI = 0.721 (Best configuration)