# üõí **Customer Purchase Behavior Analysis** üõçÔ∏è  

## üîç **Objective**  
In this notebook, we explore customer purchase behavior using **Exploratory Data Analysis (EDA)** and **interactive visualizations with Plotly**. We analyze key insights into **demographics, product preferences, loyalty, payment methods, and customer satisfaction** to uncover hidden patterns in the data.  

---

## üèóÔ∏è **Steps Performed in EDA**  

| **Step** | **Description** |
|----------|---------------|
| üìå **Data Loading & Cleaning** | Imported the dataset and handled missing values. |
| üìä **Descriptive Statistics** | Analyzed numerical distributions (mean, median, standard deviation). |
| üìà **Customer Demographics & Purchasing Behavior** | Explored spending trends by **age, gender, location, and subscription status**. |
| üõçÔ∏è **Product Preferences & Purchase Trends** | Identified **top-selling products, seasonal trends, and color/size preferences**. |
| üîÅ **Customer Loyalty & Purchase Frequency** | Examined repeat vs. one-time buyers and the effect of purchase history. |
| üí≥ **Discounts, Promotions & Payment Preferences** | Evaluated the impact of **discounts, promo codes, and payment methods** on spending. |
| ‚≠ê **Ratings & Customer Satisfaction Analysis** | Correlated **review ratings with spending patterns and subscription status**. |

---

## üìå **Key Insights Uncovered**  

‚úÖ **Younger customers vs. older customers** ‚Äì Who spends more?  
‚úÖ **Gender-based spending patterns** ‚Äì Which categories are most popular for men & women?  
‚úÖ **Location-based shopping behavior** ‚Äì Do urban customers buy more than rural ones?  
‚úÖ **Impact of subscription status** ‚Äì Do subscribers purchase more frequently?  
‚úÖ **Product preferences** ‚Äì Which products dominate sales? Any seasonal spikes?  
‚úÖ **Effectiveness of discounts & promotions** ‚Äì Do promo users spend more or less?  
‚úÖ **Preferred payment methods** ‚Äì Are digital wallets more popular than credit cards?  
‚úÖ **Review ratings vs. sales** ‚Äì Do highly rated products have more purchases?  

---

## üé® **Visualizations & Analysis**  

üìå **We used Plotly for interactive plots with a dark theme**, ensuring a visually appealing and insightful analysis. Key plots include:  
‚úîÔ∏è **Bar Charts** üìä ‚Äì To visualize **top-selling categories, gender-wise spending, and location-based trends**.  
‚úîÔ∏è **Box Plots** üì¶ ‚Äì To analyze **purchase frequency vs. review ratings**.  
‚úîÔ∏è **Heatmaps** üî• ‚Äì To uncover correlations between **spending behavior and demographics**.  
‚úîÔ∏è **Time Series Plots** ‚è≥ ‚Äì To track **seasonal trends and purchase patterns**.  

---

üöÄ **Let‚Äôs dive into the data and unlock actionable insights!**  


# **üìö Importing Libraries**

In [1]:
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
import warnings

# **‚öôÔ∏è Basic Important Settings**

In [2]:
warnings.filterwarnings("ignore")
pio.renderers.default = 'iframe'

def update_layout(fig):
    fig.update_layout(
        template="plotly_dark",  # Dark theme
        plot_bgcolor="black",    # Black background
        paper_bgcolor="black"
    )
    return fig

# **üì• Loading the Dataset**

In [3]:
df = pd.read_csv("/kaggle/input/shopping-trends-dataset/shopping_trends.csv")

# **üìä Exploring the Dataset**

In [4]:
df.head()

Unnamed: 0,Customer ID,Age,Gender,Item Purchased,Category,Purchase Amount (USD),Location,Size,Color,Season,Review Rating,Subscription Status,Payment Method,Shipping Type,Discount Applied,Promo Code Used,Previous Purchases,Preferred Payment Method,Frequency of Purchases
0,1,55,Male,Blouse,Clothing,53,Kentucky,L,Gray,Winter,3.1,Yes,Credit Card,Express,Yes,Yes,14,Venmo,Fortnightly
1,2,19,Male,Sweater,Clothing,64,Maine,L,Maroon,Winter,3.1,Yes,Bank Transfer,Express,Yes,Yes,2,Cash,Fortnightly
2,3,50,Male,Jeans,Clothing,73,Massachusetts,S,Maroon,Spring,3.1,Yes,Cash,Free Shipping,Yes,Yes,23,Credit Card,Weekly
3,4,21,Male,Sandals,Footwear,90,Rhode Island,M,Maroon,Spring,3.5,Yes,PayPal,Next Day Air,Yes,Yes,49,PayPal,Weekly
4,5,45,Male,Blouse,Clothing,49,Oregon,M,Turquoise,Spring,2.7,Yes,Cash,Free Shipping,Yes,Yes,31,PayPal,Annually


In [5]:
df.tail()

Unnamed: 0,Customer ID,Age,Gender,Item Purchased,Category,Purchase Amount (USD),Location,Size,Color,Season,Review Rating,Subscription Status,Payment Method,Shipping Type,Discount Applied,Promo Code Used,Previous Purchases,Preferred Payment Method,Frequency of Purchases
3895,3896,40,Female,Hoodie,Clothing,28,Virginia,L,Turquoise,Summer,4.2,No,Cash,2-Day Shipping,No,No,32,Venmo,Weekly
3896,3897,52,Female,Backpack,Accessories,49,Iowa,L,White,Spring,4.5,No,PayPal,Store Pickup,No,No,41,Bank Transfer,Bi-Weekly
3897,3898,46,Female,Belt,Accessories,33,New Jersey,L,Green,Spring,2.9,No,Credit Card,Standard,No,No,24,Venmo,Quarterly
3898,3899,44,Female,Shoes,Footwear,77,Minnesota,S,Brown,Summer,3.8,No,PayPal,Express,No,No,24,Venmo,Weekly
3899,3900,52,Female,Handbag,Accessories,81,California,M,Beige,Spring,3.1,No,Bank Transfer,Store Pickup,No,No,33,Venmo,Quarterly


In [6]:
df.sample(10)

Unnamed: 0,Customer ID,Age,Gender,Item Purchased,Category,Purchase Amount (USD),Location,Size,Color,Season,Review Rating,Subscription Status,Payment Method,Shipping Type,Discount Applied,Promo Code Used,Previous Purchases,Preferred Payment Method,Frequency of Purchases
2892,2893,22,Female,Dress,Clothing,39,Nevada,M,Teal,Spring,3.2,No,Bank Transfer,Standard,No,No,47,Credit Card,Monthly
70,71,22,Male,Belt,Accessories,29,Alabama,M,Magenta,Fall,4.2,Yes,Credit Card,Express,Yes,Yes,32,Debit Card,Every 3 Months
1941,1942,55,Male,Shoes,Footwear,99,Vermont,XL,Peach,Winter,3.3,No,PayPal,Standard,No,No,46,Credit Card,Quarterly
2615,2616,49,Male,T-shirt,Clothing,62,Arizona,M,Maroon,Fall,3.1,No,Venmo,2-Day Shipping,No,No,6,Debit Card,Quarterly
3674,3675,35,Female,Handbag,Accessories,65,Washington,M,Blue,Winter,2.7,No,Venmo,2-Day Shipping,No,No,41,Venmo,Fortnightly
1849,1850,30,Male,Skirt,Clothing,40,New Mexico,M,Maroon,Winter,4.5,No,Credit Card,Store Pickup,No,No,35,Bank Transfer,Monthly
2079,2080,32,Male,Jeans,Clothing,24,Montana,L,Violet,Summer,3.1,No,Debit Card,Next Day Air,No,No,16,PayPal,Quarterly
2286,2287,67,Male,Socks,Clothing,38,South Dakota,L,Blue,Spring,2.5,No,PayPal,Store Pickup,No,No,35,Venmo,Quarterly
2652,2653,23,Female,Shorts,Clothing,20,Maryland,L,Cyan,Summer,3.3,No,Debit Card,2-Day Shipping,No,No,46,Credit Card,Monthly
1432,1433,34,Male,Hat,Accessories,33,Florida,L,Black,Winter,4.4,No,Cash,Express,Yes,Yes,32,Debit Card,Quarterly


In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3900 entries, 0 to 3899
Data columns (total 19 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Customer ID               3900 non-null   int64  
 1   Age                       3900 non-null   int64  
 2   Gender                    3900 non-null   object 
 3   Item Purchased            3900 non-null   object 
 4   Category                  3900 non-null   object 
 5   Purchase Amount (USD)     3900 non-null   int64  
 6   Location                  3900 non-null   object 
 7   Size                      3900 non-null   object 
 8   Color                     3900 non-null   object 
 9   Season                    3900 non-null   object 
 10  Review Rating             3900 non-null   float64
 11  Subscription Status       3900 non-null   object 
 12  Payment Method            3900 non-null   object 
 13  Shipping Type             3900 non-null   object 
 14  Discount

---

# **üîç Exploratory Data Analysis (EDA)**

---

# **1Ô∏è‚É£ Customer Demographics & Purchasing Behavior**

### **1Ô∏è‚É£ Age vs. Purchase Amount**

In [8]:
fig1 = px.scatter(df, x="Age", y="Purchase Amount (USD)", color="Gender",
                  title="Age vs. Purchase Amount",
                  labels={"Age": "Age", "Purchase Amount (USD)": "Purchase Amount ($)"},
                  opacity=0.7)
fig1 = update_layout(fig1)
fig1.show()

### **2Ô∏è‚É£ Gender-wise Spending Patterns**

In [9]:
fig2 = px.bar(df.groupby("Gender")["Purchase Amount (USD)"].mean().reset_index(),
              x="Gender", y="Purchase Amount (USD)",
              title="Average Spending by Gender",
              color="Gender")
fig2 = update_layout(fig2)
fig2.show()

In [10]:
fig3 = px.histogram(df, x="Category", color="Gender", barmode="group",
                    title="Category Preferences by Gender",
                    labels={"Category": "Product Category"})
fig3 = update_layout(fig3)
fig3.show()

### **3Ô∏è‚É£ Location-based Shopping Trends**

In [11]:
fig4 = px.bar(df.groupby("Location")["Purchase Amount (USD)"].sum().reset_index(),
              x="Location", y="Purchase Amount (USD)",
              title="Total Spending by Location",
              color="Location")
fig4 = update_layout(fig4)
fig4.show()

### **4Ô∏è‚É£ Subscription Status Impact on Spending**

In [12]:
fig5 = px.box(df, x="Subscription Status", y="Purchase Amount (USD)",
              title="Subscription Status vs. Spending",
              color="Subscription Status")
fig5 = update_layout(fig5)
fig5.show()

In [13]:
fig6 = px.histogram(df, x="Subscription Status", color="Frequency of Purchases",
                    title="Purchase Frequency by Subscription Status",
                    barmode="group")
fig6 = update_layout(fig6)
fig6.show()

# **2Ô∏è‚É£ Product Preferences & Purchase Trends**

### **5Ô∏è‚É£ Top-Selling Categories & Items**

In [14]:
category_counts = df["Category"].value_counts().reset_index()
category_counts.columns = ["Category", "Count"]  


fig7 = px.bar(category_counts, x="Category", y="Count",
              title="Top-Selling Categories", color="Category")
fig7 = update_layout(fig7)
fig7.show()

In [15]:
fig8 = px.histogram(df, x="Item Purchased", color="Gender", barmode="group",
                    title="Best-Selling Items Across Age Groups & Genders")
fig8 = update_layout(fig8)
fig8.show()

### **6Ô∏è‚É£ Color & Size Preferences**

In [16]:
fig9 = px.histogram(df, x="Color", color="Category", barmode="group",
                    title="Color Preferences by Category")
fig9 = update_layout(fig9)
fig9.show()

In [17]:
fig10 = px.histogram(df, x="Size", color="Location", barmode="group",
                     title="Size Preferences by Location")
fig10 = update_layout(fig10)
fig10.show()

### **7Ô∏è‚É£ Seasonal Trends in Purchases**

In [18]:
fig11 = px.box(df, x="Season", y="Purchase Amount (USD)", color="Season",
               title="Purchase Amount by Season")
fig11 = update_layout(fig11)
fig11.show()

In [19]:
fig12 = px.histogram(df, x="Season", color="Item Purchased", barmode="group",
                     title="Seasonal Spikes in Purchases")
fig12 = update_layout(fig12)
fig12.show()

# **3Ô∏è‚É£ Customer Loyalty & Purchase Frequency**

### **1Ô∏è‚É£ Do customers with more past purchases spend more per transaction?**

In [20]:
fig1 = px.scatter(df, x="Previous Purchases", y="Purchase Amount (USD)",
                  title="Impact of Previous Purchases on Spending Behavior",
                  color="Previous Purchases",
                  trendline="ols")
fig1 = update_layout(fig1)
fig1.show()

### **2Ô∏è‚É£ Customer Segmentation by Purchase Frequency**

In [21]:
fig2 = px.histogram(df, x="Frequency of Purchases", color="Frequency of Purchases",
                    title="Customer Segmentation by Purchase Frequency")
fig2 = update_layout(fig2)
fig2.show()

In [22]:
fig3 = px.box(df, x="Frequency of Purchases", y="Purchase Amount (USD)", 
              title="Spending Patterns by Purchase Frequency", color="Frequency of Purchases")
fig3 = update_layout(fig3)
fig3.show()

# **4Ô∏è‚É£ Discounts, Promotions & Payment Preferences**

### **1Ô∏è‚É£ Do customers who use promo codes spend more or less?**

In [23]:
fig1 = px.box(df, x="Promo Code Used", y="Purchase Amount (USD)", 
              title="Effect of Promo Codes on Purchase Amount", color="Promo Code Used")

fig1 = update_layout(fig1)
fig1.show()

### **2Ô∏è‚É£ Compare average purchase amounts between discount users & non-users**

In [24]:
fig2 = px.box(df, x="Discount Applied", y="Purchase Amount (USD)", 
              title="Effect of Discounts on Purchase Amount", color="Discount Applied")

fig2 = update_layout(fig2)
fig2.show()

### **3Ô∏è‚É£ Which payment methods are most commonly used?**

In [25]:
payment_counts = df["Payment Method"].value_counts().reset_index()
payment_counts.columns = ["Payment Method", "Count"]

fig3 = px.bar(payment_counts, x="Payment Method", y="Count", color="Payment Method",
              title="Most Common Payment Methods")

fig3 = update_layout(fig3)
fig3.show()

### **4Ô∏è‚É£ Do credit card users spend more than digital wallet users?**

In [26]:
fig4 = px.box(df, x="Payment Method", y="Purchase Amount (USD)", 
              title="Spending Behavior by Payment Method", color="Payment Method")
fig4 = update_layout(fig4)
fig4.show()

### **5Ô∏è‚É£ Do express shipping users spend more?**

In [27]:
fig5 = px.box(df, x="Shipping Type", y="Purchase Amount (USD)", 
              title="Spending Behavior by Shipping Type", color="Shipping Type")
fig5 = update_layout(fig5)
fig5.show()

# **5Ô∏è‚É£ Ratings & Customer Satisfaction Analysis**

### **1Ô∏è‚É£ Do higher-rated items have more sales?**

In [28]:
fig1 = px.scatter(df, x="Review Rating", y="Purchase Amount (USD)", 
                  title="Review Ratings vs. Purchase Amount",
                  color="Review Rating", size="Purchase Amount (USD)")

fig1 = update_layout(fig1)
fig1.show()

### **2Ô∏è‚É£ Are subscribers more likely to leave positive reviews?**

In [29]:
fig2 = px.box(df, x="Subscription Status", y="Review Rating", 
              title="Subscription Status vs. Review Ratings", color="Subscription Status")
fig2 = update_layout(fig2)
fig2.show()

### **3Ô∏è‚É£ Are frequent buyers more satisfied than new customers?**

In [30]:
fig3 = px.box(df, x="Frequency of Purchases", y="Review Rating", 
              title="Customer Purchase Frequency vs. Review Ratings", color="Frequency of Purchases")
fig3 = update_layout(fig3)
fig3.show()

---

## üôå **Thank You for Exploring This Analysis!** üéâ  

Thank you for taking the time to explore this **Customer Purchase Behavior Analysis** notebook! üöÄ I hope the insights and visualizations provided valuable perspectives on **consumer trends, spending habits, and product preferences**.  

If you found this analysis helpful, feel free to:   
‚úîÔ∏è **Share your feedback** in the comments below.  
‚úîÔ∏è **Connect with me** for more exciting data science projects!  

### üîó **Stay Connected & Keep Learning!** üìä  
üí° **Happy Analyzing!** üõíüìà  
