### Objective

Analyze customer purchase behavior and segment them based on:

Total spending

Order frequency

Average order value

This is basic data analytics + ML foundation thinking.

### Step 1: Generate Dataset Using NumPy

In [1]:
import numpy as np
import pandas as pd

In [6]:
np.random.seed(42)

n=200

data ={
    "Customer_ID" : np.arange(1,n+1),
    "Orders" : np.random.randint(1,20,n),
    "Total_Spent" : np.random.randint(1000, 50000, n)
}

df=pd.DataFrame(data)
df.head()


Unnamed: 0,Customer_ID,Orders,Total_Spent
0,1,7,2802
1,2,15,9155
2,3,11,9120
3,4,8,40384
4,5,7,48025


### Step 2: Create New Features

In [None]:
1-Average Order Value

In [9]:
df["Avg_order_value"] = df["Total_Spent"]/df["Orders"]
df.head(3)

Unnamed: 0,Customer_ID,Orders,Total_Spent,Avg_order_value
0,1,7,2802,400.285714
1,2,15,9155,610.333333
2,3,11,9120,829.090909


2-Customer Category Based on Spending

In [16]:
conditions = [
    df["Total_Spent"] >= 40000,
    df["Total_Spent"] >= 20000,
    df["Total_Spent"] < 20000
]

categories = ["Premium", "Regular", "Low Value"]

df["Customer_Type"] = np.select(conditions, categories,default="Unknown")

### Step 3: Analysis

1-Top 10 High-Spending Customers

In [18]:
df.sort_values("Total_Spent",ascending = False).head(10)

Unnamed: 0,Customer_ID,Orders,Total_Spent,Avg_order_value,Customer_Type
173,174,4,49747,12436.75,Premium
77,78,8,49404,6175.5,Premium
37,38,15,49354,3290.266667,Premium
169,170,19,49096,2584.0,Premium
20,21,16,48357,3022.3125,Premium
70,71,15,48333,3222.2,Premium
5,6,19,48254,2539.684211,Premium
96,97,5,48202,9640.4,Premium
196,197,7,48030,6861.428571,Premium
4,5,7,48025,6860.714286,Premium


2-Customer Type Distribution

In [19]:
df["Customer_Type"].value_counts()

Customer_Type
Low Value    80
Regular      74
Premium      46
Name: count, dtype: int64

3-Average Spending Per Category

In [20]:
df.groupby("Customer_Type")["Total_Spent"].mean()

Customer_Type
Low Value     9646.312500
Premium      45116.108696
Regular      29658.810811
Name: Total_Spent, dtype: float64

4-Correlation Between Orders & Spending

In [21]:
df[["Orders","Total_Spent"]].corr()

Unnamed: 0,Orders,Total_Spent
Orders,1.0,0.021743
Total_Spent,0.021743,1.0
