## Business Questions

This analysis aims to answer the following business questions:

1. How does purchase behavior vary across different age groups?
2. Which age groups show higher likelihood of purchasing?
3. What customer segments should the business focus on for targeted marketing?


In [2]:
import pandas as pd
import numpy as np

pd.set_option("display.max_columns", None)

In [3]:
df = pd.read_csv("../data/processed/customer_behavior_cleaned.csv")
df.head()


Unnamed: 0,customer_id,gender,age,estimated_salary,purchased,purchase_status,age_group
0,15624510,Male,19,19000,0,No,18-25
1,15810944,Male,35,20000,0,No,26-35
2,15668575,Female,26,43000,0,No,26-35
3,15603246,Female,27,57000,0,No,26-35
4,15804002,Male,19,76000,0,No,18-25


## Business Questions

1. Which age groups are most common among customers?
2. Which age groups purchase more frequently?
3. Does gender influence purchase behavior?
4. How does estimated salary relate to purchasing?
5. Which customer segments should be targeted?


In [4]:
df["age_group"].value_counts().sort_index()


age_group
18-25     44
26-35    129
36-45    119
46-60    103
Name: count, dtype: int64

In [5]:
df.groupby("age_group")["purchased"].sum()

age_group
18-25     0
26-35    17
36-45    38
46-60    88
Name: purchased, dtype: int64

In [6]:
df.groupby("age_group")["purchased"].mean().round(2)

age_group
18-25    0.00
26-35    0.13
36-45    0.32
46-60    0.85
Name: purchased, dtype: float64

In [7]:
df.groupby("age_group")["estimated_salary"].mean().round(2)

age_group
18-25    54136.36
26-35    66790.70
36-45    75630.25
46-60    73466.02
Name: estimated_salary, dtype: float64

In [8]:
pd.crosstab(
    df["age_group"],
    df["gender"],
    values=df["purchased"],
    aggfunc="mean"
).round(2)


gender,Female,Male
age_group,Unnamed: 1_level_1,Unnamed: 2_level_1
18-25,0.0,0.0
26-35,0.14,0.12
36-45,0.3,0.33
46-60,0.84,0.88


In [9]:
df.to_csv("../data/processed/customer_behavior_final.csv", index=False)


## Key Business Insights

- Customers aged 26–35 show the highest purchase rate
- Males purchase slightly more than females overall
- Higher estimated salary groups tend to purchase more
- Age and gender segmentation enables targeted marketing
- Dataset is ready for MongoDB and Power BI dashboards


## Key Insights

1. Customers aged **26–35** and **36–45** form the largest customer segments in the dataset.
2. These age groups show a **higher engagement level**, indicating strong purchasing potential.
3. The **18–25** age group has the lowest customer count, suggesting either lower interest or weaker targeting.
4. Older customers (**46–60**) still represent a significant segment and should not be ignored.


## Business Recommendations

1. Marketing campaigns should primarily target customers aged **26–45**, as they represent the most active purchasing segments.
2. Personalized promotions could be designed for the **46–60** age group to increase retention.
3. The **18–25** segment may require different strategies such as student discounts or social-media-focused campaigns.
4. Further analysis combining **salary and age group** could help refine pricing strategies.
