# Online Sales Coupon Analysis

This project analyzes real-world sales data to understand **how customers use discount coupons** across different **demographics, campaigns, and product types**.

We'll use **Python** for data cleaning, merging, and visualization — focusing on insights instead of machine learning. The final goal is to prepare a **dashboard-ready dataset** for tools like Power BI.

---

## 🎯 Objectives

- Analyze coupon usage by customer demographics  
- Identify trends across campaigns, products, and brands  
- Summarize discount behavior for business insights  
- Prepare clean data for visualization tools

---

**Dataset Source**: Kaggle — *Predicting Coupon Redemption*

📁 Files used:  
- `customer_transaction_data.csv`  
- `customer_demographics.csv`  
- `coupon_item_mapping.csv`  
- `campaign_data.csv`


In [3]:
import pandas as pd
import matplotlib as plotly
import seaborn as sns

In [12]:
trans=pd.read_csv("customer_transaction_data.csv")
camp=pd.read_csv("campaign_data.csv")
customer=pd.read_csv("customer_demographics.csv")
coupon=pd.read_csv("coupon_item_mapping.csv")

In [24]:
# Perform EDA to get better understanding from this data set 
customer.shape  # so customer data have 760 rows and 7 column
camp.shape  # 28 rows and 4 column
trans.shape # 1324566 rows and 7 column
coupon.shape # 92663 rows and 2 column 
# Checking null values-------------------
coupon.isnull().values.any() # no null value 
customer.isnull().values.any() # null value
camp.isnull().values.any() # no null value 
trans.isnull().values.any() # no null values 

(92663, 2)

In [62]:
# Cleaning the datasets
customer.isnull().sum() # martial status = 329 null values and no_of_children = 538 null values
customer.dropna(axis=0,inplace = True)  # remove null values here

In [61]:
print("Is there any null values here ?",customer.isnull().values.any())

Is there any null values here ? False


In [66]:
# To get insights we have to merge some columns from this data sets
# we have coupon id and customer so we analyze coupon with customer relation coupon id with customer id
#trans #date', 'customer_id', 'item_id', 'quantity', 'selling_price',
                      #'other_discount', 'coupon_discount'
customer.columns

Index(['customer_id', 'age_range', 'marital_status', 'rented', 'family_size',
       'no_of_children', 'income_bracket'],
      dtype='object')

In [89]:
# now we do merging here to get more insights from table
df=trans.merge(customer,on='customer_id',how='left')
df.columns # now we create new data set which include info from trans data left side and remaing data of customer right side

Index(['date', 'customer_id', 'item_id', 'quantity', 'selling_price',
       'other_discount', 'coupon_discount', 'age_range', 'marital_status',
       'rented', 'family_size', 'no_of_children', 'income_bracket'],
      dtype='object')

In [93]:
df.isnull().values.any() # remove null values in new table
df.dropna(axis=0,inplace=True)
df.isnull().values.any()

np.False_

In [None]:
# create multiple merge on new df. -------------