<a href="https://colab.research.google.com/github/ksdhariwal/Data-Analysis-ML/blob/main/project-coupon-acceptance/Independent_Investigation_Coffee_House_Coupon_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Independent Investigation: Coffee House Coupon Analysis

## Introduction

The goal of this independent investigation is to explore one coupon type in depth and identify the characteristics of drivers who are more likely to accept that coupon. After comparing acceptance rates across all coupon types, we selected the **Coffee House** coupon for deeper analysis.

### Why Coffee House?

- It has the **largest number of observations** (3,996 rows), which provides strong statistical reliability.
- The acceptance rate is **moderate (≈ 0.50)**, meaning there is enough variation to detect meaningful behavioral patterns.
- Coffee-related behavior (frequency of visits, time of day, passenger type) offers rich variables for multi‑condition analysis.
- Other coupon types either had acceptance rates that were too high (Carry Out & Take Away) or too low (Bar), making them less suitable for discovering nuanced patterns.

This makes Coffee House the best candidate for a detailed behavioral investigation.



In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Load the dataset
data = pd.read_csv('sample_data/coupons.csv')
data.head()


Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
0,No Urgent Place,Alone,Sunny,55,2PM,Restaurant(<20),1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,1
1,No Urgent Place,Friend(s),Sunny,80,10AM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0
2,No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
3,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
4,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0


## Why We Selected the Coffee House Coupon

To determine which coupon type would be most suitable for deeper analysis, we compared acceptance rates and sample sizes across all coupon categories. The ideal coupon type should:

- Have a **large number of observations** (for reliable statistical analysis)
- Have a **moderate acceptance rate** (not too high, not too low)
- Show **behavioral variation** across variables such as passenger type, time of day, income, and visit frequency

The table below shows the acceptance rate and count for each coupon type. Based on these results:

- **Carry out & Take Away** and **Restaurant(<20)** have very high acceptance rates, leaving little behavioral variation to explore.
- **Bar** and **Restaurant(20–50)** have lower acceptance rates and fewer observations.
- **Coffee House** has the **largest sample size (3,996 rows)** and a **balanced acceptance rate (~0.50)**, making it ideal for discovering meaningful patterns.

Therefore, we selected **Coffee House** for the independent investigation.


In [2]:
# ---------------------------------------------------------
# Point 4: Data-driven selection of the Coffee House coupon
# ---------------------------------------------------------

# Calculate acceptance rate for each coupon type
coupon_acceptance = data.groupby('coupon')['Y'].mean()

# Count number of observations for each coupon type
coupon_counts = data['coupon'].value_counts()

# Combine into a single summary table
coupon_summary = pd.DataFrame({
    'acceptance_rate': coupon_acceptance,
    'count': coupon_counts
}).sort_values(by='acceptance_rate', ascending=False)

# Display the summary
coupon_summary


Unnamed: 0_level_0,acceptance_rate,count
coupon,Unnamed: 1_level_1,Unnamed: 2_level_1
Carry out & Take away,0.735478,2393
Restaurant(<20),0.707107,2786
Coffee House,0.499249,3996
Restaurant(20-50),0.441019,1492
Bar,0.410015,2017


In [4]:
# ---------------------------------------------------------
# Create Coffee House subset
# ---------------------------------------------------------

coffee_df = data[data['coupon'] == 'Coffee House']

# Display first few rows to confirm
coffee_df.head()


Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
1,No Urgent Place,Friend(s),Sunny,80,10AM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0
3,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
4,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
12,No Urgent Place,Kid(s),Sunny,55,6PM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
15,Home,Alone,Sunny,80,6PM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0


In [3]:
# ---------------------------------------------------------
# Point 4: Data-driven selection of the Coffee House coupon
# ---------------------------------------------------------

# Calculate acceptance rate for each coupon type
coupon_acceptance = data.groupby('coupon')['Y'].mean()

# Count number of observations for each coupon type
coupon_counts = data['coupon'].value_counts()

# Combine into a single summary table
coupon_summary = pd.DataFrame({
    'acceptance_rate': coupon_acceptance,
    'count': coupon_counts
}).sort_values(by='acceptance_rate', ascending=False)

# Display the summary table
coupon_summary


Unnamed: 0_level_0,acceptance_rate,count
coupon,Unnamed: 1_level_1,Unnamed: 2_level_1
Carry out & Take away,0.735478,2393
Restaurant(<20),0.707107,2786
Coffee House,0.499249,3996
Restaurant(20-50),0.441019,1492
Bar,0.410015,2017
