### Will a Customer Accept the Coupon?

**Context**

Imagine driving through town and a coupon is delivered to your cell phone for a restaraunt near where you are driving. Would you accept that coupon and take a short detour to the restaraunt? Would you accept the coupon but use it on a sunbsequent trip? Would you ignore the coupon entirely? What if the coupon was for a bar instead of a restaraunt? What about a coffee house? Would you accept a bar coupon with a minor passenger in the car? What about if it was just you and your partner in the car? Would weather impact the rate of acceptance? What about the time of day?

Obviously, proximity to the business is a factor on whether the coupon is delivered to the driver or not, but what are the factors that determine whether a driver accepts the coupon once it is delivered to them? How would you determine whether a driver is likely to accept a coupon?

**Overview**

The goal of this project is to use what you know about visualizations and probability distributions to distinguish between customers who accepted a driving coupon versus those that did not.

**Data**

This data comes to us from the UCI Machine Learning repository and was collected via a survey on Amazon Mechanical Turk. The survey describes different driving scenarios including the destination, current time, weather, passenger, etc., and then ask the person whether he will accept the coupon if he is the driver. Answers that the user will drive there ‘right away’ or ‘later before the coupon expires’ are labeled as ‘Y = 1’ and answers ‘no, I do not want the coupon’ are labeled as ‘Y = 0’.  There are five different types of coupons -- less expensive restaurants (under \\$20), coffee houses, carry out & take away, bar, and more expensive restaurants (\\$20 - \\$50). 

**Deliverables**

Your final product should be a brief report that highlights the differences between customers who did and did not accept the coupons.  To explore the data you will utilize your knowledge of plotting, statistical summaries, and visualization using Python. You will publish your findings in a public facing github repository as your first portfolio piece. 





### Data Description
Keep in mind that these values mentioned below are average values.

The attributes of this data set include:
1. User attributes
    -  Gender: male, female
    -  Age: below 21, 21 to 25, 26 to 30, etc.
    -  Marital Status: single, married partner, unmarried partner, or widowed
    -  Number of children: 0, 1, or more than 1
    -  Education: high school, bachelors degree, associates degree, or graduate degree
    -  Occupation: architecture & engineering, business & financial, etc.
    -  Annual income: less than \\$12500, \\$12500 - \\$24999, \\$25000 - \\$37499, etc.
    -  Number of times that he/she goes to a bar: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    -  Number of times that he/she buys takeaway food: 0, less than 1, 1 to 3, 4 to 8 or greater
    than 8
    -  Number of times that he/she goes to a coffee house: 0, less than 1, 1 to 3, 4 to 8 or
    greater than 8
    -  Number of times that he/she eats at a restaurant with average expense less than \\$20 per
    person: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    -  Number of times that he/she goes to a bar: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    

2. Contextual attributes
    - Driving destination: home, work, or no urgent destination
    - Location of user, coupon and destination: we provide a map to show the geographical
    location of the user, destination, and the venue, and we mark the distance between each
    two places with time of driving. The user can see whether the venue is in the same
    direction as the destination.
    - Weather: sunny, rainy, or snowy
    - Temperature: 30F, 55F, or 80F
    - Time: 10AM, 2PM, or 6PM
    - Passenger: alone, partner, kid(s), or friend(s)


3. Coupon attributes
    - time before it expires: 2 hours or one day

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import plotly.express as px

### Problems

Use the prompts below to get started with your data analysis.  

1. Read in the `coupons.csv` file.




In [315]:
data = pd.read_csv('data/coupons.csv')

In [316]:
data.head()

Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
0,No Urgent Place,Alone,Sunny,55,2PM,Restaurant(<20),1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,1
1,No Urgent Place,Friend(s),Sunny,80,10AM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0
2,No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
3,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
4,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0


2. Investigate the dataset for missing or problematic data.

In [317]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12684 entries, 0 to 12683
Data columns (total 26 columns):
 #   Column                Non-Null Count  Dtype 
---  ------                --------------  ----- 
 0   destination           12684 non-null  object
 1   passanger             12684 non-null  object
 2   weather               12684 non-null  object
 3   temperature           12684 non-null  int64 
 4   time                  12684 non-null  object
 5   coupon                12684 non-null  object
 6   expiration            12684 non-null  object
 7   gender                12684 non-null  object
 8   age                   12684 non-null  object
 9   maritalStatus         12684 non-null  object
 10  has_children          12684 non-null  int64 
 11  education             12684 non-null  object
 12  occupation            12684 non-null  object
 13  income                12684 non-null  object
 14  car                   108 non-null    object
 15  Bar                   12577 non-null

In [319]:
print(len(data))

#display unique column values
print(data.destination.unique())
print(data.destination.value_counts())

print(data.passanger.unique())
print(data.passanger.value_counts())

print(data.weather.unique())
print(data.weather.value_counts())

print(data.temperature.unique())
print(data.temperature.value_counts())

print(data.time.unique())
print(data.time.value_counts())

print(data.coupon.unique())
print(data.coupon.value_counts())

print(data.expiration.unique())
print(data.expiration.value_counts())

print(data.gender.unique())
print(data.gender.value_counts())

print(data.age.unique())
print(data.age.value_counts())

print(data.maritalStatus.unique())
print(data.maritalStatus.value_counts())

print(data.has_children.unique())
print(data.has_children.value_counts())

print(data.education.unique())
print(data.education.value_counts())

print(data.occupation.unique())
print(data.occupation.value_counts())

print(data.income.unique())
print(data.income.value_counts())

print(data.car.unique())
print(data.car.value_counts())

print(data.Bar.unique())
print(data.Bar.value_counts())

print(data.CoffeeHouse.unique())
print(data.CoffeeHouse.value_counts())

print(data.CarryAway.unique())
print(data.CarryAway.value_counts())

print(data.RestaurantLessThan20.unique())
print(data.RestaurantLessThan20.value_counts())

print(data.Restaurant20To50.unique())
print(data.Restaurant20To50.value_counts())

print(data.toCoupon_GEQ5min.unique())
print(data.toCoupon_GEQ5min.value_counts())

print(data.toCoupon_GEQ15min.unique())
print(data.toCoupon_GEQ15min.value_counts())

print(data.toCoupon_GEQ25min.unique())
print(data.toCoupon_GEQ25min.value_counts())

print(data.direction_same.unique())
print(data.direction_same.value_counts())

print(data.direction_opp.unique())
print(data.direction_opp.value_counts())

print(data.Y.unique())
print(data.Y.value_counts())

data.info()

12684
['No Urgent Place' 'Home' 'Work']
No Urgent Place    6283
Home               3237
Work               3164
Name: destination, dtype: int64
['Alone' 'Friend(s)' 'Kid(s)' 'Partner']
Alone        7305
Friend(s)    3298
Partner      1075
Kid(s)       1006
Name: passanger, dtype: int64
['Sunny' 'Rainy' 'Snowy']
Sunny    10069
Snowy     1405
Rainy     1210
Name: weather, dtype: int64
[55 80 30]
80    6528
55    3840
30    2316
Name: temperature, dtype: int64
['2PM' '10AM' '6PM' '7AM' '10PM']
6PM     3230
7AM     3164
10AM    2275
2PM     2009
10PM    2006
Name: time, dtype: int64
['Restaurant(<20)' 'Coffee House' 'Carry out & Take away' 'Bar'
 'Restaurant(20-50)']
Coffee House             3996
Restaurant(<20)          2786
Carry out & Take away    2393
Bar                      2017
Restaurant(20-50)        1492
Name: coupon, dtype: int64
['1d' '2h']
1d    7091
2h    5593
Name: expiration, dtype: int64
['Female' 'Male']
Female    6511
Male      6173
Name: gender, dtype: int64
['21' '46' 

3. Decide what to do about your missing data -- drop, replace, other...

In [321]:
#automatically convert datatype
data.convert_dtypes()

#rename column names 
data = data.rename(columns={"passanger": "passenger","Bar":"bar","CoffeeHouse":"coffeeHouse","CarryAway":"carryAway",
                           "RestaurantLessThan20":"restaurantLessThan20","Restaurant20To50":"restaurant20To50"})

#replace expiration day to hours
data['expiration'] = data['expiration'].str.replace('1d','24h')

# group age into range
data['age'] = np.where((data['age'] == '21'), '21-25', data['age'])
data['age'] = np.where((data['age'] == '26'), '26-30', data['age'])
data['age'] = np.where((data['age'] == '31'), '31-35', data['age'])
data['age'] = np.where((data['age'] == '36'), '36-40', data['age'])
data['age'] = np.where((data['age'] == '41'), '41-45', data['age'])
data['age'] = np.where((data['age'] == '46'), '46-50', data['age'])

# drop nan values 
data.dropna(subset=['car'])

# drop nan values 
data.dropna(subset=['bar'])

# drop nan values 
data.dropna(subset=['coffeeHouse'])

#replace '~' with '-'
data['coffeeHouse'] = data['coffeeHouse'].str.replace('1~3','1-3')
data['coffeeHouse'] = data['coffeeHouse'].str.replace('4~8','4-8')

# drop nan values
data.dropna(subset=['carryAway'])
#replace '~' with '-'
data['carryAway'] = data['carryAway'].str.replace('1~3','1-3')
data['carryAway'] = data['carryAway'].str.replace('4~8','4-8')

# drop nan values
data.dropna(subset=['restaurantLessThan20'])

# drop nan values
data.dropna(subset=['restaurant20To50'])


Unnamed: 0,destination,passenger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,coffeeHouse,carryAway,restaurantLessThan20,restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
0,No Urgent Place,Alone,Sunny,55,2PM,Restaurant(<20),24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,1
1,No Urgent Place,Friend(s),Sunny,80,10AM,Coffee House,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0
2,No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
3,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
4,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12679,Home,Partner,Rainy,55,6PM,Carry out & Take away,24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,1,0,1
12680,Work,Alone,Rainy,55,7AM,Carry out & Take away,24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,0,1,1
12681,Work,Alone,Snowy,30,7AM,Coffee House,24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,1,0,0
12682,Work,Alone,Snowy,30,7AM,Bar,24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,1,1,0,1,0


4. What proportion of the total observations chose to accept the coupon? 



In [322]:
# proportion = no# of coupons used / total number of coupons
data['Y'].value_counts()
coupon_accepted_proportion = round((len(data[data['Y']==1])/len(data))*100,2)
print(coupon_accepted_proportion)


56.84


5. Use a bar plot to visualize the `coupon` column.

In [323]:


px.bar(data.groupby('coupon').sum().reset_index(),x="coupon",y="Y",
       title="Coupon count for various types",labels=dict(coupon="Coupon", Y="Count"))

6. Use a histogram to visualize the temperature column.

In [324]:
fig = px.histogram(data, x="temperature",nbins=50,title="Temperature count",labels=dict(temperature="Temperature", Y="Count"))
fig.show()

**Investigating the Bar Coupons**

Now, we will lead you through an exploration of just the bar related coupons.  

1. Create a new `DataFrame` that contains just the bar coupons.


In [339]:
df_bar_coupon = data.query('coupon=="Bar"')
print(df_bar_coupon)
df_bar_coupon

           destination  passenger weather  temperature  time coupon  \
9      No Urgent Place     Kid(s)   Sunny           80  10AM    Bar   
13                Home      Alone   Sunny           55   6PM    Bar   
17                Work      Alone   Sunny           55   7AM    Bar   
24     No Urgent Place  Friend(s)   Sunny           80  10AM    Bar   
35                Home      Alone   Sunny           55   6PM    Bar   
...                ...        ...     ...          ...   ...    ...   
12663  No Urgent Place  Friend(s)   Sunny           80  10PM    Bar   
12664  No Urgent Place  Friend(s)   Sunny           55  10PM    Bar   
12667  No Urgent Place      Alone   Rainy           55  10AM    Bar   
12670  No Urgent Place    Partner   Rainy           55   6PM    Bar   
12682             Work      Alone   Snowy           30   7AM    Bar   

      expiration  gender    age      maritalStatus  ...  coffeeHouse  \
9            24h  Female  21-25  Unmarried partner  ...        never   
13 

Unnamed: 0,destination,passenger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,coffeeHouse,carryAway,restaurantLessThan20,restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
9,No Urgent Place,Kid(s),Sunny,80,10AM,Bar,24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
13,Home,Alone,Sunny,55,6PM,Bar,24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,0,0,1,0,1
17,Work,Alone,Sunny,55,7AM,Bar,24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,1,0,1,0
24,No Urgent Place,Friend(s),Sunny,80,10AM,Bar,24h,Male,21-25,Single,...,less1,4-8,4~8,less1,1,0,0,0,1,1
35,Home,Alone,Sunny,55,6PM,Bar,24h,Male,21-25,Single,...,less1,4-8,4~8,less1,1,0,0,1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12663,No Urgent Place,Friend(s),Sunny,80,10PM,Bar,24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,1,0,0,1,0
12664,No Urgent Place,Friend(s),Sunny,55,10PM,Bar,2h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,1,0,0,1,0
12667,No Urgent Place,Alone,Rainy,55,10AM,Bar,24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,1,0,0,1,0
12670,No Urgent Place,Partner,Rainy,55,6PM,Bar,2h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,1,0,0,1,0


In [340]:
#Restaurant(<20)' 'Coffee House' 'Carry out & Take away' 'Bar' 'Restaurant(20-50)
df_coffee_coupon = data.query('coupon=="Coffee House"')
print(df_coffee_coupon)
df_coffee_coupon

           destination  passenger weather  temperature  time        coupon  \
1      No Urgent Place  Friend(s)   Sunny           80  10AM  Coffee House   
3      No Urgent Place  Friend(s)   Sunny           80   2PM  Coffee House   
4      No Urgent Place  Friend(s)   Sunny           80   2PM  Coffee House   
12     No Urgent Place     Kid(s)   Sunny           55   6PM  Coffee House   
15                Home      Alone   Sunny           80   6PM  Coffee House   
...                ...        ...     ...          ...   ...           ...   
12656             Home      Alone   Snowy           30  10PM  Coffee House   
12659             Work      Alone   Snowy           30   7AM  Coffee House   
12674             Home      Alone   Rainy           55  10PM  Coffee House   
12675             Home      Alone   Snowy           30  10PM  Coffee House   
12681             Work      Alone   Snowy           30   7AM  Coffee House   

      expiration  gender    age      maritalStatus  ...  coffee

Unnamed: 0,destination,passenger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,coffeeHouse,carryAway,restaurantLessThan20,restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
1,No Urgent Place,Friend(s),Sunny,80,10AM,Coffee House,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0
3,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
4,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
12,No Urgent Place,Kid(s),Sunny,55,6PM,Coffee House,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
15,Home,Alone,Sunny,80,6PM,Coffee House,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12656,Home,Alone,Snowy,30,10PM,Coffee House,2h,Male,31-35,Married partner,...,never,4-8,gt8,less1,1,1,0,0,1,0
12659,Work,Alone,Snowy,30,7AM,Coffee House,24h,Male,31-35,Married partner,...,never,4-8,gt8,less1,1,0,0,1,0,0
12674,Home,Alone,Rainy,55,10PM,Coffee House,2h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,1,0,0
12675,Home,Alone,Snowy,30,10PM,Coffee House,2h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,1,0,0,1,0


In [342]:
#Restaurant(<20)'  'Restaurant(20-50)
df_carryout_coupon = data.query('coupon=="Carry out & Take away"')
print(df_carryout_coupon)
df_carryout_coupon

           destination  passenger weather  temperature  time  \
2      No Urgent Place  Friend(s)   Sunny           80  10AM   
6      No Urgent Place  Friend(s)   Sunny           55   2PM   
8      No Urgent Place     Kid(s)   Sunny           80  10AM   
19                Work      Alone   Sunny           80   7AM   
25     No Urgent Place  Friend(s)   Sunny           80  10AM   
...                ...        ...     ...          ...   ...   
12665  No Urgent Place  Friend(s)   Sunny           30  10AM   
12672             Home      Alone   Sunny           80   6PM   
12673             Home      Alone   Sunny           30   6PM   
12679             Home    Partner   Rainy           55   6PM   
12680             Work      Alone   Rainy           55   7AM   

                      coupon expiration  gender    age      maritalStatus  \
2      Carry out & Take away         2h  Female  21-25  Unmarried partner   
6      Carry out & Take away        24h  Female  21-25  Unmarried partner   


Unnamed: 0,destination,passenger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,coffeeHouse,carryAway,restaurantLessThan20,restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
2,No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
6,No Urgent Place,Friend(s),Sunny,55,2PM,Carry out & Take away,24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
8,No Urgent Place,Kid(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
19,Work,Alone,Sunny,80,7AM,Carry out & Take away,2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,0,0,1,0,1
25,No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Male,21-25,Single,...,less1,4-8,4~8,less1,1,1,0,0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12665,No Urgent Place,Friend(s),Sunny,30,10AM,Carry out & Take away,2h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,0,1,1
12672,Home,Alone,Sunny,80,6PM,Carry out & Take away,2h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,1,0,1,0,0
12673,Home,Alone,Sunny,30,6PM,Carry out & Take away,24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,0,1,0
12679,Home,Partner,Rainy,55,6PM,Carry out & Take away,24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,1,0,1


In [346]:
# restaurant coupons for less than $20
df_restaurant_lessthan20_coupon = data.query('coupon=="Restaurant(<20)"')
print(df_restaurant_lessthan20_coupon)
df_restaurant_lessthan20_coupon

           destination  passenger weather  temperature  time           coupon  \
0      No Urgent Place      Alone   Sunny           55   2PM  Restaurant(<20)   
5      No Urgent Place  Friend(s)   Sunny           80   6PM  Restaurant(<20)   
7      No Urgent Place     Kid(s)   Sunny           80  10AM  Restaurant(<20)   
10     No Urgent Place     Kid(s)   Sunny           80   2PM  Restaurant(<20)   
11     No Urgent Place     Kid(s)   Sunny           55   2PM  Restaurant(<20)   
...                ...        ...     ...          ...   ...              ...   
12666  No Urgent Place  Friend(s)   Snowy           30   2PM  Restaurant(<20)   
12668  No Urgent Place      Alone   Sunny           80  10AM  Restaurant(<20)   
12671  No Urgent Place    Partner   Snowy           30  10AM  Restaurant(<20)   
12677             Home    Partner   Sunny           30   6PM  Restaurant(<20)   
12678             Home    Partner   Sunny           30  10PM  Restaurant(<20)   

      expiration  gender   

Unnamed: 0,destination,passenger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,coffeeHouse,carryAway,restaurantLessThan20,restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
0,No Urgent Place,Alone,Sunny,55,2PM,Restaurant(<20),24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,1
5,No Urgent Place,Friend(s),Sunny,80,6PM,Restaurant(<20),2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
7,No Urgent Place,Kid(s),Sunny,80,10AM,Restaurant(<20),2h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
10,No Urgent Place,Kid(s),Sunny,80,2PM,Restaurant(<20),24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,1
11,No Urgent Place,Kid(s),Sunny,55,2PM,Restaurant(<20),24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12666,No Urgent Place,Friend(s),Snowy,30,2PM,Restaurant(<20),24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,0,1,1
12668,No Urgent Place,Alone,Sunny,80,10AM,Restaurant(<20),2h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,0,1,1
12671,No Urgent Place,Partner,Snowy,30,10AM,Restaurant(<20),24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,0,1,1
12677,Home,Partner,Sunny,30,6PM,Restaurant(<20),24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,1,1,0,1,1


In [347]:
#restaurant coupon for more than $20 and less than $50
df_restaurant_20_to_50_coupon = data.query('coupon=="Restaurant(20-50)"')
print(df_restaurant_20_to_50_coupon)
df_restaurant_20_to_50_coupon

           destination passenger weather  temperature  time  \
14                Home     Alone   Sunny           55   6PM   
18                Work     Alone   Sunny           80   7AM   
36                Home     Alone   Sunny           55   6PM   
40                Work     Alone   Sunny           80   7AM   
58                Home     Alone   Sunny           55   6PM   
...                ...       ...     ...          ...   ...   
12657             Home     Alone   Sunny           80   6PM   
12661             Work     Alone   Sunny           80   7AM   
12669  No Urgent Place   Partner   Sunny           30  10AM   
12676             Home     Alone   Sunny           80   6PM   
12683             Work     Alone   Sunny           80   7AM   

                  coupon expiration  gender    age      maritalStatus  ...  \
14     Restaurant(20-50)        24h  Female  21-25  Unmarried partner  ...   
18     Restaurant(20-50)        24h  Female  21-25  Unmarried partner  ...   
36     Re

Unnamed: 0,destination,passenger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,coffeeHouse,carryAway,restaurantLessThan20,restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
14,Home,Alone,Sunny,55,6PM,Restaurant(20-50),24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
18,Work,Alone,Sunny,80,7AM,Restaurant(20-50),24h,Female,21-25,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
36,Home,Alone,Sunny,55,6PM,Restaurant(20-50),24h,Male,21-25,Single,...,less1,4-8,4~8,less1,1,1,0,0,1,0
40,Work,Alone,Sunny,80,7AM,Restaurant(20-50),24h,Male,21-25,Single,...,less1,4-8,4~8,less1,1,1,0,0,1,0
58,Home,Alone,Sunny,55,6PM,Restaurant(20-50),24h,Male,46-50,Single,...,4-8,1-3,1~3,never,1,1,0,0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12657,Home,Alone,Sunny,80,6PM,Restaurant(20-50),24h,Male,31-35,Married partner,...,never,4-8,gt8,less1,1,0,0,1,0,0
12661,Work,Alone,Sunny,80,7AM,Restaurant(20-50),2h,Male,31-35,Married partner,...,never,4-8,gt8,less1,1,0,0,1,0,0
12669,No Urgent Place,Partner,Sunny,30,10AM,Restaurant(20-50),24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,0,1,1
12676,Home,Alone,Sunny,80,6PM,Restaurant(20-50),24h,Male,26-30,Single,...,never,1-3,4~8,1~3,1,0,0,1,0,1


In [359]:
print(len(data[data['Y']==0]))

5474


2. What proportion of bar coupons were accepted?


In [343]:
bar_coupon_accepted_proportion = round((len(df_bar_coupon[df_bar_coupon['Y']==1])/len(df_bar_coupon))*100,2)
print(bar_coupon_accepted_proportion)

41.0


In [344]:
coffee_coupon_accepted_proportion = round((len(df_coffee_coupon[df_coffee_coupon['Y']==1])/len(df_coffee_coupon))*100,2)
print(coffee_coupon_accepted_proportion)

49.92


In [345]:
carryout_coupon_accepted_proportion = round((len(df_carryout_coupon[df_carryout_coupon['Y']==1])/len(df_carryout_coupon))*100,2)
print(carryout_coupon_accepted_proportion)

73.55


In [348]:
restaurant_less_than_20_coupon_accepted_proportion = round((len(df_restaurant_lessthan20_coupon[df_restaurant_lessthan20_coupon['Y']==1])/len(df_restaurant_lessthan20_coupon))*100,2)
print(restaurant_less_than_20_coupon_accepted_proportion)

70.71


In [349]:
restaurant_more_than_20_less_than_50_coupon_accepted_proportion = round((len(df_restaurant_20_to_50_coupon[df_restaurant_20_to_50_coupon['Y']==1])/len(df_restaurant_20_to_50_coupon))*100,2)
print(restaurant_more_than_20_less_than_50_coupon_accepted_proportion)

44.1


3. Compare the acceptance rate between those who went to a bar 3 or fewer times a month to those who went more.


In [333]:
bar_acceptance = round((len(data[data['bar']=='less1']) + len(data[data['bar']=='1~3']))/(len(data[data['bar']=='4~8']) + len(data[data['bar']=='gt8']))*100,2)
print(bar_acceptance)


417.89


4. Compare the acceptance rate between drivers who go to a bar more than once a month and are over the age of 25 to the all others.  Is there a difference?


In [336]:
drivers_bar_more_than_once_and_age_over_25=len(data.query(('(bar=="1~3" | bar=="4~8" | bar=="gt8") & (age=="26-30" | age=="31-35" | age=="36~40" | age=="41-45" | age=="46-50" | age=="50plus")')))
drivers_bar_less_than_once_and_age_less_than_25 = len(data.query(('(bar=="less1" | bar=="never") & (age=="below21" | age=="21-25")')))

driver_bar_more_than_once_verses_less = round((drivers_bar_more_than_once_and_age_over_25/drivers_bar_less_than_once_and_age_less_than_25)*100,2)
print(driver_bar_more_than_once_verses_less)
if driver_bar_more_than_once_verses_less > 100:
      print("More drivers aged > 25 years go to bar more than once a month than less than 25 years ")
else:
      print("Less drivers aged > 25 years go to bar more than once a month than less than 25 years ")

121.32
More drivers aged > 25 years go to bar more than once a month than less than 25 years 


5. Use the same process to compare the acceptance rate between drivers who go to bars more than once a month and had passengers that were not a kid and had occupations other than farming, fishing, or forestry. 


In [337]:
drivers_bar_more_than_once_and_not_kid_and_not_farming=len(data.query(('(bar=="1~3" | bar=="4~8" | bar=="gt8") & \
(passenger!="Kid(s)") & \
(occupation!="Farming Fishing & Forestry")')))

drivers_bar_more_than_once_and_kid_and_farming=len(data.query(('(bar=="less1" | bar=="never") & \
(passenger=="Kid(s)") & \
(occupation=="Farming Fishing & Forestry")')))

acceptance_rate_drivers_bar_more_than_once = round((drivers_bar_more_than_once_and_not_kid_and_not_farming/drivers_bar_more_than_once_and_kid_and_farming)*100,2)
print(acceptance_rate_drivers_bar_more_than_once)

if acceptance_rate_drivers_bar_more_than_once > 100:
      print("More drivers without kid passengers and not employed in farming/fishing/forestry went to bar ")
else:
      print("More drivers without kid passengers and not employed in farming/fishing/forestry went to bar")

36960.0
More drivers without kid passengers and not employed in farming/fishing/forestry went to bar 


6. Compare the acceptance rates between those drivers who:

- go to bars more than once a month, had passengers that were not a kid, and were not widowed *OR*
- go to bars more than once a month and are under the age of 30 *OR*
- go to cheap restaurants more than 4 times a month and income is less than 50K. 



In [338]:
bar_more_than_once_a_month_not_kid_passenger_not_widowed = len(data.query(('((bar=="1~3" | bar=="4~8" | bar=="gt8") & \
                             (passenger!="Kid(s)") & (maritalStatus!="Widowed")) | ((bar!="less1" | bar!="never") &  (age=="below21" | age=="21-25")) | ((restaurantLessThan20=="4~8" | restaurantLessThan20=="gt8") & (income=="Less than $12500" | income=="$12500 - $24999" | income=="$25000 - $37499" | income=="$37500 - $49999"))')))


bar_not_more_than_once_a_month_not_kid_passenger_not_widowed = len(data.query(('not ((bar=="1~3" | bar=="4~8" | bar=="gt8") & \
                             (passenger!="Kid(s)") & (maritalStatus!="Widowed")) | ((bar!="less1" | bar!="never") &  (age=="below21" | age=="21-25")) | ((restaurantLessThan20=="4~8" | restaurantLessThan20=="gt8") & (income=="Less than $12500" | income=="$12500 - $24999" | income=="$25000 - $37499" | income=="$37500 - $49999"))')))


acceptance_rate2 = round((bar_more_than_once_a_month_not_kid_passenger_not_widowed/bar_not_more_than_once_a_month_not_kid_passenger_not_widowed)*100,2)
print(acceptance_rate2)

63.04


7.  Based on these observations, what do you hypothesize about drivers who accepted the bar coupons?

# OBSERVATIONS

We had a total of 12,684 coupons which the drivers can avail. Out of this 7210 drivers used the coupon, whereas 5474 drivers did not use.
Drivers used Carryout coupons more than any other coupon categories. Here are the accepted coupons categories and their effectiveness.
Carry out & Take away : 73.55% used
Restaurant(<20)       : 70.71% used
Coffee House          : 49.92% used
Restaurant(>20>50)    : 44.71% used
Bar                   : 41% used

In effect, bar coupons were least used by the drivers.

### Independent Investigation

Using the bar coupon example as motivation, you are to explore one of the other coupon groups and try to determine the characteristics of passengers who accept the coupons.  