### Will a Customer Accept the Coupon?

**Context**

Imagine driving through town and a coupon is delivered to your cell phone for a restaraunt near where you are driving. Would you accept that coupon and take a short detour to the restaraunt? Would you accept the coupon but use it on a sunbsequent trip? Would you ignore the coupon entirely? What if the coupon was for a bar instead of a restaraunt? What about a coffee house? Would you accept a bar coupon with a minor passenger in the car? What about if it was just you and your partner in the car? Would weather impact the rate of acceptance? What about the time of day?

Obviously, proximity to the business is a factor on whether the coupon is delivered to the driver or not, but what are the factors that determine whether a driver accepts the coupon once it is delivered to them? How would you determine whether a driver is likely to accept a coupon?

**Overview**

The goal of this project is to use what you know about visualizations and probability distributions to distinguish between customers who accepted a driving coupon versus those that did not.

**Data**

This data comes to us from the UCI Machine Learning repository and was collected via a survey on Amazon Mechanical Turk. The survey describes different driving scenarios including the destination, current time, weather, passenger, etc., and then ask the person whether he will accept the coupon if he is the driver. Answers that the user will drive there ‘right away’ or ‘later before the coupon expires’ are labeled as ‘Y = 1’ and answers ‘no, I do not want the coupon’ are labeled as ‘Y = 0’.  There are five different types of coupons -- less expensive restaurants (under \\$20), coffee houses, carry out & take away, bar, and more expensive restaurants (\\$20 - \\$50). 

**Deliverables**

Your final product should be a brief report that highlights the differences between customers who did and did not accept the coupons.  To explore the data you will utilize your knowledge of plotting, statistical summaries, and visualization using Python. You will publish your findings in a public facing github repository as your first portfolio piece. 





### Data Description
Keep in mind that these values mentioned below are average values.

The attributes of this data set include:
1. User attributes
    -  Gender: male, female
    -  Age: below 21, 21 to 25, 26 to 30, etc.
    -  Marital Status: single, married partner, unmarried partner, or widowed
    -  Number of children: 0, 1, or more than 1
    -  Education: high school, bachelors degree, associates degree, or graduate degree
    -  Occupation: architecture & engineering, business & financial, etc.
    -  Annual income: less than \\$12500, \\$12500 - \\$24999, \\$25000 - \\$37499, etc.
    -  Number of times that he/she goes to a bar: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    -  Number of times that he/she buys takeaway food: 0, less than 1, 1 to 3, 4 to 8 or greater
    than 8
    -  Number of times that he/she goes to a coffee house: 0, less than 1, 1 to 3, 4 to 8 or
    greater than 8
    -  Number of times that he/she eats at a restaurant with average expense less than \\$20 per
    person: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    -  Number of times that he/she goes to a bar: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    

2. Contextual attributes
    - Driving destination: home, work, or no urgent destination
    - Location of user, coupon and destination: we provide a map to show the geographical
    location of the user, destination, and the venue, and we mark the distance between each
    two places with time of driving. The user can see whether the venue is in the same
    direction as the destination.
    - Weather: sunny, rainy, or snowy
    - Temperature: 30F, 55F, or 80F
    - Time: 10AM, 2PM, or 6PM
    - Passenger: alone, partner, kid(s), or friend(s)


3. Coupon attributes
    - time before it expires: 2 hours or one day

In [1]:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import plotly.express as px

### Problems

Use the prompts below to get started with your data analysis.  

1. Read in the `coupons.csv` file.




In [2]:
data = pd.read_csv('data/coupons.csv')

In [3]:
data.head()

Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
0,No Urgent Place,Alone,Sunny,55,2PM,Restaurant(<20),1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,1
1,No Urgent Place,Friend(s),Sunny,80,10AM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0
2,No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
3,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
4,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0


2. Investigate the dataset for missing or problematic data.

I have devided data into three categories.
1. Consumer demographics
2. consumer current preferences and behaviours
3. Coupon acceptance rate

Step 0:  Investigate data
To easy way investigate data, i have used excel file filter funtion that can gives us different dat anomalies and different values. I have used value_count function of python as well to understand different data attirbutes.

Step 1.  Missing and problamatic data
Based on observation,  I don't see any data massaging needed for consumer demographic data. Data attributes associated with consumer current preferences has two type of issues datatype conversion and handling range values. To, solve range values i have used average of the range to simplify our query operations and graphing. 


    

Step 2. Used coding to for range value
-    NAN - represented as -1
-   <1 =  1
-   Never = 0
-   1~3 = 1.5
-   4-8 = 6
-   >8 =  8

In [4]:
# demographic data age,sex,marital status etc has no missing or problematic data
# consumer current preferences have data issue and it is presented as range. This data is very important to understand
# consumer's current behaviour which will lead to their decision of consuming coupon or not
# from that point we need to understand under which condition consumer will more likely to consume coupons.
print(data.value_counts('CoffeeHouse'))
print(data.value_counts('CarryAway'))
print(data.value_counts('RestaurantLessThan20'))
print(data.value_counts('Restaurant20To50'))


print(data.value_counts('gender'))




CoffeeHouse
less1    3385
1~3      3225
never    2962
4~8      1784
gt8      1111
Name: count, dtype: int64
CarryAway
1~3      4672
4~8      4258
less1    1856
gt8      1594
never     153
Name: count, dtype: int64
RestaurantLessThan20
1~3      5376
4~8      3580
less1    2093
gt8      1285
never     220
Name: count, dtype: int64
Restaurant20To50
less1    6077
1~3      3290
never    2136
4~8       728
gt8       264
Name: count, dtype: int64
gender
Female    6511
Male      6173
Name: count, dtype: int64


3. Decide what to do about your missing data -- drop, replace, other...

In [5]:
df_y_1=data.replace('1~3','1.5',regex=True).replace('never','0').replace('less1','0.5').replace('gt8','8').replace('4~8','6')
df_y_1=df_y_1.replace('50plus',51).replace('below21',19).fillna(-1)

# convert datatype

conver_datatype = {'age':int,'CarryAway':float,'RestaurantLessThan20':float,'Restaurant20To50':float}
df_y_1=df_y_1.astype(conver_datatype)
df_y_1.head(10)

Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
0,No Urgent Place,Alone,Sunny,55,2PM,Restaurant(<20),1d,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,0,0,0,1,1
1,No Urgent Place,Friend(s),Sunny,80,10AM,Coffee House,2h,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,0,0,0,1,0
2,No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,1
3,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,2h,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,0
4,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,1d,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,0
5,No Urgent Place,Friend(s),Sunny,80,6PM,Restaurant(<20),2h,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,1
6,No Urgent Place,Friend(s),Sunny,55,2PM,Carry out & Take away,1d,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,1
7,No Urgent Place,Kid(s),Sunny,80,10AM,Restaurant(<20),2h,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,1
8,No Urgent Place,Kid(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,1
9,No Urgent Place,Kid(s),Sunny,80,10AM,Bar,1d,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,0


4. What proportion of the total observations chose to accept the coupon? 



Findings:  
            56.84% of observations accepted coupons of some kind at the sametime 43.5% of observations not accepted coupons

In [6]:
counon_accepted_percent = ((data.query("Y==1").size)/data.size)*100
counon_not_accepted_percent = ((data.query("Y==0").size)/data.size)*100
print('counon_accepted_percent=',counon_accepted_percent)
print('counon_not_accepted_percent=',counon_not_accepted_percent)



counon_accepted_percent= 56.84326710816777
counon_not_accepted_percent= 43.15673289183223


5. Use a bar plot to visualize the `coupon` column.


Findings: 
        5474 consumers not used the coupon. 7210 consumers used coupon.

In [7]:
coupon_cnt_by_usage = df_y_1.groupby('Y')['Y'].count().reset_index(name='Count')
coupon_cnt_by_usage=coupon_cnt_by_usage.rename(columns={'Y':'Coupon Status'})
coupon_cnt_by_usage=coupon_cnt_by_usage.replace(0,'Not Used',regex=True).replace(1,'Used')
print(coupon_cnt_by_usage)
#sns.barplot(coupon_cnt_by_gender,x='Coupon Status',y='Count')
px.bar(coupon_cnt_by_usage,x='Coupon Status',y='Count')


  Coupon Status  Count
0      Not Used   5474
1          Used   7210


6. Use a histogram to visualize the temperature column.

Findings:
            Most of the observation recorded when temprature was 80 degree.

In [8]:
px.histogram(df_y_1,x='temperature')
#print(df_y_1['temperature'].value_counts() )

Findings:
            Data indicates that most of the coupons accepted when temprature was 80 degree. at the sametime most of the coupons were not accepted when temprature was 80 degree.  So, i don't see effect of temprature on concumers decision of accepting coupons.

In [9]:
px.histogram(df_y_1.query("Y==1"),x='temperature')

px.histogram(df_y_1.query("Y==0"),x='temperature')

F

**Investigating the Bar Coupons**

Now, we will lead you through an exploration of just the bar related coupons.  

1. Create a new `DataFrame` that contains just the bar coupons.


In [10]:
bar_coupon=df_y_1.query("coupon == 'Bar'")
bar_coupon

Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
9,No Urgent Place,Kid(s),Sunny,80,10AM,Bar,1d,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,0,0,1,0
13,Home,Alone,Sunny,55,6PM,Bar,1d,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,0,0,1,0,1
17,Work,Alone,Sunny,55,7AM,Bar,1d,Female,21,Unmarried partner,...,0,-1.0,6.0,1.5,1,1,1,0,1,0
24,No Urgent Place,Friend(s),Sunny,80,10AM,Bar,1d,Male,21,Single,...,0.5,6.0,6.0,0.5,1,0,0,0,1,1
35,Home,Alone,Sunny,55,6PM,Bar,1d,Male,21,Single,...,0.5,6.0,6.0,0.5,1,0,0,1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
12663,No Urgent Place,Friend(s),Sunny,80,10PM,Bar,1d,Male,26,Single,...,0,1.5,6.0,1.5,1,1,0,0,1,0
12664,No Urgent Place,Friend(s),Sunny,55,10PM,Bar,2h,Male,26,Single,...,0,1.5,6.0,1.5,1,1,0,0,1,0
12667,No Urgent Place,Alone,Rainy,55,10AM,Bar,1d,Male,26,Single,...,0,1.5,6.0,1.5,1,1,0,0,1,0
12670,No Urgent Place,Partner,Rainy,55,6PM,Bar,2h,Male,26,Single,...,0,1.5,6.0,1.5,1,1,0,0,1,0


2. What proportion of bar coupons were accepted?


What proportion of bar coupons were accepted?

Finding:   
- Bar Coupon Usage  Count
- 0         Not Used   1190
- 1             Used    827

Closed to 60% of bar coupons were not accepted whereas 40% of bar coupons accepted.

In [11]:
bar_cup_usage = bar_coupon.groupby('Y')['Y'].count().reset_index(name='Count')
bar_cup_usage=bar_cup_usage.rename(columns={'Y':'Bar Coupon Usage'})
bar_cup_usage=bar_cup_usage.replace(0,'Not Used',regex=True).replace(1,'Used')
print(bar_cup_usage)
#sns.barplot(coupon_cnt_by_gender,x='Bar Coupon Usage',y='Count')
px.bar(bar_cup_usage,x='Bar Coupon Usage',y='Count')

#   Total = 2017 , 1190 bar coupon not used and 827 bar coupon used.

  Bar Coupon Usage  Count
0         Not Used   1190
1             Used    827


3. Compare the acceptance rate between those who went to a bar 3 or fewer times a month to those who went more.


Findings:
        Consumers going to bar fewer than 3 and bar visits more than 3 has coupon acceptance rate of 81% and 19.6% respectively. 

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Bar</th>
      <th>Y</th>
      <th>Count</th>
      <th>Total</th>
      <th>Acceptance Rate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>1</th>
      <td>0</td>
      <td>1</td>
      <td>156</td>
      <td>819</td>
      <td>19.047619</td>
    </tr>
    <tr>
      <th>3</th>
      <td>0-1</td>
      <td>1</td>
      <td>253</td>
      <td>819</td>
      <td>30.891331</td>
    </tr>
    <tr>
      <th>5</th>
      <td>1-3</td>
      <td>1</td>
      <td>257</td>
      <td>819</td>
      <td>31.379731</td>
    </tr>
    <tr>
      <th>7</th>
      <td>4-8</td>
      <td>1</td>
      <td>117</td>
      <td>819</td>
      <td>14.285714</td>
    </tr>
    <tr>
      <th>9</th>
      <td>>8</td>
      <td>1</td>
      <td>36</td>
      <td>819</td>
      <td>4.395604</td>
    </tr>
  </tbody>
</table>
</div>

In [12]:
usage_by_bar_visits = bar_coupon.groupby(['Bar','Y'])['Y'].count().reset_index(name='Count').query("Y==1")
total_bar_coupon_accepted = usage_by_bar_visits['Count'].sum()
print(total_bar_coupon_accepted)
usage_by_bar_visits['Total']=total_bar_coupon_accepted
usage_by_bar_visits['Acceptance Rate'] = (usage_by_bar_visits['Count']/usage_by_bar_visits['Total'])*100
usage_by_bar_visits



827


Unnamed: 0,Bar,Y,Count,Total,Acceptance Rate
1,-1.0,1,8,827,0.967352
3,0.0,1,156,827,18.863362
5,0.5,1,253,827,30.592503
7,1.5,1,257,827,31.076179
9,6.0,1,117,827,14.147521
11,8.0,1,36,827,4.353083


4. Compare the acceptance rate between drivers who go to a bar more than once a month and are over the age of 25 to the all others.  Is there a difference?


Findings:  

Drivers with age over 25 and 1-3 times bar visits has hest base coupon acceptance rate of 32.69%. Approximately same acceptance rate noticed for drivers with age over 25 and 0-1 visits.

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Bar</th>
      <th>Y</th>
      <th>Count</th>
      <th>Total</th>
      <th>Acceptance Rate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>1</th>
      <td>0</td>
      <td>1</td>
      <td>97</td>
      <td>572</td>
      <td>16.958042</td>
    </tr>
    <tr>
      <th>3</th>
      <td>0-1</td>
      <td>1</td>
      <td>183</td>
      <td>572</td>
      <td>31.993007</td>
    </tr>
    <tr>
      <th>5</th>
      <td>1-3</td>
      <td>1</td>
      <td>187</td>
      <td>572</td>
      <td>32.692308</td>
    </tr>
    <tr>
      <th>7</th>
      <td>4-8</td>
      <td>1</td>
      <td>84</td>
      <td>572</td>
      <td>14.685315</td>
    </tr>
    <tr>
      <th>9</th>
      <td>>8</td>
      <td>1</td>
      <td>21</td>
      <td>572</td>
      <td>3.671329</td>
    </tr>
  </tbody>
</table>
</div>

In [13]:
usage_by_bar_visits_25 = bar_coupon.query("age > 25").groupby(['Bar','Y'])['Y'].count().reset_index(name='Count').query("Y==1")
total_bar_coupon_accepted = usage_by_bar_visits_25['Count'].sum()
print(total_bar_coupon_accepted)
usage_by_bar_visits_25['Total']=total_bar_coupon_accepted
usage_by_bar_visits_25['Acceptance Rate'] = (usage_by_bar_visits_25['Count']/usage_by_bar_visits_25['Total'])*100
usage_by_bar_visits_25

580


Unnamed: 0,Bar,Y,Count,Total,Acceptance Rate
1,-1.0,1,8,580,1.37931
3,0.0,1,97,580,16.724138
5,0.5,1,183,580,31.551724
7,1.5,1,187,580,32.241379
9,6.0,1,84,580,14.482759
11,8.0,1,21,580,3.62069


5. Use the same process to compare the acceptance rate between drivers who go to bars more than once a month and had passengers that were not a kid and had occupations other than farming, fishing, or forestry. 


Findings:

the total acceptance rate between drivers who go to bars more than once a month and had passengers that were not a kid and had occupations other than farming, fishing, or forestry is closed to 80%

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Bar</th>
      <th>Y</th>
      <th>Count</th>
      <th>Total</th>
      <th>Acceptance Rate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>1</th>
      <td>0</td>
      <td>1</td>
      <td>156</td>
      <td>815</td>
      <td>19.141104</td>
    </tr>
    <tr>
      <th>3</th>
      <td>0-1</td>
      <td>1</td>
      <td>249</td>
      <td>815</td>
      <td>30.552147</td>
    </tr>
    <tr>
      <th>5</th>
      <td>1-3</td>
      <td>1</td>
      <td>257</td>
      <td>815</td>
      <td>31.533742</td>
    </tr>
    <tr>
      <th>7</th>
      <td>4-8</td>
      <td>1</td>
      <td>117</td>
      <td>815</td>
      <td>14.355828</td>
    </tr>
    <tr>
      <th>9</th>
      <td>>8</td>
      <td>1</td>
      <td>36</td>
      <td>815</td>
      <td>4.417178</td>
    </tr>
  </tbody>
</table>
</div>

In [14]:
usage_by_bar_visits_no_kids_occu = bar_coupon.query(("has_children == '0'") and ("occupation != 'Farming Fishing & Forestry'")).groupby(['Bar','Y'])['Y'].count().reset_index(name='Count').query("Y==1")
total_bar_coupon_accepted = usage_by_bar_visits_no_kids_occu['Count'].sum()
print(total_bar_coupon_accepted)
usage_by_bar_visits_no_kids_occu['Total']=total_bar_coupon_accepted
usage_by_bar_visits_no_kids_occu['Acceptance Rate'] = (usage_by_bar_visits_no_kids_occu['Count']/usage_by_bar_visits_no_kids_occu['Total'])*100
usage_by_bar_visits_no_kids_occu

823


Unnamed: 0,Bar,Y,Count,Total,Acceptance Rate
1,-1.0,1,8,823,0.972053
3,0.0,1,156,823,18.955043
5,0.5,1,249,823,30.255164
7,1.5,1,257,823,31.227217
9,6.0,1,117,823,14.216282
11,8.0,1,36,823,4.374241


6. Compare the acceptance rates between those drivers who:

- go to bars more than once a month, had passengers that were not a kid, and were not widowed *OR*

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Bar</th>
      <th>Y</th>
      <th>Count</th>
      <th>Total</th>
      <th>Acceptance Rate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>1</th>
      <td>0</td>
      <td>1</td>
      <td>149</td>
      <td>812</td>
      <td>18.349754</td>
    </tr>
    <tr>
      <th>3</th>
      <td>0.5</td>
      <td>1</td>
      <td>253</td>
      <td>812</td>
      <td>31.157635</td>
    </tr>
    <tr>
      <th>5</th>
      <td>1.5</td>
      <td>1</td>
      <td>257</td>
      <td>812</td>
      <td>31.650246</td>
    </tr>
    <tr>
      <th>7</th>
      <td>6</td>
      <td>1</td>
      <td>117</td>
      <td>812</td>
      <td>14.408867</td>
    </tr>
    <tr>
      <th>9</th>
      <td>8</td>
      <td>1</td>
      <td>36</td>
      <td>812</td>
      <td>4.433498</td>
    </tr>
  </tbody>
</table>
</div>

- go to bars more than once a month and are under the age of 30 *OR*

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>Bar</th>
      <th>Y</th>
      <th>Count</th>
      <th>Total</th>
      <th>Acceptance Rate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>1</th>
      <td>0</td>
      <td>1</td>
      <td>81</td>
      <td>437</td>
      <td>18.535469</td>
    </tr>
    <tr>
      <th>3</th>
      <td>0.5</td>
      <td>1</td>
      <td>107</td>
      <td>437</td>
      <td>24.485126</td>
    </tr>
    <tr>
      <th>5</th>
      <td>1.5</td>
      <td>1</td>
      <td>139</td>
      <td>437</td>
      <td>31.807780</td>
    </tr>
    <tr>
      <th>7</th>
      <td>6</td>
      <td>1</td>
      <td>80</td>
      <td>437</td>
      <td>18.306636</td>
    </tr>
    <tr>
      <th>9</th>
      <td>8</td>
      <td>1</td>
      <td>30</td>
      <td>437</td>
      <td>6.864989</td>
    </tr>
  </tbody>
</table>
</div>

- go to cheap restaurants more than 4 times a month and income is less than 50K. 

<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>RestaurantLessThan20</th>
      <th>Y</th>
      <th>Count</th>
      <th>Total</th>
      <th>Acceptance Rate</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>1</th>
      <td>6.0</td>
      <td>1</td>
      <td>924</td>
      <td>173966</td>
      <td>0.531138</td>
    </tr>
    <tr>
      <th>3</th>
      <td>8.0</td>
      <td>1</td>
      <td>445</td>
      <td>173966</td>
      <td>0.255797</td>
    </tr>
  </tbody>
</table>
</div>


Findings:  Consumers going to bars more than once a month, had passengers that were not a kid, and were not widowed vs. going bars more than once a month and are under the age of 30 showing similar acceptance rate patterns in which visiting bar 1-3 times a month has highest acceptance rate of 31%. Interestingly, cosumers who go to cheap resturants more than 4 time with salary less than 50 has lowest coupon acceptance rate compare to above two conditions.

In [15]:
# go to bars more than once a month, had passengers that were not a kid, and were not widowed *OR*

usage_by_bar_visits_no_kids_widow = bar_coupon.query(("has_children == '0'") and ("maritalStatus != 'Widowed'")).groupby(['Bar','Y'])['Y'].count().reset_index(name='Count').query("Y==1")
total_bar_coupon_accepted = usage_by_bar_visits_no_kids_widow['Count'].sum()
print(total_bar_coupon_accepted)
usage_by_bar_visits_no_kids_widow['Total']=total_bar_coupon_accepted
usage_by_bar_visits_no_kids_widow['Acceptance Rate'] = (usage_by_bar_visits_no_kids_widow['Count']/usage_by_bar_visits_no_kids_widow['Total'])*100
usage_by_bar_visits_no_kids_widow

820


Unnamed: 0,Bar,Y,Count,Total,Acceptance Rate
1,-1.0,1,8,820,0.97561
3,0.0,1,149,820,18.170732
5,0.5,1,253,820,30.853659
7,1.5,1,257,820,31.341463
9,6.0,1,117,820,14.268293
11,8.0,1,36,820,4.390244


In [16]:
#go to bars more than once a month and are under the age of 30 *OR*

usage_by_bar_visits_below_30 = bar_coupon.query(("age < 30")).group-by(['Bar','Y'])['Y'].count().reset_index(name='Count').query("Y==1")
total_bar_coupon_accepted = usage_by_bar_visits_below_30['Count'].sum()
print(total_bar_coupon_accepted)
usage_by_bar_visits_below_30['Total']=total_bar_coupon_accepted
usage_by_bar_visits_below_30['Acceptance Rate'] = (usage_by_bar_visits_below_30['Count']/usage_by_bar_visits_below_30['Total'])*100
usage_by_bar_visits_below_30

AttributeError: 'DataFrame' object has no attribute 'group'

In [None]:
#go to cheap restaurants more than 4 times a month and income is less than 50K. 
cheap_res_income_less50k=df_y_1.query("income in ('$37500 - $49999','$12500 - $24999','Less than $12500','$25000 - $37499')")
Cheapres_plus_coupon_use=cheap_res_income_less50k.query("RestaurantLessThan20 > 4").groupby(['RestaurantLessThan20','Y'])['Y'].count().reset_index(name='Count').query("Y==1")

Cheapres_plus_coupon_use['Total']=cheap_res_income_less50k.size
Cheapres_plus_coupon_use['Acceptance Rate'] = (Cheapres_plus_coupon_use['Count']/Cheapres_plus_coupon_use['Total'])*100
Cheapres_plus_coupon_use



Unnamed: 0,RestaurantLessThan20,Y,Count,Total,Acceptance Rate
1,6.0,1,924,173966,0.531138
3,8.0,1,445,173966,0.255797


7.  Based on these observations, what do you hypothesize about drivers who accepted the bar coupons?

### Independent Investigation

Using the bar coupon example as motivation, you are to explore one of the other coupon groups and try to determine the characteristics of passengers who accept the coupons.  