### Will a Customer Accept the Coupon?

**Context**

Imagine driving through town and a coupon is delivered to your cell phone for a restaraunt near where you are driving. Would you accept that coupon and take a short detour to the restaraunt? Would you accept the coupon but use it on a sunbsequent trip? Would you ignore the coupon entirely? What if the coupon was for a bar instead of a restaraunt? What about a coffee house? Would you accept a bar coupon with a minor passenger in the car? What about if it was just you and your partner in the car? Would weather impact the rate of acceptance? What about the time of day?

Obviously, proximity to the business is a factor on whether the coupon is delivered to the driver or not, but what are the factors that determine whether a driver accepts the coupon once it is delivered to them? How would you determine whether a driver is likely to accept a coupon?

**Overview**

The goal of this project is to use what you know about visualizations and probability distributions to distinguish between customers who accepted a driving coupon versus those that did not.

**Data**

This data comes to us from the UCI Machine Learning repository and was collected via a survey on Amazon Mechanical Turk. The survey describes different driving scenarios including the destination, current time, weather, passenger, etc., and then ask the person whether he will accept the coupon if he is the driver. Answers that the user will drive there ‘right away’ or ‘later before the coupon expires’ are labeled as ‘Y = 1’ and answers ‘no, I do not want the coupon’ are labeled as ‘Y = 0’.  There are five different types of coupons -- less expensive restaurants (under \\$20), coffee houses, carry out & take away, bar, and more expensive restaurants (\\$20 - \\$50). 

**Deliverables**

Your final product should be a brief report that highlights the differences between customers who did and did not accept the coupons.  To explore the data you will utilize your knowledge of plotting, statistical summaries, and visualization using Python. You will publish your findings in a public facing github repository as your first portfolio piece. 





### Data Description
Keep in mind that these values mentioned below are average values.

The attributes of this data set include:
1. User attributes
    -  Gender: male, female
    -  Age: below 21, 21 to 25, 26 to 30, etc.
    -  Marital Status: single, married partner, unmarried partner, or widowed
    -  Number of children: 0, 1, or more than 1
    -  Education: high school, bachelors degree, associates degree, or graduate degree
    -  Occupation: architecture & engineering, business & financial, etc.
    -  Annual income: less than \\$12500, \\$12500 - \\$24999, \\$25000 - \\$37499, etc.
    -  Number of times that he/she goes to a bar: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    -  Number of times that he/she buys takeaway food: 0, less than 1, 1 to 3, 4 to 8 or greater
    than 8
    -  Number of times that he/she goes to a coffee house: 0, less than 1, 1 to 3, 4 to 8 or
    greater than 8
    -  Number of times that he/she eats at a restaurant with average expense less than \\$20 per
    person: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    -  Number of times that he/she goes to a bar: 0, less than 1, 1 to 3, 4 to 8 or greater than 8
    

2. Contextual attributes
    - Driving destination: home, work, or no urgent destination
    - Location of user, coupon and destination: we provide a map to show the geographical
    location of the user, destination, and the venue, and we mark the distance between each
    two places with time of driving. The user can see whether the venue is in the same
    direction as the destination.
    - Weather: sunny, rainy, or snowy
    - Temperature: 30F, 55F, or 80F
    - Time: 10AM, 2PM, or 6PM
    - Passenger: alone, partner, kid(s), or friend(s)


3. Coupon attributes
    - time before it expires: 2 hours or one day

In [304]:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import plotly.express as px

## Problems

Use the prompts below to get started with your data analysis.  

### 1. Read in the `coupons.csv` file.




In [305]:
data = pd.read_csv('data/coupons.csv')

In [306]:
px.histogram(data,"temperature")

In [307]:
data.head(5)

Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
0,No Urgent Place,Alone,Sunny,55,2PM,Restaurant(<20),1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,1
1,No Urgent Place,Friend(s),Sunny,80,10AM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,0,0,0,1,0
2,No Urgent Place,Friend(s),Sunny,80,10AM,Carry out & Take away,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,1
3,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,2h,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0
4,No Urgent Place,Friend(s),Sunny,80,2PM,Coffee House,1d,Female,21,Unmarried partner,...,never,,4~8,1~3,1,1,0,0,1,0


### 2. Investigate the dataset for missing or problematic data.

In [308]:
# We take a first look at the data structure.
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12684 entries, 0 to 12683
Data columns (total 26 columns):
 #   Column                Non-Null Count  Dtype 
---  ------                --------------  ----- 
 0   destination           12684 non-null  object
 1   passanger             12684 non-null  object
 2   weather               12684 non-null  object
 3   temperature           12684 non-null  int64 
 4   time                  12684 non-null  object
 5   coupon                12684 non-null  object
 6   expiration            12684 non-null  object
 7   gender                12684 non-null  object
 8   age                   12684 non-null  object
 9   maritalStatus         12684 non-null  object
 10  has_children          12684 non-null  int64 
 11  education             12684 non-null  object
 12  occupation            12684 non-null  object
 13  income                12684 non-null  object
 14  car                   108 non-null    object
 15  Bar                   12577 non-null

In [309]:
# Viewing data from the first 13 columns.
data.iloc[:,:13].sample(5)

Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,has_children,education,occupation
10721,Work,Alone,Rainy,55,7AM,Bar,1d,Female,41,Single,1,Bachelors degree,Healthcare Support
8307,No Urgent Place,Partner,Sunny,30,10AM,Bar,1d,Male,26,Married partner,0,Graduate degree (Masters or Doctorate),Architecture & Engineering
1668,Home,Alone,Sunny,55,6PM,Bar,1d,Male,50plus,Unmarried partner,1,Bachelors degree,Personal Care & Service
2247,Work,Alone,Sunny,55,7AM,Restaurant(<20),1d,Female,26,Married partner,0,Associates degree,Management
8619,Home,Alone,Snowy,30,6PM,Coffee House,1d,Female,50plus,Married partner,1,Associates degree,Office & Administrative Support


In [310]:
# We take a deeper look at the "age" column.
data["age"].unique()

array(['21', '46', '26', '31', '41', '50plus', '36', 'below21'],
      dtype=object)

In [311]:
# Viewing data from the last 13 columns.
data.iloc[:,13:].sample(5)

Unnamed: 0,income,car,Bar,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
11447,$75000 - $87499,,1~3,1~3,4~8,4~8,4~8,1,1,1,0,1,1
3366,$12500 - $24999,,never,never,4~8,1~3,never,1,1,0,0,1,1
10171,$100000 or More,,never,less1,4~8,1~3,less1,1,0,0,0,1,0
5174,$12500 - $24999,,never,never,4~8,4~8,less1,1,1,0,0,1,1
11920,$100000 or More,,never,less1,1~3,1~3,1~3,1,1,0,1,0,1


In [312]:
# We observe which attributes have missing data.
data.isnull().sum()

destination                 0
passanger                   0
weather                     0
temperature                 0
time                        0
coupon                      0
expiration                  0
gender                      0
age                         0
maritalStatus               0
has_children                0
education                   0
occupation                  0
income                      0
car                     12576
Bar                       107
CoffeeHouse               217
CarryAway                 151
RestaurantLessThan20      130
Restaurant20To50          189
toCoupon_GEQ5min            0
toCoupon_GEQ15min           0
toCoupon_GEQ25min           0
direction_same              0
direction_opp               0
Y                           0
dtype: int64

In [313]:
# We analyze what the attributes with null records contain.
data[["car","Bar", "CoffeeHouse","CarryAway","RestaurantLessThan20","Restaurant20To50"]].sample(10)

Unnamed: 0,car,Bar,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50
3967,,never,1~3,gt8,1~3,1~3
11385,,never,1~3,1~3,1~3,1~3
159,,gt8,gt8,gt8,gt8,gt8
8926,,never,less1,less1,4~8,less1
12385,,1~3,less1,4~8,1~3,less1
3778,,1~3,1~3,1~3,1~3,less1
6367,,less1,4~8,4~8,4~8,4~8
4625,,4~8,4~8,4~8,1~3,never
8297,Car that is too old to install Onstar :D,never,less1,1~3,less1,less1
8248,,never,4~8,1~3,4~8,less1


In [314]:
# We analyze the percentage of data that contains null values for the column "Car".
remove_null_car = data[data["car"].isnull()].shape
data_miss_percentage_car = (remove_null_car[0]/data.shape[0])*100
print("Percentage of null data with respect to the total for car column:")
print(str(data_miss_percentage_car)[:5] + "%")

Percentage of null data with respect to the total for car column:
99.14%


In [315]:
# We analyze the percentage of data that contains null values for the columns "Bar",
# "CoffeeHouse", "CarryAway", "RestaurantLessThan20", and "Restaurant20To50".
remove_null = data.drop("car", axis=1)
data_miss_percentage = ((data.shape[0] - remove_null.dropna().shape[0])/data.shape[0])*100
print("Percentage of null data with respect to the total for Bar, CoffeeHouse,\nCarryAway, RestaurantLessThan20 and Restaurant20To50 columns:")
print(str(data_miss_percentage)[:5] + "%")

Percentage of null data with respect to the total for Bar, CoffeeHouse,
CarryAway, RestaurantLessThan20 and Restaurant20To50 columns:
4.769%


In [316]:
# What data does the "car" column contain?
data["car"].unique()

array([nan, 'Scooter and motorcycle', 'crossover', 'Mazda5',
       'do not drive', 'Car that is too old to install Onstar :D'],
      dtype=object)

#### Conclusions for this section
- It is observed that some of the data in the "age" column are strings, such as '50plus' and 'below21'.
- Upon observing the analysis of the structure, it is found that several columns contain text but are typed as "object".
- Upon observing the null data, the "car" column stands out, where 99.14% of the data in the total dataset is null.
- For the columns "Bar," "CoffeeHouse," "CarryAway," "RestaurantLessThan20," and "Restaurant20To50," the total percentage of null columns with respect to the entire dataset is 4.7%.

### 3. Decide what to do about your missing data -- drop, replace, other...

In [317]:
# In the "age" column, there are two unique data points that prevent it from being converted to an integer:
#'50plus' and 'below21'. What we will do is remove the non-integer part and transform all '50plus' values to just 50,
#and similarly, all 'below21' values to 21.
data["age"] = data["age"].str.replace(r"plus$","", regex=True)
data["age"] = data["age"].str.replace(r"^below","", regex=True)

In [318]:
data["age"].unique()

array(['21', '46', '26', '31', '41', '50', '36'], dtype=object)

In [319]:
# We convert all object data types to primitives to enable operations with them.
data = data.convert_dtypes()
data["age"] = data["age"].astype("int64")
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12684 entries, 0 to 12683
Data columns (total 26 columns):
 #   Column                Non-Null Count  Dtype 
---  ------                --------------  ----- 
 0   destination           12684 non-null  string
 1   passanger             12684 non-null  string
 2   weather               12684 non-null  string
 3   temperature           12684 non-null  Int64 
 4   time                  12684 non-null  string
 5   coupon                12684 non-null  string
 6   expiration            12684 non-null  string
 7   gender                12684 non-null  string
 8   age                   12684 non-null  int64 
 9   maritalStatus         12684 non-null  string
 10  has_children          12684 non-null  Int64 
 11  education             12684 non-null  string
 12  occupation            12684 non-null  string
 13  income                12684 non-null  string
 14  car                   108 non-null    string
 15  Bar                   12577 non-null

In [320]:
# The "car" column contains very few values, so we can choose to remove it.
data.drop("car", axis=1, inplace=True)
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12684 entries, 0 to 12683
Data columns (total 25 columns):
 #   Column                Non-Null Count  Dtype 
---  ------                --------------  ----- 
 0   destination           12684 non-null  string
 1   passanger             12684 non-null  string
 2   weather               12684 non-null  string
 3   temperature           12684 non-null  Int64 
 4   time                  12684 non-null  string
 5   coupon                12684 non-null  string
 6   expiration            12684 non-null  string
 7   gender                12684 non-null  string
 8   age                   12684 non-null  int64 
 9   maritalStatus         12684 non-null  string
 10  has_children          12684 non-null  Int64 
 11  education             12684 non-null  string
 12  occupation            12684 non-null  string
 13  income                12684 non-null  string
 14  Bar                   12577 non-null  string
 15  CoffeeHouse           12467 non-null

In [321]:
#The columns "Bar," "CoffeeHouse," "CarryAway," "RestaurantLessThan20," and "Restaurant20To50"
#contain a total of 4.7% of null values. Since the percentage is low, we can choose to remove them.
data.dropna(inplace=True)
data.shape[0]

12079

In [322]:
# We verify that there are no null values.
data.isnull().sum()

destination             0
passanger               0
weather                 0
temperature             0
time                    0
coupon                  0
expiration              0
gender                  0
age                     0
maritalStatus           0
has_children            0
education               0
occupation              0
income                  0
Bar                     0
CoffeeHouse             0
CarryAway               0
RestaurantLessThan20    0
Restaurant20To50        0
toCoupon_GEQ5min        0
toCoupon_GEQ15min       0
toCoupon_GEQ25min       0
direction_same          0
direction_opp           0
Y                       0
dtype: int64

#### Conclusions for this section
- The values '50plus' and 'below21' in the "age" column were replaced with '50' and '20', respectively. Then, the data type of the column was transformed from object to integer.
- All object data types were transformed into strings, except for the "age" column.
- Originally, there were 12,684 records. After cleaning the null values from the "Bar," "CoffeeHouse," "CarryAway," "RestaurantLessThan20," and "Restaurant20To50" columns, there are 12,079 records remaining.
- As there were very few non-null values in the "car" column, the column was removed.

### 4. What proportion of the total observations chose to accept the coupon? 



In [408]:
# Percentage calculation.
percentage_accept = data["Y"].mean()*100
# Printing the percentage with a message.
print(str(percentage_accept)[:5]+"% of individuals responded that they would accept the coupon.")

56.93% of individuals responded that they would accept the coupon.


#### Conclusions for this section
- A little over half of the interviewed individuals would accept the coupon.

### 5. Use a bar plot to visualize the `coupon` column.

In [324]:
# I will create a count of the total data for each coupon
total_coupon = data["coupon"].value_counts().reset_index()

# I assign the column names
total_coupon.columns = ["coupon", "cumulative_total_coupons"]

# Plot it!
px.bar(total_coupon, x="coupon", y="cumulative_total_coupons", title="Cumulative total cupons by type",
       labels={"coupon":"Coupon Type","cumulative_total_coupons":"Total coupons delivere"})

#### Conclusions for this section
- In the bar chart, it can be observed that the majority of coupons were distributed for Coffee House, with approximately 1000 coupons, compared to the next highest.
- Similarly, in the bar chart, it can be observed that the least number of coupons were delivered for "expensive" restaurants.
- The previous two conclusions can be attributed to the assumption that there are more coffee houses per block than "expensive" restaurants.

### 6. Use a histogram to visualize the temperature column.

In [330]:
# Histogram of number of interviews vs Temperature
px.histogram(data,"temperature", title="Number of interviews vs Temperature", labels={"temperature":"Temperature (ºF)"})

#### Conclusions for this section
- With this analysis, it can be observed that as the temperature increases, more coupons were delivered.
- It is interesting to note that there are no data points for the temperature range of 60F to 80F.

### **Investigating the Bar Coupons**

Now, we will lead you through an exploration of just the bar related coupons.  

### 1. Create a new `DataFrame` that contains just the bar coupons.


In [336]:
# We use a query to isolate the data containing the value "Coffee House" in the "coupon" column.
bar_df = data.query("coupon == 'Bar'")
bar_df.sample(5)

Unnamed: 0,destination,passanger,weather,temperature,time,coupon,expiration,gender,age,maritalStatus,...,CoffeeHouse,CarryAway,RestaurantLessThan20,Restaurant20To50,toCoupon_GEQ5min,toCoupon_GEQ15min,toCoupon_GEQ25min,direction_same,direction_opp,Y
10146,No Urgent Place,Alone,Snowy,30,2PM,Bar,1d,Male,31,Married partner,...,never,1~3,1~3,1~3,1,0,0,0,1,1
8945,Home,Kid(s),Sunny,30,6PM,Bar,2h,Female,31,Married partner,...,4~8,4~8,less1,less1,1,1,0,0,1,1
6729,Work,Alone,Sunny,55,7AM,Bar,1d,Female,31,Married partner,...,1~3,1~3,1~3,1~3,1,1,1,0,1,0
9679,Work,Alone,Sunny,30,7AM,Bar,1d,Male,21,Single,...,never,4~8,1~3,1~3,1,1,0,1,0,0
12325,Work,Alone,Snowy,30,7AM,Bar,1d,Female,36,Married partner,...,never,1~3,less1,less1,1,1,1,0,1,0


### 2. What proportion of bar coupons were accepted?


In [349]:
# Calculo del porcentaje.
percentage_accept_bar = (bar_df["Y"].sum()/bar_df.shape[0])*100
# Imprimimos el porcentaje con un mensaje.
print(str(percentage_accept_bar)[:5]+"% of individuals responded that they would accept the coupon for bars.")

41.19% of individuals responded that they would accept the coupon for bars.


#### Conclusions for this section
- Just under half of the customers would accept the coupons for bars.

### 3. Compare the acceptance rate between those who went to a bar 3 or fewer times a month to those who went more.


In [428]:
# I will create a count of the total data for each coupon for bars
total_bar_coupon = bar_df.groupby("Bar")["Y"].sum().reset_index()

# Sorting the data by the frequency of attendance to bars.
total_bar_coupon_sort.iloc[0] = total_bar_coupon.iloc[4]
total_bar_coupon_sort.iloc[1] = total_bar_coupon.iloc[3]
total_bar_coupon_sort.iloc[2] = total_bar_coupon.iloc[0]
total_bar_coupon_sort.iloc[3] = total_bar_coupon.iloc[1]
total_bar_coupon_sort.iloc[4] = total_bar_coupon.iloc[2]

# Plot it!
px.bar(total_bar_coupon_sort, x="Bar", y="Y", title="Cumulative total cupons Attendance per month",
       labels={"Y":"Accepted coupons","Bar":"Attendance at bars per month."})

In [426]:
# Create two separate DataFrames based on bar visit frequency
less_3 = bar_df.query("Bar == 'never' | Bar == 'less1' | Bar == '1~3'")
more_3 = bar_df.query("Bar == '4~8' | Bar == 'gt8'")

# Percentage calculation.
less_3_rate = less_3["Y"].mean()*100
more_3_rate = more_3["Y"].mean()*100

# Print the results
print("- Acceptance Rate for 3 or Fewer Bar Visits:\n  Total data:" + str(less_3["Y"].shape[0]) +"\n  Acceptance:"+str(less_3["Y"].sum())+"\n  Rate:"+str(less_3_rate)[:5] + "%\n")
print("- Acceptance Rate for 3 or Fewer Bar Visits:\n  Total data:" + str(more_3["Y"].shape[0]) +"\n  Acceptance:"+str(more_3["Y"].sum())+"\n  Rate:"+str(more_3_rate)[:5] + "%\n")

- Acceptance Rate for 3 or Fewer Bar Visits:
  Total data:1720
  Acceptance:641
  Rate:37.26%

- Acceptance Rate for 3 or Fewer Bar Visits:
  Total data:193
  Acceptance:147
  Rate:76.16%



#### Conclusions for this section
- In the histogram, we can see that there are many more coupons delivered and accepted by people who go to bars 3 times or less per month.
- A deeper analysis reveals that there is a higher acceptance rate among people who tend to go to bars more than 3 times a moth.

### 4. Compare the acceptance rate between drivers who go to a bar more than once a month and are over the age of 25 to the all others.  Is there a difference?


In [456]:
# Create two separate DataFrames based on conditions
more_1_age_25 = bar_df.query("(Bar == '1~3' | Bar == '4~8' | Bar == 'gt8') & age > 25")
not_more_1_age_2 = bar_df.query("not((Bar == '1~3' | Bar == '4~8' | Bar == 'gt8') & age > 25)")

# Percentage calculation
more_1_age_25_rate = more_1_age_25["Y"].mean()*100
not_more_1_age_2_rate = not_more_1_age_2["Y"].mean()*100

# Print the results
print("- Customers who go more than once at month and are above the age of 25:\n  Total data:" + str(more_1_age_25["Y"].shape[0]) +"\n  Acceptance:"+str(more_1_age_25["Y"].sum())+"\n  Rate:"+str(more_1_age_25_rate)[:5] + "%\n")
print("- Other customers who do not meet this condition:\n  Total data:" + str(not_more_1_age_2["Y"].shape[0]) +"\n  Acceptance:"+str(not_more_1_age_2["Y"].sum())+"\n  Rate:"+str(not_more_1_age_2_rate)[:5] + "%\n")

- Customers who go more than once at month and are above the age of 25:
  Total data:403
  Acceptance:278
  Rate:68.98%

- Other customers who do not meet this condition:
  Total data:1510
  Acceptance:510
  Rate:33.77%



#### Conclusions for this section
- It can be observed that for people who are above the age of 25 and go to bars more than once a month, there is a higher acceptance of these coupons.

### 5. Use the same process to compare the acceptance rate between drivers who go to bars more than once a month and had passengers that were not a kid and had occupations other than farming, fishing, or forestry. 


In [461]:
# Create two separate DataFrames based on conditions
more_1_passenger_18_nfa_nfi_nfo = bar_df.query("(passanger != 'Kid(s)' & passanger != 'Alone' & (Bar == '1~3' | Bar == '4~8' | Bar == 'gt8') & occupation != 'Farming Fishing & Forestry')")
not_more_1_passenger_18_nfa_nfi_nfo = bar_df.query("not(passanger != 'Kid(s)' & passanger != 'Alone' & (Bar == '1~3' | Bar == '4~8' | Bar == 'gt8') & occupation != 'Farming Fishing & Forestry')")

# Percentage calculation
more_1_passenger_18_nfa_nfi_nfo_rate = more_1_passenger_18_nfa_nfi_nfo["Y"].mean()*100
not_more_1_passenger_18_nfa_nfi_nfo_rate = not_more_1_passenger_18_nfa_nfi_nfo["Y"].mean()*100

# Print the results
print("- Customers who go more than once at month, have passengers who are not children and are not engaged in farming, fishing, or forestry:\n  Total data:" + str(more_1_passenger_18_nfa_nfi_nfo["Y"].shape[0]) +"\n  Acceptance:"+str(more_1_passenger_18_nfa_nfi_nfo["Y"].sum())+"\n  Rate:"+str(more_1_passenger_18_nfa_nfi_nfo_rate)[:5] + "%\n")
print("- Other customers who do not meet this condition:\n  Total data:" + str(not_more_1_passenger_18_nfa_nfi_nfo["Y"].shape[0]) +"\n  Acceptance:"+str(not_more_1_passenger_18_nfa_nfi_nfo["Y"].sum())+"\n  Rate:"+str(not_more_1_passenger_18_nfa_nfi_nfo_rate)[:5] + "%\n")

- Customers who go more than once at month, have passengers who are not children and are not engaged in farming, fishing, or forestry:
  Total data:189
  Acceptance:135
  Rate:71.42%

- Other customers who do not meet this condition:
  Total data:1724
  Acceptance:653
  Rate:37.87%



### 6. Compare the acceptance rates between those drivers who:

- go to bars more than once a month, had passengers that were not a kid, and were not widowed *OR*
- go to bars more than once a month and are under the age of 30 *OR*
- go to cheap restaurants more than 4 times a month and income is less than 50K. 



### 7.  Based on these observations, what do you hypothesize about drivers who accepted the bar coupons?

### Independent Investigation

Using the bar coupon example as motivation, you are to explore one of the other coupon groups and try to determine the characteristics of passengers who accept the coupons.  