# RFP: Targeted Taco Bell Ads

## Project Overview
You are invited to submit a proposal that answers the following question:

### What ad will you create and why?

*Please submit your proposal by **1/30/25 at 11:59 PM**.*

## Required Proposal Components

### 1. Data Description
In the code cell below, read in the data you will need to train and test your model. Call `info()` once you have read the data into a dataframe. Consider using some or all of the following sources:
- [Customer Demographics](https://drive.google.com/file/d/1HK42Oa3bhhRDWR1y1wVBDAQ2tbNwg1gS/view?usp=sharing)
- [Ad Response Data](https://drive.google.com/file/d/1cuLqXPNKhP66m5BP9BAlci2G--Vopt-Z/view?usp=sharing)

*Note, a level 5 dataset combines these two data sets.*

In [8]:
# Read data into a dataframe(s).
import pandas as pd
customerdems=pd.read_csv('customer_data.csv')
addata=pd.read_csv('ad_data.csv')

print('\nCustomer Dems:')
customerdems.info()
df=customerdems.merge(addata, how='left', on='customer_id')
print('\nAdd info:')
addata.info()

print('\nMerged info:')
df.info()

df.head()
# Don't forget to call info()!


Customer Dems:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   customer_id  10000 non-null  int64  
 1   state        10000 non-null  object 
 2   sex          10000 non-null  object 
 3   age          10000 non-null  float64
 4   occupation   10000 non-null  object 
 5   family_size  10000 non-null  int64  
 6   income       10000 non-null  int64  
dtypes: float64(1), int64(3), object(3)
memory usage: 547.0+ KB

Add info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 6 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   customer_id       10000 non-null  int64 
 1   ad_type           10000 non-null  object
 2   ad_medium         10000 non-null  object
 3   ad_response       10000 non-null  bool  
 4   items_purchased   10000 non-null  ob

Unnamed: 0,customer_id,state,sex,age,occupation,family_size,income,ad_type,ad_medium,ad_response,items_purchased,drinks_purchased
0,9167,MO,F,42.0,Food Service,1,40343,DISCOUNT-20%,Instagram photo ad,True,"['mexican pizza', 'chicken quesadilla']","['mountain dew', 'mug root beer']"
1,531,MI,F,36.0,Retail,4,41730,DISCOUNT-10%,Instagram photo ad,False,"['steak garlic nacho fries', 'crunchy taco', '...","['mug root beer', 'iced tea', 'starry', 'iced ..."
2,2265,CA,F,25.0,IT,0,84024,DISCOUNT-20%,15 sec YouTube ad,False,['chicken quesadilla'],['mug root beer']
3,7550,VA,M,38.0,Food Service,2,38990,BOGO - Garlic Steak Nacho Fries,15 sec YouTube ad,True,"['steak garlic nacho fries', 'steak garlic nac...","['pepsi', 'diet pepsi', 'diet pepsi']"
4,5334,MT,M,35.0,Food Service,1,33400,DISCOUNT-20%,15 sec YouTube ad,False,"['spicy potato soft taco', 'nachos bellgrande']","['gatorade', 'baja blast']"


### 2. Training Your Model
In the cell seen below, write the code you need to train a K-means clustering model. Make sure you describe the center of each cluster found.

*Note, level 5 work uses at least 3 features to train a K-means model using only the standard Python library and Pandas. A level 4 uses external libraries like scikit or numpy.*

In [18]:
# Train model here.

#Features: Ad_Medium, Ad_response, Ad_Type, Income, age


sdf=[]
for i, player in df.iterrows():
    drinks = player['drinks_purchased']
    if 'baja blast' in drinks:
        sdf.append(player)
sdf=pd.DataFrame(sdf)
sdf.info()
sdf.head()


badresponse=[]
for i, player in sdf.iterrows():
    response = player['ad_response']
    if response == 'False':
        badresponse.append(i)
sdf = sdf.drop(badresponse)
sdf.info()
sdf.head()


<class 'pandas.core.frame.DataFrame'>
Index: 4379 entries, 4 to 9998
Data columns (total 12 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   customer_id       4379 non-null   int64  
 1   state             4379 non-null   object 
 2   sex               4379 non-null   object 
 3   age               4379 non-null   float64
 4   occupation        4379 non-null   object 
 5   family_size       4379 non-null   int64  
 6   income            4379 non-null   int64  
 7   ad_type           4379 non-null   object 
 8   ad_medium         4379 non-null   object 
 9   ad_response       4379 non-null   bool   
 10  items_purchased   4379 non-null   object 
 11  drinks_purchased  4379 non-null   object 
dtypes: bool(1), float64(1), int64(3), object(7)
memory usage: 414.8+ KB
<class 'pandas.core.frame.DataFrame'>
Index: 4379 entries, 4 to 9998
Data columns (total 12 columns):
 #   Column            Non-Null Count  Dtype  
---  ------       

Unnamed: 0,customer_id,state,sex,age,occupation,family_size,income,ad_type,ad_medium,ad_response,items_purchased,drinks_purchased
4,5334,MT,M,35.0,Food Service,1,33400,DISCOUNT-20%,15 sec YouTube ad,False,"['spicy potato soft taco', 'nachos bellgrande']","['gatorade', 'baja blast']"
5,9168,FL,F,68.0,Retired,3,0,DISCOUNT-5%,30 sec cable TV ad,False,"['cinnamon twists', 'mexican pizza', 'soft tac...","['brisk', 'iced tea', 'baja blast', 'iced tea']"
7,5405,NC,M,45.0,Other,3,70379,DISCOUNT-20%,30 sec Hulu commercial,False,"['nachos bellgrande', 'spicy potato soft taco'...","['baja blast', 'baja blast', 'pepsi', 'pepsi']"
9,9385,NE,M,20.0,IT,0,77982,REWARD - Free Baja Blast with purchase of $20 ...,15 sec YouTube ad,True,"['chicken quesadilla', 'beefy 5 layer burrito'...",['baja blast']
11,3989,AL,M,46.0,Retail,1,47946,BOGO - Baja Blast,Static Facebook ad,True,"['spicy potato soft taco', 'beefy 5 layer burr...","['baja blast', 'baja blast']"


#### Don't forget to describe the centers of the clusters you found.

### 3. Testing Your Model
In the cell seen below, write the code you need to test your K-means model. Then, interpret your findings.

*Note, level 5 testing uses both an elbow plot and a silhouette score to evaluate your model. Level 4 uses one or the other.*

In [2]:
# Test model here.

#### Interpret your elbow plot and/or silhouette score here.

### 4. Final Answer

In the first cell seen below, describe the cluster you have chosen to target with your ad, making sure to describe the type of ad they were the most likely to respond to. Then, use software of your choosing to create the ad you will need to target this cluster. You do not need to create an ad for both the nacho fries and the Baja Blast. You can focus on one if that's what your cluster cares about most.

In the second cell seen below, include a link to your ad.

*Note, a level 5 ad uses the medium (static image or video) the cluster most likely responded to.* 

#### Describe the cluster you are targeting here.

#### Link your ad here.