<a href="https://colab.research.google.com/github/Data-Intelligence-Mastery/data_science_interview_questions/blob/master/Q011_best_ad_group.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Best ad group

*Data Analysis, Python, Pandas, Data Manipulation, External Dataset*

Given the following dataset, can you identify the top 3 performing ad groups? Explain how you evaluated the ad groups.

Note there can be several correct solutions for this problem, and it's an example of a more open-ended, case type question you might come across in an interview.

In [1]:
import pandas as pd

path = 'https://raw.githubusercontent.com/Data-Intelligence-Mastery/data_science_interview_questions/master/data/Ads%20Table.csv'
df = pd.read_csv(path)
df.head()

Unnamed: 0,date,shown,clicked,converted,avg_cost_per_click,total_revenue,ad
0,10/1/15,65877,2339,43,0.9,641.62,ad_group_1
1,10/2/15,65100,2498,38,0.94,756.37,ad_group_1
2,10/3/15,70658,2313,49,0.86,970.9,ad_group_1
3,10/4/15,69809,2833,51,1.01,907.39,ad_group_1
4,10/5/15,68186,2696,41,1.0,879.45,ad_group_1


### Performance based on `total_revenue`

First let's sort the data based on `total_revenue`

In [2]:
df.sort_values(['total_revenue'], ascending = False, inplace = True)
df.head()

Unnamed: 0,date,shown,clicked,converted,avg_cost_per_click,total_revenue,ad
678,11/13/15,158855,13805,1349,1.77,39623.71,ad_group_13
664,10/30/15,157960,12359,1273,1.6,35841.62,ad_group_13
635,10/1/15,162075,14470,1389,1.81,35232.44,ad_group_13
657,10/23/15,161809,15014,1224,1.89,34571.09,ad_group_13
687,11/22/15,165662,14084,1343,1.73,31849.14,ad_group_13


In [3]:
df.groupby('ad').sum().sort_values('total_revenue', ascending = False).head()

Unnamed: 0_level_0,shown,clicked,converted,avg_cost_per_click,total_revenue
ad,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
ad_group_13,8237478,705941,66741,88.76,1054962.03
ad_group_18,4634466,458621,32039,104.01,522716.78
ad_group_4,4799075,397757,25004,129.84,381221.11
ad_group_20,6273230,383486,17869,93.21,280928.66
ad_group_26,3820805,347642,17397,118.59,275222.54


`ad_group_13`, `ad_group_18` and `ad_group_4` rank top 3 based on total revenue.

In [4]:
df.groupby('ad').count().sort_values('date', ascending = False).head()

Unnamed: 0_level_0,date,shown,clicked,converted,avg_cost_per_click,total_revenue
ad,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
ad_group_1,53,53,53,53,53,53
ad_group_37,53,53,53,53,53,53
ad_group_30,53,53,53,53,53,53
ad_group_31,53,53,53,53,53,53
ad_group_32,53,53,53,53,53,53


It looks all ad groups are done 53 times. So the performances based on `total_revenue` or average revenue are the same.

### Performance based on profit

In [5]:
df['cost'] = df['clicked'] * df['avg_cost_per_click']
df['profit'] = df['total_revenue'] - df['cost']
df.groupby('ad').sum().sort_values('profit', ascending = False).head()
df.head()

Unnamed: 0,date,shown,clicked,converted,avg_cost_per_click,total_revenue,ad,cost,profit
678,11/13/15,158855,13805,1349,1.77,39623.71,ad_group_13,24434.85,15188.86
664,10/30/15,157960,12359,1273,1.6,35841.62,ad_group_13,19774.4,16067.22
635,10/1/15,162075,14470,1389,1.81,35232.44,ad_group_13,26190.7,9041.74
657,10/23/15,161809,15014,1224,1.89,34571.09,ad_group_13,28376.46,6194.63
687,11/22/15,165662,14084,1343,1.73,31849.14,ad_group_13,24365.32,7483.82


ad_group_2, ad_group_31 and ad_group_16 rank top 3 based on profit.

## Performance based on conversion rate


In [6]:
df['conv_rate'] = df['converted'] / df['clicked']
df.groupby('ad').mean().sort_values('conv_rate', ascending = False).head()

Unnamed: 0_level_0,shown,clicked,converted,avg_cost_per_click,total_revenue,cost,profit,conv_rate
ad,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
ad_group_2,51076.981132,1168.150943,117.90566,0.638113,1783.559811,756.079811,1027.48,0.102833
ad_group_13,155424.113208,13319.641509,1259.264151,1.674717,19904.943962,23343.302642,-3438.358679,0.095344
ad_group_12,28624.339623,1710.981132,156.509434,2.009623,2439.359245,3588.445472,-1149.086226,0.092007
ad_group_16,29595.075472,788.0,69.849057,0.542642,1056.845849,449.684528,607.161321,0.089641
ad_group_34,35371.622642,2979.169811,260.320755,1.733208,4019.53434,5208.80434,-1189.27,0.08822


ad_group_2, ad_group_13 and ad_group_12 rank top 3 based on conversion rate.
