In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [6]:
df = pd.read_csv('market data.csv')

In [7]:
df.shape

(1143, 11)

In [8]:
df.head()

Unnamed: 0,ad_id,xyz_campaign_id,fb_campaign_id,age,gender,interest,Impressions,Clicks,Spent,Total_Conversion,Approved_Conversion
0,708746,916,103916,30-34,M,15,7350,1,1.43,2,1
1,708749,916,103917,30-34,M,16,17861,2,1.82,2,0
2,708771,916,103920,30-34,M,20,693,0,0.0,1,0
3,708815,916,103928,30-34,M,28,4259,1,1.25,1,0
4,708818,916,103928,30-34,M,28,4133,1,1.29,1,1


In [10]:
# Getting number of unique values

print ('No. of unique ads:', df['ad_id'].nunique())
print ('No. of campaigns:', df['xyz_campaign_id'].nunique())
print ('No. of facebook campaigns:', df['fb_campaign_id'].nunique())
print ('No. of interest groups:', df['interest'].nunique())
print ('No. of age groups:', df['interest'].nunique())

No. of unique ads: 1143
No. of campaigns: 3
No. of facebook campaigns: 691
No. of interest groups: 40
No. of age groups: 40


In [11]:
# 'ad_id' and 'fb_campaign_id' has too many unique values. Hence, dropping both the columns

df = df.drop(['ad_id', 'fb_campaign_id'], axis = 1)
df.columns

Index(['xyz_campaign_id', 'age', 'gender', 'interest', 'Impressions', 'Clicks',
       'Spent', 'Total_Conversion', 'Approved_Conversion'],
      dtype='object')

# Feature engineering

* ```Click-through-rate (CTR)```: This is the percentage of how many of our impressions became clicks. A high CTR is often seen as a sign of good creative being presented to a relevant audience. A low click through rate is suggestive of less-than-engaging adverts (design and / or messaging) and / or presentation of adverts to an inappropriate audience. What is seen as a good CTR will depend on the type of advert (website banner, Google Shopping ad, search network test ad etc.) and can vary across sectors, but 2% would be a reasonable benchmark.

--------------
* ```Conversion Rate (CR)```: This is the percentage of clicks that result in a 'conversion'. What a conversion is will be determined by the objectives of the campaign. It could be a sale, someone completing a contact form on a landing page, downloading an e-book, watching a video, or simply spending more than a particular amount of time or viewing over a target number of pages on a website.

-------------
* ```Cost Per Click (CPC)```: Self-explanatory this one: how much (on average) did each click cost. While it can often be seen as desirable to reduce the cost per click, the CPC needs to be considered along with other variables. For example, a campaign with an average CPC of £0.5 and a CR of 5% is likely achieving more with its budget than one with a CPC of £0.2 and a CR of 1% (assuming the conversion value is the same.

-----------------
* ```Cost Per Conversion```: Another simple metric, this figure is often more relevant than the CPC, as it combines the CPC and CR metrics, giving us an easy way to quickly get a feel for campaign effectiveness.

In [12]:
# Calculating CTR, CR, CPC and Cost Per Conversion

df['ClickThroughRate'] = ((df['Clicks'] / df['Impressions']) * 100)
df['ConversionRate'] = (df['Approved_Conversion'] / df['Total_Conversion'])
df['CostPerClick'] = (df['Spent'] / df['Clicks'])
df['CostPerConversion'] = (df['Spent'] / df['Approved_Conversion'])

* ```Conversion Value```: A conversion value is a numerical value that you assign to specific conversions in order to represent their impact to your business. The major benefit to assigning conversion values is to help you track, optimize, and report on your return on ad spend (ROAS).

----------------
* ```ROAS```: ROAS stands for return on ad spend—a marketing metric that measures the amount of revenue your business earns for each dollar it spends on advertising. For all intents and purposes, ROAS is practically the same as another metric you're probably familiar with: return on investment, or ROI.

-----------------
* ```Cost Per Mille```: This number is the cost of one thousand impressions. If your objective is ad exposure to increase brand awareness, this might be an important KPI for you to measur

In [13]:
# Assume the value of sales to be 100 dollars

df['ConversionValue'] = df['Approved_Conversion'] * 100
df['ROAS'] = round(df['ConversionValue'] / df['Spent'], 2)
df['CostPerMille'] = round((df['Spent'] / df['Impressions']) * 1000, 2)

In [14]:
# Removing all the records having NaN or Infinity values

df = df.replace([np.inf, -np.inf], np.nan).dropna(axis = 0)

In [15]:
df.shape

(513, 16)

In [16]:
df.columns

Index(['xyz_campaign_id', 'age', 'gender', 'interest', 'Impressions', 'Clicks',
       'Spent', 'Total_Conversion', 'Approved_Conversion', 'ClickThroughRate',
       'ConversionRate', 'CostPerClick', 'CostPerConversion',
       'ConversionValue', 'ROAS', 'CostPerMille'],
      dtype='object')

In [17]:
df[['xyz_campaign_id', 'ClickThroughRate', 'CostPerClick', 'ConversionRate', 
    'ConversionValue', 'ROAS', 'CostPerMille']].groupby(['xyz_campaign_id'], as_index = False).agg('mean').rename(columns = {'xyz_campaign_id':'Campaign', 'ClickThroughRate':'Average CTR', 'CostPerClick':'Average Cost/Click', 'ConversionRate':'Average Conversion Rate'}).style.background_gradient(cmap = 'Wistia')

Unnamed: 0,Campaign,Average CTR,Average Cost/Click,Average Conversion Rate,ConversionValue,ROAS,CostPerMille
0,916,0.027162,1.339464,0.90625,100.0,42.5,0.37125
1,936,0.026293,1.372928,0.899851,105.357143,47.035089,0.359732
2,1178,0.016359,1.583425,0.51837,225.714286,5.214623,0.254
