###  Sales Campaign analysis

__An introduction to the Facebook advertising platform__<br/>
Along with Google's search and display networks, Facebook is one of the big players when it comes to online advertising. As Facebook users interact with the platform, adding demographic information, liking particular pages and commenting on specific posts, Facebook builds a profile of that user based on who they are and what they're interested in.<br/>
This fact makes Facebook very attractive for advertisers. Advertisers can create Facebook adverts, then create an 'Audience' for that advert or group of adverts. Audiences can be built from a range of attributes including gender, age, location and interests. This specific targetting means advertisers can tailor content appropriately for a specific audience, even if the product being marketed is the same.<br/>

__What do we need from our Facebook ads analysis?__<br/>
When it comes to analysing the Facebook adverts dataset, there are a lot of questions we can ask, and a lot of insight we can generate. However, from a business perspective we want to ask questions that will give us answers we can use to improve business performance.<br/>
Without knowing anything of the company's marketing strategy or campaign objectives, we do not know which key performance indicators (KPIs) are the most important. For example, a new company may be focussed on brand awareness and may want to maximise the amount of impressions, being less concerned about how well these adverts perform in terms of generating clicks and revenue. Another company may simply want to maximise the amount of revenue, while minimising the amount it spends on advertising.<br/>
As these two objectives are very different, it is important to work with the client to understand exactly what they are hoping to achieve from their marketing campaigns before beginning any analysis in order to ensure that our conclusions are relevant and, in particular, actionable. There's not much point in producing a report full of insight, if there's nothing the client can do about it.



__Understanding the dataset__<br/>
The data used in this project is from an anonymous organisation’s social media ad campaign. The data contains 1143 observations in 11 variables. Below are the descriptions of the variables. Since you are working with numpy, refer the `Feature Index` column for the indices of every feature.

|Feature Index|Features|Description|
|----|----|----|
|0|ad_id| unique ID for each ad|
|1|xyz_campaign_id| an ID associated with each ad campaign of XYZ company|
|2|fb_campaign_id| an ID associated with how Facebook tracks each campaign|
|3|age| age of the person to whom the ad is shown|
|4|gender| gender of the person to whom the add is shown|
|5|interest| a code specifying the category to which the person’s interest belongs (interests are as mentioned in the person’s Facebook public profile)|
|6|Impressions| the number of times the ad was shown|
|7|Clicks| number of clicks on for that ad|
|8|Spent| Amount paid by company xyz to Facebook, to show that ad|
|9|Total conversion| Total number of people who enquired about the product after seeing the ad|
|10|Approved conversion| Total number of people who bought the product after seeing the ad|

Below is a snapshot of the data you will be working with

In [0]:
import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
import sys

In [0]:
# Command to display all the columns of a numpy array
np.set_printoptions(threshold=sys.maxsize)

### Let's load the data

In [0]:
import csv
path = 'KAG_conversion_data.csv'

with open(path) as f:
    adm = csv.reader(f,delimiter=',')
    adm = list(adm)

# Remove the header
adm.remove(adm[0])

# Convert the data into a numpy array and store it in sales_data
sales_data=np.array(adm)

In [0]:
sales_data.shape

(1143, 11)

Let's delve into the data to find the answers to some questions

### How many unique ad campaigns (xyz_campaign_id) does this data contain ? And for how many times was each campaign run ?

In [0]:
import numpy as np
arr_1 = np.arange(1,11).reshape(5,2)

In [4]:
arr_1

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [5]:
arr_1[:, 0] + arr_1[:, 1]

array([ 3,  7, 11, 15, 19])

In [0]:
unique_campaign, campagin_count = np.unique(sales_data[:,1], return_counts=True)

In [0]:
print("The Total number of campaign are {}".format(unique_campaign))
print("The Total count of campaigns are {}".format(campagin_count))

The Total number of campaign are ['1178' '916' '936']
The Total count of campaigns are [625  54 464]


### What are the age groups that were targeted through these ad campaigns ?

In [0]:
age_groups = np.unique(sales_data[:,3])
print("The age groups targeted are ", age_groups)

The age groups targeted are  ['30-34' '35-39' '40-44' '45-49']


So the people targeted belong to the ages 30-49

### What was the average, minimum and maximum amount spent on the ads ?

In [0]:
max_amt = sales_data[:,8].astype(float).max()
min_amt = sales_data[:,8].astype(float).min()
avg_amt = sales_data[:,8].astype(float).mean()

print('Minimum amt spent on ads was ', min_amt)
print('Maximum amt spent on ads was ', max_amt)
print('Average amt spent on ads was ', avg_amt)

Minimum amt spent on ads was  0.0
Maximum amt spent on ads was  639.9499981
Average amt spent on ads was  51.36065613141295


### What is the id of the ad having the maximum number of clicks ?

In [0]:
max_clicks = sales_data[:, 7].astype(int).max()
print('The maximum number of clicks were {}'.format(max_clicks))

The maximum number of clicks were 421


In [0]:
max_clicks_ad = sales_data[:,0][sales_data[:,7].astype(int) == max_clicks]
print('The advertisement with the maximum number of clicks was the one with id ', max_clicks_ad)

The advertisement with the maximum number of clicks was the one with id  ['1121814']


### How many people bought the product after seeing the ad with most clicks ? Is that the maximum number of purchases in this dataset  ?

In [0]:
max_purchases = sales_data[:,10].astype(int).max()
print ("Maximum Purchases that happened = {}".format(max_purchases))

In [0]:
max_sales = int(sales_data[:, 10][sales_data[:, 0] == max_clicks_ad])
print('Number of people who bought the product having maximum ad clicks is ',max_sales)

In [0]:
if (max_sales >= max_purchases):
    print("The maximum sales were on this product")
elif (max_sales <= max_purchases):
    print('The maximum number of purchases were ', max_purchases)

### So the ad with the most clicks didn't fetch the maximum number of purchases. Let's find the details of the product having maximum number of purchases

In [0]:
max_purchase_record = sales_data[sales_data[:,10].astype(int) == max_purchases]

In [0]:
print('The record for this product is as shown below')
print (max_purchase_record)

If you look at the value for impressions for this product which is the 7th value in the array, it can be seen that this product had a very high number of __impressions__ !

### Creating additional features

Let's add some additional features that will represent some additional standard metrics.

###  Click Through Rate (CTR)
This is the percentage of how many of our impressions became clicks. A high CTR is often seen as a sign of good creative being presented to a relevant audience. A low click through rate is suggestive of less-than-engaging adverts (design and / or messaging) and / or presentation of adverts to an inappropriate audience. What is seen as a good CTR will depend on the type of advert (website banner, Google Shopping ad, search network test ad etc.) and can vary across sectors, but 2% would be a reasonable benchmark.

### Create a new feature `Click Through Rate`  (CTR) and then concatenate it to the original numpy array 

CTR = $\frac{Clicks}{Impressions}$x100

In [0]:
CTR = np.array((sales_data[:,7].astype(float)/sales_data[:,6].astype(float))*100)

In [0]:
sales_data.shape

In [0]:
sales_data.ndim

In [0]:
CTR.shape

In [0]:
CTR.ndim

In [0]:
CTR = CTR.reshape(-1,1)

In [0]:
CTR.shape

In [0]:
CTR.ndim

In [0]:
sales_data = np.concatenate((sales_data, CTR), axis=1)

In [0]:
sales_data.shape

### Create a new column that represents Cost Per Mille (CPM) .
This number is the cost of one thousand impressions. If your objective is ad exposure to increase brand awareness, this might be an important KPI for you to measure.

In [0]:
CPM = (sales_data[:,8].astype(float)/sales_data[:,6].astype(float))*1000

In [0]:
print (CPM.shape, CPM.ndim)

In [0]:
CPM = CPM.reshape(-1,1)

In [0]:
print (CPM.shape, CPM.ndim)

In [0]:
sales_data = np.concatenate((sales_data, CPM),axis=1)

In [0]:
sales_data.shape