In [4]:
library(testthat)
library(digest)
library(tidyverse)
library(repr)
library(gridExtra)

Our MDS marketing team designed four Facebook ads and decided to run an A/B test to answer the following questions:

**Questions:** 

- Which of the four ads will lead to a greater click through rate (*i.e.*, move the visitor from Facebook to our MDS marketing website)?

- What ad features should be used in future ads?

**Methods:**

The four ads shown below (labelled A through D) were randomly served to Facebook visitors for four months. Click through rate was measured.

![](img/MDS-facebook-ABtest.png)

**Data:**

In [6]:
tibble(ad_variant = c("A", "B", "C", "D"),
      impressions = c(11921, 6747, 2778, 29701),
      clicks = c(152, 88, 22, 450),
      click_through_rate = c(0.0128, 0.013, 0.0079, 0.0152))

ad_variant,impressions,clicks,click_through_rate
<chr>,<dbl>,<dbl>,<dbl>
A,11921,152,0.0128
B,6747,88,0.013
C,2778,22,0.0079
D,29701,450,0.0152


| Click through rate | Ad variant | Video           | Call to action button | Text | 
|--------------------|------------|-----------------|-----------------------|------|
| 0.0128             | A          | Video of Male   | APPLY NOW             | "Be career ready in just 10 month..."|
| 0.0130             | B          | Video of Male   | APPLY NOW             | "Gain the skills for an in-demand career..." |
| 0.0079             | C          | Video of Female | APPLY NOW             | "Find out how the UBC Master of Data Science..." |
| 0.0152             | D          | Video of Female | LEARN MORE            | "Seize the opportunity to obtain a career..." |

> - We will fit a logistic regression on the data, the response variable is binary (click or not click). The Explanatory variable is the Ad variant(A, B, C, D)
> - Then, we check the coff of the logistic regression output. If the p-value is significant, then we can reject the null hypothesis.
> - When we have significant p-value, we will have follow up the pairwise proportion test for different combination of Ad variant. We can use Bonferroni do the correction to find the effect of different Ad variant.

> - The Ad variant D got the highest click through rate in those 4 Ad variants.
> - So in the future Ad, we will use the structure like Ad D. So it will be a Video of Female, using LEARN MORE button, the ad text is "Seize the opportunity to obtain a career..."

In [6]:
df <- tibble(ad_variant = c("A", "B", "C", "D"),
      impressions = c(11921, 6747, 2778, 29701),
      clicks = c(152, 88, 22, 450),
      click_through_rate = c(0.0128, 0.013, 0.0079, 0.0152))

temp1 <- c()
temp2 <- c()

for (i in 1:4){
    temp2 <- c(temp2, rep(1, df$clicks[i]), rep(0, df$impressions[i] - df$clicks[i]))
    temp1 <- c(temp1, rep(df$ad_variant[i], df$clicks[i]), rep(df$ad_variant[i], df$impressions[i] - df$clicks[i]))
}

df1 <- tibble(x = temp1, y = temp2)

glm <- glm(y ~ x, data = df, family = 'binomial')

glm


Call:  glm(formula = y ~ x, family = "binomial", data = df)

Coefficients:
(Intercept)           xB           xC           xD  
   -4.34934      0.02296     -0.48115      0.17492  

Degrees of Freedom: 51146 Total (i.e. Null);  51143 Residual
Null Deviance:	    7501 
Residual Deviance: 7487 	AIC: 7495

In [7]:
pairwise.prop.test(x = df$clicks, n = df$impressions)


	Pairwise comparisons using Pairwise comparison of proportions 

data:  df$clicks out of df$impressions 

  1     2     3    
2 0.918 -     -    
3 0.215 0.215 -    
4 0.215 0.430 0.018

P value adjustment method: holm 

> - I think Ad D is the best among those 4 ads, which confirm what I did in 2.3
> - The p value is different than the logsitic approach. The reason is it used the holm to do the P value adjustment