# Analyze A/B Test Results
Facebook recently introduced a new bidding type, “**average bidding**”, as an alternative to its exisiting bidding type, called “**maximum bidding**”. One of our clients, **bombabomba.com**, has decided to test this new feature and wants to conduct an **A/B test** to understand **if average bidding brings more conversions than maximum bidding**.

In this A/B test, bombabomba.com randomly splits its audience into two equally sized groups, e.g. the test and the control group. **A Facebook ad campaign with “maximum bidding” is served to “control group” and another campaign with “average bidding” is served to the “test group”.**

The A/B test has run for 1 month and bombabomba.com now expects you to analyze the results of this A/B test.

- Facebook Ad: An advertisement created by a business on Facebook that's served up to Facebook users.
- Impressions: The number of times an ad is displayed.
- Reach: The number of unique people who saw an ad.
- Website Clicks: The number of clicks on ad links directed to Advertiser’s website.
- Website Click Through Rate: Number of Website Clicks / Number of Impressions x 100
- Cost per Action: Spend / Number of Actions
- Action: Can be any conversion event, such as Search, View Content, Add to Cart and Purchase.
- Conversion Rate: Number of Actions / Number of Website Clicks x 100

The ultimate success metric for bombabomba.com is Number of Purchases.Therefore, we should focus on Purchase metrics for statistical testing.


## How would we define the hypothesis of this A/B test?

- **H0** : There is no statistically significant difference between the Control group that was served  “maximum bidding” campaign and Test group that was served  “average bidding” campaign.

- **H1** : There is statistically significant difference between the Control group that was served  “maximum bidding” campaign and Test group that was served  “average bidding” campaign.

In [3]:
# loading necessary libraries
import pandas as pd
from scipy.stats import shapiro
import scipy.stats as stats

In [4]:
# reading the data
Control_Group = pd.read_excel("ab_testing_data.xlsx", sheet_name='Control Group')  # maximum bidding
Test_Group = pd.read_excel("ab_testing_data.xlsx", sheet_name='Test Group')        # average bidding

In [5]:
# assinging name for dataframes
Control_Group.name = "Control_Group"
Test_Group.name = "Test_Group";

In [6]:
# Analyze the dataframes 
def analyze_df (df):
    print("Dataframe Name : %s" % df.name,"\n") # dataframe name
    print("Shape of dataframe: {0}".format(df.shape), "\n") # shape of dataframe
    print("There are {0} observations and {1} features".format(len(df),len(df.columns)),"\n") # number of observations and features
    print(df.head(),"\n") # first 5 observation
    for col in df.columns:
        print(" Number of null value in the {0} column: {1}".format(col,df[col].isnull().sum())) # is there a null value in any columns
    print("\n","Website Click Through Rate: %.4f" % (df["Click"].sum()/df["Impression"].sum()*100),"%","\n") # rate of website click
    print(df.describe().T) # for observe the outliers

In [7]:
#analyze Control Group
analyze_df (Control_Group);

Dataframe Name : Control_Group 

Shape of dataframe: (40, 4) 

There are 40 observations and 4 features 

      Impression        Click    Purchase      Earning
0   82529.459271  6090.077317  665.211255  2311.277143
1   98050.451926  3382.861786  315.084895  1742.806855
2   82696.023549  4167.965750  458.083738  1797.827447
3  109914.400398  4910.882240  487.090773  1696.229178
4  108457.762630  5987.655811  441.034050  1543.720179 

 Number of null value in the Impression column: 0
 Number of null value in the Click column: 0
 Number of null value in the Purchase column: 0
 Number of null value in the Earning column: 0

 Website Click Through Rate: 5.0148 % 

            count           mean           std           min           25%  \
Impression   40.0  101711.449068  20302.157862  45475.942965  85726.690349   
Click        40.0    5100.657373   1329.985498   2189.753157   4124.304129   
Purchase     40.0     550.894059    134.108201    267.028943    470.095533   
Earning      40.0  

In [8]:
# analyze Test Group
analyze_df (Test_Group)

Dataframe Name : Test_Group 

Shape of dataframe: (40, 4) 

There are 40 observations and 4 features 

      Impression        Click    Purchase      Earning
0  120103.503796  3216.547958  702.160346  1939.611243
1  134775.943363  3635.082422  834.054286  2929.405820
2  107806.620788  3057.143560  422.934258  2526.244877
3  116445.275526  4650.473911  429.033535  2281.428574
4  145082.516838  5201.387724  749.860442  2781.697521 

 Number of null value in the Impression column: 0
 Number of null value in the Click column: 0
 Number of null value in the Purchase column: 0
 Number of null value in the Earning column: 0

 Website Click Through Rate: 3.2922 % 

            count           mean           std           min            25%  \
Impression   40.0  120512.411758  18807.448712  79033.834921  112691.970770   
Click        40.0    3967.549761    923.095073   1836.629861    3376.819024   
Purchase     40.0     582.106097    161.152513    311.629515     444.626828   
Earning      40.0 

In [9]:
# The ultimate success metric for bombabomba.com is Number of Purchases.
# Therefore, we should focus on Purchase metrics for statistical testing.
A = pd.DataFrame(Control_Group["Purchase"])
B = pd.DataFrame(Test_Group["Purchase"])

In [10]:
# Combine the test and control group
AB = pd.concat([A,B],axis=1)
AB.columns=["A","B"]
AB.head()

Unnamed: 0,A,B
0,665.211255,702.160346
1,315.084895,834.054286
2,458.083738,422.934258
3,487.090773,429.033535
4,441.03405,749.860442


In [11]:
print(" Mean of purchase of control group: %.3f"%AB.A.mean(),"\n",
      "Mean of purchase of test group: %.3f"%AB.B.mean())

 Mean of purchase of control group: 550.894 
 Mean of purchase of test group: 582.106


When we look at the purchase average of the two groups, we can observe that the test group, the average bidding campaign, is better. But this observation is not a statistically significant results.

# Can we conclude statistically significant results?
## Indepented Two Sample T-Test

The Independent Samples t Test compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different.
 ## Requirements
- **Normal distribution**: Non-normal population distributions, especially those that are thick-tailed or heavily skewed, considerably reduce the power of the test
- **Homogeneity of variances** : When this assumption is violated and the sample sizes for each group differ, the p value is not trustworthy.

## Hypotheses
The **null hypothesis (H0)** and **alternative hypothesis (H1)** of the Independent Samples t Test can be expressed in two different but equivalent ways:
- H0: µ1 = µ2 (the two population means are equal)
- H1: µ1 ≠ µ2 (the two population means are not equal)

 ### The Shapiro-Wilks Test for Normality 

- H0: There is no statistically significant difference between sample distribution and theoretical normal distribution
- H1: There is statistically significant difference between sample distribution and theoretical normal distribution

The test rejects the hypothesis of normality when the p-value is less than or equal to 0.05. Failing the normality test allows you to state with 95% confidence the data does not fit the normal distribution.

* p-value < 0.05 (H0 rejected)
* p-value > 0.05 (H0 not rejected)

In [12]:
# Shapiro-Wilks Test  for Group A
test_statistic , pvalue = shapiro(AB.A)
print('Test statistic = %.4f, p-Value = %.4f' % (test_statistic, pvalue))

Test statistic = 0.9773, p-Value = 0.5891


In [13]:
#p-value greater then 0.05 so H0 is not rejected for Group A.
pvalue < 0.05

False

In [14]:
# Shapiro-Wilks Test  for Group B
test_statistic , pvalue = shapiro(AB.B)
print('Test statistic = %.4f, p-Value = %.4f' % (test_statistic, pvalue))

Test statistic = 0.9589, p-Value = 0.1541


In [15]:
#p-value greater then 0.05 so H0 is not rejected for Group B.
pvalue < 0.05

False

There is **no statistically significant difference** between sample distribution and theoretical normal distribution in groups A and B.

### Levene’s Test for Homogeneity of variances
Levene’s test is an equal variance test. It can be used to check if our data sets fulfill the homogeneity of variance assumption before we perform the t-test or Analysis of Variance 

- H0: the compared groups have equal variance.
- H1: the compared groups do not have equal variance.

In [16]:
test_statistic,pvalue = stats.levene(AB.A,AB.B)
print('Test statistic = %.4f, p-Value = %.4f' % (test_statistic, pvalue))

Test statistic = 2.6393, p-Value = 0.1083


In [17]:
# p-value greater then 0.05 so H0 is not rejected.
pvalue < 0.05

False

The compared groups have equal variance. 

The assumptions of normality distribution and variance homogeneity were tested.Two assumptions are provided, we can now test for our main hypothesis.

- **H0** : There is no statistically significant difference between the Control group that was served  “maximum bidding” campaign and Test group that was served  “average bidding” campaign.

- **H1** : There is statistically significant difference between the Control group that was served  “maximum bidding” campaign and Test group that was served  “average bidding” campaign.

In [18]:
tvalue, pvalue = stats.ttest_ind(AB["A"], AB["B"], equal_var=True)
print('tvalue = %.4f, pvalue = %.4f' % (tvalue, pvalue))

tvalue = -0.9416, pvalue = 0.3493


In [19]:
# p-value greater then 0.05 so H0 is not rejected.
pvalue < 0.05

False

P-value greater then 0.05 so H0 is not rejected.So, There is no statistically significant difference between the Control group that was served “maximum bidding” campaign and Test group that was served “average bidding” campaign.

## Which statistical test did we use, and why?
We used independent t-test because we want to determine if there is a significant difference between the means of two indepented groups, which may be related in certain features.

## What would be our recommendation to client?
There is no statistically significant difference between the Control group that was served “maximum bidding” campaign and Test group that was served “average bidding” campaign. For this reason, we can recommend continuing with the maximum bidding campaign currently used.


# Conclusion
- Hypothesis established and interpreted
- The data was analyzed, outliers were observed
- It was checked whether the assumptions were met for the statistical test to be applied
- The assumptions were observed and tested
- Commented based on -p-value
- Suggestion offered to customer