## Introduction

#### Experiment Approach
- **Null Hypothesis** Hₒ: p = pₒ          "There is no significant difference between the ad success rate of both groups"
- **Alternative Hypothesis** Hₐ: p ≠ pₒ   "There is significant difference between the ad success rate of both groups"
    - Given we don’t know if the new design will perform better/worse/equal as our current design, we will perform a two-tailed test
    - **Confidence Level**: 95% (α=0.05)
    - p and pₒ stand for the conversion rate of the new and old design We’ll also set a confidence level of 95%
    
#### Business Objectives
- The company expects the new ad startegy to increase ad success from 45% to 50% (+5pp / +11%).

#### Columns Description
- **auction_id**: the unique id of the online user who has been presented the BIO. In standard terminologies this is called an impression id. The user may see the BIO questionnaire but choose not to respond. In that case both the yes and no columns are zero.
- **experiment**: which group the user belongs to - control or exposed.
    - ***control***: users who have been shown a dummy ad
    - ***exposed***: users who have been shown a creative, an online interactive ad, with the SmartAd brand.


- **date**: the date in YYYY-MM-DD format
- **hour**: the hour of the day in HH format.
- **device_make**: the name of the type of device the user has e.g. Samsung
- **platform_os** : the id of the OS the user has.
- **browser**: the name of the browser the user uses to see the BIO questionnaire.
- **yes**: 1 if the user chooses the “Yes” radio button for the BIO questionnaire.
- **no**: 1 if the user chooses the “No” radio button for the BIO questionnaire.

## Package & Data Imports

In [None]:
# Programming
import pandas as pd
import numpy as np

# Visualization
import matplotlib.pyplot as plt 
import seaborn as sns

# Statistics
import statsmodels.stats.api as sms
from statsmodels.stats.proportion import proportions_ztest, proportion_confint

In [None]:
# Data Import
df = pd.read_csv('/kaggle/input/ad-ab-testing/AdSmartABdata - AdSmartABdata.csv', parse_dates=['date'])
print(df.shape)
df.head()

## Data Preprocessing

In order to preprocess the data we will proceed with the following steps:
- **Converting 'auction_id' to df index** This will allow us to index for specific observations if necessary.
- **Check for null values** to validate that we work with observations that have actionable data.
- **Remove non-answer observations** (both 'yes' and 'no' columns are equal to 0). This may remove a significant percentage of the observations, but non-answers are not useful for our analysis as we can not infer if the ad was successful or not. We will also unify the 'yes' and 'no' columns in a single column where 1 == ad_success and 0 == ad_failure

In [None]:
# Making the auction_id feature our index
df = df.set_index('auction_id')

In [None]:
# Checking for nulls
total_nulls = df.isnull().sum()
print('Null values:')
print(total_nulls)
print('')

# Removing non-answers
ad_success =[]
for x, y in zip(df.yes, df.no):
    if (x == 1) and (y == 0):
        ad_success.append(1)
    elif (x == 0) and (y == 1):
        ad_success.append(0)
    else:
        ad_success.append('no_response')
df['ad_success'] = ad_success
df = df.loc[~df.ad_success.isin(['no_response'])]
df = df.drop(['yes', 'no'], axis = 1)

# Check and codify categorical values
    # Checked the values within each column and coded them based on the most common values
    # device_make_list = df.device_make.unique()
    # browser_list = df.browser.unique()
device_list_codified = []
for x in df.device_make:
    if 'Samsung' in x:
        device_list_codified.append(1)
    elif 'iPhone' in x:
        device_list_codified.append(2)
    else:
        device_list_codified.append(0)
df.device_make = device_list_codified

browser_list_codified = []
for x in df.browser:
    if 'Chrome' in x:
        browser_list_codified.append(1)
    elif 'Safari' in x:
        browser_list_codified.append(2)
    else:
        browser_list_codified.append(0)
df.browser = browser_list_codified

del ad_success, device_list_codified, browser_list_codified

# Prinitng processed df
print('# Observations: {}'.format(df.shape[0]))
print('# Variables: {}'.format(df.shape[1]))
df.head()

## EDA
We will perform some checks to validate that both groups are representative of the total population, and thus ensuring that the differences on ad success are not caused by other factors.

In [None]:
# Sample sizes
sns.set(rc={'figure.figsize':(10,5)})
sns.countplot(x='experiment', data=df)
plt.title('Count of Observations per Group')
plt.show()
plt.close()

In [None]:
# Distributions for categorical variables
sns.set(rc={'figure.figsize':(30,10)})
fig, ax = plt.subplots(1,5)
sns.kdeplot(x='device_make', hue='experiment', data=df, ax=ax[0])
sns.kdeplot(x='browser', hue='experiment', data=df, ax=ax[1])
sns.kdeplot(x='platform_os', hue='experiment', data=df, ax=ax[2])
sns.kdeplot(x='hour', hue='experiment', data=df, ax=ax[3])
sns.kdeplot(x='date', hue='experiment', data=df, ax=ax[4])
plt.xticks(rotation=90)
plt.show()
plt.close()

In [None]:
# Creating dfs for each group
df_control = df[df.experiment =='control']
df_exposed = df[df.experiment =='exposed']

# Computing mean (as success == 1 and failure == 0 the mean is effectively our success rate)
mean_success_control = df_control.ad_success.mean()
mean_success_exposed = df_exposed.ad_success.mean()

# Printing results
print('Ad Success Control group {}%'.format((mean_success_control*100).round(2)))
print('Ad Success Exposed group {}%'.format((mean_success_exposed*100).round(2)))

#### EDA Conclusions
- **Control and Experiment properly represent the population** This can be seen on the fact that 'device_make', 'browser', 'platform_os' present almost equal distributions for both results. This is a good sign, as it shows that the population partition between the control and exposed group has been performed in a way where both groups are representative of the total population.
- **Hour and Date differences** We can see that the hour and date distributions do present siginificant differences, which can probably be explained by when was the ad implmented (we are missing info on the experiment implementations here),
- **Ad success is higher on the exposed group by 4% / 1.83pp**

## Statistical Significance

In [None]:
# Computing the difference (improvement) we want to obtain. From 45% 'ad_success' to 50%
effect_size = sms.proportion_effectsize(0.45, 0.50)

# Computing the needed sample size (per group) to ensure that we capture siginificant differences
    # Effect_size = The difference you want to observe
    # Power = The probability that we will capture an existing difference. 0.8 is standard practice
    # Alpha = alpha value for your desired statistical significance
required_n = np.ceil(sms.NormalIndPower().solve_power(effect_size, power=0.8, alpha=0.05, ratio=1))

# Printing required sample
print(f'Number of observations needed by group: {int(required_n)}')
print(f'Number of total observations on dataset: {df.shape[0]}')
print('')

# Counting successes on each group
ad_succes_count = [df_control.ad_success.sum(), df_exposed.ad_success.sum()]

# Counting observations on each group
obs_count = [df_control.ad_success.count(), df_exposed.ad_success.count()]

# Computing p-value of the ad_success distribution
z_stat, pval = proportions_ztest(ad_succes_count, nobs=obs_count)

# Computing 95% confidence intervals
(l_ci_con, l_ci_exp), (u_ci_con, u_ci_exp) = proportion_confint(ad_succes_count, nobs=obs_count, alpha=0.05)

# Prinitng results
print('The p-value of ad_success is {}'.format(pval.round(4)))
print(f'The 95% CI for ad_success on the control group is [{l_ci_con.round(4)}, {u_ci_con.round(4)}]')
print(f'The 95% CI for ad_success on the exposed group is [{l_ci_exp.round(4)}, {u_ci_exp.round(4)}]')

## Conclusion
#### Statistical Conclusion
- The p-value obtained for 'ad_success' (0.5185) is way above the decided threshold (95%, α=0.05) and thus we ***can not reject the 'Null Hypothesis' (Hₒ: p = pₒ)*** , which stated that there is no significant difference between the two groups.
- Once we remove from the dataset observations with no answers (both 'yes' and 'no' columns == 0) we are only left with 1243 observations (-6834 obs / -84.61%). This significant loss of data causes a lack of observations to ensure that significant differences are detected.

#### Business Conclusion
- The fact that the observed increase on 'ad_success' is not significant means that we can not disprove that the apparently better result was just chance. This indicates that the **differences between the 'dummy ad' shown to the 'control' group, and the 'creative ad' shown to the 'exposed' group do not convert into better ad performance**. This findings indicate that there is no solid business reason to push the implementation of the new ad design over the old one, as it will yield no extra benefit.
- **The target of 50% has not been achieved with the new add**. Ad success has not achieved the goal we had set up. Based on the results only on the best possible day the 50% target would be met, as indicated by the [0.4306, 0.507] 95% CI. Considering the almost idetincal results can be observed on the 95% CI for the existing ad ([0.4102, 0.4908]) we see no reason to implement the new ad strategy.