# A/B Testing Analysis

Goals: to validate the hypothesis that is made in the experimentation

Flow:
1. Reading the dataset both variant A and variant B (with treatment) during the experimentation
2. Check the normality both variant
3. Check the homogeneity of variance (if the data is normally distributed)
4. Implement the hypothesis testing and validate the hypothesis
5. Calculate the uplift of the metrics (if variant B is better based on the statistic)

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats

## Data Preparation

In [None]:
dataset = pd.read_csv("/content/drive/MyDrive/Ari Folders/Data_Ari/Data_Science/Speaker/Apiary/Astra International/Astra International x Apiary/ab_app_subscription.csv")

# change the type of data into date type
dataset['cohort_date'] = pd.to_datetime(dataset['cohort_date'])

display(dataset.info())
dataset.head()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 505 entries, 0 to 504
Data columns (total 14 columns):
 #   Column                    Non-Null Count  Dtype         
---  ------                    --------------  -----         
 0   period                    505 non-null    object        
 1   cohort_date               505 non-null    datetime64[ns]
 2   variant                   505 non-null    object        
 3   user_id                   505 non-null    object        
 4   country                   505 non-null    object        
 5   platform                  505 non-null    object        
 6   traffic_source            505 non-null    object        
 7   device_model              505 non-null    object        
 8   age                       505 non-null    int64         
 9   sessions_7d               505 non-null    int64         
 10  time_to_first_action_min  505 non-null    float64       
 11  trial_start               505 non-null    int64         
 12  paid_subscriber       

None

Unnamed: 0,period,cohort_date,variant,user_id,country,platform,traffic_source,device_model,age,sessions_7d,time_to_first_action_min,trial_start,paid_subscriber,revenue_30d
0,during,2025-07-05,B,APP201334,ID,android,organic,low_end,54,4,6.1,0,0,444.616527
1,during,2025-07-04,A,APP201596,ID,android,organic,low_end,24,2,6.5,0,0,424.835708
2,during,2025-07-06,A,APP200545,TH,android,organic,mid_range,41,4,7.3,0,0,393.086785
3,during,2025-07-07,B,APP203155,MY,android,organic,mid_range,40,1,2.3,0,0,502.190971
4,during,2025-07-04,A,APP202354,ID,android,paid,mid_range,49,6,2.4,1,0,432.384427


In [None]:
# Filter the data only "during" period of experiment
dataset_during = dataset[dataset['period'] == 'during'].copy()
dataset_during.shape

(505, 14)

In [None]:
dataset_during['variant'].value_counts()

Unnamed: 0_level_0,count
variant,Unnamed: 1_level_1
A,257
B,248


## Hypothesis Testing

### Check normality
- Null Hypothesis (Ho): The data is normally distributed.
- Alternative Hypothesis (Ha): The data is not normally distributed.

Interpretation: The null hypothesis is that the data is normally distributed. A p-value greater than \(0.05\) (a common significance level) suggests the data appears normal.

In [22]:
revenue_a = dataset[dataset_during['variant'] == 'A']['revenue_30d'].values
revenue_b = dataset[dataset_during['variant'] == 'B']['revenue_30d'].values

In [24]:
# Check normality using shapiro
stat_A, p_A = stats.shapiro(revenue_a)
stat_B, p_B = stats.shapiro(revenue_b)

print(f"Shapiro-Wilk test for variant A: Statistic={stat_A:.4f}, p-value={p_A:.4f}")
print(f"Shapiro-Wilk test for variant B: Statistic={stat_B:.4f}, p-value={p_B:.4f}")

# Interpretasi
alpha = 0.05
print("Variant A")
if p_A > alpha:
    print("Data appears normal")
else:
    print("No evidence to say the data is normal")

print("")
print("Variant B")
if p_B > alpha:
    print("Data appears normal")
else:
    print("No evidence to say the data is normal")

Shapiro-Wilk test for variant A: Statistic=0.9919, p-value=0.1668
Shapiro-Wilk test for variant B: Statistic=0.9967, p-value=0.8805
Variant A
Data appears normal

Variant B
Data appears normal


### Homogeneity of Variance (If Normality check shows the data is normally distributed)
- Null Hypothesis \(Ho\): The variances of the groups are equal.
- Alternative Hypothesis \(Ha\): At least one group's variance is different.

Interpretation: the p-value is less than the significance level (commonly \(0.05\)), you reject the null hypothesis and conclude that the variances are not equal.

In [25]:
# Uji homogenitas varians (Levene)
stat, p_value = stats.levene(revenue_a, revenue_a)

print("Levene's Test (Homogeneity of Variance)")
print(f"Statistic: {stat:.4f}, P-value: {p_value:.4f}")

# Interpretasi
alpha = 0.05
if p_value > alpha:
    print("Varians dianggap homogen (equal variance).")
else:
    print("Varians tidak homogen (unequal variance).")

Levene's Test (Homogeneity of Variance)
Statistic: 0.0000, P-value: 1.0000
Varians dianggap homogen (equal variance).


### Independent t-test
Note: If the data is normally distributed and the variance between the variant A and B are equal.
- Null Hypothesis (Ho): There is no significant difference or no evidence that two variants perform similarly
- Alternative Hypothesis (H1): Significant difference between two variants

Interpretation: If the p-value is below than the significance level (0.05), you can reject the null hypothesis and conclude that two variants have significanct difference

In [26]:
t_test, p_value = stats.ttest_ind(revenue_a, revenue_b, equal_var=True)
print("t-test p-value:", p_value)

# Interpretasi
alpha = 0.05
if p_value < alpha:
    print("Reject null hypothesis. There’s strong evidence that the two groups perform differently, and it’s very unlikely that this difference happened by chance")
else:
    print("No evidence that two variants perform similarly")

t-test p-value: 6.435448411416861e-58
Reject null hypothesis. Two variants have significant difference


## Metrics Calculation

In [28]:
average_variant_a = revenue_a.mean()
average_variant_b = revenue_b.mean()

print("average revenue A {}, and average revenue B {}".format(average_variant_a, average_variant_b))

average revenue A 400.28583278798976, and average revenue B 480.6651552511709


In [30]:
median_variant_a = np.median(revenue_a)
median_variant_b = np.median(revenue_b)

print("Median revenue A: {}, and median revenue B: {}".format(median_variant_a, median_variant_b))


Median revenue A: 403.01151049705135, and median revenue B: 479.93177646003915
