## Overview
This notebook presents a comprehensive analysis of Globox's A/B testing data, focusing on key metrics like conversion rate and average amount spent per user. The primary objective is to evaluate the effectiveness of a new banner and make an informed decision on whether it should be launched across the platform.

### Data Description
The dataset is rich with insights, encompassing various aspects of user behavior and characteristics during the A/B test:

- **User Demographics:** Includes country and gender of the users.
- **Device Information:** Details the type of device used by each user.
- **Group Assignment:** Users are categorized into two groups - Group A (Control) and Group B (Treatment).
- **User Engagement Metrics:**
   - **Conversion Status:** Indicates whether users engaged in a conversion action.
   - **Amount Spent:** Reflects the total expenditure by each user, serving as a key metric for assessing user spending behavior and the economic impact of the banner.

### Analysis Outline
The analytical approach is structured and multifaceted, covering the following critical areas:

1. **Hypothesis Testing for Conversion Rate:**
   - Examining the effect of the banner on user conversion rates between Group A and Group B.
   - Determining the statistical significance of differences in conversion rates.

2. **Confidence Intervals for Conversion Rate Difference:**
   - Calculating the 95% confidence interval to estimate the true difference in conversion rates.

3. **Hypothesis Testing for Average Amount Spent per User:**
   - Analyzing the influence of the banner on users' average spending.
   - Assessing the statistical relevance of any differences in spending patterns between the groups.

4. **Confidence Intervals for Average Amount Spent Difference:**
   - Estimating the variability in the difference of average amount spent between the two groups through confidence intervals.

5. **Novelty Effect Analysis:**
   - Investigating potential novelty effects to understand the initial user reaction to the banner.

This detailed analytical approach will provide robust insights for a data-driven decision on the banner's potential launch.



# Question 1: Hypothesis Testing for Conversion Rate

## Objective
To test whether there is a statistically significant difference in the conversion rates 
between two groups (A and B) in the provided dataset.

## Hypotheses
- **Null Hypothesis ($H_0$):** There is no difference in the mean conversion rate between group A and group B.
  
  $\mu_A = \mu_B$
- **Alternative Hypothesis ($H_a$):** There is a difference in the mean conversion rate between group A and group B.
  
  $\mu_A \neq \mu_B$

Where:
- $\mu_A$ is the mean conversion rate of group A.
- $\mu_B$ is the mean conversion rate of group B.

## Significance Level
- $\alpha = 0.05$


In [5]:
# Load the necessary libraries
import pandas as pd
from scipy.stats import ttest_ind

# Load the data
data_path = '/Users/air/Desktop/MasterSchool/Project_final/Data_Sprint1_project.csv'  # Update the path accordingly
data = pd.read_csv(data_path)

# Extracting conversion rates for groups A and B
conv_rate_A = data[data["Test_Group"] == "A"]["Conversion_Rate"]
conv_rate_B = data[data["Test_Group"] == "B"]["Conversion_Rate"]

# Calculating means
mean_A = conv_rate_A.mean()
mean_B = conv_rate_B.mean()

# Performing a two-sample t-test
t_stat, p_value = ttest_ind(conv_rate_A, conv_rate_B, equal_var=False)

(mean_A, mean_B, t_stat, p_value)


(0.03923099042845993,
 0.04630081300813008,
 -3.8664058066425353,
 0.00011059442123932611)


## Conclusion

After calculating the p-value from the t-test, a decision based on the comparison of the p-value 
and the significance level ($\alpha$).

### Decision Rule
- If p-value $< \alpha$: Reject the null hypothesis ($H_0$).
- If p-value $\geq \alpha$: Fail to reject the null hypothesis.


# Results and Conclusion

## Results

### Group A:
- **Mean conversion rate:** \(0.0392\) or \(3.92\%\)

### Group B:
- **Mean conversion rate:** \(0.0463\) or \(4.63\%\)

## Hypothesis Test Results
- **T-statistic:** \(-3.87\)
- **P-value:** \(0.00011\)

## Conclusion

Given our significance level \(\alpha = 0.05\), since the p-value (\(0.00011\)) is less than \(\alpha\), we reject the null hypothesis \(H_0\). This indicates that there is a statistically significant difference in the mean conversion rates between groups A and B.

Despite the statistical significance, it is essential to evaluate whether this difference is practically significant, considering the specific context and business objectives. Furthermore, additional analyses could be conducted to understand the impact of other variables on conversion rate and ensure that the observed differences are due to the groupings (A/B) and not some other factor.


# Question 2: Confidence Interval for Difference in Conversion Rates

## Objective
To calculate the 95% confidence interval for the difference in conversion rates between the treatment and control groups.

## Methodology
The 95% confidence interval for the difference in conversion rates (mean values) between two groups can be estimated using the formula:

\[ CI = (ar{x}_B - ar{x}_A) \pm t 	imes SE \]

where:
- \( ar{x}_B \) and \( ar{x}_A \) are the sample means of groups B and A, respectively.
- \( t \) is the t-value for our confidence level (for a 95% confidence interval and assuming a normal distribution, \( t \) is approximately 1.96, given the large sample size).
- SE is the standard error of the difference between the two means, calculated as:

\[ SE = \sqrt{ \left( 
rac{s_B^2}{n_B} 
ight) + \left( 
rac{s_A^2}{n_A} 
ight) } \]

where:
- \( s_B \) and \( s_A \) are the standard deviations of groups B and A, respectively.
- \( n_B \) and \( n_A \) are the sample sizes of groups B and A, respectively.

Let's calculate the 95% confidence interval for the difference in conversion rates between the treatment (B) and control (A) groups.


In [6]:
import math
# Extracting conversion rates for groups A and B
conv_rate_A = data[data["Test_Group"] == "A"]["Conversion_Rate"]
conv_rate_B = data[data["Test_Group"] == "B"]["Conversion_Rate"]

# Calculating means and standard deviations
mean_A, mean_B = conv_rate_A.mean(), conv_rate_B.mean()
std_A, std_B = conv_rate_A.std(), conv_rate_B.std()

# Sample sizes
n_A, n_B = len(conv_rate_A), len(conv_rate_B)

# Standard error of the difference between means
SE = math.sqrt((std_B**2 / n_B) + (std_A**2 / n_A))

# t-value for 95% confidence interval (two-tailed)
t_value = 1.96

# Confidence interval
lower_bound = (mean_B - mean_A) - t_value * SE
upper_bound = (mean_B - mean_A) + t_value * SE

(lower_bound, upper_bound)

(0.0034859121085170103, 0.0106537330508233)

## Conclusion

The 95% confidence interval for the difference in conversion rates between the treatment group (B) and the control group (A) is calculated above.

This interval provides a range of plausible values for the true difference in conversion rates between the two groups, given our data. If the confidence interval does not contain zero, it indicates that there is a statistically significant difference in conversion rates between the two groups at the 95% confidence level. Always consider this interval in the context of practical significance and the specific use-case scenario.

If you have further analyses or questions, feel free to ask!

# Question 3: Hypothesis Testing for Average Amount Spent per User Between Two Groups

This analysis aims to conduct a hypothesis test to determine whether there is a difference in the average amount spent per user between two groups. We will use the t-distribution and assume a 5% significance level, considering unequal variance.

## Steps
1. **Determine the Null and Alternative Hypothesis**
2. **Determine the Type of Test**
3. **Calculate the Test Statistic**
4. **Calculate the p-value**
5. **Draw a Conclusion about the Hypothesis**

Let's start by loading the dataset and examining its structure.
    

## Step 1: Determine the Null and Alternative Hypothesis

Before performing the hypothesis test, we need to establish the null and alternative hypotheses.

- **Null Hypothesis (\(H_0\))**: The average amount spent per user is the same for both groups (Group A and Group B).
- **Alternative Hypothesis (\(H_1\))**: The average amount spent per user is different for the two groups.

## Step 2: Determine the Type of Test

We will use a two-sample t-test for unequal variances, also known as Welch's t-test. This is appropriate because we are comparing the means of two independent groups and do not assume equal variances.

In [7]:
from scipy import stats

# Filter the data for the two groups
group_a = data[data['Test_Group'] == 'A']['Total_Amount_Spent']
group_b = data[data['Test_Group'] == 'B']['Total_Amount_Spent']

# Perform Welch's t-test
t_statistic, p_value = stats.ttest_ind(group_a, group_b, equal_var=False)

t_statistic, p_value

(-0.07042501611591226, 0.9438556687127899)

## Step 3: Calculate the Test Statistic

The test statistic (t-value) obtained from Welch's t-test is \(-0.0704\). This value indicates the number of standard deviations the sample mean is from the null hypothesis.

## Step 4: Calculate the p-value

The p-value associated with the t-statistic is \(0.9439\). This value represents the probability of observing a test statistic as extreme as, or more extreme than, the observed value under the null hypothesis.

## Step 5: Draw a Conclusion about the Hypothesis

Given a significance level of 5% (\(0.05\)), the p-value of \(0.9439\) is much larger than \(0.05\). This means we do not have sufficient evidence to reject the null hypothesis.

**Conclusion:** We fail to reject the null hypothesis. Based on the data provided, there is no statistically significant difference in the average amount spent per user between the two groups (Group A and Group B).
    

# Question 4: 95% Confidence Interval for the Difference in Average Amount Spent

This analysis calculates the 95% confidence interval for the difference in the average amount spent per user between the treatment and control groups. We'll use the t-distribution and assume unequal variances.

## Steps
1. **Determine the Type of Interval**
2. **Calculate the Sample Statistic**
3. **Calculate the Standard Error**
4. **Calculate the Critical Value**
5. **Construct the Interval**

Let's proceed with these steps using the data.

In [8]:
import numpy as np
from scipy import stats

# Assuming the data has already been loaded and filtered for Group A and Group B
# Group A (control) and Group B (treatment) data have been stored in variables group_a and group_b

# Step 2: Calculate the Sample Statistic (Difference in sample means)
mean_diff = np.mean(group_b) - np.mean(group_a)

# Step 3: Calculate the Standard Error for the difference in means
n_a = len(group_a)
n_b = len(group_b)
std_a = np.std(group_a, ddof=1)
std_b = np.std(group_b, ddof=1)

se_diff = np.sqrt(std_a**2/n_a + std_b**2/n_b)

# Step 4: Calculate the Critical Value (t-value for 95% confidence)
# Degrees of freedom for unequal variances (Welch-Satterthwaite equation)
df = (std_a**2/n_a + std_b**2/n_b)**2 / ((std_a**2/n_a)**2/(n_a-1) + (std_b**2/n_b)**2/(n_b-1))

critical_value = stats.t.ppf(0.975, df)  # 2-tailed test, so we use 0.975 to get the upper tail

# Step 5: Construct the Interval
margin_of_error = critical_value * se_diff
conf_interval = (mean_diff - margin_of_error, mean_diff + margin_of_error)

mean_diff, se_diff, critical_value, conf_interval
    

(0.016348503076092147,
 0.232140565636282,
 1.9600125038846983,
 (-0.4386499082298871, 0.4713469143820714))

## Interpretation

The 95% confidence interval for the difference in the average amount spent per user between the treatment (Group B) and control (Group A) is approximately \((-0.439, 0.471)\).

This confidence interval suggests that, with 95% confidence, the difference in the average amount spent per user between the treatment and control groups is estimated to be between \(-0.439\) and \(0.471\). Since the interval includes zero, it indicates that there's no statistically significant difference in the average amount spent per user between the two groups at the 95% confidence level.

# Novelty Effect Assessment

## Overview
This notebook presents an analysis of external validity for a study comparing two groups (A and B) based on conversion rates and the average amount spent per user. The goal is to assess how well the results of this study can be generalized to other settings, populations, or times.

### Data Description
The dataset contains information about users, including their country, gender, device type, test group (A or B), conversion status, and total amount spent.

### Steps in Analysis
1. Analyze the demographic composition of the study participants (country and gender distributions) to assess representativeness.
2. Analyze the distribution of device types to assess representativeness.
3. Analyze the distribution of total amount spent to assess representativeness.
4. Analyze the distribution of conversion rates to assess representativeness.



In [9]:
# Analyze the demographic composition of the study participants

# Country distribution
country_distribution = data['country'].value_counts(normalize=True) * 100

country_distribution

USA    30.583851
BRA    19.532091
MEX    11.879917
DEU     7.979296
TUR     7.726708
FRA     6.397516
GBR     6.105590
ESP     4.126294
CAN     3.250518
AUS     2.418219
Name: country, dtype: float64

In [11]:
# Gender distribution
gender_distribution = data['Gender'].value_counts(normalize=True) * 100

gender_distribution

M    48.206140
F    47.828360
O     3.965501
Name: Gender, dtype: float64

## Power Analysis
This section is dedicated to performing the power analysis. We calculate the required sample size to detect a specified Minimum Detectable Effect (MDE) for both the conversion rate and the average amount spent. We use a two-sample proportion test for the conversion rate and a two-sample t-test for the average amount spent.

In [15]:
import pandas as pd
#!pip install statsmodels
from statsmodels.stats.power import TTestIndPower, NormalIndPower
import numpy as np

In [16]:
# Load the dataset
file_path = '/Users/air/Desktop/MasterSchool/Project_final/Data_Sprint1_project.csv'  # Replace with actual file path
data = pd.read_csv(file_path)

# Display the first few rows of the dataframe
data.head()

Unnamed: 0,user_id,country,Gender,Device_Type,Test_Group,Conversion_Rate,Total_Amount_Spent
0,1000039,GBR,F,A,B,1,36.65
1,1000045,USA,F,I,B,1,51.58
2,1000071,USA,F,I,B,1,6.71
3,1000101,MEX,F,A,B,1,23.8
4,1000123,DEU,,I,B,1,100.74


## Group Calculations
We segregate the data into control and test groups to calculate the baseline metrics. The metrics of interest are the conversion rate and average amount spent for both control and test groups. These calculations provide a foundation for the power analysis.

In [20]:
# Segregating the data into control and test groups
control_group = data[data['Test_Group'] == 'A']
test_group = data[data['Test_Group'] == 'B']

# Calculating the baseline metrics for the control group
control_conversion_rate = control_group['Conversion_Rate'].mean()
control_avg_amount_spent = control_group['Total_Amount_Spent'].mean()

# Calculating the metrics for the test group
test_conversion_rate = test_group['Conversion_Rate'].mean()
test_avg_amount_spent = test_group['Total_Amount_Spent'].mean()

# Preparing a summary of the metrics
metrics_summary = pd.DataFrame({
    "Metric": ["Conversion Rate", "Average Amount Spent"],
    "Group A: Control": [control_conversion_rate, control_avg_amount_spent],
    "Group B: Test": [test_conversion_rate, test_avg_amount_spent]
})
metrics_summary.set_index("Metric", inplace=True)
metrics_summary

Unnamed: 0_level_0,Group A: Control,Group B: Test
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1
Conversion Rate,0.039231,0.046301
Average Amount Spent,3.374518,3.390867


In [21]:
# Parameters for the power analysis
alpha = 0.05  # Significance level
power = 0.80  # Statistical power
mde_relative = 0.10  # Minimum Detectable Effect (10% relative change)

# Calculating the absolute MDE for conversion rate and average amount spent
mde_conversion_rate = control_conversion_rate * mde_relative
mde_avg_amount_spent = control_avg_amount_spent * mde_relative

# Power analysis for conversion rate (two-sample proportion test)
effect_size_conversion_rate = NormalIndPower().solve_power(
    effect_size=(mde_conversion_rate / np.sqrt(control_conversion_rate * (1 - control_conversion_rate))),
    alpha=alpha,
    power=power,
    ratio=1  # Equal size for control and test groups
)

# Power analysis for average amount spent (two-sample t-test)
effect_size_avg_amount_spent = TTestIndPower().solve_power(
    effect_size=(mde_avg_amount_spent / control_group['Total_Amount_Spent'].std()),
    alpha=alpha,
    power=power,
    ratio=1,  # Equal size for control and test groups
    alternative='two-sided'
)

required_sample_size = pd.DataFrame({
    "Metric": ["Conversion Rate", "Average Amount Spent"],
    "Required Sample Size": [effect_size_conversion_rate, effect_size_avg_amount_spent]
})
required_sample_size.set_index("Metric", inplace=True)
required_sample_size

Unnamed: 0_level_0,Required Sample Size
Metric,Unnamed: 1_level_1
Conversion Rate,38443.800962
Average Amount Spent,92733.4931



## Current Sample Sizes
Here, we calculate and compare the current sample sizes of the control and test groups with the required sample sizes obtained from the power analysis. This comparison helps us understand if the current sample sizes are sufficient for the A/B test.


In [22]:
# Calculating the current sample sizes for control and test groups
current_sample_size_control = control_group.shape[0]
current_sample_size_test = test_group.shape[0]

current_sample_sizes = pd.DataFrame({
    "Group": ["Control Group", "Test Group"],
    "Current Sample Size": [current_sample_size_control, current_sample_size_test]
})
current_sample_sizes.set_index("Group", inplace=True)
current_sample_sizes


Unnamed: 0_level_0,Current Sample Size
Group,Unnamed: 1_level_1
Control Group,24343
Test Group,24600


## Interpretations of Results
Based on our analysis, we observe the following:
- The required sample size to detect a 10% relative change in conversion rate is approximately 38,444 participants per group, which is higher than the current sample sizes in both control and test groups.
- For the average amount spent, the required sample size is about 92,734 participants per group, which is significantly higher than our current sample sizes.

These findings suggest that the current sample sizes may not be sufficient to reliably detect the intended effects in both conversion rate and average amount spent with the desired statistical power and significance level.

## Recommendations
Given the results of our power analysis, we recommend the following:
1. Increase the sample size for both metrics to meet the identified thresholds. This will enhance the statistical power of the test and the reliability of its results.
2. Consider extending the duration of the test to accumulate a larger sample size.
3. If increasing the sample size is not feasible, reconsider the MDE or adjust the desired level of statistical power.
4. Perform a post-hoc power analysis after the completion of the test to assess the actual power achieved with the current sample sizes.