<a href="https://colab.research.google.com/github/alibabastocks/a-b-tests-t-test-2-variants/blob/main/Calculate_sample_size_need_2_proportions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sample Size Calculator

*   For Binary distribution



# Approach explanation:

*   Power: This represents the probability of correctly rejecting a false null hypothesis. It's often calculated as one minus beta (β), which is the probability of making a Type II error (failing to detect a real difference). A higher power indicates a greater chance of identifying a true effect if it's present in your data.

*   Alpha: This represents the probability of rejecting a true null hypothesis. It's often referred to as the significance level. It essentially indicates the risk of getting a false positive result (concluding a difference exists when there truly isn't). Common choices for alpha are 0.05 (5%) or 0.01 (1%).

*   Two-tailed test (default approach): It assumes the new variation (B) could be either better or worse than the control (A) on the metric you're testing. It essentially checks for any statistically significant difference between A and B.

*   One-tailed test (less common): This is used when you have a strong prior belief that the variation will be better (or worse) than the control. For instance, you might have seen positive results from similar tests in the past.  Here, the test focuses on detecting a difference only in the direction you predicted (e.g., B having a higher conversion rate than A). This approach requires a smaller sample size.


In [1]:
# Import libraries
import numpy as np
import statsmodels.api as sm
import statsmodels.stats.power

In [9]:
# Edit the following fieds with your experiment data.
# Enter numbers as decimals, eg 50% becomes 0.5

cr=(0.063)  # Baseline, this the the conversion rate in your control group

cr_new = (0.0756) # This is the min improvement you are hoping to measure

traffic_control = 0.5  # % of traffic you plan to send to the control (A)

traffic_variation = 0.5 # % of traffic you plan to send to the new variation (B)



In [10]:
# Desired power (probability of detecting an effect)
power = 0.8

# Significance level (alpha)
alpha = 0.05

# Decide whether your test is two or one-sided
alternative = "two-sided" # Keep as "two-sided" if you are good with the default
                       # Otherwise set the alternative hypothesis for one-sided test ("larger" mean in new variation B, "smaller" means in control group A).



In [11]:
# Traffic ratio
traffic_ratio = traffic_control/traffic_variation
# Effect size (mean difference divided by standard deviation)
effect_size_std = sm.stats.proportion_effectsize(cr_new, cr)


# Calculate the required sample size per group
sample_size = statsmodels.stats.power.TTestIndPower().solve_power(effect_size=effect_size_std, power=power, alpha=alpha, nobs1=None, ratio = traffic_ratio, alternative=alternative)

if sample_size is None:
  print("Sample size calculation failed.")
else:
  print("Sample size needed per variation is", round(sample_size))

Sample size needed per variation is 6366
