In [2]:
import numpy as np
import pandas as pd
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sns
import itertools
import statsmodels.stats.api as sms
from scipy.stats import (ttest_1samp, shapiro, levene, ttest_ind, mannwhitneyu,
                         pearsonr, spearmanr, kendalltau, f_oneway, kruskal)
from statsmodels.stats.proportion import proportions_ztest
from statsmodels.stats.multicomp import MultiComparison
import zipfile, requests, io
from statsmodels.stats.power import NormalIndPower
from statsmodels.stats.proportion import proportion_effectsize
from statsmodels.stats.power import zt_ind_solve_power

# Daily Challenge: The Great Bake-Off


You‚Äôre the data analyst for a popular online bakery, ‚ÄúSweet Bytes,‚Äù known for its delicious treats and innovative digital marketing campaigns. The bakery is about to launch a new checkout process, and the team believes it could significantly boost sales. However, before making the switch, you need to run an A/B test to ensure the new process truly outperforms the current one. To do this, you‚Äôll need to calculate the right sample size to ensure your test is both efficient and reliable. Ready to power up your A/B testing skills and help Sweet Bytes make the right decision? Let‚Äôs dive in!

## Calculate the Required Sample Size

In [4]:
# Given parameters
current_conversion = 0.05  # 5% current conversion rate
new_conversion = 0.07     # 7% expected new conversion rate
alpha = 0.05               # Significance level
power = 0.8                # Desired power (1 - beta)
effect_size_given = 0.2    # Given effect size

# Calculate Cohen's h effect size for proportions
effect_size_calculated = proportion_effectsize(new_conversion, current_conversion)

# Calculate sample size per group using the GIVEN effect size
sample_size_per_group = zt_ind_solve_power(
    effect_size=effect_size_given,
    alpha=alpha,
    power=power,
    ratio=1.0,  # Equal sample sizes in both groups
    alternative='two-sided'
)

# Calculate sample size per group using the CALCULATED effect size
cohens_sample_size_per_group = zt_ind_solve_power(
    effect_size=effect_size_calculated,
    alpha=alpha,
    power=power,
    ratio=1.0,  # Equal sample sizes in both groups
    alternative='two-sided'
)

print(f"\nüìä RESULTS:")
print(f"  Cohen's H Sample size per group: {int(np.ceil(cohens_sample_size_per_group))}")
print(f"  Sample size per group (Effect Size = 0.2): {int(np.ceil(sample_size_per_group))}")
print(f"  Total sample size needed (Effect Size = 0.2): {int(np.ceil(sample_size_per_group)) * 2}")


üìä RESULTS:
  Cohen's H Sample size per group: 2199
  Sample size per group (Effect Size = 0.2): 393
  Total sample size needed (Effect Size = 0.2): 786


**Analyze the Impact of Effect Size:**

As you expect a higher conversion rate (higher effect size), the required sample size will decrease, since it is easier to notice and statistically prove when there is a large difference between the distribution of the control vs. treatment groups. Whereas, if you want to statistically prove a minor difference, you will require many more observations before concluding with confidence.

The bakery‚Äôs head chef, always aiming for perfection, wonders what would happen if the effect size were different. Calculate the required sample size for effect sizes of 0.1, 0.2, 0.3, and 0.4.

In [None]:
analysis = NormalIndPower()

# Calculate sample size per group
sizes = [0.1, 0.2, 0.3, 0.4]
for size in sizes:
    sample_size = analysis.solve_power(effect_size=size, alpha=0.5, power=0.8, ratio=1)
    print(f"Required Sample Size at Effect Size - {size} = {int(sample_size)} users per gourp")

Required Sample Size at Effect Size - 0.1 = 425 users per gourp
Required Sample Size at Effect Size - 0.2 = 106 users per gourp
Required Sample Size at Effect Size - 0.3 = 47 users per gourp
Required Sample Size at Effect Size - 0.4 = 26 users per gourp


In [29]:
changes = [0.06, 0.07, 0.08, 0.09]

for change in changes:
    p1 = 0.05
    p2 = change # p2 will change according to the desired effect size
    alpha = 0.05
    power = 0.8

    # Calculate Cohen's H Effect Size
    effect_size_calculated = proportion_effectsize(p2, p1)
    
    analysis = NormalIndPower()
    sample_size = analysis.solve_power(effect_size=effect_size_calculated, power=power, alpha=alpha, ratio=1)
    print(f"Required sample size at P2 - {change} = {int(sample_size)} users per group")

Required sample size at P2 - 0.06 = 8142 users per group
Required sample size at P2 - 0.07 = 2198 users per group
Required sample size at P2 - 0.08 = 1046 users per group
Required sample size at P2 - 0.09 = 625 users per group


As the effect size increases, the required sample size will decrease, since it is easier to notice and statistically prove when there is a large difference between the distribution of the control vs. treatment groups. Whereas, if you want to statistically prove a minor difference, you will require many more observations before concluding with confidence.

**Explain the Relationship**: There is an Inverse Relationship between sample size and effect size. When effect size doubles, sample size drops to about 1/4. This follows the formula: Sample Size ‚àù 1/(Effect Size)¬≤

Imagine you‚Äôre explaining this to the bakery‚Äôs team in a fun, easy-to-understand way. Why is it so important to balance effect size and sample size when planning an A/B test? Help them understand how this ensures they‚Äôre not wasting time or resources and how it helps them confidently make decisions that could increase their sweet sales.

THE GOLDEN BALANCE:
   
   - ‚úÖ TOO SMALL sample ‚Üí Might miss real improvements (lose money!)
   - ‚úÖ TOO LARGE sample ‚Üí Waste time testing when you could be selling!
   - ‚úÖ JUST RIGHT ‚Üí Confidently detect real improvements efficiently!

üéØ BOTTOM LINE FOR THE BAKERY TEAM:
   "Bigger differences are easier to spot with fewer customers.
    Tiny improvements need lots of data to confirm they're real.
    Plan your sample size BEFORE testing to avoid wasted effort!"