# Problem
You work for a music streaming company that specializes in using machine learning to recommend new music. As a data scientist on the team, you helped develop a new recommendation algorithm to improve user experience.

This algorithm introduces a new stage in the processing pipeline, leveraging unsupervised learning to cluster users based on their listening history and the instrumentals present in the tracks.

Internal testing showed promising results in how these clusters relate to music recommendations, but now it's time for a real-world evaluation. You need to design an experiment to compare the effectiveness of this new algorithm with the existing recommendation system.
# Sample Size Calculator
This sample size calculator allows you to generate the number of samples needed for each side of an A/B test.

Expects:
1. hypothesis type (one-sided or two-sided)
2. confidence level
3. power of the test
4. baseline conversion rate
5. expected improvement relative to the baseline conversion rate


In [1]:
import numpy as np
from scipy.stats import norm

In [2]:
def calculate_sample_size(hypothesis, confidence, power, baseline_conversion_rate, expected_improvement):
    """
    Calculates the minimum sample size per variation for an A/B test for a normally distributed metric.

    Args:
      hypothesis (str): 'one-sided' or 'two-sided'
      confidence (float): Confidence level (e.g., 0.95 for 95%)
      power (float): Power of the test (e.g., 0.8 for 80%)
      baseline_conversion_rate (float): Baseline conversion rate (e.g., 0.1 for 10%)
      expected_improvement (float): Expected relative improvement in conversion rate (e.g., 0.1 for 10% improvement)

    Returns:
      int: Minimum sample size per variation
    """

    if hypothesis not in ('one-sided', 'two-sided'):
        raise ValueError("Hypothesis must be 'one-sided' or 'two-sided'")

    # Calculate the effect size
    effect_size = abs(expected_improvement) 
    
    # Calculate the pooled standard deviation
    p2 = baseline_conversion_rate *  (1 + effect_size)
    pooled_prob = (baseline_conversion_rate + p2) / 2
    pooled_sd = np.sqrt(pooled_prob * (1 - pooled_prob))
    mde = effect_size * baseline_conversion_rate

    # Adjust alpha if two-sided test
    alpha = 1 - confidence
    alpha = alpha / (2 if hypothesis == 'two-sided' else 1)
    
    # Calculate the z-scores
    z_alpha = norm.ppf(1 - alpha)
    z_beta = norm.ppf(power)
    z_combined = (z_alpha + z_beta)**2

    # Calculate the sample size
    n =  2 * z_combined * pooled_sd**2 / mde**2
    # Round up to the nearest whole number
    n = int(np.ceil(n))
    return n

In [4]:
# Calculate the required sample size needed for a two sided test
# Use a 95% confidence and 80% power
# Use a baseline conversion of 2%, with a 15% expected improvement
calculate_sample_size("two-sided", 0.95, 0.8, 0.02, 0.15)

36694