# Bayesian Machine Learning A/B Testing

Future of A/B Testing A/B testing + Machine Learning = Much more sophisticated personalization 
* e.g., Moving from targeting customers based on 2 variables (Location, Device) to 50 variables
* Recent advances in ML make this easy/automatable in principled ways

Testing platforms will move away from rules of thumb for decision making (e.g., p=0.05) and toward “Bayesian” paradigms based on data

In [1]:
import numpy as np
from scipy.stats import beta

class BayesianABTest:
    """
    The code implements a class for traditional Bayesian A/B testing. It 
    allows you to update the posterior distribution based on observed data 
    for two variants (A and B). You can generate posterior samples, calculate 
    the posterior mean, and determine the highest posterior density (HPD) 
    interval. The example usage demonstrates updating the posterior with 
    simulated data and calculating the mean and interval for inference.
    
    First we initialize the BayesianABTest class with prior parameters 
    (alpha_prior and beta_prior). Then, we update the posterior distribution
    with the observed data from two variants (A and B) using the update method. 
    We can generate posterior samples using the sample_posterior method and 
    calculate the posterior mean and highest posterior density (HPD) interval 
    using the respective methods. Finally, we print the results to see the
    posterior mean and interval.
    """
    def __init__(self, alpha_prior, beta_prior):
        """
        Initialize the Bayesian A/B test with prior parameters.

        :param alpha_prior: Prior alpha parameter (positive outcome)
        :param beta_prior: Prior beta parameter (negative outcome)
        """
        self.alpha_prior = alpha_prior
        self.beta_prior = beta_prior

    def update(self, successes, failures):
        """
        Update the posterior distribution based on observed data.

        :param successes: Number of successes in the experiment
        :param failures: Number of failures in the experiment
        """
        self.alpha_posterior = self.alpha_prior + successes
        self.beta_posterior = self.beta_prior + failures

    def sample_posterior(self, size=10000):
        """
        Generate posterior samples using the updated parameters.

        :param size: Number of posterior samples to generate
        :return: Array of posterior samples
        """
        return np.random.beta(self.alpha_posterior, self.beta_posterior, size)

    def calculate_posterior_mean(self):
        """
        Calculate the posterior mean.

        :return: Posterior mean
        """
        return self.alpha_posterior / (self.alpha_posterior + self.beta_posterior)

    def calculate_posterior_interval(self, interval=0.95):
        """
        Calculate the highest posterior density (HPD) interval.

        :param interval: Probability interval (default: 0.95)
        :return: HPD interval
        """
        return beta.interval(interval, self.alpha_posterior, self.beta_posterior)


# Example usage
alpha_prior = 1  # Prior alpha parameter
beta_prior = 1  # Prior beta parameter

# Initialize Bayesian A/B test
ab_test = BayesianABTest(alpha_prior, beta_prior)

# Simulated data
successes_A = 100  # Number of successes for variant A
failures_A = 900   # Number of failures for variant A
successes_B = 120  # Number of successes for variant B
failures_B = 880   # Number of failures for variant B

# Update the posterior distribution
ab_test.update(successes_A, failures_A)
ab_test.update(successes_B, failures_B)

# Generate posterior samples
samples = ab_test.sample_posterior()

# Calculate posterior mean and interval
mean = ab_test.calculate_posterior_mean()
interval = ab_test.calculate_posterior_interval()

print("Posterior Mean:", mean)
print("Highest Posterior Density Interval:", interval)


Posterior Mean: 0.12075848303393213
Highest Posterior Density Interval: (0.10132341964422845, 0.14162699974885076)
