## Hypotheses

Null Hypothesis (H0): There is no difference in the conversion rates between the control group and the test group. This means that any observed difference is due to chance and not due to the change that's being tested.

Alternative Hypothesis (H1): There is a difference in the conversion rates between the control group and the test group. This means that the observed difference is statistically significant, and not due to chance alone. This difference could be either positive or negative depending on the context of the test.

In [87]:
import math
from scipy.stats import norm

In [88]:
# Parameters

total_test = 89425
conversions_test = 87482

total_control = 9659
conversions_control = 9410

confidence_level = 0.99
one_tailed = True

## When to use a one tailed and two tailed test

A one-tailed test or a two-tailed test refers to the tail ends of the probability distribution where we look for significant results.

One-tailed test: You would use a one-tailed test when you want to determine if a parameter of your sample data is either greater than or less than a certain value, but not both. That is, when you have a specific direction in mind. For instance, you might want to test whether a new web page design leads to more clicks than the old design (i.e., "greater than"), or if a new medicine lowers blood pressure more than the standard medicine (i.e., "less than"). The critical region (where we would reject the null hypothesis) is only in one tail of the distribution.

Two-tailed test: On the other hand, a two-tailed test is used when you're interested in determining if a parameter is simply different from a certain value, without concern for which direction the difference falls. You would use a two-tailed test if you're testing whether a new drug has a different effect (it could be higher or lower) from the standard drug. The critical region is in both tails of the distribution.

When calculating a Z-score for a hypothesis test, the decision of whether to use a one-tailed or two-tailed test depends on the nature of your hypothesis or the question you are trying to answer with your test. If you have a specific direction in mind, use a one-tailed test. If you are just looking for a difference without a specific direction, use a two-tailed test.

In [89]:
# Conversions

p1 = conversions_test/total_test
p2 = conversions_control/total_control
n1 = total_test
n2 = total_control
p = ((p1 * n1) + (p2 * n2)) / (n1 + n2)

p1 is the conversion rate of group 1 (e.g. test group), 
<br> p2 is the conversion rate of group 2 (e.g. control group),
<br> n1 is the size of group 1,
<br> n2 is the size of group 2,
<br> p is the pooled probability of the two groups, calculated as ((p1 * n1) + (p2 * n2)) / (n1 + n2).

## Function 1

This function calculates the z-score for an A/B test, given the total population and number of conversions for both the test and control group, the desired confidence level, and whether a one-tailed or two-tailed test is desired. It then returns the calculated z-score and the critical z-score for the desired confidence level.

In [90]:
def calculate_z_score(p1, p2, n1, n2, p, confidence_level, one_tailed):
    
    z_score = (p1 - p2) / math.sqrt((p * (1 - p) * ((1/n1) + (1/n2))))

    # Calculate the critical z-score for the desired confidence level
    if one_tailed == True:
        critical_z_score = norm.ppf(1 - (1 - confidence_level))
    else:
        critical_z_score = norm.ppf(1 - (1 - confidence_level) / 2)

    return z_score, critical_z_score

    # Calculate the p-value
    if one_tailed == True:
        p_value = 1 - norm.cdf(abs(z_score))
    else:
        p_value = 2 * (1 - norm.cdf(abs(z_score)))

    return p_value

In [91]:
# usage
z_score, critical_z_score = calculate_z_score(p1, p2, n1, n2, p, confidence_level, one_tailed)
print(f"Z Score: {z_score}, Critical Z Score: {critical_z_score}, P Value: {p_value}")

Z Score: 2.5717815178854773, Critical Z Score: 2.3263478740408408, P Value: 0.010117671942198925


In [85]:
if z_score > critical_z_score:
    print('Significant based off Critical Z')
else:
    print('Not Significant based off Critical Z')

Significant based off Critical Z


In [86]:
if p_value < (1-confidence_level):
    print('Significant based off P Value')
else:
    print('Not Significant based off P Value')

Not Significant based off P Value


Note: If your z-score is greater than the critical z-score, then you can reject the null hypothesis and conclude that there's a significant difference between your test and control groups. If it's less than the critical z-score, you fail to reject the null hypothesis, meaning you didn't find a significant difference.

The function assumes that both the test and control groups follow a binomial distribution and uses the approximation to the normal distribution when the sample size is large.

Note: The confidence level is usually used to decide the critical value (or threshold), and if the calculated p-value is less than the significance level (1 - confidence level), then the null hypothesis can be rejected.