# Experimental Design Tutorial: Testing Conversion Rates

D. Akman

In this notebook, we perform a hypothesis test comparing the conversion rates of two mortgage products, new versus existing, using an experimental design rather than observational data. 

In our controlled experiment, participants are randomly assigned to either the **champion product**, *Standard Home Mortgage*, which offers a traditional, reliable fixed payment structure and established application process, or the **challenger** product, *FlexiHome Mortgage*, which features innovative elements like flexible payment options, tailored interest rates, and a digital dashboard. 

This random assignment minimises confounding variables and allows us to isolate the effect of product features on conversion rates, enabling us to directly assess if the new product significantly outperforms the existing one, as opposed to the correlational insights typically derived from observational studies.

We will work through each step using made-up numbers and perform the analysis using Python and `statsmodels`.

## 1. Define the Hypotheses

We wish to test if the conversion rate of the new (challenger) mortgage product is higher than that of the existing (champion) product. 

### Null Hypothesis
$H_0: p_{challenger} = p_{champion}$

### Alternative Hypothesis
$H_1: p_{challenger} > p_{champion}$

This is a one-tailed test because we are testing for an increase in the conversion rate for the challenger product.

## 2. Set Up the Experiment and Data

Assume we have 400 participants randomly assigned equally to two groups. To further enhance the robustness of our experimental design and minimise confounding effects, we can incorporate stratified randomisation. 

For example, among bank customers, 65% are below 40 and 35% are 40 or older. If we also consider gender by stratifying participants into male and female groups (assumed to be equal proportions), we end up with four strata based on both age and gender. Rather than assuming an even split across strata, we reflect the natural distribution: approximately 65% of the participants will fall into the "below 40" strata and 35% into the "40 or older" strata. 

Within each age bracket, further stratification by gender ensures that each subgroup is well represented in both the new and the existing product groups. For instance, if within the "below 40" group the gender split is roughly equal, and you have 260 participants (65% of 400), you might have around 130 males and 130 females. These would then be randomly divided into 65 per group (new and existing) per gender within this age bracket. Similarly, in the "40 or older" group, with 140 participants (35% of 400) split by gender, each subgroup would have 70 participants, with 35 allocated to each product group. 

This stratified randomisation strategy ensures that the distribution of both age and gender mirrors the real-world customer base, thereby controlling for potential biases and ensuring that any observed differences in conversion rates are more likely due to the product features rather than underlying demographic differences.

- **Group A (New Product - Challenger):**
  - Number of participants: $$n_A = 200$$
  - Conversions: $$x_A = 60$$

- **Group B (Existing Product - Champion):**
  - Number of participants: $$n_B = 200$$
  - Conversions: $$x_B = 40$$

The sample conversion rates are:

$$\hat{p}_A = \frac{60}{200} = 0.30$$
$$\hat{p}_B = \frac{40}{200} = 0.20$$

## 3. Calculate the Z-Test Statistic Manually

### a. Calculate the Pooled Conversion Rate

Under the null hypothesis, we pool the data:

$$\hat{p} = \frac{x_A + x_B}{n_A + n_B} = \frac{60 + 40}{200 + 200} = \frac{100}{400} = 0.25$$

### b. Compute the Standard Error (SE)

The standard error for the difference in proportions is:

$$SE = \sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_A} + \frac{1}{n_B}\right)} = \sqrt{0.25 \times 0.75 \times \left(\frac{1}{200} + \frac{1}{200}\right)}$$

Simplifying further:

$$SE = \sqrt{0.1875 \times \frac{2}{200}} = \sqrt{0.1875 \times 0.01} = \sqrt{0.001875} \approx 0.0433$$

### c. Calculate the Z-Statistic

The z-statistic is calculated by:

$$z = \frac{\hat{p}_A - \hat{p}_B}{SE} = \frac{0.30 - 0.20}{0.0433} \approx \frac{0.10}{0.0433} \approx 2.31$$

For a one-tailed test at a significance level of $\alpha = 0.05$, the critical z-value is approximately **1.645**. 

Since $z \approx 2.31 > 1.645$, we would reject the null hypothesis.

## 4. Python Implementation using `statsmodels`

We can perform the two-proportion z-test using the `statsmodels` package. The function `proportions_ztest` can be used for this purpose.

In [4]:
import numpy as np
from statsmodels.stats.proportion import proportions_ztest

# Data for the experiment
n_A = 200  # sample size for Group A (New Product - Challenger)
x_A = 60   # conversions for Group A

n_B = 200  # sample size for Group B (Existing Product - Champion)
x_B = 40   # conversions for Group B

# Create arrays for the number of successes and the number of observations
counts = np.array([x_A, x_B])
nobs = np.array([n_A, n_B])

# Perform the z-test for proportions
# alternative='larger' tests the hypothesis that the proportion in the first group is larger than in the second
z_stat, p_value = proportions_ztest(count=counts, nobs=nobs, alternative='larger')

print(f"Z-statistic: {z_stat:.2f}")
print(f"P-value: {p_value:.4f}")

# Interpretation based on critical value for one-tailed test at alpha=0.05 (approx. 1.645)
if z_stat > 1.645:
    print("Reject the null hypothesis: The new (challenger) product has a higher conversion rate than the existing (champion) product.")
else:
    print("Fail to reject the null hypothesis: There is not enough evidence that the new product has a higher conversion rate.")

Z-statistic: 2.31
P-value: 0.0105
Reject the null hypothesis: The new (challenger) product has a higher conversion rate than the existing (champion) product.


## 5. Experimental Designs vs Observational Data

Experimental designs, like the one in this notebook, have key advantages over observational data when it comes to establishing causal relationships:

- **Control Over Confounding Variables:** In an experiment, participants are randomly assigned to treatment groups. This randomisation helps ensure that confounding factors (e.g., demographic characteristics, prior financial experiences) are evenly distributed across groups, reducing their impact on the results. In observational studies, these confounders might be unevenly distributed, which can bias the conclusions.

- **Causal Inference:** Because experimental designs actively manipulate the treatment (in this case, exposure to different mortgage products) under controlled conditions, they provide stronger evidence for cause-and-effect relationships. Observational data typically only reveal associations, making it difficult to determine whether a change in the mortgage product truly causes a difference in conversion rates.

- **Internal Validity:** Experimental designs are set up to isolate the effect of the independent variable (product features) on the dependent variable (conversion rate). This high internal validity means that any observed differences are more confidently attributed to the treatment rather than other external factors, which is often not possible with observational studies.

In summary, while observational data can be useful for identifying patterns and correlations, experimental designs offer a more robust framework for determining causality by controlling for biases and confounding variables.

## 6. Conclusion and Discussion on Confounding Variables

Based on our manual calculations and the Python implementation using `statsmodels`, the computed z-statistic is approximately **2.31** with a corresponding p-value less than 0.05. 

**Conclusion:**

- We reject the null hypothesis, concluding that the new (challenger) product has a significantly higher conversion rate than the existing (champion) product.

**Discussion on Confounding Variables:**

While random assignment in experiments helps minimize the impact of confounding variables, there are a few factors to consider:

- **Participant Characteristics:** Variations in demographics, financial literacy, or prior experience with mortgage products could affect conversion rates.
- **Presentation Effects:** Even small differences in how product information is presented (such as layout or language) might influence decisions.
- **External Influences:** Economic conditions or seasonal factors might also impact a participant's decision to convert.

It is important to measure or control these potential confounders where possible. In real-world applications, further stratification or regression techniques may be needed to isolate the true effect of the product differences.

## 7. Teaching Points

- **Random Assignment:** Ensures an unbiased distribution of participants between the groups, minimizing confounding.
- **Pooled Proportion:** Combines data under the assumption that the conversion rates are equal under the null hypothesis.
- **Standard Error Calculation:** Illustrates the variability expected in the difference between proportions.
- **Z-Test for Proportions:** A robust method to compare conversion rates between two independent groups.
- **Confounding Variables:** Always consider external factors that might affect results. Even with randomization, participant characteristics, presentation methods, and external influences may introduce bias.
- **Interpretation:** Rejecting the null hypothesis implies that the new product's conversion rate is significantly higher at the chosen significance level ($\alpha = 0.05$).