# Project Overview
Objective:
To determine which version of the landing page (A or B) leads to a higher conversion rate (i.e., more users making a purchase).
Hypothesis:
Null Hypothesis (H0): The conversion rates for both landing pages (A and B) are equal.
Alternative Hypothesis (H1): The conversion rates for the two landing pages are different.

In [12]:
import pandas as pd
import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

# Generate synthetic data
n = 10000  # total number of users
data = {
    'user_id': range(1, n + 1),
    'group': np.random.choice(['A', 'B'], size=n, p=[0.5, 0.5]),
    'converted': np.random.choice([0, 1], size=n, p=[0.9, 0.1])  # assume 10% conversion rate
}

df = pd.DataFrame(data)

# Simulate a slight difference in conversion rate between groups
df.loc[df['group'] == 'A', 'converted'] = np.random.choice([0, 1], size=df[df['group'] == 'A'].shape[0], p=[0.88, 0.12])
df.loc[df['group'] == 'B', 'converted'] = np.random.choice([0, 1], size=df[df['group'] == 'B'].shape[0], p=[0.92, 0.08])

# Display the first few rows of the dataframe
print(df.head())

   user_id group  converted
0        1     A          0
1        2     B          1
2        3     B          0
3        4     B          0
4        5     A          0


In [13]:
# Calculate conversion rates for each group
conversion_rates = df.groupby('group')['converted'].mean()
print(conversion_rates)

# Perform a statistical test to compare the conversion rates
from scipy.stats import chi2_contingency

# Create a contingency table
contingency_table = pd.crosstab(df['group'], df['converted'])
print(contingency_table)

# Perform the chi-square test
chi2, p, _, _ = chi2_contingency(contingency_table)
print(f"Chi-square test statistic: {chi2}")
print(f"P-value: {p}")


group
A    0.126084
B    0.069456
Name: converted, dtype: float64
converted     0    1
group               
A          4436  640
B          4582  342
Chi-square test statistic: 89.86767682318305
P-value: 2.5463331385406013e-21


In [14]:
alpha = 0.05  # significance level

if p < alpha:
    print("Reject the null hypothesis. There is a significant difference in conversion rates between the two groups.")
else:
    print("Fail to reject the null hypothesis. There is no significant difference in conversion rates between the two groups.")


Reject the null hypothesis. There is a significant difference in conversion rates between the two groups.


### Conclusion

Based on the A/B test results, we found that the p-value is [p-value], which is [greater/less] than the significance level of 0.05. Therefore, we [fail to reject/reject] the null hypothesis. This means that there is [no significant/a significant] difference in conversion rates between the two landing pages.

### Recommendations

- If version B has a higher conversion rate, consider implementing the changes from version B on the main website to improve overall conversions.
- If there is no significant difference, further testing with different variables or larger sample sizes may be needed to find meaningful insights.
