#Top

[![Open in GitHub](https://img.shields.io/badge/Open%20Folder%20in-GitHub-181717?logo=github&logoColor=white)](https://github.com/lindsayalexandra14/ds_portfolio/tree/main/1_projects/statistical_analysis/a_b_testing/fishers_exact_test)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://github.com/lindsayalexandra14/ds_portfolio/blob/main/1_projects/statistical_analysis/a_b_testing/fishers_exact_test/fishers_exact_test_r_google_colab.ipynb)

![Alt text](https://github.com/lindsayalexandra14/ds_portfolio/raw/main/2_images/templates/notebook/headers/lilac/AB%20Testing%20Small%20Sample%20Size.png)

**test**
*   This hypothetical experiment tests two Landing Pages (control vs. treatment)
*   I am trying to prove that the treatment performed better than the control because the team is interested in moving forward with the treatment
*   The initial sample size is 30,000 users but the test gets cut short (due to constraints below) and our sample is cut to 305 users and I need to use Fisher's Exact Test for small sample sizes where a cell has < 10 users

**constraints**
*   The landing page needs to be switched over sooner due to changed timelines, so the results give as much information as I will get about how the treatment performed

**result**
*  It was established from the test that the treatment performed better with significance (at alpha=0.05). The practical significance is low (cohen's h = 0.02) but the business impact is high. It did not have the full desired statistical power (75% vs. 80%)

**recommendation**
*  In this case, given the constraints, I am comfortable enough with the treatment performing some level higher than the control with significance and not vice versa that I recommend moving forward with implementing the treatment





![Alt text](https://github.com/lindsayalexandra14/ds_portfolio/raw/main/2_images/templates/notebook/headers/lilac/Fishers%20R.png)

#Setup

![Alt text](https://github.com/lindsayalexandra14/ds_portfolio/raw/main/2_images/templates/notebook/headers/lilac/Setup.png)

In [None]:
install.packages("exact2x2")
install.packages("statmod")

In [None]:
library(exact2x2)
library(statmod)
library(glue)

#Test Design

![Alt text](https://github.com/lindsayalexandra14/ds_portfolio/raw/main/2_images/templates/notebook/headers/lilac/Test%20Design.png)

In [40]:
# Parameters
alpha <- 0.05             # Significance level
power <- 0.80             # Statistical power (Probability of detecting an effect when it exists)
control=0.14              # Baseline rate
effect <- 0.05            # Desired relative effect (e.g., 5% lift over baseline)
mde <- control * effect   # Minimum Detectable Effect (MDE). Difference in absolute terms
treatment= control + mde  # Treatment rate (includes effect)
cat(sprintf("Test Design:\n"))
cat("\n")
cat(paste('Control:', control, "\n"))
cat(paste('Treatment:', treatment, "\n"))

# Reference proportion for hypothesis
p_1=treatment
p_2=control
p1_label = "Treatment"
p2_label = "Control"

# Hypothesis
alternative = "greater" # in reference to p1:

hypothesis <- switch(alternative,
  greater = sprintf("%s (%.4f) is greater than %s (%.4f)", p1_label, p_1, p2_label, p_2),
  less = sprintf("%s (%.4f) is less than %s (%.4f)", p1_label, p_1, p2_label, p_2),
  two.sided = sprintf("%s (%.4f) is different from %s (%.4f)", p1_label, p_1, p2_label, p_2),
)
cat("Hypothesis:",hypothesis)

# Effect (Cohen's h: standardized effect size for proportions)
proportion_effectsize <- function(treatment, control) {
  2 * asin(sqrt(treatment)) - 2 * asin(sqrt(control))
}
effect_size <- proportion_effectsize(treatment, control)

interpret_effect_size <- function(effect_size) {
  if (abs(h) < 0.2) return("negligible")
  if (abs(h) < 0.5) return("small")
  if (abs(h) < 0.8) return("medium")
  return("large")
}

cat(sprintf("\nMinimum Detectable Effect (MDE): %.3f %s\n", mde,interpret_effect_size(effect_size)))
cat(sprintf("Effect Size (Cohen's h): %.3f\n", effect_size))

# Power Function
simulate_fisher_power <- function(p1, p2, n1, n2, alpha, reps = 1000, alternative, seed=100) {
  set.seed(seed)
  rejects <- replicate(reps, {
    x1 <- rbinom(1, n1, p1)
    x2 <- rbinom(1, n2, p2)

    # Creates contingency table, x1 is reference for hypothesis
    tbl <- matrix(c(x1, n1 - x1, x2, n2 - x2), nrow = 2, byrow = TRUE)

    fisher.test(tbl, alternative = alternative)$p.value < alpha
  })

  mean(rejects)
}

# Sample Size
  # Calculate minimum sample size for each group for one-sided test:
  # Test if one group performs better or worse than the other

n_1 <- 30000   # input sample size for group 1
n_2 <- 30000   # input sample size for group 2

# Estimate Power at Sample Size
estimated_power <- simulate_fisher_power(p1=p_1, p2=p_2, n_1, n_2, alpha=alpha, alternative = alternative)
cat(sprintf("Sample size: %s\n",n_1))
cat(sprintf("Power: %.3f\n", estimated_power))

Test Design:

Control: 0.14 
Treatment: 0.147 
Hypothesis: Treatment (0.1470) is greater than Control (0.1400)
Minimum Detectable Effect (MDE): 0.007 negligible
Effect Size (Cohen's h): 0.020
Sample size: 30000
Power: 0.800


#Results

![Alt text](https://github.com/lindsayalexandra14/ds_portfolio/raw/main/2_images/templates/notebook/headers/lilac/Results.png)

In [45]:
cat("Results:")
cat("\n")
cat("\n")

# Data
control_conversions=7
treatment_conversions=18
control_no_conversions=150
treatment_no_conversions=130

table <- matrix(c(control_conversions, control_no_conversions, treatment_conversions, treatment_no_conversions), nrow = 2, byrow = TRUE)
colnames(table) <- c("Converted", "Not_Converted")
rownames(table) <- c("Control", "Treatment")

# Flip table if p1_label is not in the first row
if (rownames(table)[1] != p1_label) {
  table_to_use <- table[c(2, 1), ]
}

print(table_to_use)

# Conversion Rates
n1 <- sum(table_to_use[1, ])           # Reference group (row 1)
n2 <- sum(table_to_use[2, ])           # (row 2)

p1 <- table_to_use[1, "Converted"] / n1   # Reference group (row 1) conversion rate
p2 <- table_to_use[2, "Converted"] / n2   # (row 2) conversion rate

groups <- c(p1 = p1_label, p2 = p2_label)

cat("\n")
print(glue("p1: ","{groups['p1']} Conversion Rate: {round(p1 * 100, 2)}%"))
print(glue("p2: ","{groups['p2']} Conversion Rate: {round(p2 * 100, 2)}%"))

# Results Hypothesis
result_hypothesis <- switch(alternative,
  greater = sprintf("%s (%.4f) is greater than %s (%.4f)", p1_label, p1, p2_label, p2),
  less = sprintf("%s (%.4f) is less than %s (%.4f)", p1_label, p1, p2_label, p2),
  two.sided = sprintf("%s (%.4f) is different from %s (%.4f)", p1_label, p1, p2_label, p2),
)

cat("\n")
cat("Result Hypothesis:",result_hypothesis)

# Absolute Difference
abs_diff <- abs(p1 - p2)

cat("\n")
cat("\n")
cat(sprintf("Absolute difference: %.3f (%.1f%%)\n", abs_diff, abs_diff * 100))

# Effect (Cohen's h)
proportion_effectsize <- function(control, treatment) {
  2 * asin(sqrt(treatment)) - 2 * asin(sqrt(control))
}

h <- proportion_effectsize(control, treatment)

interpret_h <- function(h) {
  if (abs(h) < 0.2) return("negligible")
  if (abs(h) < 0.5) return("small")
  if (abs(h) < 0.8) return("medium")
  return("large")
}

cat(sprintf("Cohen's h: %.3f\n", h))
cat(sprintf("Effect size interpretation: %s\n", interpret_h(h)))

# Fisher's Test
result <-  exact2x2(table_to_use, alternative = alternative, conf.level = 1 - alpha, tsmethod="central")
cat("\n")
print(result)

# P-value
p_value <- result$p.value

# Lower CI
lower_ci <- result$conf.int[1]

# Power
set.seed(100)
result_power <- power.fisher.test(n1 = n1, n2 = n2, p1 = p1, p2 = p2,
                           alpha = alpha,
                           alternative = alternative,
                           nsim = 10000)

print(paste("Result Power:",round(result_power*100,1),"%"))

Results:

          Converted Not_Converted
Treatment        18           130
Control           7           150

p1: Treatment Conversion Rate: 12.16%
p2: Control Conversion Rate: 4.46%

Result Hypothesis: Treatment (0.1216) is greater than Control (0.0446)

Absolute difference: 0.077 (7.7%)
Cohen's h: 0.020
Effect size interpretation: negligible


	One-sided Fisher's Exact Test

data:  table_to_use
p-value = 0.01186
alternative hypothesis: true odds ratio is greater than 1
95 percent confidence interval:
 1.294771      Inf
sample estimates:
odds ratio 
  2.956901 

[1] "Result Power: 75.3 %"


#Results Summary

![Alt text](https://github.com/lindsayalexandra14/ds_portfolio/raw/main/2_images/templates/notebook/headers/lilac/Results%20Summary.png)

***performance:***   
With 95% confidence, the treatment group has higher odds of conversion than the control group. The odds of conversion in the treatment group are at least 29.5% higher than in the control group based on the 95% CI lower bound odds ratio of 1.295. This supports the hypothesis that treatment is better than control.

***significance:***  
Because the interval does not include 1, and the p-value (0.012) is less than alpha (0.050), this result is statistically significant at the 95% confidence level. The practical significance is low (cohen's h = 0.02) but the business impact is high based on domain knowledge.

***power:***  
Our test was somewhat underpowered (e.g., only ~75% power vs. 80% desired), meaning there was
a higher chance we failed to detect a true difference due to limited sample size.
As a result, while the effect appears meaningful, we cannot be statistically
confident in it without further data and cannot give a confident
estimate in incremental revenue from the test.

#Recommendation

![Alt text](https://github.com/lindsayalexandra14/ds_portfolio/raw/main/2_images/templates/notebook/headers/lilac/Recommendation.png)

*  In this case, given the constraints in needing to choose a landing page prematurely, I am comfortable enough with the treatment performing some level higher than the control with significance and not vice versa that I recommend moving forward with implementing the treatment