# **A/B - Hypothesis Testing**

## Overview

This notebook covers two key methodologies for enhancing insurance offerings: **A/B Testing** and **Hypothesis Testing**.

## A/B Testing

A/B Testing compares different insurance offerings or marketing strategies to evaluate their impact on:

- **Customer Engagement**
- **Policy Uptake**

### Example

- Testing two marketing campaigns to identify which leads to more sign-ups.

## Hypothesis Testing

Hypothesis Testing validates assumptions about factors influencing claims and premiums, focusing on:

- **Formulating Hypotheses**: Making statements about influences (e.g., regional preferences).
- **Testing Hypotheses**: Analyzing data to support or refute these statements.

### Example

- Investigating if policyholder age impacts claim frequency.


In [2]:
# Import libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import scipy.stats as stats
import os, sys

# Add the 'scripts' directory to the Python path for module imports
sys.path.append(os.path.abspath(os.path.join('..', 'scripts')))

In [6]:
# Read the dataset
df = pd.read_csv('../data/processeddata/cleaned_data.csv', low_memory=False, index_col=False)

# Print the first few rows of the dataframe to confirm successful loading
print("✅Dataset loaded successfully. Here are the first few rows:")
print(df.head())

✅Dataset loaded successfully. Here are the first few rows:
   UnderwrittenCoverID  PolicyID     TransactionMonth  IsVATRegistered  \
0               145249     12827  2015-03-01 00:00:00             True   
1               145249     12827  2015-05-01 00:00:00             True   
2               145249     12827  2015-07-01 00:00:00             True   
3               145255     12827  2015-05-01 00:00:00             True   
4               145255     12827  2015-07-01 00:00:00             True   

  Citizenship          LegalType Title Language                 Bank  \
0              Close Corporation    Mr  English  First National Bank   
1              Close Corporation    Mr  English  First National Bank   
2              Close Corporation    Mr  English  First National Bank   
3              Close Corporation    Mr  English  First National Bank   
4              Close Corporation    Mr  English  First National Bank   

       AccountType  ...                    ExcessSelected Cover

In [7]:
# Initialize the class
from hypothesis_testing import ABHypothesisTesting

# Create an instance of the ABHypothesisTesting class with the dataset
ab_test = ABHypothesisTesting(df)

# Run all tests and store the results
results = ab_test.run_all_tests()

# Print results in a human-readable format
for test_name, result in results.items():
    print(f'--- {test_name} ---')
    print(result)
    print()  # Print a newline for better readability

--- Risk Differences Across Provinces ---
Chi-squared test on Province and TotalPremium: chi2 = 2491500.912683971, p-value = 0.0
Reject the null hypothesis.

--- Risk Differences Between Postal Codes ---
Chi-squared test on PostalCode and TotalPremium: chi2 = 224052676.14292973, p-value = 0.0
Reject the null hypothesis.

--- Margin Differences Between Postal Codes ---
Z-test on TotalPremium: Z-statistic = -0.4370784074657527, p-value = 0.6620544861020186
Fail to reject the null hypothesis.

--- Risk Differences Between Women and Men ---
T-test on TotalPremium: T-statistic = -5.118420932688848, p-value = 3.0925282750010697e-07
Reject the null hypothesis.

