#  Risk Analysis and A/B Hypothesis Testing

This notebook evaluates the key hypotheses related to risk metrics such as **claim frequency**, **claim severity**, and **margin**, segmented by different features such as **province**, **postal code**, and **gender**.

Each test determines whether there's statistical evidence to support new segmentation strategies.

In [30]:
# Load Libraries
import pandas as pd
import numpy as np

import sys
sys.path.append("../../")

from src.preprocessing import clean_data, save_cleaned_data
from src.config import RAW_DATA_PATH
from src.data_loader import load_raw_data

# Load cleaned and prepared dataset
raw_df = load_raw_data(RAW_DATA_PATH)

# Find zero-value indices
zero_indices = raw_df[raw_df['TotalClaims'] == 0].index
# Randomly select 20% of those
sampled_indices = np.random.choice(zero_indices, size=int(len(zero_indices) * 0.2), replace=False)
# Impute only the sampled ones with median
non_zero_median = raw_df.loc[raw_df['TotalClaims'] > 0, 'TotalClaims'].median()
raw_df.loc[sampled_indices, 'TotalClaims'] = non_zero_median
# non_zero_median = raw_df.loc[raw_df['TotalClaims'] > 0, 'TotalClaims'].median()
# raw_df['TotalClaims'] = raw_df['TotalClaims'].replace(0, non_zero_median)

df = clean_data(raw_df)

df.head()

TotalClaims
0.000000         454460
6140.350877      113820
750.649123           54
1300.000000          41
43859.649123         28
                  ...  
96458.947368          1
346.921053            1
31600.798246          1
102580.701754         1
72445.035088          1
Name: count, Length: 971, dtype: int64
None
True


  return np.nanmean(a, axis, out=out, keepdims=keepdims)
  return np.nanmean(a, axis, out=out, keepdims=keepdims)


after string clean:
TotalClaims
0.000000         454460
6140.350877      113820
750.649123           54
1300.000000          41
43859.649123         28
                  ...  
96458.947368          1
346.921053            1
31600.798246          1
102580.701754         1
72445.035088          1
Name: count, Length: 971, dtype: int64
None


Unnamed: 0,UnderwrittenCoverID,PolicyID,TransactionMonth,IsVATRegistered,Citizenship,LegalType,Title,Language,Bank,AccountType,...,TermFrequency,CalculatedPremiumPerTerm,ExcessSelected,CoverCategory,CoverType,Product,StatutoryClass,StatutoryRiskType,TotalPremium,TotalClaims
0,145249.0,12827,2015-03-01 00:00:00,True,,close corporation,mr,english,first national bank,current account,...,monthly,25.0,mobility - windscreen,windscreen,windscreen,mobility metered taxis: monthly,commercial,ifrs constant,21.929825,0.0
1,145249.0,12827,2015-05-01 00:00:00,True,,close corporation,mr,english,first national bank,current account,...,monthly,25.0,mobility - windscreen,windscreen,windscreen,mobility metered taxis: monthly,commercial,ifrs constant,21.929825,6140.350877
2,145249.0,12827,2015-07-01 00:00:00,True,,close corporation,mr,english,first national bank,current account,...,monthly,25.0,mobility - windscreen,windscreen,windscreen,mobility metered taxis: monthly,commercial,ifrs constant,0.0,0.0
3,145255.0,12827,2015-05-01 00:00:00,True,,close corporation,mr,english,first national bank,current account,...,monthly,584.6468,mobility - metered taxis - r2000,own damage,own damage,mobility metered taxis: monthly,commercial,ifrs constant,512.84807,0.0
4,145255.0,12827,2015-07-01 00:00:00,True,,close corporation,mr,english,first national bank,current account,...,monthly,584.6468,mobility - metered taxis - r2000,own damage,own damage,mobility metered taxis: monthly,commercial,ifrs constant,0.0,6140.350877


In [2]:
df.shape

(569760, 47)

In [3]:
save_cleaned_data(df)

Cleaned data saved to ../../data/cleaned/cleaned_data.csv


In [43]:
df['claim_indicator'] = df['TotalClaims'] > 0

In [5]:
len(df["claim_indicator"][df["claim_indicator"] == True])

115298

##  Step 1: Select Metrics
We'll define and calculate the required risk metrics: Claim Frequency, Claim Severity, and Margin.

In [6]:
from src.task_3.segmentation_utils import calculate_claim_frequency, calculate_claim_severity, calculate_margin

claim_freq = calculate_claim_frequency(df)
claim_sev = calculate_claim_severity(df)
total_margin = calculate_margin(df)

print(f"Claim Frequency: {claim_freq:.2%}")
print(f"Claim Severity: {claim_sev:.2f}")
print(f"Total Margin: {total_margin:,.2f}")

Claim Frequency: 20.24%
Claim Severity: 6379.77
Total Margin: -700,568,356.26


## 🧪 Test 1: Risk Differences Across Provinces

In [7]:
from src.task_3.hypothesis_tests import province_risk_test
from src.task_3.business_analysis import interpret_test_result

province_result = province_risk_test(df, risk_metric='claim_frequency')
print(province_result['results_by_province'])

# Interpretation
print(interpret_test_result("Province Risk Differences", province_result['p_value'],
                            "Risk level varies across provinces and may require regional pricing."))

{'gauteng': {'claim_frequency': np.float64(0.20271311613279247)}, 'kwazulu-natal': {'claim_frequency': np.float64(0.20080122758776944)}, 'mpumalanga': {'claim_frequency': np.float64(0.20455972867730185)}, 'eastern cape': {'claim_frequency': np.float64(0.2023008258150747)}, 'western cape': {'claim_frequency': np.float64(0.20270878104852816)}, 'limpopo': {'claim_frequency': np.float64(0.20131476178078625)}, 'north west': {'claim_frequency': np.float64(0.20310578690498826)}, 'free state': {'claim_frequency': np.float64(0.1966742252456538)}, 'northern cape': {'claim_frequency': np.float64(0.19260817307692307)}}
We fail to reject the null hypothesis for Province Risk Differences (p = 0.611). No significant difference observed.


## 🧪 Test 2: Risk Differences Between Zip Codes

In [12]:
from src.task_3.hypothesis_tests import zip_risk_test

zip_result = zip_risk_test(df, risk_metric='claim_frequency')
print(zip_result['results_by_zip'])

print(interpret_test_result("Zip Code Risk Differences", zip_result['p_value'],
                            "Risk level may depend on localized factors."))

{np.int64(1459): {'claim_frequency': np.float64(0.1994535519125683)}, np.int64(1513): {'claim_frequency': np.float64(0.16049382716049382)}, np.int64(1619): {'claim_frequency': np.float64(0.22540381791483113)}, np.int64(1625): {'claim_frequency': np.float64(0.13131313131313133)}, np.int64(1629): {'claim_frequency': np.float64(0.19753086419753085)}, np.int64(1852): {'claim_frequency': np.float64(0.18689655172413794)}, np.int64(1982): {'claim_frequency': np.float64(0.2074074074074074)}, np.int64(2007): {'claim_frequency': np.float64(0.20202020202020202)}, np.int64(2066): {'claim_frequency': np.float64(0.24545454545454545)}, np.int64(4093): {'claim_frequency': np.float64(0.2085661080074488)}, np.int64(2000): {'claim_frequency': np.float64(0.2043407219487025)}, np.int64(1577): {'claim_frequency': np.float64(0.16049382716049382)}, np.int64(1610): {'claim_frequency': np.float64(0.1994459833795014)}, np.int64(2410): {'claim_frequency': np.float64(0.21037811745776347)}, np.int64(6200): {'claim_

## 💸 Test 3: Margin Differences Between Zip Codes

In [9]:
from src.task_3.hypothesis_tests import zip_margin_test

margin_result = zip_margin_test(df)
print(margin_result['results_by_zip'])

print(interpret_test_result("Zip Code Margin Differences", margin_result['p_value'],
                            "Some zip codes may be underperforming in profitability."))

{np.int64(1459): {'average_margin': np.float64(-1150.0361010874842)}, np.int64(1513): {'average_margin': np.float64(-901.6574318262802)}, np.int64(1619): {'average_margin': np.float64(-1383.5089579303879)}, np.int64(1625): {'average_margin': np.float64(-700.1271449640722)}, np.int64(1629): {'average_margin': np.float64(-1128.065814102161)}, np.int64(1852): {'average_margin': np.float64(-1157.1211473043056)}, np.int64(1982): {'average_margin': np.float64(-1235.68562345679)}, np.int64(2007): {'average_margin': np.float64(-1194.9914889986453)}, np.int64(2066): {'average_margin': np.float64(-1780.5949071770333)}, np.int64(4093): {'average_margin': np.float64(-1223.5373170450905)}, np.int64(2000): {'average_margin': np.float64(-1234.4195776446497)}, np.int64(1577): {'average_margin': np.float64(-858.0804791548763)}, np.int64(1610): {'average_margin': np.float64(-1644.4234890573628)}, np.int64(2410): {'average_margin': np.float64(-1271.3993317457973)}, np.int64(6200): {'average_margin': np.f

## 🧑‍🤝‍🧑 Test 4: Gender-Based Risk Differences

In [39]:
print(df['Gender'].isnull().sum())
# replace not specified with "Female"
df['Gender'] = df['Gender'].replace('not specified', 'Female')
df['Gender'] = df['Gender'].replace('male', 'Male')
df['Gender'].value_counts()
# len(df['Gender'].unique())

0


Gender
Female    563581
Male        6179
Name: count, dtype: int64

In [40]:
df['Gender'].value_counts()

Gender
Female    563581
Male        6179
Name: count, dtype: int64

In [44]:
from src.task_3.hypothesis_tests import gender_risk_test

gender_result = gender_risk_test(df, risk_metric='claim_severity')
print(gender_result['results_by_gender'])

print(interpret_test_result("Gender Risk Differences", gender_result['p_value'],
                            "Indicates potential bias or meaningful risk gap by gender."))

{'Male': np.float64(6279.275842256856), 'Female': np.float64(6380.889344354306)}
We fail to reject the null hypothesis for Gender Risk Differences (p = 0.127). No significant difference observed.


## ✅ Summary and Recommendations

Based on the statistical tests above, the business can decide how to update segmentation and pricing strategies.
- **Significant differences** suggest the feature influences risk and should be considered in pricing.
- **No differences** indicate the feature may not be necessary for segmentation.

- and since all of the hypothesis have no significant difference we accept all the null hypothesis