# Task 3: Hypothesis Testing

This notebook performs A/B testing to assess whether significant risk differences exist across customer segments such as provinces, postal codes, and genders.

**Goals**:
- Validate or reject predefined hypotheses using statistical tests.
- Focus on metrics: Claim Frequency, Loss Ratio, Margin, Claim Severity.
- Use t-test, Mann-Whitney, and Chi-squared tests as appropriate.

### Load Data

In [None]:
import pandas as pd
import numpy as np

# Load the cleaned DataFrame from EDA (or re-run EDA if needed)
df = pd.read_csv("../data/raw/MachineLearningRating_v3.txt", sep='|',parse_dates=['TransactionMonth'])

# Derived metrics
df['LossRatio'] = np.where(df['TotalPremium'] > 0, df['TotalClaims'] / df['TotalPremium'], np.nan)
df['Margin'] = df['TotalPremium'] - df['TotalClaims']
df['HasClaim'] = (df['TotalClaims'] > 0).astype(int)

# View sample
df[['Province', 'Gender', 'LossRatio', 'Margin', 'HasClaim']].head()

## Step 1: Hypothesis H1 – Risk Differences Across Provinces

- **Null Hypothesis (H1)**: No difference in average Loss Ratio across provinces.
- **Metric**: LossRatio (continuous).
- **Test**: T-test or ANOVA (if more than two provinces).

In [None]:
# Group counts
province_counts = df['Province'].value_counts()
print(province_counts)

# Filter provinces with at least 30 policies
valid_provinces = province_counts[province_counts >= 30].index.tolist()
df_province = df[df['Province'].isin(valid_provinces)]

print("Filtered provinces:", valid_provinces)

### Visualize Group Differences

In [None]:
import matplotlib.pyplot as plt

# Mean LossRatio per province
loss_by_province = df_province.groupby('Province')['LossRatio'].mean().sort_values(ascending=False)

# Bar plot
plt.figure(figsize=(8, 4))
loss_by_province.plot(kind='bar', color='coral', edgecolor='black')
plt.title('Mean Loss Ratio by Province')
plt.ylabel('Mean Loss Ratio')
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

### Check Normality and Run Anova Test

In [None]:
from scipy.stats import f_oneway

# Group LossRatio arrays
groups = [group['LossRatio'].dropna() for name, group in df_province.groupby('Province') if len(group) >= 30]

# Run ANOVA (1-way)
stat, p_value = f_oneway(*groups)
print(f"ANOVA result: F-stat = {stat:.3f}, p-value = {p_value:.4f}")

if p_value < 0.05:
    print("✅ Reject null hypothesis: significant difference in LossRatio across provinces.")
else:
    print("❌ Fail to reject null hypothesis: no significant difference found.")

## Step 2: Hypothesis H4 – Risk Differences by Gender

### H4a – Claim Frequency
- **Null Hypothesis**: Claim frequency is equal for males and females.
- **Metric**: HasClaim (binary)
- **Test**: Chi-squared test

### H4b – Claim Severity
- **Null Hypothesis**: Claim severity is equal for males and females.
- **Metric**: TotalClaims (for HasClaim == 1)
- **Test**: Mann–Whitney U test (non-parametric)

### Claim Frequency by Gender (Chi-squared)

In [None]:
from scipy.stats import chi2_contingency

# Build contingency table
contingency = pd.crosstab(df['Gender'], df['HasClaim'])
print("Contingency Table:\n", contingency)

# Chi-squared test
stat, p, dof, expected = chi2_contingency(contingency)

print(f"\nChi-squared test:\nChi² = {stat:.3f}, p = {p:.4f}")
if p < 0.05:
    print("✅ Reject H₀: Claim frequency differs by gender.")
else:
    print("❌ Fail to reject H₀: No significant difference in claim frequency between genders.")

### Claim Severity by Gender (Mann–Whitney)

In [None]:
from scipy.stats import mannwhitneyu

# Filter to only those who had claims
claimants = df[df['HasClaim'] == 1]

# Check sample sizes
print(claimants['Gender'].value_counts())

# Group values
male_claims = claimants[claimants['Gender'] == 'Male']['TotalClaims'].dropna()
female_claims = claimants[claimants['Gender'] == 'Female']['TotalClaims'].dropna()

# Mann–Whitney test
stat, p_value = mannwhitneyu(male_claims, female_claims, alternative='two-sided')

print(f"\nMann–Whitney U test on Claim Severity:\nU = {stat:.2f}, p = {p_value:.4f}")
if p_value < 0.05:
    print("✅ Reject H₀: Claim severity differs between genders.")
else:
    print("❌ Fail to reject H₀: No significant difference in claim severity between genders.")


### H4a – Claim Frequency (Chi-squared)

- **Metric**: HasClaim
- **Test**: Chi-squared test
- **p-value**: 0.0266
- ✅ **Conclusion**: Reject H₀ → Male and female claim frequencies differ.

---

### H4b – Claim Severity (Mann–Whitney U)

- **Metric**: TotalClaims (HasClaim == 1)
- **Test**: Mann–Whitney U
- **p-value**: 0.2235
- ❌ **Conclusion**: Fail to reject H₀ → No strong evidence of difference in claim severity between genders.

---

**Business Implication**: Gender may influence claim frequency, but not necessarily claim amount. Pricing strategies can account for this, provided it aligns with regulatory fairness.
