# Module 3: Uninsurability Risk Classification

ML classifiers predicting which US counties face insurance affordability crises using a composite risk target derived from claim severity, disaster exposure, FEMA Housing Assistance, and aggregate claims.

1. **Gradient Boosting Classifier** — primary tree-based ensemble model
2. **Random Forest Classifier** — bagging-based comparison model
3. **SHAP Analysis** — feature importance and interaction effects

**Data:** County-year panel (2004–2024), 25,760 observations across 3,240 US counties.  
**Target:** Composite uninsurability risk (~18% positive rate) based on high claim severity, cumulative disaster exposure, FEMA Housing Assistance damage, and high total claims paid.  
**Features:** 32 predictive features (disaster exposure, demographics, macro indicators, lagged claims) — no current-year claims data used, to ensure genuine out-of-sample prediction.

In [None]:
import sys
sys.path.insert(0, "..")

import pickle
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import seaborn as sns
import plotly.express as px
import shap

from src.utils.config import MODELS_DIR, REPORTS_FIGURES, DATA_PROCESSED

REPORTS_FIGURES.mkdir(parents=True, exist_ok=True)

sns.set_theme(style="whitegrid")
plt.rcParams["figure.figsize"] = (14, 6)
plt.rcParams["figure.dpi"] = 100

print("Setup complete.")

## 1. Load Model Results

In [None]:
# Load all saved results
comparison = pd.read_csv(MODELS_DIR / "classifier_comparison_metrics.csv")
feature_importance = pd.read_csv(MODELS_DIR / "classifier_feature_importance.csv")
shap_importance = pd.read_csv(MODELS_DIR / "classifier_shap_importance.csv")
roc_data = pd.read_csv(MODELS_DIR / "classifier_roc_curves.csv")
predictions = pd.read_csv(MODELS_DIR / "classifier_predictions.csv", dtype={"county_fips": str})
cv_results = pd.read_csv(MODELS_DIR / "classifier_cv_results.csv")
risk_scores = pd.read_csv(MODELS_DIR / "classifier_risk_scores.csv", dtype={"county_fips": str})
thresholds = pd.read_csv(MODELS_DIR / "classifier_target_thresholds.csv")
feature_names = pd.read_csv(MODELS_DIR / "classifier_feature_names.csv")["feature"].tolist()

# Load SHAP explanation
with open(MODELS_DIR / "classifier_shap_values.pkl", "rb") as f:
    shap_explanation = pickle.load(f)

# Load panel for context
panel = pd.read_csv(DATA_PROCESSED / "county_year_panel_glm_ready.csv", dtype={"county_fips": str})

print(f"Panel dataset: {panel.shape}")
print(f"Features used: {len(feature_names)}")
print(f"Counties scored: {len(risk_scores):,}")
print(f"\n=== Model Comparison ===")
display(comparison.round(4))

## 2. Target Variable: Composite Uninsurability Risk

The target is constructed from four risk signals (thresholds computed from training data only to prevent leakage):

1. **High claim severity** — avg_claim_severity >= 75th percentile of positive claims
2. **High cumulative disasters** — cum_disasters_3yr >= 75th percentile
3. **FEMA Housing Assistance damage** — ha_total_damage > $0
4. **High total claims paid** — total_claims_paid >= 75th percentile

A county-year is labeled **high risk** if it meets **>= 2 of 4** signals.

In [None]:
# Display thresholds
print("Target Construction Thresholds (from training data <= 2021):")
print(f"  Claim severity:     >= ${thresholds['severity_threshold'].values[0]:,.0f}")
print(f"  Cum disasters 3yr:  >= {thresholds['disaster_threshold'].values[0]:.1f}")
print(f"  Total claims paid:  >= ${thresholds['claims_paid_threshold'].values[0]:,.0f}")
print(f"  HA damage:          > $0")
print(f"  Min signals needed: {int(thresholds['min_signals'].values[0])}")

# Recreate target for visualization
sev_thr = thresholds['severity_threshold'].values[0]
dis_thr = thresholds['disaster_threshold'].values[0]
cp_thr = thresholds['claims_paid_threshold'].values[0]

panel['sig_severity'] = (panel['avg_claim_severity'] >= sev_thr).astype(int)
panel['sig_disasters'] = (panel['cum_disasters_3yr'] >= dis_thr).astype(int)
panel['sig_ha'] = (panel['ha_total_damage'] > 0).astype(int)
panel['sig_claims'] = (panel['total_claims_paid'] >= cp_thr).astype(int)
panel['risk_signals'] = panel['sig_severity'] + panel['sig_disasters'] + panel['sig_ha'] + panel['sig_claims']
panel['uninsurability_risk'] = (panel['risk_signals'] >= 2).astype(int)

# Signal rates
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Signal positive rates
signals = {
    'High Claim\nSeverity': panel['sig_severity'].mean(),
    'High Cumulative\nDisasters': panel['sig_disasters'].mean(),
    'FEMA HA\nDamage': panel['sig_ha'].mean(),
    'High Total\nClaims Paid': panel['sig_claims'].mean(),
}
colors = ['#e74c3c', '#e67e22', '#f1c40f', '#2ecc71']
axes[0].bar(signals.keys(), [v * 100 for v in signals.values()], color=colors, edgecolor='white')
axes[0].set_ylabel('Positive Rate (%)')
axes[0].set_title('Risk Signal Positive Rates')
for i, v in enumerate(signals.values()):
    axes[0].text(i, v * 100 + 0.5, f'{v:.1%}', ha='center', fontweight='bold')

# Train vs test target distribution
train_rate = panel[panel['year'] <= 2021]['uninsurability_risk'].mean()
test_rate = panel[panel['year'] > 2021]['uninsurability_risk'].mean()
bars = axes[1].bar(['Train (2004-2021)', 'Test (2022-2024)'], 
                    [train_rate * 100, test_rate * 100],
                    color=['#3498db', '#e74c3c'], edgecolor='white')
axes[1].set_ylabel('High Risk Rate (%)')
axes[1].set_title('Uninsurability Risk Rate: Train vs Test')
for bar, rate in zip(bars, [train_rate, test_rate]):
    axes[1].text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.3,
                 f'{rate:.1%}', ha='center', fontweight='bold')

plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "uninsurability_target_distribution.png", bbox_inches="tight", dpi=150)
plt.show()

print(f"\nOverall positive rate: {panel['uninsurability_risk'].mean():.1%}")
print(f"Train positive rate:   {train_rate:.1%}")
print(f"Test positive rate:    {test_rate:.1%}")
print(f"\nThe higher test rate reflects accelerating climate risk — consistent with Module 1 findings.")

## 3. Model Comparison

In [None]:
# Formatted comparison table
comp_display = comparison.copy()
comp_display['cv_auc'] = comp_display.apply(
    lambda r: f"{r['cv_auc_mean']:.4f} \u00b1 {r['cv_auc_std']:.4f}", axis=1
)
comp_display['cv_ap'] = comp_display.apply(
    lambda r: f"{r['cv_ap_mean']:.4f} \u00b1 {r['cv_ap_std']:.4f}", axis=1
)

print("="*70)
print("MODEL COMPARISON: Gradient Boosting vs Random Forest")
print("="*70)
display(
    comp_display[['model', 'auc_roc', 'avg_precision', 'f1_score', 
                   'brier_score', 'cv_auc', 'cv_ap']]
    .set_index('model')
    .rename(columns={
        'auc_roc': 'AUC-ROC (Test)',
        'avg_precision': 'Avg Precision (Test)',
        'f1_score': 'F1 Score (Test)',
        'brier_score': 'Brier Score (Test)',
        'cv_auc': '5-Fold CV AUC-ROC',
        'cv_ap': '5-Fold CV Avg Precision',
    })
    .T
)

best_model = comparison.loc[comparison['auc_roc'].idxmax(), 'model']
print(f"\nBest model: {best_model}")

## 4. ROC and Precision-Recall Curves

In [None]:
from sklearn.metrics import precision_recall_curve, average_precision_score

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

gb_auc = comparison[comparison['model'] == 'Gradient Boosting']['auc_roc'].values[0]
rf_auc = comparison[comparison['model'] == 'Random Forest']['auc_roc'].values[0]

# ROC Curve
axes[0].plot(roc_data['gb_fpr'].dropna(), roc_data['gb_tpr'].dropna(),
             color='#e74c3c', linewidth=2, label=f'Gradient Boosting (AUC={gb_auc:.3f})')
axes[0].plot(roc_data['rf_fpr'].dropna(), roc_data['rf_tpr'].dropna(),
             color='#3498db', linewidth=2, label=f'Random Forest (AUC={rf_auc:.3f})')
axes[0].plot([0, 1], [0, 1], 'k--', linewidth=1, alpha=0.5, label='Random (AUC=0.500)')
axes[0].fill_between(roc_data['gb_fpr'].dropna(), roc_data['gb_tpr'].dropna(), alpha=0.05, color='#e74c3c')
axes[0].set_xlabel('False Positive Rate')
axes[0].set_ylabel('True Positive Rate')
axes[0].set_title('ROC Curve: Uninsurability Risk Classification')
axes[0].legend(loc='lower right')

# Precision-Recall Curve
gb_prec, gb_rec, _ = precision_recall_curve(predictions['y_test'], predictions['y_prob_gb'])
rf_prec, rf_rec, _ = precision_recall_curve(predictions['y_test'], predictions['y_prob_rf'])
gb_ap = average_precision_score(predictions['y_test'], predictions['y_prob_gb'])
rf_ap = average_precision_score(predictions['y_test'], predictions['y_prob_rf'])

axes[1].plot(gb_rec, gb_prec, color='#e74c3c', linewidth=2, label=f'Gradient Boosting (AP={gb_ap:.3f})')
axes[1].plot(rf_rec, rf_prec, color='#3498db', linewidth=2, label=f'Random Forest (AP={rf_ap:.3f})')
baseline = predictions['y_test'].mean()
axes[1].axhline(y=baseline, color='k', linestyle='--', linewidth=1, alpha=0.5, label=f'Baseline={baseline:.3f}')
axes[1].set_xlabel('Recall')
axes[1].set_ylabel('Precision')
axes[1].set_title('Precision-Recall Curve')
axes[1].legend(loc='upper right')

plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "classifier_roc_pr_curves.png", bbox_inches="tight", dpi=150)
plt.show()

## 5. Confusion Matrices

In [None]:
from sklearn.metrics import confusion_matrix

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

for ax, prob_col, name, color in [
    (axes[0], 'y_prob_gb', 'Gradient Boosting', 'Reds'),
    (axes[1], 'y_prob_rf', 'Random Forest', 'Blues'),
]:
    threshold = comparison[comparison['model'] == name]['optimal_threshold'].values[0]
    y_pred = (predictions[prob_col] >= threshold).astype(int)
    cm = confusion_matrix(predictions['y_test'], y_pred)
    
    sns.heatmap(cm, annot=True, fmt='d', cmap=color, ax=ax,
                xticklabels=['Low Risk', 'High Risk'],
                yticklabels=['Low Risk', 'High Risk'])
    ax.set_xlabel('Predicted')
    ax.set_ylabel('Actual')
    ax.set_title(f'{name}\n(threshold={threshold:.3f})')

plt.suptitle('Confusion Matrices (Optimal Threshold via Youden\'s J)', fontsize=13, y=1.02)
plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "classifier_confusion_matrices.png", bbox_inches="tight", dpi=150)
plt.show()

## 6. Feature Importance (sklearn Impurity-Based)

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(16, 7))

top_n = 15

# Gradient Boosting
gb_top = feature_importance.nlargest(top_n, 'gb_importance')
axes[0].barh(range(top_n), gb_top['gb_importance'].values[::-1], color='#e74c3c', alpha=0.8)
axes[0].set_yticks(range(top_n))
axes[0].set_yticklabels(gb_top['feature'].values[::-1])
axes[0].set_xlabel('Feature Importance')
axes[0].set_title('Gradient Boosting — Top 15 Features')

# Random Forest
rf_top = feature_importance.nlargest(top_n, 'rf_importance')
axes[1].barh(range(top_n), rf_top['rf_importance'].values[::-1], color='#3498db', alpha=0.8)
axes[1].set_yticks(range(top_n))
axes[1].set_yticklabels(rf_top['feature'].values[::-1])
axes[1].set_xlabel('Feature Importance')
axes[1].set_title('Random Forest — Top 15 Features')

plt.suptitle('sklearn Feature Importance (Impurity-Based)', fontsize=13, y=1.02)
plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "classifier_feature_importance.png", bbox_inches="tight", dpi=150)
plt.show()

## 7. SHAP Analysis: Global Feature Importance

SHAP (SHapley Additive exPlanations) provides model-agnostic feature importance based on game theory. Unlike sklearn's impurity-based importance, SHAP values show both the **magnitude** and **direction** of each feature's contribution to individual predictions.

In [None]:
# SHAP bar plot (mean |SHAP value| per feature)
fig, ax = plt.subplots(figsize=(10, 8))

top_shap = shap_importance.head(20)
colors = plt.cm.YlOrRd(np.linspace(0.3, 0.9, len(top_shap)))

ax.barh(range(len(top_shap)), top_shap['mean_abs_shap'].values[::-1], 
        color=colors[::-1], edgecolor='white')
ax.set_yticks(range(len(top_shap)))
ax.set_yticklabels(top_shap['feature'].values[::-1])
ax.set_xlabel('Mean |SHAP Value|')
ax.set_title('SHAP Feature Importance: Top 20 Risk Drivers\n(Gradient Boosting Classifier)', fontsize=13)

plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "shap_bar_plot.png", bbox_inches="tight", dpi=150)
plt.show()

print("Top 5 Risk Drivers (SHAP):")
for _, row in top_shap.head(5).iterrows():
    print(f"  {row['feature']}: mean |SHAP| = {row['mean_abs_shap']:.4f}")

## 8. SHAP Beeswarm Plot

Each dot is one county-year observation. The x-axis shows the SHAP value (how much that feature pushed the prediction toward high risk or low risk). The color shows the feature's actual value (red = high, blue = low).

In [None]:
fig = plt.figure(figsize=(12, 10))
shap.plots.beeswarm(shap_explanation, max_display=20, show=False)
plt.title('SHAP Beeswarm Plot: Feature Impact on Uninsurability Risk\n(Gradient Boosting)', fontsize=13)
plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "shap_beeswarm_plot.png", bbox_inches="tight", dpi=150)
plt.show()

## 9. SHAP Dependence Plots

How the top 4 features influence risk predictions across their value ranges. SHAP automatically selects the best interaction feature (colored).

In [None]:
top_4_features = shap_importance.head(4)['feature'].tolist()

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

for idx, (feature, ax) in enumerate(zip(top_4_features, axes.flatten())):
    feat_idx = feature_names.index(feature)
    shap.plots.scatter(shap_explanation[:, feature], ax=ax, show=False)
    ax.set_title(f'{feature}', fontsize=11)

plt.suptitle('SHAP Dependence Plots: Top 4 Risk Drivers', fontsize=13, y=1.02)
plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "shap_dependence_plots.png", bbox_inches="tight", dpi=150)
plt.show()

## 10. Top At-Risk Counties

In [None]:
# Merge risk scores with panel context
county_context = (
    panel.groupby(['county_fips', 'state'])
    .agg(
        mean_severity=('avg_claim_severity', 'mean'),
        total_disasters=('total_disasters', 'sum'),
        total_claims=('claim_count', 'sum'),
        population=('total_population', 'mean'),
    )
    .reset_index()
)

top_risk = risk_scores.head(25).merge(county_context, on=['county_fips', 'state'], how='left')

print("Top 25 Highest-Risk Counties (by ML Risk Score)")
print("=" * 90)
display(
    top_risk[['risk_rank', 'county_fips', 'state', 'mean_risk', 'max_risk',
              'total_disasters', 'mean_severity', 'total_claims', 'population']]
    .rename(columns={
        'risk_rank': 'Rank',
        'county_fips': 'FIPS',
        'state': 'State',
        'mean_risk': 'Mean Risk',
        'max_risk': 'Max Risk',
        'total_disasters': 'Disasters',
        'mean_severity': 'Avg Severity ($)',
        'total_claims': 'Total Claims',
        'population': 'Population',
    })
    .round(2)
    .reset_index(drop=True)
)

# State summary
print("\nState Distribution of Top 25:")
print(top_risk['state'].value_counts().to_string())

## 11. Geographic Visualization: County Risk Map

In [None]:
fig = px.choropleth(
    risk_scores,
    geojson="https://raw.githubusercontent.com/plotly/datasets/master/geojson-counties-fips.json",
    locations="county_fips",
    color="mean_risk",
    color_continuous_scale="YlOrRd",
    scope="usa",
    title="Uninsurability Risk Score by County (ML Model Prediction)",
    labels={"mean_risk": "Risk Score"},
    range_color=[0, risk_scores['mean_risk'].quantile(0.95)],
)
fig.update_layout(margin={"r": 0, "t": 40, "l": 0, "b": 0}, width=1000, height=600)
fig.show()

## 12. Risk Score Distribution by FEMA Region

In [None]:
# Map counties to FEMA regions
from src.utils.config import FEMA_REGIONS

state_to_region = {}
for region, states in FEMA_REGIONS.items():
    for s in states:
        state_to_region[s] = region

risk_with_region = risk_scores.copy()
risk_with_region['state_fips'] = risk_with_region['county_fips'].str[:2]
risk_with_region['fema_region'] = risk_with_region['state_fips'].map(state_to_region)

region_labels = {
    1: "R1: New England",
    2: "R2: NJ,NY",
    3: "R3: Mid-Atlantic",
    4: "R4: Southeast",
    5: "R5: Great Lakes",
    6: "R6: South-Central",
    7: "R7: Plains",
    8: "R8: Mountain",
    9: "R9: Pacific",
    10: "R10: Northwest",
}
risk_with_region['region_label'] = risk_with_region['fema_region'].map(region_labels)
risk_with_region = risk_with_region.dropna(subset=['region_label'])

# Box plot
fig, ax = plt.subplots(figsize=(14, 6))

order = (risk_with_region.groupby('region_label')['mean_risk']
         .median().sort_values(ascending=False).index)

sns.boxplot(data=risk_with_region, x='region_label', y='mean_risk', order=order,
            palette='YlOrRd_r', ax=ax)
ax.set_xlabel('FEMA Region')
ax.set_ylabel('Mean Risk Score')
ax.set_title('Uninsurability Risk Score Distribution by FEMA Region', fontsize=13)
plt.xticks(rotation=30, ha='right')

plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "risk_by_fema_region.png", bbox_inches="tight", dpi=150)
plt.show()

# Regional summary
region_summary = (risk_with_region.groupby('region_label')['mean_risk']
                  .agg(['mean', 'median', 'count'])
                  .sort_values('mean', ascending=False)
                  .round(4))
print("\nRegional Risk Summary:")
display(region_summary)

## 13. Cross-Validation Results

In [None]:
print("5-Fold Stratified Cross-Validation Results")
print("=" * 60)

for model_name in ['Gradient Boosting', 'Random Forest']:
    model_cv = cv_results[cv_results['model'] == model_name]
    print(f"\n{model_name}:")
    display(model_cv[['fold', 'auc_roc', 'avg_precision', 'f1_score']].round(4))
    print(f"  Mean AUC-ROC:       {model_cv['auc_roc'].mean():.4f} +/- {model_cv['auc_roc'].std():.4f}")
    print(f"  Mean Avg Precision: {model_cv['avg_precision'].mean():.4f} +/- {model_cv['avg_precision'].std():.4f}")
    print(f"  Mean F1 Score:      {model_cv['f1_score'].mean():.4f} +/- {model_cv['f1_score'].std():.4f}")

## 14. Predicted Probability Distribution

In [None]:
fig, ax = plt.subplots(figsize=(10, 5))

low_risk = predictions[predictions['y_test'] == 0]['y_prob_gb']
high_risk = predictions[predictions['y_test'] == 1]['y_prob_gb']

ax.hist(low_risk, bins=50, alpha=0.6, color='#3498db', label=f'Low Risk (n={len(low_risk)})')
ax.hist(high_risk, bins=50, alpha=0.6, color='#e74c3c', label=f'High Risk (n={len(high_risk)})')

threshold = comparison[comparison['model'] == 'Gradient Boosting']['optimal_threshold'].values[0]
ax.axvline(x=threshold, color='black', linestyle='--', linewidth=2,
           label=f'Optimal threshold = {threshold:.3f}')

ax.set_xlabel('Predicted Risk Probability')
ax.set_ylabel('Count')
ax.set_title('Distribution of Predicted Risk Probabilities (Gradient Boosting)')
ax.legend()

plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "classifier_prob_distribution.png", bbox_inches="tight", dpi=150)
plt.show()

## 15. Risk Trends Over Time

In [None]:
# Annual risk rate
annual_risk = (
    panel.groupby('year')
    .agg(
        risk_rate=('uninsurability_risk', 'mean'),
        n_high_risk=('uninsurability_risk', 'sum'),
        n_counties=('county_fips', 'nunique'),
    )
    .reset_index()
)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Risk rate trend
axes[0].plot(annual_risk['year'], annual_risk['risk_rate'] * 100, 'o-', 
             color='#e74c3c', linewidth=2)
axes[0].axvline(x=2021.5, color='gray', linestyle='--', alpha=0.5, label='Train/Test split')
axes[0].set_xlabel('Year')
axes[0].set_ylabel('High Risk Rate (%)')
axes[0].set_title('Uninsurability Risk Rate Over Time')
axes[0].legend()

# Number of high-risk counties
axes[1].bar(annual_risk['year'], annual_risk['n_high_risk'], color='#e74c3c', alpha=0.7)
ax2 = axes[1].twinx()
ax2.plot(annual_risk['year'], annual_risk['n_counties'], 'bo-', linewidth=2, label='Total counties')
axes[1].set_xlabel('Year')
axes[1].set_ylabel('High-Risk County-Years', color='#e74c3c')
ax2.set_ylabel('Total Counties', color='blue')
axes[1].set_title('High-Risk Counties Over Time')
ax2.legend(loc='upper left')

plt.tight_layout()
plt.savefig(REPORTS_FIGURES / "risk_trends_over_time.png", bbox_inches="tight", dpi=150)
plt.show()

## 16. Key Findings Summary

### Model Performance
- **Gradient Boosting** achieved the best results: AUC-ROC = 0.83 (test), 0.87 (5-fold CV)
- **Random Forest** performed comparably: AUC-ROC = 0.82 (test), 0.86 (5-fold CV)
- Both models significantly outperform random classification (AUC = 0.50)
- Low CV variance indicates stable, generalizable models

### Top Risk Drivers (SHAP Analysis)
1. **Cumulative disaster exposure** (3-year rolling) — the strongest predictor of uninsurability risk
2. **Total population** — larger counties face higher aggregate risk exposure
3. **Lagged claim count** — previous year's claims are highly predictive of future risk
4. **Average incident duration** — longer-lasting events drive more severe outcomes
5. **Severe storm count** — frequent storms compound risk beyond individual events

### Geographic Risk Patterns
- **Louisiana** dominates the top risk rankings (7+ of top 10 counties)
- **Florida** and **Texas** are also heavily represented
- **FEMA Region 6** (TX, LA, AR, OK, NM) has the highest median risk — consistent with Module 2's finding of 154% higher claim severity
- Risk is accelerating: the test period (2022-2024) shows higher positive rates than training data

### Implications
- The model identifies counties where the convergence of disaster frequency, population exposure, and historical claims creates genuine uninsurability pressure
- Demographic and economic factors (income, home values, macro indicators) play a secondary but meaningful role
- The geographic concentration in Gulf Coast states matches real-world insurer behavior — several major insurers have already left Florida and Louisiana markets
- Early warning: counties with high cumulative disasters and rising claim counts should be monitored for insurer withdrawal risk