# Introduction to Necessity and Sufficiency Analysis

Welcome! This notebook introduces the key concepts behind evaluating XAI robustness using necessity and sufficiency scores.

## What This Framework Does

This framework helps answer: **"Are XAI methods (LIME, SHAP) truly identifying important features?"**

We use two causal concepts:
1. **Necessity**: Is the feature essential for the prediction?
2. **Sufficiency**: Can the feature drive predictions on its own?

## Core Concepts

### Necessity

**Question**: "If I change this feature, will the model's prediction change?"

**High Necessity** = Changing the feature consistently flips the prediction

**Example**: 
- Model predicts "Malignant tumor" (Class 1)
- Change "tumor size" from large to small
- Prediction changes to "Benign" (Class 0)
- → Tumor size is NECESSARY for the malignant prediction

### Sufficiency

**Question**: "Can this feature alone produce the target prediction?"

**High Sufficiency** = Setting the feature to a specific value reliably produces the outcome

**Example**:
- Model predicts "Benign tumor" (Class 0)
- Set "tumor size" to very large
- Prediction changes to "Malignant" (Class 1)
- → Large tumor size is SUFFICIENT for malignant prediction

## Visual Intuition

![Concept Diagram](../images/concept_diagram.png)

### Interpreting Scores

| Necessity | Sufficiency | Interpretation |
|-----------|-------------|----------------|
| High | High | **Crucial feature** - Essential AND can drive decisions |
| High | Low | **Necessary but not sufficient** - Important but needs other features |
| Low | High | **Sufficient but not necessary** - Can drive decisions but alternatives exist |
| Low | Low | **Not important** - Neither essential nor driving decisions |

## Why This Matters for XAI

### The Problem

LIME and SHAP rank features as "important", but:
- Do they agree with each other?
- Are these features truly causal?
- Can we trust the rankings?

### Our Solution

**Hypothesis**: If a feature is truly important, it should be:
1. **Necessary** (changing it affects predictions)
2. **Sufficient** (it can drive predictions)

**Robustness Test**: Do top-ranked features have high necessity and sufficiency?

![Robustness Concept](../images/robustness_concept.png)

## Quick Example

Let's see this in action with a simple example:

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Simulate a simple scenario
# Feature X2 is important, X1 is not

# Imagine we calculated these scores:
features = ['X1 (irrelevant)', 'X2 (important)', 'X3 (moderate)']
necessity = [0.05, 0.85, 0.45]
sufficiency = [0.03, 0.72, 0.38]

# Plot
x = np.arange(len(features))
width = 0.35

fig, ax = plt.subplots(figsize=(10, 6))
bars1 = ax.bar(x - width/2, necessity, width, label='Necessity', color='#FF6B6B', alpha=0.8)
bars2 = ax.bar(x + width/2, sufficiency, width, label='Sufficiency', color='#4ECDC4', alpha=0.8)

ax.set_ylabel('Score', fontsize=12)
ax.set_title('Example: Feature Importance via Necessity & Sufficiency', fontsize=14, fontweight='bold')
ax.set_xticks(x)
ax.set_xticklabels(features)
ax.legend(fontsize=11)
ax.set_ylim([0, 1])
ax.grid(axis='y', alpha=0.3)

# Add value labels
for bars in [bars1, bars2]:
    for bar in bars:
        height = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2., height + 0.02,
               f'{height:.2f}', ha='center', va='bottom', fontsize=10)

plt.tight_layout()
plt.show()

print("Interpretation:")
print("- X1: Low scores → Not important (as expected)")
print("- X2: High scores → Very important and causal")
print("- X3: Moderate scores → Somewhat important")

## The Framework Workflow

```
1. Train Model
   ↓
2. Calculate Global Necessity & Sufficiency
   - For each feature
   - Across many instances
   - Using forward counterfactuals
   ↓
3. Get LIME/SHAP Rankings
   - For test instances
   - Extract top-k features
   ↓
4. Evaluate Robustness
   - Do top ranks have high necessity?
   - Do top ranks have high sufficiency?
   - Is the pattern monotonic?
   ↓
5. Visualize & Interpret
   - Global scores
   - Robustness curves
   - Comparison between methods
```

## Key Advantages

### 1. No Causal Model Needed
- Traditional causal methods require domain knowledge
- Our approach works with any trained model

### 2. Works on Complex Data
- High-dimensional features
- Sparse datasets
- Imbalanced classes

### 3. Model-Agnostic
- Logistic Regression
- Random Forests
- Neural Networks
- Any classifier!

### 4. Principled Evaluation
- Grounded in causal theory
- Validated on synthetic data
- Published methodology

## Real-World Applications

### Healthcare
- Validate feature importance in disease prediction
- Ensure critical symptoms are identified
- Compare diagnostic models

### Finance
- Credit scoring model validation
- Risk factor identification
- Regulatory compliance

### Geophysics (Original Application)
- Hydrocarbon prospect evaluation
- Seismic indicator assessment
- Drilling decision support

## Next Steps

Now that you understand the concepts, explore:

1. **02_toy_example.ipynb**
   - See validation with logical operators
   - Verify scores match theory

2. **03_full_analysis.ipynb**
   - Complete walkthrough on real data
   - Step-by-step analysis

3. **Run Quick Demo**
   ```bash
   cd ../src
   python demo.py
   ```

4. **Full Analysis**
   ```bash
   python main.py --dataset breast_cancer --model logistic
   ```

## References

### Papers Implemented

1. **Chowdhury et al. (2023)**
   - "Explaining Explainers: Necessity and Sufficiency in Tabular Data"
   - NeurIPS 2023 Workshop on Table Representation Learning

2. **Chowdhury et al. (2025)**
   - "A unified framework for evaluating the robustness of machine-learning interpretability"
   - Geophysics, Vol. 90, No. 3, pp. IM103-IM118

### Theoretical Foundations

- **Pearl (2009)**: Causality - Models, Reasoning and Inference
- **Halpern (2016)**: Actual Causality
- **Swartz (1997)**: The Concepts of Necessary and Sufficient Conditions

### XAI Methods

- **LIME**: Ribeiro et al. (2016) - "Why Should I Trust You?"
- **SHAP**: Lundberg & Lee (2017) - "A Unified Approach to Interpreting Model Predictions"

## Questions?

For more details:
- See `../docs/methodology.md` for complete methodology
- Check `../README.md` for usage instructions
- Read the papers for theoretical background
- Open an issue on GitHub for support