# Impairment Scoring Agent

## Where We Are in the Workflow

**Previous Step (Workstream 3):** The detection agent analyzed multiple data sources (application, RX history, labs, MIB) and produced a structured JSON payload listing:
- Each medical impairment found (e.g., diabetes, hypertension)
- Scoring factors for each impairment (e.g., A1C level, blood pressure readings, medication compliance)
- Evidence supporting each finding

**This Step (Workstream 4):** The scoring agent takes that payload and calculates a **risk score** by:
1. Looking up rating tables from underwriting guidelines
2. Applying the scoring factors to determine debits (risk increases) and credits (risk decreases)
3. Calculating a numerical score for each impairment
4. Aggregating all impairment scores into a final total risk score

**Why This Matters:** The final risk score determines:
- Whether the application is approved, declined, or postponed
- What premium rate the applicant qualifies for (standard, preferred, rated)
- How much additional premium (if any) is charged for the identified risks

## What This Notebook Does

This notebook shows step-by-step how the Strands agent:
- Ingests the JSON payload from the detection agent
- Searches the knowledge base for rating tables specific to each impairment
- Applies underwriting rules to calculate debits and credits
- Uses the calculator tool to sum scores accurately
- Returns a final aggregated risk assessment with detailed explanations

> **Why a notebook?**  
> • Allows for easy modification of the input payload to test different scenarios  
> • Lets you tweak the prompt or tools and iterate without redeploying Lambda functions  
> • Serves as executable documentation showing exactly how risk scores are calculated  
> • Makes the underwriting logic transparent and auditable


## Configuration and Sample Payload

**What:** Set up the scoring agent and define a sample input payload from the detection agent.

**Why:**

**Configuration Settings:**
- **Knowledge Base**: Toggle between local markdown files and Bedrock KB (same as detection agent)
- **Model**: Claude 3.7 Sonnet for complex mathematical reasoning and rule application
- **Embedding Model**: For semantic search when using local knowledge base

**Sample Payload Structure:**
The `impairments_payload` shows what the detection agent (Workstream 3) outputs:
- `impairment_id`: Canonical name matching knowledge base documents (e.g., "hypertension")
- `scoring_factors`: Key-value pairs needed for rating tables (blood pressure, age, medications, duration, etc.)
- `evidence`: Supporting data points with their sources (for audit trail)

**Why This Payload Format:**
- **Modular**: Each impairment is independent, allowing parallel processing in production
- **Complete**: Contains all information needed to score without re-querying data sources
- **Traceable**: Evidence list provides audit trail for compliance and quality review
- **Flexible**: Easy to modify this payload to test different clinical scenarios

Try changing values in `scoring_factors` (e.g., increase blood pressure to "160/100 mmHg") to see how scores change!


In [15]:
import os, json, boto3, re
import numpy as np
from collections import defaultdict
from strands import Agent, tool

# ---- Set these before running locally ----
# Knowledge base configuration - uncomment the line below to use Bedrock Knowledge Base instead of local files
#kb_id = 'YSWIGPQHRJ'
kb_id = None  # Reset to None to ensure clean state

model_id = 'us.anthropic.claude-3-7-sonnet-20250219-v1:0'
embedding_model_id = 'amazon.titan-embed-text-v2:0'

# Local knowledge base path
local_kb_path = "../underwriting_manual"

# Initialize the payload for easy modification
impairments_payload = [
  {
    "impairment_id": "hypertension",
    "scoring_factors": {
      "blood_pressure": "128/92 mmHg",
      "age": 41,
      "medication": "Lisinopril 10mg",
      "duration": "At least since 2022-04-18",
      "compliance": "Good - regular refills",
      "target_organ_damage": "None evident",
      "comorbidities": "None evident",
      "family_history": "Father had heart attack at age 58"
    },
    "evidence": [
      "Rx: Lisinopril 10mg for hypertension, filled 2024-01-10 (90 tablets)",
      "Rx: Lisinopril 10mg for hypertension, filled 2023-10-12 (90 tablets)",
      "MIB: Code 311C 'CARDIOVASCULAR - HYPERTENSION TREATED' from 2022-04-18",
      "Application: Self-reported Lisinopril 10mg for blood pressure",
      "Application: Blood pressure reading 128/92 mmHg"
    ]
  }
]


## Local Knowledge Base Setup

**What:** Load underwriting guideline markdown files and create semantic search capability (identical to detection agent setup).

**Why:**
The scoring agent needs access to the same underwriting manual as the detection agent, but for a different purpose:

**Detection Agent Uses KB To:**
- Identify what scoring factors are required for each impairment
- Understand what evidence to look for in data feeds

**Scoring Agent Uses KB To:**
- Retrieve rating tables with specific debits and credits
- Apply rules for modifying factors (e.g., age adjustments, duration credits)
- Understand thresholds (e.g., what blood pressure range = what debit)

**Same Knowledge Base, Different Questions:**
- Detection agent asks: "What do I need to know about diabetes to score it?"
- Scoring agent asks: "Given an A1C of 7.2%, what's the debit according to the Type 2 Diabetes rating table?"

This dual-mode setup (local vs. Bedrock) ensures the notebook works independently for demos while supporting production AWS infrastructure.


In [16]:
# Local Knowledge Base Setup
local_kb_store = None
bedrock_runtime = boto3.client('bedrock-runtime')

def cosine_similarity(a, b):
    """Calculate cosine similarity between two vectors"""
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

def create_embedding(text):
    """Create embedding using Amazon Titan model"""
    response = bedrock_runtime.invoke_model(
        modelId=embedding_model_id,
        body=json.dumps({"inputText": text})
    )
    embedding = json.loads(response['body'].read())['embedding']
    return np.array(embedding)

def load_local_knowledge_base():
    """Load markdown files from local underwriting manual and create embeddings"""
    global local_kb_store
    
    if 'kb_id' in globals() and kb_id is not None:
        print("Bedrock KB configured, skipping local KB loading...")
        return
    
    print(f"Loading local knowledge base from {local_kb_path}...")
    
    kb_documents = []
    
    # Find all markdown files in the underwriting manual directory
    if not os.path.exists(local_kb_path):
        print(f"Warning: Local KB path {local_kb_path} does not exist")
        return
    
    for filename in os.listdir(local_kb_path):
        if filename.lower().endswith('.md'):
            file_path = os.path.join(local_kb_path, filename)
            try:
                with open(file_path, 'r', encoding='utf-8') as f:
                    content = f.read()
                
                print(f"✓ Loading {filename} ({len(content)} chars)")
                
                # Create embedding for the document
                embedding = create_embedding(content)
                
                kb_documents.append({
                    'filename': filename,
                    'content': content,
                    'embedding': embedding
                })
                
            except Exception as e:
                print(f"✗ Error loading {filename}: {e}")
    
    local_kb_store = kb_documents
    print(f"Local knowledge base loaded with {len(kb_documents)} documents")

# Load the local knowledge base if kb_id is not defined or None
if 'kb_id' not in globals() or kb_id is None:
    load_local_knowledge_base()
else:
    print("Using Bedrock Knowledge Base")


Loading local knowledge base from ../underwriting_manual...
✓ Loading hypertension.md (7391 chars)
✓ Loading type1_diabetes.md (9446 chars)
✓ Loading type2_diabetes.md (9187 chars)
✓ Loading lab_values.md (18969 chars)
Local knowledge base loaded with 4 documents


## Agent Tools: Knowledge Base Search + Calculator

**What:** Define two tools that give the agent capabilities it doesn't have natively.

**Why Each Tool:**

### 1. `kb_search` Tool
**Purpose:** Retrieve underwriting rating tables for specific impairments.

**Why It's Needed:** 
The LLM doesn't have the company's proprietary underwriting guidelines in its training data. The agent must:
- Look up the exact rating table for each impairment (e.g., hypertension, diabetes)
- Get current, authoritative rules (not outdated or generic information)
- Access company-specific debits/credits that may differ from industry standards

**Example:** For "hypertension," this returns the markdown containing blood pressure ranges, age-based adjustments, medication factors, and comorbidity modifiers.

### 2. `calculator` Tool
**Purpose:** Perform exact arithmetic on lists of debits and credits.

**Why It's Needed:**
While LLMs are good at reasoning, they can make arithmetic errors, especially with negative numbers (credits). This tool ensures:
- **Accuracy**: No rounding errors or miscalculations in risk scores
- **Auditability**: Each calculation is logged and traceable
- **Consistency**: Same inputs always produce same outputs

**Example:** Given `[25, -10, 50, -25]` (two debits, two credits), returns `40` with complete precision.

**Why Not Just Let the LLM Calculate?**
Risk scoring affects real financial outcomes. A calculation error could mean:
- Incorrectly denying coverage to a qualified applicant
- Underpricing a high-risk policy and losing money
- Regulatory compliance issues

Offloading arithmetic to a tool eliminates this risk class entirely.


In [17]:
kb_rt = boto3.client('bedrock-agent-runtime')

@tool
def kb_search(canonical_term: str):
    """Return markdown for the top KB hit from either local or Bedrock knowledge base."""
    
    if ('kb_id' not in globals() or kb_id is None) and local_kb_store:
        # Use local knowledge base
        print(f"Searching local KB for: {canonical_term}")
        
        # Create embedding for the search query
        query_embedding = create_embedding(canonical_term)
        
        # Find the most similar document
        best_match = None
        best_similarity = -1
        
        for doc in local_kb_store:
            similarity = cosine_similarity(query_embedding, doc['embedding'])
            if similarity > best_similarity:
                best_similarity = similarity
                best_match = doc
        
        if best_match:
            print(f"Best match: {best_match['filename']} (similarity: {best_similarity:.3f})")
            return best_match['content']
        else:
            return "No matching documents found in local knowledge base."
    
    else:
        # Use Bedrock Knowledge Base
        print(f"Searching Bedrock KB for: {canonical_term}")
        resp = kb_rt.retrieve(
            knowledgeBaseId=kb_id,
            retrievalQuery={'text': canonical_term},
            retrievalConfiguration={'vectorSearchConfiguration': {'numberOfResults': 1}}
        )
        print(resp)
        # According to official AWS documentation, the field is 'text', not 'text_markdown'
        return resp['retrievalResults'][0]['content']['text']

@tool
def calculator(values: list[float]):
    """Calculates the sum of a list of numbers. Use this for adding up credits (negative numbers) and debits (positive numbers)."""
    print(f"Calculator adding: {values}")
    return sum(values)


## Agent System Prompt: The Scoring Workflow

**What:** Instructions that define how the agent processes impairments and calculates risk scores.

**Why This Multi-Step Process:**

The prompt orchestrates a rigorous workflow that mirrors how professional underwriters work:

### Step 1: Lookup (Per Impairment)
**Action:** Call `kb_search` with the impairment ID  
**Why:** Get the authoritative rating tables and rules for this specific condition  
**Example:** Searching "hypertension" returns blood pressure ranges, age adjustments, and medication factors

### Step 2: Analyze (Apply Rules to Factors)
**Action:** Match scoring factors to rating table rows/columns  
**Why:** Determine exact debits and credits based on the applicant's specific situation  
**Example:** Blood pressure of 128/92 at age 41 → find the intersection in the age-stratified table

**Key Challenge:** Rating tables often have ranges (e.g., "+25 to +50"). The prompt instructs the agent to **use the lower value** for consistency (this could be configured as upper, midpoint, or based on other factors in production).

### Step 3: Calculate Subtotal (Per Impairment)
**Action:** Pass all debits/credits to the `calculator` tool  
**Why:** Get exact arithmetic for this impairment's score  
**Example:** `[0, 25]` (BP debit + family history debit) → subtotal of 25

### Step 4: Explain (Document Reasoning)
**Action:** Generate detailed `reason` string  
**Why:** 
- **Regulatory Compliance**: Insurance decisions must be explainable and auditable
- **Quality Review**: Underwriting managers can verify the agent's logic
- **Customer Communication**: Agents can explain ratings to applicants
- **Model Improvement**: Data scientists can identify where the agent needs refinement

**Example Reason:** "Blood pressure of 128/92 mmHg at age 41 falls into the '141-150/91-95' category for 'Age 40-60' column, which scores 0 points. Additional debit of +25 points for family history of early cardiovascular disease."

### Step 5: Aggregate (Final Score)
**Action:** Sum all impairment subtotals with `calculator`  
**Why:** Get the total risk score across all conditions  
**Example:** Hypertension (25) + Diabetes (75) + Hyperlipidemia (25) = 125 total

**Critical Output Format:**
The prompt strictly requires JSON output because:
- Downstream systems need structured data (not prose)
- Parseable format enables automation
- Standardized schema allows integration with pricing engines, workflow systems, and reporting tools


In [18]:
PROMPT = """You are a senior life insurance underwriter specializing in risk assessment scoring. Your job is to calculate a risk score for an application based on a list of identified impairments and their scoring factors.

You will be given a JSON array of impairments. For each impairment in the input list, you must perform the following steps in sequence:

1. **Lookup**: Call the `kb_search` tool using the impairment's `impairment_id` as the `canonical_term`. This returns the authoritative underwriting manual section.

2. **Analyze**: Carefully read the returned markdown. Use the `scoring_factors` provided for the impairment to find the correct debits and credits in the rating tables. For example, a `blood_pressure` of "128/92 mmHg" and `age` of 41 falls into the "141-150/91-95" row for the "Age 40-60" column in the hypertension manual, which indicates a debit between +25 and +50. Use the lower value if a range is given.

3. **Calculate Subtotal**: Create a list of all numerical debits (positive numbers) and credits (negative numbers) you identified. Pass this list to the `calculator` tool to get a `sub_total` for the impairment.

4. **Explain**: After calculating the subtotal, you must generate a detailed `reason` string explaining exactly how you arrived at that score, citing the specific scoring factors, table values, and modifying factors used.

Repeat this entire process for every impairment in the input list.

Once you have a `sub_total` for all impairments, create a final list containing all the individual sub-totals. Call the `calculator` tool one last time with this list to get the final `total_score`.

Finally, structure your entire response as a single JSON object. Do not include any other text or explanation outside of the final JSON block.

Your output must be in this exact format:
```json
{
  "total_score": 100,
  "impairment_scores": [
    {
      "impairment_id": "hypertension",
      "sub_total": 50,
      "reason": "Debit of +25 for BP 128/92 at age 41. Debit of +25 for newly diagnosed. No credits applied."
    }
  ]
}
```
"""


## Initialize the Scoring Agent

**What:** Create an Agent instance connecting the system prompt, tools, and model.

**Why:**
This is where the three components come together to create the scoring "brain":

1. **System Prompt** → Defines the workflow (lookup → analyze → calculate → explain → aggregate)
2. **Tools** → Gives capabilities the LLM lacks (knowledge base access, precise arithmetic)
3. **Model** → Provides reasoning ability to interpret tables and apply rules

**What Happens When the Agent Runs:**
- Receives impairment payload
- For each impairment:
  - Decides to call `kb_search` tool
  - Reads the returned rating tables
  - Reasons about which debits/credits apply
  - Calls `calculator` tool for subtotal
  - Generates explanation
- Calls `calculator` again for final total
- Formats structured JSON output

This agent is essentially a **rule-based reasoning engine** where the rules come from the knowledge base and the reasoning is performed by the LLM.


In [19]:
# Create the scoring agent with the correct tools
scoring_agent = Agent(
    system_prompt=PROMPT,
    tools=[kb_search, calculator],
    model=model_id,
)


## Run Scoring Function

**What:** A utility function that sends the payload to the agent and parses the response.

**Why:**
This wrapper provides a clean interface to:
- **Format Input**: Convert the Python dictionary payload into a message string
- **Invoke Agent**: Call the scoring agent and capture the response
- **Parse Output**: Extract JSON from the response (handles markdown code blocks if present)
- **Error Handling**: Gracefully handle parsing errors

**The Workflow:**
1. Stringify the impairments payload as JSON
2. Send to agent: "Here is the JSON payload of impairments to score..."
3. Agent performs multi-step reasoning (as defined in system prompt)
4. Extract the final JSON output from the response
5. Return structured data ready for downstream systems

**Why Parse JSON from Response?**
The agent may wrap JSON in markdown code blocks (```json ... ```) for better formatting. This function handles both:
- Raw JSON strings
- JSON wrapped in markdown code fences

This makes the scoring agent robust to different output formats while ensuring we always get parseable structured data.


In [20]:
def run_scoring(payload):
    """Utility to run the scoring agent in‑notebook"""
    
    # Create a string message with the JSON payload
    message = f"Here is the JSON payload of impairments to score:\n\n{json.dumps(payload, indent=2)}"
    
    print("Sending payload to the scoring agent...")
    
    # Call the agent with the message
    res = scoring_agent(message)
    print("Agent response:")
    print(res)
    
    # Extract JSON from between ```json ... ``` tags if present
    res_str = str(res)
    json_match = re.search(r"```json\s*(.*?)\s*```", res_str, re.DOTALL)
    if json_match:
        res_str = json_match.group(1)
    else:
        # If no markdown code block, try to find JSON object directly
        json_match = re.search(r"\{.*\}", res_str, re.DOTALL)
        if json_match:
            res_str = json_match.group(0)
    
    return json.loads(res_str)


## Execute Scoring on Sample Payload

**What:** Run the complete risk scoring workflow on the sample hypertension case.

**Why:**
This is where the agent applies underwriting rules to calculate a real risk score:

**Input:** Impairment payload with:
- Hypertension diagnosis
- Blood pressure: 128/92 mmHg
- Age: 41
- Medication: Lisinopril 10mg (controlled)
- Family history: Father had heart attack at age 58

**Process:** The agent will:
1. Search knowledge base for hypertension rating tables
2. Locate the appropriate cell in the table based on BP and age
3. Apply modifying factors (medication type, family history, compliance)
4. Calculate debits and credits
5. Generate detailed explanation of the score

**Expected Output:**
```json
{
  "total_score": 25,
  "impairment_scores": [
    {
      "impairment_id": "hypertension",
      "sub_total": 25,
      "reason": "Detailed explanation of how the score was calculated..."
    }
  ]
}
```

**What This Score Means:**
- **0 points**: Standard risk (no additional premium)
- **1-50 points**: Mild impairment (small rate increase)
- **51-150 points**: Moderate impairment (table rating, e.g., +25% premium per 25 points)
- **150+ points**: Severe impairment (may be declined or heavily rated)

In this case, a score of 25 suggests a **mild impairment** with controlled hypertension and family history considerations. The applicant would likely qualify for coverage with a modest premium increase.

**Try Experimenting:**
- Change blood pressure to "160/100 mmHg" to see higher scores
- Remove family history to see credits applied
- Add "duration: 5+ years of excellent control" to get duration credits


In [21]:
# Run the scoring agent using the input payload
print("=== Running Impairment Scoring Agent ===\\n")

try:
    results = run_scoring(impairments_payload)
    
    print("\\n=== Final Scoring Results ===")
    print(json.dumps(results, indent=2))
    
except Exception as e:
    print(f"\\n\\nError running scoring: {e}")
    print("\\nMake sure:")
    print("1. Your AWS credentials are configured")
    print("2. The kb_id and model_id are set correctly")
    print("3. You have access to the specified Bedrock model")


=== Running Impairment Scoring Agent ===\n
Sending payload to the scoring agent...
I'll analyze the hypertension impairment and calculate the risk score. Let me retrieve the underwriting manual information first.
Tool #1: kb_search
Searching local KB for: hypertension
Best match: hypertension.md (similarity: 0.526)
Now that I have the underwriting manual information, let me analyze the hypertension impairment using the scoring factors provided.

Looking at the factors:
- Blood pressure: 128/92 mmHg
- Age: 41
- Medication: Lisinopril 10mg
- Duration: At least since 2022-04-18 (approximately 2 years)
- Compliance: Good - regular refills
- Target organ damage: None evident
- Comorbidities: None evident
- Family history: Father had heart attack at age 58

According to the rating guidelines for Uncomplicated Hypertension:
1. The BP reading of 128/92 mmHg falls into the "141-150/91-95" range for diastolic (92 is in 91-95 range). For Age 40-60, this carries a debit of "0 to +25". I'll use the

## Summary: The Complete Agentic Underwriting Pipeline

### What Just Happened (Workstream 3 + 4 Combined)

**End-to-End Workflow:**

1. **Data Ingestion (Workstream 3)** 
   - Detection agent received: Application JSON, RX history, Labs, MIB records
   - Identified impairments across all sources
   - Extracted scoring factors and compiled evidence

2. **Risk Calculation (Workstream 4)**
   - Scoring agent received: Structured impairment payload
   - Looked up rating tables for each condition
   - Applied underwriting rules and calculated scores
   - Generated detailed audit trail

3. **Final Output**
   - Total risk score with per-impairment breakdown
   - Detailed reasoning for each score
   - Structured JSON ready for downstream systems


### Testing Different Scenarios

To see how the system handles various cases:
1. Run the detection agent on `../mock_data/diabetes_cardiovascular/` (complex multi-impairment case)
2. Copy that output to the `impairments_payload` in this notebook
3. Run the scoring agent to see total risk across multiple conditions

The beauty of this architecture is that each agent specializes in one task (detect vs. score), making the system easier to test, debug, and improve over time.
