<img src="../static/imo_health.png" alt="IMO Health Logo" width="300"/>

---

## Setup and Configuration

Import libraries and load the extracted entities from Step 2.

In [None]:
import sys
import os

# Add parent directory to path
sys.path.append(os.path.dirname(os.path.abspath('')))

import json
import requests
from typing import Dict, List, Any
from datetime import datetime

# Import configuration
import config

# Import authenticator from Step 2
import time

class IMOAuthenticator:
    """Handle IMO API authentication."""
    
    def __init__(self):
        self.auth_url = config.imo_auth_url if hasattr(config, 'imo_auth_url') else "https://auth.imohealth.com/oauth/token"
        self.client_id = config.imo_normalize_enrichment_api_client_id
        self.client_secret = config.imo_normalize_enrichment_api_client_secret
        self.access_token = None
        self.token_expiry = None
    
    def get_access_token(self):
        """Get or refresh OAuth access token."""
        if self.access_token and self.token_expiry and time.time() < self.token_expiry:
            return self.access_token
        
        headers = {'Content-Type': 'application/json'}
        payload = {
            'grant_type': 'client_credentials',
            'client_id': self.client_id,
            'client_secret': self.client_secret,
            'audience': 'https://api.imohealth.com'
        }
        
        try:
            response = requests.post(self.auth_url, headers=headers, json=payload, timeout=30)
            
            if response.status_code == 200:
                result = response.json()
                self.access_token = result.get('access_token')
                expires_in = result.get('expires_in', 3600)
                self.token_expiry = time.time() + expires_in - 60
                return self.access_token
            else:
                return None
        except Exception as e:
            print(f"Error getting access token: {str(e)}")
            return None

authenticator = IMOAuthenticator()
print("✓ Libraries imported and authenticator initialized")

## Load Extracted Entities from Step 2

Load the entities with context that were extracted in the previous step.

In [None]:
# Load extracted entities from Step 2
entities_file = 'extracted_entities_output.json'

try:
    with open(entities_file, 'r') as f:
        entities_data = json.load(f)
    
    extracted_entities = entities_data['entities']
    metadata = entities_data['extraction_metadata']
    
    print("✓ Extracted entities loaded successfully")
    print(f"\nEntity Counts:")
    print(f"  Problems: {metadata['problems_count']}")
    print(f"  Procedures: {metadata['procedures_count']}")
    print(f"  Medications: {metadata['medications_count']}")
    print(f"  Labs: {metadata['labs_count']}")
    print(f"  Total: {metadata['total_entities']}")
    
except FileNotFoundError:
    print(f"✗ Error: {entities_file} not found")
    print("  Please run Step 2 notebook first to extract entities")

## Normalize Entities with IMO Precision Normalize API

Call the IMO Precision Normalize API to normalize and enrich entities.

In [None]:
import uuid

def normalize_single_entity(entity, category, access_token):
    """
    Normalize a single entity using IMO API.
    Uses enrichment endpoint for problems (with context), regular endpoint for others.
    
    Args:
        entity (dict): Entity to normalize
        category (str): Entity category (problems, procedures, medications, labs)
        access_token (str): OAuth access token
        
    Returns:
        dict: Normalized entity
    """
    # Map category to domain
    domain_map = {
        'problems': 'problem',
        'procedures': 'procedure',
        'medications': 'medication',
        'labs': 'lab'
    }
    domain = domain_map.get(category, 'problem')
    
    # Use enrichment endpoint for problems, regular for others
    if category == 'problems':
        url = config.imo_precision_normalize_enrichment_url if hasattr(config, 'imo_precision_normalize_enrichment_url') else "https://api.imohealth.com/precision/normalize/enrichment"
        
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {access_token}'
        }
        
        # Build enrichment payload with context
        payload = {
            "organization_id": "IMO",
            "client_request_id": str(uuid.uuid4()),
            "preferences": {
                "threshold": 0.0,
                "match_field_pref": "input_term",
                "debug": True
            },
            "requests": [{
                "record_id": entity.get('entity_id', str(uuid.uuid4())),
                "domain": domain,
                "input_term": entity.get('text', ''),
                "context": {
                    "source_text": entity.get('context', '')
                }
            }]
        }
        
        print(f"  Normalizing problem with enrichment: {entity.get('text', '')}")
        
    else:
        # Use regular normalize endpoint for other domains
        url = config.imo_precision_normalize_url if hasattr(config, 'imo_precision_normalize_url') else "https://api.imohealth.com/precision/normalize"
        
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {access_token}'
        }
        
        payload = {
            "organization_id": "IMO",
            "client_request_id": str(uuid.uuid4()),
            "preferences": {
                "threshold": 0.0,
                "match_field_pref": "input_term",
                "debug": True
            },
            "requests": [{
                "record_id": entity.get('entity_id', str(uuid.uuid4())),
                "domain": domain,
                "input_term": entity.get('text', ''),
                "input_code": entity.get('code', ''),
                "input_code_system": entity.get('code_system', '')
            }]
        }
        
        print(f"  Normalizing {category} entity: {entity.get('text', '')}")
    
    try:
        response = requests.post(
            url,
            headers=headers,
            json=payload,
            timeout=30
        )
        
        if response.status_code == 200:
            result = response.json()
            
            # Parse IMO normalization response
            if 'requests' in result and len(result['requests']) > 0:
                request_data = result['requests'][0]
                
                # Check if response exists and has items
                if 'response' in request_data and 'items' in request_data['response']:
                    items = request_data['response']['items']
                    
                    if len(items) > 0:
                        # Get top match (first item)
                        top_match = items[0]
                        
                        entity['imo_code'] = top_match.get('code', '')
                        entity['imo_lexical_code'] = top_match.get('lexical_code', '')
                        entity['imo_description'] = top_match.get('title', '')
                        entity['imo_lexical_title'] = top_match.get('lexical_title', '')
                        entity['normalized'] = True
                        entity['normalization_confidence'] = top_match.get('score', 0.0)
                        
                        # Check if refinable (from flags)
                        metadata = top_match.get('metadata', {})
                        flags = metadata.get('flags', {})
                        entity['is_refinable'] = flags.get('is_icd10cm_refinable', False)
                        entity['needs_refinement'] = entity['is_refinable']
                        
                        # Store mappings based on category
                        mappings = metadata.get('mappings', {})
                        
                        # ICD-10-CM for problems
                        icd10cm_codes = mappings.get('icd10cm', {}).get('codes', [])
                        if icd10cm_codes:
                            entity['icd10cm_code'] = icd10cm_codes[0].get('code', '')
                            entity['icd10cm_title'] = icd10cm_codes[0].get('title', '')
                        
                        # CPT for procedures
                        cpt_codes = mappings.get('cpt', {}).get('codes', [])
                        if cpt_codes:
                            entity['cpt_code'] = cpt_codes[0].get('code', '')
                            entity['cpt_title'] = cpt_codes[0].get('title', '')
                        
                        # RxNorm for medications
                        rxnorm_codes = mappings.get('rxnorm', {}).get('codes', [])
                        if rxnorm_codes:
                            entity['rxnorm_code'] = rxnorm_codes[0].get('code', '')
                            entity['rxnorm_title'] = rxnorm_codes[0].get('title', '')
                        
                        # LOINC for labs
                        loinc_codes = mappings.get('loinc', {}).get('codes', [])
                        if loinc_codes:
                            entity['loinc_code'] = loinc_codes[0].get('code', '')
                            entity['loinc_title'] = loinc_codes[0].get('title', '')
                        
                        # Store alternate choices
                        alternate_choices = []
                        for alt_item in items[1:]:
                            alt_choice = {
                                'imo_code': alt_item.get('code', ''),
                                'lexical_code': alt_item.get('lexical_code', ''),
                                'title': alt_item.get('title', ''),
                                'lexical_title': alt_item.get('lexical_title', ''),
                                'score': alt_item.get('score', 0.0),
                                'is_refinable': alt_item.get('metadata', {}).get('flags', {}).get('is_icd10cm_refinable', False)
                            }
                            
                            # Add category-specific codes if available
                            alt_mappings = alt_item.get('metadata', {}).get('mappings', {})
                            
                            # ICD-10-CM
                            alt_icd10cm = alt_mappings.get('icd10cm', {}).get('codes', [])
                            if alt_icd10cm:
                                alt_choice['icd10cm_code'] = alt_icd10cm[0].get('code', '')
                                alt_choice['icd10cm_title'] = alt_icd10cm[0].get('title', '')
                            
                            alternate_choices.append(alt_choice)
                        
                        entity['alternate_choices'] = alternate_choices
                        
                        # Log result
                        refinable_tag = " [REFINABLE]" if entity['is_refinable'] else ""
                        print(f"    ✓ Normalized to IMO: {entity['imo_code']} - {entity['imo_description']}{refinable_tag}")
                        if alternate_choices:
                            print(f"      Alternate choices: {len(alternate_choices)}")
                    else:
                        entity['normalized'] = False
                        entity['needs_refinement'] = False
                        entity['is_refinable'] = False
                        entity['alternate_choices'] = []
                else:
                    entity['normalized'] = False
                    entity['needs_refinement'] = False
                    entity['is_refinable'] = False
                    entity['alternate_choices'] = []
            else:
                entity['normalized'] = False
                entity['needs_refinement'] = False
                entity['is_refinable'] = False
                entity['alternate_choices'] = []
                
            return entity
            
        else:
            print(f"    ✗ API Error: {response.status_code}")
            entity['normalized'] = False
            entity['needs_refinement'] = False
            return entity
            
    except Exception as e:
        print(f"    ✗ Error: {str(e)}")
        entity['normalized'] = False
        entity['needs_refinement'] = False
        return entity

def normalize_entities(entities_dict):
    """
    Normalize entities using IMO Precision Normalize API.
    
    Args:
        entities_dict (dict): Extracted entities
        
    Returns:
        dict: Normalized entities with IMO codes
    """
    # Get OAuth access token
    access_token = authenticator.get_access_token()
    if not access_token:
        raise Exception("Failed to obtain IMO API access token")
    
    normalized = {
        'problems': [],
        'procedures': [],
        'medications': [],
        'labs': []
    }
    
    print(f"\nNormalizing entities...")
    print("=" * 80)
    
    # Normalize each category
    for category, entity_list in entities_dict.items():
        if not entity_list:
            continue
            
        print(f"\n{category.upper()} ({len(entity_list)} entities):")
        print("-" * 80)
        
        for entity in entity_list:
            try:
                normalized_entity = normalize_single_entity(entity, category, access_token)
                normalized[category].append(normalized_entity)
            except Exception as e:
                print(f"  ✗ Error normalizing entity {entity.get('text')}: {str(e)}")
                # Add original entity if normalization fails
                entity['normalized'] = False
                entity['needs_refinement'] = False
                normalized[category].append(entity)
    
    return normalized

# Normalize entities
normalized_entities = normalize_entities(extracted_entities)

print("\n" + "=" * 80)
print("NORMALIZATION COMPLETE")
print("=" * 80)

## Analyze Normalization Results

Analyze the normalized entities and enrichment data.

In [None]:
def analyze_normalization(normalized_dict):
    """
    Analyze normalization results.
    
    Args:
        normalized_dict (dict): Normalized entities
    """
    total = sum(len(v) for v in normalized_dict.values())
    successfully_normalized = sum(1 for entities in normalized_dict.values() 
                                  for e in entities if e.get('normalized', False))
    failed = total - successfully_normalized
    
    # Count entities with refinement flags
    needs_refinement = sum(1 for entities in normalized_dict.values()
                          for e in entities if e.get('needs_refinement', False))
    
    print("Normalization Analysis:")
    print("=" * 80)
    print(f"Total entities: {total}")
    print(f"Successfully normalized: {successfully_normalized} ({successfully_normalized/total*100:.1f}%)")
    print(f"Failed normalization: {failed}")
    print(f"Needs refinement: {needs_refinement} ({needs_refinement/total*100:.1f}%)")
    
    # Count by category
    print(f"\nBy Category:")
    for category, entities in normalized_dict.items():
        normalized_count = sum(1 for e in entities if e.get('normalized', False))
        refinable_count = sum(1 for e in entities if e.get('is_refinable', False))
        print(f"  {category.title()}: {len(entities)} total, {normalized_count} normalized, {refinable_count} refinable")

analyze_normalization(normalized_entities)

## Display Normalized Entities with Enrichment

Show normalized entities with their standard codes and enrichment data.

In [None]:
def display_normalized_entities(normalized_dict, max_display=5):
    """
    Display normalized entities with enrichment.
    
    Args:
        normalized_dict (dict): Normalized entities
        max_display (int): Maximum entities to display per category
    """
    for category, entity_list in normalized_dict.items():
        if not entity_list:
            continue
        
        print("\n" + "=" * 80)
        print(f"{category.upper()} ({len(entity_list)} entities)")
        print("=" * 80)
        
        for i, entity in enumerate(entity_list[:max_display], 1):
            print(f"\n{i}. {entity['text']}")
            
            if entity.get('normalized', False):
                print(f"   ✓ Normalized successfully")
                print(f"   IMO Code: {entity.get('imo_lexical_code', 'N/A')}")
                print(f"   IMO Description: {entity.get('imo_lexical_title', 'N/A')}")
                print(f"   Confidence: {entity.get('normalization_confidence', 0):.2f}")
                
                # Display category-specific codes
                if category == 'problems' and entity.get('icd10cm_code'):
                    print(f"\n   ICD-10-CM:")
                    print(f"     Code: {entity.get('icd10cm_code', 'N/A')}")
                    print(f"     Title: {entity.get('icd10cm_title', 'N/A')}")
                
                elif category == 'procedures' and entity.get('cpt_code'):
                    print(f"\n   CPT:")
                    print(f"     Code: {entity.get('cpt_code', 'N/A')}")
                    print(f"     Title: {entity.get('cpt_title', 'N/A')}")
                
                elif category == 'medications' and entity.get('rxnorm_code'):
                    print(f"\n   RxNorm:")
                    print(f"     Code: {entity.get('rxnorm_code', 'N/A')}")
                    print(f"     Title: {entity.get('rxnorm_title', 'N/A')}")
                
                elif category == 'labs' and entity.get('loinc_code'):
                    print(f"\n   LOINC:")
                    print(f"     Code: {entity.get('loinc_code', 'N/A')}")
                    print(f"     Title: {entity.get('loinc_title', 'N/A')}")
                
                # Display refinement flag
                if entity.get('is_refinable', False):
                    print(f"\n   ⚠️  IS REFINABLE - Needs diagnostic specificity workflow")
                
                # Display alternate choices if available
                alternate_choices = entity.get('alternate_choices', [])
                if alternate_choices:
                    print(f"\n   Alternate Choices ({len(alternate_choices)}):")
                    for j, alt in enumerate(alternate_choices[:3], 1):
                        refinable_marker = " [REFINABLE]" if alt.get('is_refinable', False) else ""
                        print(f"     {j}. {alt.get('lexical_title', 'N/A')} (Score: {alt.get('score', 0):.2f}){refinable_marker}")
                    if len(alternate_choices) > 3:
                        print(f"     ... and {len(alternate_choices) - 3} more")
            else:
                print(f"   ✗ Normalization failed")
            
            print("-" * 80)
        
        if len(entity_list) > max_display:
            print(f"\n... and {len(entity_list) - max_display} more entities")

# Display normalized entities
display_normalized_entities(normalized_entities, max_display=5)

## Identify Entities Requiring Refinement

Extract entities flagged for diagnostic specificity refinement.

In [None]:
def extract_refinement_candidates(normalized_dict):
    """
    Extract entities that need refinement.
    
    Args:
        normalized_dict (dict): Normalized entities
        
    Returns:
        list: Entities needing refinement
    """
    refinement_candidates = []
    
    for category, entity_list in normalized_dict.items():
        for entity in entity_list:
            if entity.get('is_refinable', False) or entity.get('needs_refinement', False):
                refinement_candidates.append({
                    'category': category,
                    'entity': entity
                })
    
    return refinement_candidates

# Extract refinement candidates
refinement_candidates = extract_refinement_candidates(normalized_entities)

print(f"Entities Requiring Refinement: {len(refinement_candidates)}")
print("=" * 80)

for i, item in enumerate(refinement_candidates, 1):
    entity = item['entity']
    category = item['category']
    print(f"\n{i}. {entity['text']}")
    print(f"   Category: {category}")
    print(f"   IMO Code: {entity.get('imo_lexical_code', 'N/A')}")
    print(f"   IMO Title: {entity.get('imo_lexical_title', 'N/A')}")
    if entity.get('icd10cm_code'):
        print(f"   ICD-10-CM: {entity.get('icd10cm_code', 'N/A')} - {entity.get('icd10cm_title', 'N/A')}")
    print("-" * 80)

if not refinement_candidates:
    print("\n✓ All entities are sufficiently specific - no refinement needed")

## Save Normalized Entities

Save the normalized and enriched entities for use in Step 4 (Diagnostic Specificity).

In [None]:
# Create output structure
normalization_output = {
    'normalized_entities': normalized_entities,
    'refinement_candidates': refinement_candidates,
    'normalization_metadata': {
        'total_entities': sum(len(v) for v in normalized_entities.values()),
        'successfully_normalized': sum(1 for entities in normalized_entities.values() 
                                       for e in entities if e.get('normalized', False)),
        'needs_refinement': len(refinement_candidates),
        'normalized_at': datetime.now().isoformat(),
        'enrichment_used': True,
        'code_systems': ['IMO', 'ICD10CM', 'CPT', 'RXNORM', 'LOINC']
    }
}

# Save to file
output_file = 'normalized_entities_output.json'
with open(output_file, 'w') as f:
    json.dump(normalization_output, f, indent=2)

print(f"✓ Normalized entities saved to: {output_file}")
print(f"\nOutput includes:")
print(f"  - {normalization_output['normalization_metadata']['total_entities']} normalized entities")
print(f"  - {normalization_output['normalization_metadata']['successfully_normalized']} successfully normalized")
print(f"  - {normalization_output['normalization_metadata']['needs_refinement']} entities flagged for refinement")
print(f"  - Standard codes: IMO, ICD-10-CM, CPT, RxNorm, LOINC")
print(f"  - Enrichment data: alternate choices, refinement flags")
print(f"  - Context used for problem normalization")

## Summary

### What We Accomplished

1. ✓ Loaded extracted entities from Step 2
2. ✓ Prepared entities for batch normalization
3. ✓ Called IMO Precision Normalize API with context
4. ✓ Normalized entities to standard terminologies
5. ✓ Enriched entities with clinical metadata
6. ✓ Identified entities needing refinement
7. ✓ Saved normalized entities for diagnostic specificity workflow

### Key Normalization Features

- **Multi-code system mapping**: ICD-10-CM, SNOMED CT, RxNorm, LOINC
- **Context-aware normalization**: Uses 200-char context for disambiguation
- **Enrichment data**: Synonyms, clinical status, severity, attributes
- **Refinement flags**: Identifies entities needing additional specificity
- **Preferred terms**: Standardized medical terminology

### Example: Normalization in Action

**Input**: "chest pain"

**Normalized Output**:
- Preferred Term: "Chest pain"
- ICD-10-CM: R07.9 - Chest pain, unspecified
- SNOMED CT: 29857009 - Chest pain
- Refinement Flag: TRUE
- Refinement Reason: "Needs laterality and specificity"

### Next Steps

The normalized entities will be used in **Step 4: Diagnostic Specificity Workflow**, where we'll:
- Process entities flagged for refinement
- Use IMO Diagnostic Workflow API
- Add specificity (laterality, severity, timing)
- Generate more precise ICD-10 codes
- Improve billing accuracy and clinical documentation

### Key Takeaways

- **Normalization** bridges free-text and structured data
- **Context** significantly improves normalization accuracy
- **Enrichment** adds valuable clinical intelligence
- **Refinement flags** ensure diagnostic specificity
- **Standard codes** enable interoperability across healthcare systems