<a href="https://colab.research.google.com/github/buwituze/pre-consultation-agent/blob/main/language_understanding_model_b.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# üß† Model B ‚Äî Language Understanding (Text ‚Üí Structured Meaning)

**Purpose:** Extract structured, clinically relevant information from patient speech transcripts to support triage and downstream reasoning.

| | |
|---|---|
| **Input** | Cleaned text transcript (output from Model A) |
| **Output** | Fixed-schema JSON + `additional_observations` field |
| **Model** | Google Gemini AI (via `google-generativeai`) |
| **Mode** | Constrained information extraction ‚Äî no diagnosis, no advice |

### Pipeline Position
```
Model A Output                     Model B Output
(Transcript Text)
       ‚îÇ
       ‚ñº
‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê       ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
‚îÇ  Transcript +       ‚îÇ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚ñ∫‚îÇ  chief_complaint              ‚îÇ
‚îÇ  Fixed Schema       ‚îÇ       ‚îÇ  duration                     ‚îÇ
‚îÇ  (prompt)           ‚îÇ       ‚îÇ  severity                     ‚îÇ
‚îÇ                     ‚îÇ       ‚îÇ  body_part                    ‚îÇ
‚îÇ  Gemini Model       ‚îÇ       ‚îÇ  associated_symptoms []       ‚îÇ
‚îÇ                     ‚îÇ       ‚îÇ  red_flags_present true/false ‚îÇ
‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò       ‚îÇ  additional_observations      ‚îÇ
                              ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                                         ‚îÇ
                                         ‚ñº
                                    ‚Üí Model C (Dialogue)
                                    ‚Üí Doctor Review
```

> ‚ö†Ô∏è **Non-Negotiable Constraints:** This model produces **observational output only**. No diagnoses, no medical advice, no treatment or medication recommendations.

---
## üì¶ Section 1 ‚Äî Install Dependencies

In [None]:
!pip install -q -U google-generativeai
!pip install -q jsonschema

print("‚úÖ Dependencies installed.")

---
## üîë Section 2 ‚Äî API Key Configuration

Store your Gemini API key securely using **Colab Secrets** (the üîë icon in the left sidebar) under the name `GEMINI_API_KEY`. This avoids hardcoding credentials.

In [None]:
import google.generativeai as genai
from google.colab import userdata

# ‚îÄ‚îÄ Load API key from Colab Secrets ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
# Add your key via: left sidebar ‚Üí üîë Secrets ‚Üí Add "GEMINI_API_KEY"
try:
    GEMINI_API_KEY = userdata.get("GEMINI_API_KEY")
    genai.configure(api_key=GEMINI_API_KEY)
    print("‚úÖ Gemini API key loaded from Colab Secrets.")
except Exception:
    # Fallback: paste key directly (not recommended for shared notebooks)
    GEMINI_API_KEY = "YOUR_API_KEY_HERE"  # ‚Üê replace only if Secrets unavailable
    genai.configure(api_key=GEMINI_API_KEY)
    print("‚ö†Ô∏è  API key set manually. Use Colab Secrets for security.")

---
## ü§ñ Section 3 ‚Äî Model Initialisation

In [None]:
import json
import re
from dataclasses import dataclass, field, asdict
from typing import List, Optional

# ‚îÄ‚îÄ Model configuration ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
# gemini-1.5-flash: fast, cost-efficient, supports multilingual text well
# gemini-1.5-pro  : higher reasoning quality ‚Äî swap in for production
GEMINI_MODEL_NAME = "gemini-1.5-flash"

generation_config = genai.types.GenerationConfig(
    temperature       = 0.0,    # Deterministic ‚Äî critical for consistent schema output
    top_p             = 1.0,
    max_output_tokens = 1024,   # Schema output is compact; 1024 is sufficient
)

safety_settings = [
    # Relax Gemini's default blocks on medical content so clinical terms pass through
    {"category": "HARM_CATEGORY_DANGEROUS_CONTENT",  "threshold": "BLOCK_ONLY_HIGH"},
    {"category": "HARM_CATEGORY_HARASSMENT",         "threshold": "BLOCK_ONLY_HIGH"},
    {"category": "HARM_CATEGORY_HATE_SPEECH",        "threshold": "BLOCK_ONLY_HIGH"},
    {"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",  "threshold": "BLOCK_ONLY_HIGH"},
]

gemini_model = genai.GenerativeModel(
    model_name        = GEMINI_MODEL_NAME,
    generation_config = generation_config,
    safety_settings   = safety_settings,
)

print(f"‚úÖ Gemini model initialised: {GEMINI_MODEL_NAME}")
print(f"   Temperature : {generation_config.temperature} (deterministic)")

---
## üìê Section 4 ‚Äî Output Schema Definition

The schema is the contract between the model and downstream systems. It is fixed and passed directly into the prompt ‚Äî Gemini is instructed to populate it, not design it.

In [None]:
# ‚îÄ‚îÄ Dataclass for strongly-typed output ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
@dataclass
class ClinicalExtraction:
    """
    Structured output from Model B.
    All fields are observational ‚Äî no diagnosis, no advice.
    """
    chief_complaint         : str         = ""       # Main reason for visit
    duration                : str         = ""       # How long the symptom has been present
    severity                : str         = ""       # Patient's own description (mild/moderate/severe)
    body_part               : str         = ""       # Anatomical area mentioned
    associated_symptoms     : List[str]   = field(default_factory=list)  # Secondary symptoms
    red_flags_present       : Optional[bool] = None  # True if any red flag language detected
    additional_observations : str         = ""       # Unstructured but clinically relevant context


# ‚îÄ‚îÄ Empty schema template (injected into prompt) ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
EMPTY_SCHEMA = {
    "chief_complaint"         : "",
    "duration"                : "",
    "severity"                : "",
    "body_part"               : "",
    "associated_symptoms"     : [],
    "red_flags_present"       : None,
    "additional_observations" : ""
}

# ‚îÄ‚îÄ Red flag vocabulary ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
# Used for schema validation ‚Äî Gemini's red_flags_present is cross-checked
RED_FLAG_TERMS = [
    # English
    "can't breathe", "cannot breathe", "chest pain", "chest tightness",
    "unconscious", "fainted", "fainting", "collapse", "collapsed",
    "severe bleeding", "heavy bleeding", "coughing blood", "blood in urine",
    "stroke", "paralysis", "can't move", "cannot move",
    "seizure", "convulsion", "fits",
    "sudden vision loss", "sudden blindness",
    "difficulty swallowing", "can't swallow",
    # Kinyarwanda transliterations (common terms)
    "guhumeka",        # breathing
    "amaraso",         # blood
    "guhinduka",       # collapse / change suddenly
    "ingufu",          # convulsion / force
    "kunanirwa",       # unable to
    "imitsi",          # paralysis / nerves
]

print("‚úÖ Output schema and red flag vocabulary defined.")
print(f"   Schema fields : {list(EMPTY_SCHEMA.keys())}")
print(f"   Red flag terms: {len(RED_FLAG_TERMS)} terms loaded")

---
## üìù Section 5 ‚Äî Prompt Engineering

The system prompt and user prompt are separated. The system prompt establishes role, constraints, and output rules once. The user prompt injects the transcript at runtime.

Design principles applied:
- **Strict JSON-only output** ‚Äî no prose, no markdown fences
- **Leave blank, never invent** ‚Äî missing fields stay empty
- **Hard constraint block** ‚Äî no diagnosis, no advice, no recommendations
- **Multilingual awareness** ‚Äî Kinyarwanda, English, and mixed text are all valid

In [None]:
SYSTEM_PROMPT = """\
You are a clinical information extraction assistant operating in a hospital triage system.
Your ONLY job is to extract factual, observable information from patient speech transcripts
and populate a fixed JSON schema.

=== HARD CONSTRAINTS (NEVER VIOLATE) ===
- Do NOT generate any diagnosis, suspected condition, or differential.
- Do NOT provide medical advice, treatment suggestions, or medication recommendations.
- Do NOT infer or guess information not present in the transcript.
- Do NOT add new fields to the schema.
- Do NOT wrap your output in markdown code blocks or backticks.
- Your output must be a single, valid JSON object and nothing else.

=== OUTPUT RULES ===
1. Populate fields ONLY with information explicitly stated or directly implied in the transcript.
2. Leave a field as an empty string "" or empty list [] if information is absent or unclear.
3. "red_flags_present": set to true if the patient mentions any of these ‚Äî difficulty breathing,
   chest pain, loss of consciousness, heavy bleeding, seizure, sudden paralysis, or inability
   to perform a basic function. Set to false if none apply. Set to null if genuinely unclear.
4. "additional_observations": capture any clinically relevant context that doesn't fit other
   fields (e.g. emotional state, environmental context, patient's own worry, language used).
   Keep it concise and factual. Do not include observations about the recording quality.
5. "severity": use the patient's own words or phrasing. Do not reclassify.
6. The transcript may be in English, Kinyarwanda, or a mix of both. Extract from all languages.

=== SCHEMA TO POPULATE ===
{schema}
"""


def build_user_prompt(transcript: str) -> str:
    """Construct the user-turn prompt for a given transcript."""
    return f"""\
=== PATIENT TRANSCRIPT ===
{transcript.strip()}

=== TASK ===
Extract information from the transcript above and return ONLY the populated JSON schema.
Do not include any text before or after the JSON object.
"""


print("‚úÖ Prompt templates defined.")

---
## ‚öôÔ∏è Section 6 ‚Äî Output Parsing & Validation

In [None]:
def parse_gemini_response(raw_text: str) -> dict:
    """
    Parse the raw Gemini response into a Python dict.
    Handles edge cases: markdown fences, leading/trailing whitespace,
    and truncated responses.
    """
    # Strip markdown fences if Gemini adds them despite instruction
    cleaned = re.sub(r"```(?:json)?\s*", "", raw_text).strip()
    cleaned = re.sub(r"```\s*$", "", cleaned).strip()

    # Extract the first JSON object found
    match = re.search(r"\{.*\}", cleaned, re.DOTALL)
    if not match:
        raise ValueError(f"No JSON object found in model response.\nRaw: {raw_text[:300]}")

    return json.loads(match.group(0))


def validate_and_coerce(raw_dict: dict) -> ClinicalExtraction:
    """
    Validate model output against the schema.
    Coerces types, fills missing keys, and cross-validates red_flags_present
    against the known red flag vocabulary.

    Returns a ClinicalExtraction dataclass instance.
    """
    # ‚îÄ‚îÄ Ensure all keys are present ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    validated = {k: raw_dict.get(k, v) for k, v in EMPTY_SCHEMA.items()}

    # ‚îÄ‚îÄ Type coercion ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    # String fields
    for str_field in ["chief_complaint", "duration", "severity",
                      "body_part", "additional_observations"]:
        if not isinstance(validated[str_field], str):
            validated[str_field] = str(validated[str_field]) if validated[str_field] else ""

    # List field
    if not isinstance(validated["associated_symptoms"], list):
        val = validated["associated_symptoms"]
        validated["associated_symptoms"] = [val] if val else []

    # Boolean / null field
    rfp = validated["red_flags_present"]
    if rfp not in (True, False, None):
        validated["red_flags_present"] = None

    # ‚îÄ‚îÄ Cross-validate red_flags_present with vocabulary ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    all_text = " ".join([
        validated["chief_complaint"],
        validated["additional_observations"],
        " ".join(validated["associated_symptoms"]),
    ]).lower()

    keyword_hit = any(term.lower() in all_text for term in RED_FLAG_TERMS)

    # If vocabulary detects red flag but model missed it, override to True
    if keyword_hit and validated["red_flags_present"] is not True:
        validated["red_flags_present"] = True
        print("  ‚ö†Ô∏è  Red flag keyword detected ‚Äî overriding model output to True.")

    return ClinicalExtraction(**validated)


print("‚úÖ Parsing and validation functions defined.")

---
## üöÄ Section 7 ‚Äî Core Extraction Pipeline

In [None]:
def extract_clinical_information(
    transcript        : str,
    source_language   : str = "unknown",
    source_confidence : float = 1.0,
    verbose           : bool = True,
) -> dict:
    """
    Full Model B pipeline: transcript ‚Üí structured clinical extraction.

    Args:
        transcript        : Raw text from Model A (any language).
        source_language   : Language reported by Model A ('kinyarwanda'/'english'/'unknown').
        source_confidence : Confidence score from Model A (0.0‚Äì1.0).
        verbose           : Print progress and results.

    Returns:
        dict with keys:
            'extraction'          : ClinicalExtraction dataclass
            'extraction_dict'     : dict version of extraction
            'source_language'     : str
            'source_confidence'   : float
            'raw_model_response'  : str (for debugging)
            'extraction_success'  : bool
            'error'               : str or None
    """
    if verbose:
        print("\n" + "‚ïê" * 62)
        print(" üß†  LANGUAGE UNDERSTANDING PIPELINE ‚Äî MODEL B")
        print("‚ïê" * 62)
        print(f"  Source language   : {source_language}")
        print(f"  Source confidence : {source_confidence:.3f}")
        preview = transcript.strip()[:120]
        print(f"  Transcript preview: {preview}{'...' if len(transcript) > 120 else ''}")

    # ‚îÄ‚îÄ Guard: refuse to process near-empty transcripts ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    if len(transcript.strip()) < 10:
        if verbose:
            print("  ‚ùå Transcript too short to extract information.")
        empty = ClinicalExtraction()
        return {
            "extraction"         : empty,
            "extraction_dict"    : asdict(empty),
            "source_language"    : source_language,
            "source_confidence"  : source_confidence,
            "raw_model_response" : "",
            "extraction_success" : False,
            "error"              : "Transcript too short.",
        }

    # ‚îÄ‚îÄ Build prompt ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    system_with_schema = SYSTEM_PROMPT.format(
        schema=json.dumps(EMPTY_SCHEMA, indent=2)
    )
    user_prompt = build_user_prompt(transcript)

    # ‚îÄ‚îÄ Call Gemini ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    try:
        if verbose:
            print("\n  üîÑ Calling Gemini API...")

        chat = gemini_model.start_chat(history=[
            {"role": "user",  "parts": [system_with_schema]},
            {"role": "model", "parts": ["Understood. I will extract information from transcripts and return only valid JSON matching the provided schema, without diagnosis or medical advice."]},
        ])
        response     = chat.send_message(user_prompt)
        raw_response = response.text

        if verbose:
            print("  ‚úÖ Gemini response received.")

    except Exception as api_err:
        err_msg = str(api_err)
        if verbose:
            print(f"  ‚ùå Gemini API error: {err_msg}")
        empty = ClinicalExtraction()
        return {
            "extraction"         : empty,
            "extraction_dict"    : asdict(empty),
            "source_language"    : source_language,
            "source_confidence"  : source_confidence,
            "raw_model_response" : "",
            "extraction_success" : False,
            "error"              : err_msg,
        }

    # ‚îÄ‚îÄ Parse and validate ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    try:
        raw_dict   = parse_gemini_response(raw_response)
        extraction = validate_and_coerce(raw_dict)
        success    = True
        error      = None
    except Exception as parse_err:
        if verbose:
            print(f"  ‚ö†Ô∏è  Parse error: {parse_err}")
        extraction = ClinicalExtraction()
        success    = False
        error      = str(parse_err)

    # ‚îÄ‚îÄ Display results ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
    if verbose and success:
        print("\n" + "‚îÄ" * 62)
        print(" üìã  EXTRACTION RESULT")
        print("‚îÄ" * 62)
        d = asdict(extraction)
        for key, value in d.items():
            display_val = value if value not in ("", [], None) else "‚¨ú (not found)"
            print(f"  {key:<28}: {display_val}")
        if extraction.red_flags_present:
            print("\n  üö® RED FLAG DETECTED ‚Äî escalate for immediate review.")

    return {
        "extraction"         : extraction,
        "extraction_dict"    : asdict(extraction),
        "source_language"    : source_language,
        "source_confidence"  : source_confidence,
        "raw_model_response" : raw_response,
        "extraction_success" : success,
        "error"              : error,
    }


print("‚úÖ Core extraction pipeline defined.")

---
## üîó Section 8 ‚Äî Model A ‚Üí Model B Integration

This is the standard runtime flow when Models A and B are chained together.

In [None]:
def run_pipeline_from_model_a(model_a_output: dict, verbose: bool = True) -> dict:
    """
    Accept the full output dict from Model A (transcribe_audio_file)
    and feed it directly into Model B.

    Args:
        model_a_output : Dict returned by `transcribe_audio_file()` from Model A.
        verbose        : Print pipeline progress.

    Returns:
        Model B output dict (same structure as extract_clinical_information).
    """
    transcript  = model_a_output.get("full_text", "")
    language    = model_a_output.get("dominant_language", "unknown")
    confidence  = model_a_output.get("mean_confidence", 1.0)

    if verbose:
        print("üîó Receiving Model A output...")
        print(f"   Language   : {language}")
        print(f"   Confidence : {confidence:.3f}")
        print(f"   Transcript : {transcript[:100]}..." if len(transcript) > 100 else f"   Transcript : {transcript}")

    return extract_clinical_information(
        transcript        = transcript,
        source_language   = language,
        source_confidence = confidence,
        verbose           = verbose,
    )


# ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
# USAGE (when Model A notebook is also open / imported):
#
#   # Model A
#   model_a_result = transcribe_audio_file("patient_audio.wav")
#
#   # Model B ‚Äî chain directly
#   model_b_result = run_pipeline_from_model_a(model_a_result)
# ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ

print("‚úÖ Model A ‚Üí Model B integration function defined.")

---
## üß™ Section 9 ‚Äî Test Cases

Run these to validate the pipeline before connecting to real audio. Covers English, Kinyarwanda, mixed language, a red flag case, and a sparse/minimal transcript.

In [None]:
TEST_CASES = [
    {
        "id"         : "TC-01 | English ‚Äî standard presentation",
        "language"   : "english",
        "confidence" : 0.91,
        "transcript" : """
            I've had a headache for the past three days. It's mostly on the right side of my head.
            The pain is moderate, maybe a six out of ten. I also feel a bit nauseous, and
            light seems to make it worse. I took paracetamol yesterday but it didn't help much.
        """,
    },
    {
        "id"         : "TC-02 | Kinyarwanda ‚Äî abdominal symptoms",
        "language"   : "kinyarwanda",
        "confidence" : 0.84,
        "transcript" : """
            Ndi n'ububabare mu nda kuva ejo. Ububabare ni uburemere kandi burarushaho.
            Sinashye neza ijoro rya ejo. Nk'aho mba nshaka kuruka ariko sinabikora.
        """,
        # Translation hint: Stomach pain since yesterday. Heavy pain, getting worse.
        # Couldn't sleep last night. Feels like vomiting but hasn't.
    },
    {
        "id"         : "TC-03 | Mixed language ‚Äî code-switching",
        "language"   : "kinyarwanda",
        "confidence" : 0.77,
        "transcript" : """
            Chest pain yatangiye ejobundi. Ndi kwibaza niba ni heart problem.
            Ntushobora guhumeka neza iyo ugerageza gukora akazi.
            The pain shoots to my left arm sometimes.
        """,
        # Translation hint: Chest pain started two days ago. Wondering if it's a heart problem.
        # Can't breathe well when trying to work. Pain goes to left arm.
    },
    {
        "id"         : "TC-04 | English ‚Äî sparse transcript",
        "language"   : "english",
        "confidence" : 0.60,
        "transcript" : "My leg hurts.",
    },
    {
        "id"         : "TC-05 | English ‚Äî red flag case",
        "language"   : "english",
        "confidence" : 0.95,
        "transcript" : """
            I suddenly can't see properly out of my left eye. It started about an hour ago.
            I also feel very confused and my right arm feels weak. I've never had this before.
        """,
    },
]

print(f"‚úÖ {len(TEST_CASES)} test cases loaded. Run the next cell to execute them.")

In [None]:
# ‚îÄ‚îÄ Run all test cases ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
test_results = {}

for tc in TEST_CASES:
    print(f"\n{'‚ñì' * 62}")
    print(f"  TEST CASE: {tc['id']}")
    print(f"{'‚ñì' * 62}")

    result = extract_clinical_information(
        transcript        = tc["transcript"],
        source_language   = tc["language"],
        source_confidence = tc["confidence"],
        verbose           = True,
    )
    test_results[tc["id"]] = result

print(f"\n\n‚úÖ All {len(TEST_CASES)} test cases completed.")

---
## üóÇÔ∏è Section 10 ‚Äî Batch Processing & Export

In [None]:
import datetime

def batch_extract(
    transcripts : list,
    verbose     : bool = False,
) -> list:
    """
    Batch process a list of transcript dicts.

    Each item in `transcripts` should have:
        {
          'transcript'       : str,
          'source_language'  : str,   (optional)
          'source_confidence': float  (optional)
        }

    Returns:
        List of Model B output dicts.
    """
    results = []
    total   = len(transcripts)
    for i, item in enumerate(transcripts):
        print(f"  [{i+1}/{total}] Processing transcript...")
        res = extract_clinical_information(
            transcript        = item.get("transcript", ""),
            source_language   = item.get("source_language", "unknown"),
            source_confidence = item.get("source_confidence", 1.0),
            verbose           = verbose,
        )
        res["input_transcript"] = item.get("transcript", "")
        results.append(res)

        # Show one-line summary per item
        ext = res["extraction"]
        flag = "üö®" if ext.red_flags_present else "‚úÖ"
        print(f"     {flag} complaint='{ext.chief_complaint or 'N/A'}' "
              f"| severity='{ext.severity or 'N/A'}' "
              f"| confidence={res['source_confidence']:.2f}")

    return results


def export_batch_results(results: list, filepath: str = "model_b_extractions.json"):
    """Save batch results to a JSON file and download it."""
    export_data = {
        "timestamp"  : datetime.datetime.now().isoformat(),
        "model"      : GEMINI_MODEL_NAME,
        "total"      : len(results),
        "red_flags"  : sum(1 for r in results if r["extraction"].red_flags_present),
        "extractions": [
            {
                "index"             : i + 1,
                "source_language"   : r["source_language"],
                "source_confidence" : r["source_confidence"],
                "extraction_success": r["extraction_success"],
                "error"             : r.get("error"),
                "extraction"        : r["extraction_dict"],
            }
            for i, r in enumerate(results)
        ]
    }

    with open(filepath, "w", encoding="utf-8") as f:
        json.dump(export_data, f, ensure_ascii=False, indent=2)

    print(f"\n‚úÖ Saved {len(results)} extraction(s) to: {filepath}")
    print(f"   Red flags detected: {export_data['red_flags']}")

    from google.colab import files
    files.download(filepath)


print("‚úÖ Batch processing and export functions defined.")
print()
print("USAGE EXAMPLE:")
print("  transcripts = [")
print("    {'transcript': 'I have a headache...', 'source_language': 'english'},")
print("    {'transcript': 'Ndi n\'ububabare...', 'source_language': 'kinyarwanda'},")
print("  ]")
print("  results = batch_extract(transcripts)")
print("  export_batch_results(results)")

---
## üìä Section 11 ‚Äî Prompt Calibration & Validation Workflow

Use this section to calibrate prompts against your manually annotated local dataset. Feed in annotated examples and compare model output against gold labels.

In [None]:
def evaluate_against_annotations(
    annotated_examples: list,
    fields_to_evaluate: list = ["chief_complaint", "severity", "body_part"],
) -> dict:
    """
    Compare model extractions against manually annotated gold labels.

    Args:
        annotated_examples : List of dicts, each with:
            {
              'transcript'   : str,
              'gold_labels'  : { 'chief_complaint': ..., 'severity': ..., ... }
            }
        fields_to_evaluate : Which fields to score.

    Returns:
        Dict with per-field match rates and full comparison table.
    """
    field_matches  = {f: 0 for f in fields_to_evaluate}
    field_totals   = {f: 0 for f in fields_to_evaluate}
    comparisons    = []

    for i, ex in enumerate(annotated_examples):
        print(f"\n  [{i+1}/{len(annotated_examples)}] Evaluating...")
        result   = extract_clinical_information(ex["transcript"], verbose=False)
        pred     = result["extraction_dict"]
        gold     = ex["gold_labels"]
        row      = {"index": i+1, "transcript_preview": ex["transcript"][:80]}

        for fld in fields_to_evaluate:
            gold_val = str(gold.get(fld, "")).strip().lower()
            pred_val = str(pred.get(fld, "")).strip().lower()
            # Soft match: gold label contained in prediction or exact match
            match    = (gold_val in pred_val) or (gold_val == pred_val)
            field_matches[fld] += int(match)
            field_totals[fld]  += 1
            row[f"{fld}_gold"]  = gold_val
            row[f"{fld}_pred"]  = pred_val
            row[f"{fld}_match"] = "‚úÖ" if match else "‚ùå"
            print(f"     {fld}: gold='{gold_val}' | pred='{pred_val}' | {'‚úÖ' if match else '‚ùå'}")

        comparisons.append(row)

    # Summary
    print("\n" + "‚îÄ" * 50)
    print(" CALIBRATION SUMMARY")
    print("‚îÄ" * 50)
    summary = {}
    for fld in fields_to_evaluate:
        rate = field_matches[fld] / max(field_totals[fld], 1)
        summary[fld] = round(rate, 3)
        bar = "‚ñà" * int(rate * 20) + "‚ñë" * (20 - int(rate * 20))
        print(f"  {fld:<28}: {bar}  {rate*100:.1f}%")

    return {"per_field_accuracy": summary, "comparisons": comparisons}


# ‚îÄ‚îÄ Example annotated set ‚Äî replace with your real annotations ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ
SAMPLE_ANNOTATED = [
    {
        "transcript": "I've had a headache for two days, mostly on the right side. It's quite severe.",
        "gold_labels": {"chief_complaint": "headache", "duration": "two days", "severity": "severe", "body_part": "head"},
    },
    {
        "transcript": "My stomach has been hurting since this morning. Moderate pain.",
        "gold_labels": {"chief_complaint": "stomach pain", "duration": "since this morning", "severity": "moderate", "body_part": "stomach"},
    },
]

# Uncomment to run calibration:
# calibration_results = evaluate_against_annotations(SAMPLE_ANNOTATED)

print("‚úÖ Calibration framework defined.")
print("   Uncomment the last line to run with SAMPLE_ANNOTATED or your own dataset.")

---
## üìù Notes & Design Decisions

| Topic | Detail |
|---|---|
| **Temperature = 0.0** | Ensures deterministic, reproducible schema outputs ‚Äî essential for clinical consistency. |
| **Red flag override** | A keyword vocabulary layer cross-checks Gemini's `red_flags_present` to catch any model misses. |
| **No diagnosis constraint** | Enforced both in the system prompt and in downstream validation. The model sees no diagnosis instruction at all. |
| **`additional_observations`** | Intentionally flexible ‚Äî preserves patient affect, contextual clues, and language-switch patterns that don't map to schema fields. |
| **Kinyarwanda support** | Gemini 1.5 has multilingual support. Kinyarwanda extraction quality should be validated against your annotated set and the prompt refined if needed. |
| **Low-confidence transcripts** | If `source_confidence < 0.5`, consider flagging the extraction result as provisional for doctor review. |
| **Model swap** | Replace `gemini-1.5-flash` with `gemini-1.5-pro` in `GEMINI_MODEL_NAME` for higher extraction quality in production. |