# Discharge Navigator — Kaggle Entrypoint

**One-click reproducibility.** Run All → repo cloned → model loaded → live extraction → evidence displayed.

| Component | Detail |
|-----------|--------|
| Model | MedGemma 4B IT |
| Inference | HuggingFace Transformers (Kaggle T4) or Ollama (local CPU) |
| Dataset | MTSamples (CC0, Kaggle) |
| Pipeline | Extract → Verify → Validate → Clinician Review |

**Auto-detects environment:**
- Kaggle + T4 GPU → live inference via HuggingFace Transformers
- Local + Ollama → live inference via Ollama (CPU, offline)
- No GPU, no Ollama → pre-computed evidence pack

## Step 0 — Clone Repo + Install Dependencies

In [None]:
import subprocess, sys, os
from pathlib import Path

# --- Clone the repo if not already present ---
REPO_URL = 'https://github.com/LegenDairy93/discharge-navigator.git'
REPO_DIR = Path('discharge-navigator')

if not REPO_DIR.exists():
    print('Cloning repository...')
    subprocess.check_call(['git', 'clone', '--depth', '1', REPO_URL, str(REPO_DIR)])
    print('Clone complete.')
else:
    print(f'Repo already exists at {REPO_DIR}')

# Repo root IS the project root
PROJECT_DIR = REPO_DIR

# Verify project structure
assert (PROJECT_DIR / 'src' / 'demo_app.py').exists(), 'Missing src/demo_app.py'
assert (PROJECT_DIR / 'eval' / 'results' / 'metrics_summary.json').exists(), 'Missing eval results'
print(f'Project root: {PROJECT_DIR.resolve()}')

# --- Install dependencies ---
deps = ['requests', 'pydantic', 'pandas', 'gradio', 'matplotlib']
for pkg in deps:
    try:
        __import__(pkg)
    except ImportError:
        subprocess.check_call([sys.executable, '-m', 'pip', 'install', '-q', pkg])
        print(f'Installed {pkg}')
    else:
        print(f'{pkg} OK')

# Add project to Python path
src_path = str(PROJECT_DIR.resolve())
if src_path not in sys.path:
    sys.path.insert(0, src_path)

print('\nAll dependencies ready.')

## Step 1 — Detect Environment + Load Model

In [None]:
import json, time

BACKEND = None  # 'hf', 'ollama', or None (evidence mode)
hf_model = None
hf_tokenizer = None
ollama_model = None

is_kaggle = os.path.exists('/kaggle')

# --- Try HuggingFace (Kaggle T4 or any CUDA GPU) ---
try:
    import torch
    if torch.cuda.is_available():
        print(f'GPU detected: {torch.cuda.get_device_name(0)}')
        print(f'VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB')

        # Authenticate with HuggingFace
        try:
            if is_kaggle:
                from kaggle_secrets import UserSecretsClient
                hf_token = UserSecretsClient().get_secret('HF_TOKEN')
            else:
                hf_token = os.environ.get('HF_TOKEN')

            if hf_token:
                from huggingface_hub import login
                login(token=hf_token, add_to_git_credential=False)
                print('HuggingFace authenticated.')

                from src.hf_backend import load_medgemma
                hf_model, hf_tokenizer = load_medgemma()
                BACKEND = 'hf'
                print('\nBackend: HuggingFace Transformers (GPU)')
            else:
                print('No HF_TOKEN found. Set it as a Kaggle Secret or env var.')
        except Exception as e:
            print(f'HuggingFace loading failed: {e}')
    else:
        print('No CUDA GPU detected.')
except ImportError:
    print('PyTorch not installed (expected on local without ML stack).')

# --- Fallback: Try Ollama (local CPU) ---
if BACKEND is None:
    try:
        from src.navigator import check_ollama, select_model
        models = check_ollama()
        if models:
            ollama_model = select_model(models)
            BACKEND = 'ollama'
            print(f'\nBackend: Ollama (CPU) \u2014 model: {ollama_model}')
    except Exception:
        pass

# --- Final fallback: Evidence mode ---
if BACKEND is None:
    print('\nBackend: EVIDENCE MODE (pre-computed results)')
    print('For live inference, either:')
    print('  - Enable GPU + add HF_TOKEN secret (Kaggle)')
    print('  - Start Ollama with MedGemma model (local)')

## Step 2 — Smoke Test (1 Note)

In [None]:
from src.grounding import grounding_report

if BACKEND in ('hf', 'ollama'):
    golden_path = PROJECT_DIR / 'data' / 'golden_note.txt'
    test_note = golden_path.read_text(encoding='utf-8')
    print(f'Smoke test note: {len(test_note)} chars')
    print('Running extraction...\n')

    t0 = time.time()

    if BACKEND == 'hf':
        from src.navigator import generate_packet_hf
        packet, raw = generate_packet_hf(
            test_note, model=hf_model, tokenizer=hf_tokenizer, return_raw=True
        )
    else:
        from src.navigator import generate_packet
        packet, raw = generate_packet(
            test_note, model=ollama_model, return_raw=True
        )

    elapsed = time.time() - t0

    if packet is None:
        print(f'SMOKE TEST FAILED \u2014 could not parse output in {elapsed:.1f}s')
        print(f'Raw output preview:\n{raw[:500]}')
    else:
        d = packet.model_dump()
        report = grounding_report(packet, test_note)
        print(f'SMOKE TEST PASSED in {elapsed:.1f}s')
        print(f'  Diagnoses:   {len(d["diagnoses"]):>2}  (grounded: {report["diagnoses_grounded_ratio"]:.0%})')
        print(f'  Medications: {len(d["medications"]):>2}  (grounded: {report["meds_grounded_ratio"]:.0%})')
        print(f'  Follow-ups:  {len(d["followups"]):>2}')
        print(f'  Red flags:   {len(d["red_flags"]):>2}')
        print(f'  Missing info:{len(d["missing_info"]):>2}')
        print(f'  Overall grounded: {report["overall_grounded_ratio"]:.0%}')
else:
    print('Skipping live smoke test (no model available).\n')
    print('Pre-computed sample (note_002):')
    sample_path = PROJECT_DIR / 'eval' / 'results' / 'samples' / 'note_002.json'
    if sample_path.exists():
        sample = json.loads(sample_path.read_text(encoding='utf-8'))
        print(f'  Diagnoses:   {len(sample.get("diagnoses", []))}')
        print(f'  Medications: {len(sample.get("medications", []))}')
        print(f'  Follow-ups:  {len(sample.get("followups", []))}')
    print('\nFull eval: 46/50 parsed, 34s median, 94% diagnosis grounding.')

## Step 3 — Evidence Pack Summary

**Note on inference backends:** The live smoke test above uses MedGemma 4B at bfloat16 precision on GPU via HuggingFace Transformers. The evidence pack below was generated offline using the same model quantized to Q4_K_M (2.5 GB) via Ollama on CPU — the edge deployment target. Same prompts, same schema validation, same grounding logic. Minor metric differences reflect quantization, not pipeline changes.

In [None]:
results_dir = PROJECT_DIR / 'eval' / 'results'

with open(results_dir / 'metrics_summary.json') as f:
    summary = json.load(f)

print('=' * 60)
print('  DISCHARGE NAVIGATOR \u2014 EVALUATION SUMMARY')
print('=' * 60)
print(f'  Model:      {summary["model"]}')
print(f'  Quant:      {summary["quantization"]}')
print(f'  Inference:  {summary["inference"]}')
print(f'  Dataset:    {summary["dataset"]}')
print(f'  Notes:      {summary["total_notes"]}')
print('-' * 60)
print(f'  Parse rate:           {summary["json_valid_rate"]:.0%}  ({summary["json_valid_count"]}/{summary["total_notes"]})')
print(f'  Median latency:       {summary["median_latency_s"]:.0f}s')
print(f'  P95 latency:          {summary["p95_latency_s"]:.0f}s')
print(f'  Dx grounded (mean):   {summary["diagnoses_grounded_mean"]:.0%}')
print(f'  Meds grounded (mean): {summary["medications_grounded_mean"]:.0%}')
print(f'  Overall grounded:     {summary["overall_grounded_mean"]:.0%}')
print('-' * 60)
for k, v in summary['success_criteria'].items():
    status = 'PASS' if v else 'FAIL'
    print(f'  {k}: {status}')
print('=' * 60)

In [None]:
from IPython.display import Image, display

hist_path = results_dir / 'latency_histogram.png'
if hist_path.exists():
    display(Image(filename=str(hist_path), width=800))
else:
    print('Histogram not found.')

## Step 4 — Launch Trust Surface Demo

Interactive demo with traceability panel, reliability board, and failure analysis.
On Kaggle, a public share link will be generated.

In [None]:
from src.demo_app import build_app

app = build_app()

app.launch(
    share=True,
    server_name='0.0.0.0',
    server_port=7860,
)