Skip to content

evantofu/po-adk-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Claims AI Agent

A medical claims coding agent built on Google ADK and Gemini 2.5 Flash. Reads a patient's live FHIR R4 clinical record, assigns CPT procedure codes, detects billing errors, and produces an auditable claims summary with CMS citations for every coding decision.

Built for the Prompt Opinion Agent Assemble competition.

Architecture

The agent integrates with the Prompt Opinion platform via the A2A protocol and the SHARP FHIR extension. Patient context (FHIR URL, token, patient ID) is injected into the A2A message metadata at runtime.

Companion repo: po-community-mcp — TypeScript MCP server serving CMS billing rules to the agent.

Stack

  • Python 3.12
  • Google ADK
  • Gemini 2.5 Flash (thinking_budget=0)
  • FHIR R4 (Prompt Opinion SHARP extension)
  • A2A Protocol
  • Chroma (payer-specific policy RAG)
  • uvicorn

Setup

git clone https://github.com/evantofu/po-adk-python
cd po-adk-python
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Copy .env.example to .env and fill in:

GOOGLE_API_KEY=
CLAIMS_MCP_URL=http://localhost:5000/mcp
EVAL_FHIR_URL=
EVAL_FHIR_TOKEN=
MIDDLEWARE_API_KEY=

Running

Start the MCP server first (see companion repo), then:

PYTHONUNBUFFERED=1 uvicorn healthcare_agent.app:a2a_app \
  --host 0.0.0.0 --port 8001 2>&1 | tee /tmp/uvicorn.log

Expose via ngrok:

ngrok http 8001

Token Refresh

The FHIR token is obtained by triggering a consult in the Prompt Opinion platform. Once uvicorn captures it in the log:

./refresh_token.sh
export EVAL_FHIR_TOKEN=$(grep EVAL_FHIR_TOKEN .env | cut -d= -f2)

Evaluation

Run all golden cases:

cd eval
python run_evals.py

Chaos test (25 iterations, 6 cases):

caffeinate python eval/chaos_run.py --iterations 25 --sleep 10

Results

Metric Value
F1 Score (all cases) 1.000
Pass rate (chaos testing) 150/150 (100%)
Independent runs 2
Transient failures recovered 5
Logic failures 0

Cases

Case Encounter Type Payer
tamera_preventive_v1 Preventive Blue Cross Blue Shield
marcus_chronic_care_v1 Chronic care Aetna
jaylen_pediatric_v1 Pediatric well-child UnitedHealthcare
alex_asthma_v1 Asthma Commercial
dorothy_awv_v1 Annual Wellness Visit Medicare
eleanor_wrong_codes_v1 AWV + wrong-code correction Medicare

Project Structure

po-adk-python/
├── healthcare_agent/
│   ├── agent.py              # Main agent definition
│   ├── app.py                # A2A app entry point
│   └── tools/
│       └── claims.py         # FHIR tool implementations
├── shared/
│   ├── app_factory.py        # Custom Runner with InMemorySessionService
│   ├── fhir_hook.py          # FHIR context extraction from A2A metadata
│   ├── middleware.py         # Request/response middleware
│   └── tools/
│       └── fhir.py           # Shared FHIR utilities
└── eval/
    ├── run_evals.py          # Golden case runner
    ├── chaos_run.py          # Chaos testing framework
    ├── runner.py             # A2A client and response parser
    ├── scorer.py             # F1 scoring logic
    ├── golden_cases.py       # Case loader and data models
    └── cases/                # Golden case definitions (JSON)

About

FHIR-native AI billing agent

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors