Skip to content

domcushnan/avt-reference

Repository files navigation

avt-reference

An open-source ambient-voice-technology (AVT) reference implementation that is conformant-by-construction — it measures itself against the AVT Metrics Taxonomy (the model of experts) and reports which UK health-tech regulatory gates it would face.

NOT FOR CLINICAL DEPLOYMENT. A clinical ambient-scribe that writes to the record is a medical device. Deploying it triggers MHRA registration, UKCA, classification, DCB0129 + DCB0160, DPIA and post-market surveillance — see COMPLIANCE.md. This repo runs on synthetic data only and exists to demonstrate the measurement × regulation method, not to be used in care.

The idea

Most AI scribes ship the pipeline and stop. This ships the pipeline with its own assurance harness and a generated compliance map — the union nobody usually packages together:

  1. Pipeline — capture → transcribe (OpenAI) → summarise into a structured note → FHIR-shaped write-back. (Architecture echoes i-dot-ai/minute.)
  2. Assurance harness — the taxonomy's metrics as an executable test suite scoring the pipeline's own output (WER, hallucination, omission, negation, write-back fidelity). The computable metrics run; the rest are honestly marked "declared — not automated here".
  3. Compliance manifestCOMPLIANCE.md, generated from the regulatory twin: device class → gates → which metric evidences each.

The taxonomy is the spec; the harness is the acceptance test; the twin says whether it would be legal. Build to pass the harness.

Quickstart

uv sync                         # or: pip install -e .
cp .env.example .env            # add your OPENAI_API_KEY

avt compliance                  # generate COMPLIANCE.md (no API key needed)
avt synth                       # make synthetic consultation audio (OpenAI TTS)
avt demo                        # ASR → note → FHIR → assurance scorecard

avt demo runs the full loop on synthetic/encounter_01/ (a fictional GP consultation) and prints a scorecard: each computable metric's score, and how many of the taxonomy's metrics this reference automates, by tier.

Layout

avt_reference/pipeline/   transcribe (OpenAI) · summarise · FHIR
avt_reference/harness/    computable metrics + scorer (joins to the taxonomy)
avt_reference/compliance  COMPLIANCE.md generator
twin_export/              snapshot of the taxonomy corpus + gate map (from the twin)
synthetic/                fictional encounter fixtures (no PII, ever)

Clinical concepts

The reference pipeline summarises into a structured note and FHIR-shaped write-back, but it does not link concepts to a clinical terminology. A companion live demo (in the regulatory-twin app, gated) adds a concept-grounding layer on top of the same pipeline: it pulls the note into discrete clinical concepts (condition | medication | symptom | procedure | allergy | finding), tags negation, attaches candidate SNOMED CT / dm+d terms, and runs a grounding pass that checks each concept back against the source transcript — surfacing a Concept grounding rate metric (joins the taxonomy's Write-back Fidelity + Clinical Keyword clusters).

⚠️ Those SNOMED / dm+d terms are LLM-suggested candidates, not terminology-server lookups — illustrative only. The NHS production path is a SNOMED-linked NLP pipeline: MedCAT / CogStack (or Amazon Comprehend Medical / Azure Text Analytics for Health). An LLM guessing a code is not the same as resolving one.

Honest limits

  • Synthetic only. Real clinical audio is governance-gated; a real-data, deployable version is a separate, regulated programme.
  • The LLM-judge metrics (hallucination/omission/negation) are themselves taxonomy-flagged as unvalidated — treat the numbers as indicative, not ground truth.
  • The compliance map is descriptive, generated from a model — not legal advice or a conformity claim.

Credits & licence

Code MIT (see LICENSE). Builds on the AVT Metrics Taxonomy (Dan Schofield, CC BY 4.0) and the i-dot-ai/minute architecture — see NOTICE.

About

Conformant-by-construction ambient-voice-technology reference + assurance harness: an AVT pipeline that scores itself against the AVT Metrics Taxonomy and generates its UK regulatory compliance map. NOT for clinical use; runs on synthetic data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages