# AI-Powered Clinical Documentation Assistant

## Problem overview

Healthcare professionals face a significant burden from medical documentation. This project focuses on leveraging generative AI to alleviate this burden by automatically extracting structured information from physician-patient audio conversations and using it to pre-fill administrative forms by generating data points that can be electronically stored in EMRs, and EHRS.

We particularly focus on how the below presented workflow can be used to fill a medical history form an audio recording.

This tool outputs data in a FHIR compatible format which ensures seamless integration with existing healthcare systems through a standardized, interoperable format. This structured approach unlocks the data's potential for reusability in various clinical workflows, analytics, and future healthcare applications.

## Solution architecture

This tool implements a RAG-based approach for form/[questionaire](https://www.hl7.org/fhir/R4/questionnaireresponse.html) discovery depending on a users prompt.

It then generates a [QuestionnaireResponse](https://www.hl7.org/fhir/R4/questionnaireresponse.html), which represents an instance of a form submission

In the place of FHIR compatible server we use a json file placeholder to act as a questionnaire repository

[TODO - insert workflow image here preferably landscape]


## Setup

Install packages

In [1]:
!pip uninstall -qqy jupyterlab kfp  # Remove unused conflicting packages
!pip install -q "google-genai==1.7.0" "chromadb==0.6.3" "langchain==0.3.23" "langgraph==0.3.29" "json-repair==0.41.1" "google-api-core==2.24.2" "langchain-google-genai==2.1.2"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


**Set up your API key**

To run the following cell, your API key must be stored it in a [Kaggle secret](https://www.kaggle.com/discussions/product-feedback/114053) named `GOOGLE_API_KEY`.

If you don't already have an API key, you can grab one from [AI Studio](https://aistudio.google.com/app/apikey). You can find [detailed instructions in the docs](https://ai.google.dev/gemini-api/docs/api-key).

To make the key available through Kaggle secrets, choose `Secrets` from the `Add-ons` menu and follow the instructions to add your key or enable it for this notebook.

In [2]:
# from kaggle_secrets import UserSecretsClient

# GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")

GOOGLE_API_KEY = "AIzaSyDAZjElfeaJqItRsB21v3p4ETShat1PzmI"

In [3]:
from google import genai
from google.genai import types
from google.api_core import retry

from IPython.display import HTML, Markdown, display

client = genai.Client(api_key=GOOGLE_API_KEY)
model_id = "gemini-2.0-flash"

is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})

genai.models.Models.generate_content = retry.Retry(
    predicate=is_retriable)(genai.models.Models.generate_content)

## Pre-requisites

**Creating the embedding database with ChromaDB**

We create a [custom function](https://docs.trychroma.com/guides/embeddings#custom-embedding-functions) to generate embeddings with the Gemini API. 

We will use this to store our questionnaire description as documents in the vector database

In [4]:
# from chromadb import Documents, EmbeddingFunction, Embeddings

# class GeminiEmbeddingFunction(EmbeddingFunction):
#     # Specify whether to generate embeddings for documents, or queries
#     document_mode = True

#     @retry.Retry(predicate=is_retriable)
#     def __call__(self, input: Documents) -> Embeddings:
#         if self.document_mode:
#             embedding_task = "retrieval_document"
#         else:
#             embedding_task = "retrieval_query"

#         response = client.models.embed_content(
#             model="models/text-embedding-004",
#             contents=input,
#             config=types.EmbedContentConfig(
#                 task_type=embedding_task,
#             ),
#         )
#         return [e.values for e in response.embeddings]

Define a few utils that will be used to load the json questionnaire data. Ideally the data source would be a live FHIR server 

In [5]:
# import json

# _quest_docs = None
# def read_questionnaires_from_fs():
#     global _quest_docs
#     if _quest_docs is None:
#         with open("/kaggle/input/quest-sample-db/quest.db.json", "r") as file:
#             _quest_docs = json.loads(file.read())
#     return _quest_docs

# def get_quest_docs_meta():
#     quest_docs = read_questionnaires_from_fs()
#     doc_with_metad = []
#     doc_ids = []
#     for doc in quest_docs:
#         doc_id = doc.get("id")
#         doc_meta = {
#             k: v
#             for k, v in {
#                 "id": doc_id,
#                 "title": doc.get("title"),
#                 "name": doc.get("name"),
#             }.items()
#             if v is not None
#         }
#         doc_desc = (
#             doc.get("description") if doc.get("description") else "No description"
#         )
#         doc_with_metad.append((doc_desc, doc_meta))
#         doc_ids.append(doc_id)
#     return doc_with_metad, doc_ids

We now load our questionnaire data and populate the vector database. we store the descriptions as vector embeddings and tag it with some metadata pertaining to each form/questionnaire.

In [6]:
# import chromadb

# DB_NAME = "fhir-quest-semantic"

# embed_fn = GeminiEmbeddingFunction()
# chroma_client = chromadb.Client()
# db = chroma_client.get_or_create_collection(name=DB_NAME, embedding_function=embed_fn)

# def populate_vector_db():
#     embed_fn.document_mode = True
#     (desc_with_metad, doc_ids) = get_quest_docs_meta()
#     descriptions, meta = zip(*desc_with_metad)
#     print(meta)

#     db.add(documents=list(descriptions), ids=doc_ids, metadatas=list(meta))

# populate_vector_db()

Confirm that the data was inserted by looking at the database

db.count()

## Retrieval: Finding relevant questionnaires

We will be using the user prompt to find a relevant questionnaire to fill. We do so by

1. Querying our vector store for the single top most questionnaire that is semantically related to the users needs
2. Use the gemini model to validate that the questionnaire does actually relate to the users prompt.

In [7]:
# def generate_form_validation_prompt(user_prompt, quest_desc, quest_metadata):
#     return f"""
# # Instruction
# You are an evaluator. Your task is to evaluate the relevance of a form description and metadata to a user instruction.
# We will provide you with the user instruction, and the form description and metadata.
# Read the user instruction carefully to understand the user's need, and then evaluate if the provided form description and metadata are relevant to fulfilling that need based on the Criteria provided in the Evaluation section below.
# You will assign the form description a rating following the Rating Rubric

# # Evaluation
# ## Metric Definition
# You will be assessing form relevance, which measures whether the provided form description and metadata are suitable for fulfilling the user's instruction.  Relevance implies that a user could likely find the form useful and pertinent to their stated need.

# ## Criteria
# Relevance to User Instruction: The form description and metadata align with the user's instruction and suggest the form could potentially address the user's need.
# Usefulness for User Instruction: The form, as described, appears practically useful for a user attempting to follow the given instruction.
# Clarity of Description: The form description and metadata are clear and understandable enough to assess relevance. (If description is unclear, down-rate even if potentially relevant).

# ## Rating Rubric
# (YES). The form is very likely to be relevant and useful for the user instruction. The description is clear and strongly suggests a good match.
# (NO). The form is not relevant to the user instruction. The description clearly indicates the form is unrelated to the user's need.

# # User Inputs and Model Rating
# ## User Instruction

# ### Prompt
# {user_prompt}

# ## Form Description and Metadata

# ### Form Instruction Description
# {quest_desc}

# ### Form Metadata (JSON)
# {quest_metadata}
# """

Use JSON mode to control the models output

Define our agent's internal state

In [8]:
# import enum


# class RelevantRating(enum.Enum):
#     YES = "Yes"
#     NO = "No"

# def discover_questionnaire(query):
#     embed_fn.document_mode = False
#     result = db.query(query_texts=[query], n_results=1)
#     queried_doc_ids = result.get("ids")
#     try:
#         interest_doc_id = queried_doc_ids[0][0]
#     except IndexError:
#         return None
#     queried_doc_desc = result.get("documents")[0][0]
#     queried_doc_meta = result.get("metadatas")[0][0]
#     prompt = generate_form_validation_prompt(query, queried_doc_desc, queried_doc_meta)
#     print("PRompt", prompt)

#     structured_output_config = types.GenerateContentConfig(
#         response_mime_type="text/x.enum",
#         response_schema=RelevantRating,
#     )
#     response = client.models.generate_content(
#         model=model_id, contents=[prompt], config=structured_output_config
#     )
#     parsed_resp = response.parsed

#     if parsed_resp is RelevantRating.YES:
#         return interest_doc_id
#     else:
#         return


In [9]:
%pip install -qU "langchain==0.3.23" "langgraph==0.3.29" "json-repair==0.41.1" "langchain-google-genai==2.1.2"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [10]:
from typing_extensions import TypedDict, Any, Dict, List

# Define the state of our graph
class AgentState(TypedDict):
    audio_file_path: str
    uploaded_audio_file: Any
    # past_soap_notes: Dict[str, Any]
    transcription: str
    soap_note: str
    fhir_resources: Any # TODO - verify typings are appropriate
    clinical_intent: str
    resource_plan: str

In [11]:
# def fetch_questionnaire(state: AgentState):
#     query = state.get("instructions")
#     quest_id = discover_questionnaire(query)

#     full_quest_docs = read_questionnaires_from_fs()
#     of_interest_quest = None
#     for quest in full_quest_docs:
#         if quest["id"] == quest_id:
#             of_interest_quest = quest
#             break
#     if of_interest_quest is None:
#         return {"quest_found": False}
#     else:
#         return {"quest_found": True, "quest": of_interest_quest}


In [12]:
_upload_file_cache = None

def upload_to_gemini(state: AgentState):
    """
    Uploads the local audio file to Gemini if not already uploaded.
    Returns a dictionary with the uploaded file object.
    """
    global _upload_file_cache
    local_file_path = state.get("audio_file_path")

    try:
        if _upload_file_cache is None:
            _upload_file_cache = client.files.upload(file=local_file_path)
        return {"uploaded_audio_file": _upload_file_cache}

    except Exception as e:
        print(f"Error uploading to Gemini: {e}")
        # You can also return None, raise the error, or log it more formally
        return {}

In [13]:
def diarize_audio(state: AgentState):
    prompt = """
        Diarize and transcribe this health-related interview, maintaining chronological order with timestamps if possible. Add labels for speaker (like 'Doctor:', 'Patient:', or 'Speaker 1:', 'Speaker 2:') at the beginning of each turn.
        Accurately capture medical terms, mark unclear words as “[INAUDIBLE],” avoid adding extra commentary or guesses, and keep overlapping speech on separate lines. 
        Return only the final transcript.
        """
    uploaded_audio_file = state.get("uploaded_audio_file")

    response = client.models.generate_content(
        model=model_id, contents=[prompt, uploaded_audio_file]
    )

    transcription = response.text.strip()
    return {"transcription": transcription}

We then make make the call to the model with the questionniare that describes the form, and the audio file expecting the qeustionnaire response. The questionnaire response represents the form submission that a physician would have had to make manually from listening or partaking in the recorded conversation.

In [14]:
# from json_repair import repair_json

# def get_questionnaire_response(state: AgentState):
#     audio_file = state.get("uploaded_audio_file")
#     questionnaire = state.get("quest")
#     patient_emr = None # state.get("patient_emr")
#     # patient_emr, audio_file, questionnaire, questionnaireRes_example
#     prompt = f"""
#         You are an audio processing expert with extensive experience in converting audio files into structured data formats, specifically JSON. Your specialty lies in accurately extracting meaningful information from audio recordings and populating questionnaire-style data structures based on that information.
        
#         Your task is to analyze the provided audio file and patient Electronic Medical Record (EMR) and fill out the questionnaire with the relevant responses. The output format should follow the structure of the provided questionnaireResponse example.

#         Here is the patient EMR:
#         {patient_emr}
        
#         Here is the questionnaire:
#         {questionnaire}
        
#         Here is an example questionnaire response:
#         {questionnaireRes_example}
        
#         Please analyze the audio and generate the appropriate questionnaire response.

#         Use this JSON schema:

#         QuestionnaireResponse = <generated questionnaireResponse>
#         return: QuestionnaireResponse
#     """

#     response = client.models.generate_content(
#         model=model_id,
#         contents=[prompt, audio_file],
#         config=types.GenerateContentConfig(
#             temperature=0,
#             response_mime_type='application/json',
#         )
#     )
#     qr = repair_json.loads(response.text)
#     return {"quest_resp": qr}


In [15]:
soap_note_generation_sys_prompt = """You are an expert medical scribe tasked with generating a concise and accurate SOAP (Subjective, Objective, Assessment, Plan) note from a health care provider - patient conversation transcription.

**Input:** You will be provided with a diarized transcription of a patient-provider encounter.  The transcription will clearly identify each speaker as either "Patient" or "Provider" at the beginning of each dialogue turn.

**Task:**  Analyze the transcription and extract relevant information to populate each section of a SOAP note.

**Output:**  Generate a SOAP note in the following structured format:

S - Subjective:

    Chief Complaint (CC): [Concise statement of the patient's primary reason for visit]

    History of Present Illness (HPI): [Detailed narrative of the patient's current problem, using OLDCARTS or similar mnemonic if applicable. Include onset, location, duration, character, aggravating/alleviating factors, radiation, timing, severity.]

    Past Medical History (PMH): [Summarize relevant past medical conditions mentioned by the patient or provider.]

    Medications: [List current medications mentioned by the patient.]

    Allergies: [List known allergies mentioned by the patient.]

    Social History (SH): [Extract pertinent social history details like smoking, alcohol use, occupation, living situation if discussed and relevant to the encounter.]

    Family History (FH): [Summarize relevant family history if discussed.]

    Review of Systems (ROS): [Briefly list any systems reviewed and any symptoms reported by the patient related to those systems. Focus on relevant systems based on the chief complaint.]

O - Objective:

    Vital Signs: [List any vital signs mentioned in the transcription (BP, HR, RR, Temp, SpO2, Pain Scale) and their values if provided. If not explicitly stated in the transcription, state "Not documented in transcription."]

    Physical Exam Findings: [Summarize any physical exam findings described by the provider. Focus on findings related to the chief complaint and ROS. If no physical exam findings are explicitly mentioned in the transcription, state "Not documented in transcription, infer from provider statements if possible (e.g., 'lungs sound clear' implies auscultation)."]

    Lab Results: [List any lab results mentioned by the provider or patient, including test name and result. If no lab results are mentioned, state "Not documented in transcription."]

    Imaging Results: [List any imaging results mentioned, including type and findings. If none mentioned, state "Not documented in transcription."]

    Other Diagnostic Tests: [List any other diagnostic test results mentioned (e.g., EKG, PFTs). If none mentioned, state "Not documented in transcription."]

A - Assessment:

    Differential Diagnoses: [List any differential diagnoses discussed by the provider. Include potential diagnoses considered.]

    Working Diagnosis (or Most Likely Diagnosis): [Identify the most likely diagnosis or working diagnosis stated or strongly implied by the provider. If no clear diagnosis is stated, summarize the provider's assessment of the patient's condition.]

    Problem List: [List any active or chronic problems identified or confirmed by the provider during the encounter. Focus on problems relevant to this visit.]

P - Plan:

    Diagnostic Plan: [List any further diagnostic tests, labs, or imaging ordered or planned by the provider.]

    Therapeutic Plan: [Summarize the treatment plan, including medications prescribed, procedures planned, therapies recommended, lifestyle modifications advised, and referrals made.]

    Patient Education: [Summarize any patient education provided by the provider, including instructions, self-care advice, and information about medications or conditions.]

    Follow-up Plan: [Describe the follow-up plan, including when the patient should return, specific instructions for follow-up, and any "return precautions" mentioned (e.g., "return if symptoms worsen").]

    Consults/Referrals: [List any consultations or referrals to specialists or other providers planned by the provider.]   
"""

In [16]:
from google.genai import types

def generate_soap_note(state: AgentState):
    transcription = state.get("transcription")

    response = client.models.generate_content(
        model=model_id,
        contents=[transcription],
        config=types.GenerateContentConfig(
            temperature=0.1,
            system_instruction=soap_note_generation_sys_prompt
        )
    )

    return {"soap_note": response.text}


In [17]:
# Clinical intent Extraction & Structuring (from SOAP Note)

clinical_intent_sys_prompt = """
You are a highly skilled medical information extraction specialist. Your primary task is to analyze clinical notes, specifically SOAP notes, and identify the underlying clinical intents within each section (Subjective, Objective, Assessment, Plan). You will structure these intents into a JSON format for downstream processing. You are meticulous and focused on capturing the core meaning and purpose of the clinical information, not just surface-level keywords.
"""


def get_clinical_intent(state: AgentState):
    soap_note = state.get("soap_note")
    prompt = """
    Analyze the following SOAP note and extract the clinical intents from each section (Subjective, Objective, Assessment, Plan).

    For each intent, identify:
    - `intent_type`: A concise label describing the clinical intent (e.g., "patient_reported_symptom", "medication_order", "diagnosis", "vital_sign_observation", "referral_request"). Be specific and use a controlled vocabulary of intent types if possible (e.g., choose from: patient_reported_symptom, symptom_characteristic, associated_symptom, negative_symptom, vital_sign_observation, physical_exam_finding, lab_result, diagnosis, problem, differential_diagnosis, medication_order, procedure_order, referral_request, patient_education, treatment_recommendation).
    - `intent_details`:  Capture the key details related to the intent. This should be a dictionary containing relevant information. The specific details will vary based on the `intent_type`. For example:
        - For `patient_reported_symptom`: include `symptom_name`, `location` (if mentioned), `duration` (if mentioned), `severity` (if mentioned), `characteristics` (e.g., "sharp", "dull").
        - For `medication_order`: include `medication_name`, `dosage`, `route`, `frequency`, `reason` (if mentioned).
        - For `diagnosis`: include `diagnosis_name`, `certainty` (e.g., "suspected", "confirmed").
        - For `referral_request`: include `specialty`, `reason`.
        - For `lab_order`: include `lab_test_name`.
        - For `procedure_order`: include `procedure_name`.

    Structure your output as a JSON object with the following structure:

    ```json
    {
    "subjective_intents": [ /* Array of intent objects from Subjective section */ ],
    "objective_intents": [ /* Array of intent objects from Objective section */ ],
    "assessment_intents": [ /* Array of intent objects from Assessment section */ ],
    "plan_intents": [ /* Array of intent objects from Plan section */ ]
    }"
    """

    response = client.models.generate_content(
        model=model_id,
        contents=[prompt, soap_note],
        config=types.GenerateContentConfig(
            temperature=0.1, system_instruction=clinical_intent_sys_prompt
        ),
    )

    return {"clinical_intent": response.text}

In [18]:
# Clinical intent Extraction & Structuring (from SOAP Note)

import json_repair

clinical_intent_sys_prompt = """
You are the ultimate expert in clinical data transformation. Your unparalleled skill lies in converting unstructured SOAP notes into structured, semantically accurate FHIR R4 resources in a single, efficient step. You possess deep knowledge of medical terminology, clinical workflows, and the FHIR R4 specification. Your goal is to take a SOAP note and directly generate a set of valid FHIR R4 JSON resources that comprehensively represent the clinical encounter.
"""


def single_flow_fhir_gen(state: AgentState):
    soap_note = state.get("soap_note")
    prompt = prompt = """
        Convert the following SOAP note directly into a set of valid FHIR R4 JSON resources.

        Let's use the following Chain of Thought to ensure accurate and comprehensive FHIR resource generation:

        **Thought Process (Chain of Thought):**

        1.  **Comprehensive SOAP Note Analysis:** I will thoroughly analyze the entire SOAP note, section by section (Subjective, Objective, Assessment, Plan), to understand all relevant clinical information, from patient reports to physician's plans.

        2.  **Clinical Intent Recognition and Extraction:** As I analyze each part of the SOAP note, I will implicitly recognize the underlying clinical intents. I will mentally identify intents like 'patient_reported_symptom', 'medication_order', 'diagnosis', 'referral_request', etc., and extract the necessary details for each.

        3.  **Direct FHIR Resource Planning and Generation:**  For each recognized clinical intent, I will directly determine the most appropriate FHIR R4 resource type *and immediately generate* the FHIR JSON content for that resource.  [**Simulated "Action": I will access my comprehensive FHIR R4 knowledge and best practices to select the right resource and populate it correctly.**]  **For resources representing future actions (e.g., MedicationRequest, ProcedureRequest, ServiceRequest, CarePlan, etc.), I will check if the resource has a status field that reflects operational progress. If a status field exists, I will set it to the *earliest possible status* that indicates a pending or preliminary state, such as 'draft', 'planned', 'active', 'requested', or 'on-hold', depending on the resource type and its valid status values. This ensures that these resources are not marked as complete and are ready for potential human-in-the-loop approval or further processing.** For resources that represent past or present observations or conditions, I will set their status to 'final' or appropriate terminal state.

        4.  **Resource Interlinking:**  Where contextually appropriate, I will assign a globally unique identifier(UUID) to each resource and attempt to establish basic interlinks between resources. For example, linking Observations to the Encounter resources (use the placeholder reference "unknownRef" where the logical id is not present or unknown). I will however Assume that the Patient and Practitioner resources will be created separately and linked later.

        5.  **Structured JSON Output of FHIR Resources:**  Finally, I will output a JSON object. a list of all complete generated R4 fhir resources.

        **Output Format:**

        Generate a JSON object where the keys are descriptive resource names and the values are the generated FHIR R4 JSON resource objects.

        Example JSON output structure:
        [<Generated Fhir resources>]
    """

    response = client.models.generate_content(
        model=model_id,
        contents=[prompt, soap_note],
        config=types.GenerateContentConfig(
            temperature=0,
            system_instruction=clinical_intent_sys_prompt,
            response_mime_type="application/json",
        ),
    )

    fhir_resources = json_repair.loads(response.text)
    return {"clinical_intent": fhir_resources}

In [19]:
# Fhir resource type planning(Reasoning and selection.)

fhir_res_planning_sys_prompt = """
You are a highly experienced FHIR (Fast Healthcare Interoperability Resources) architect and clinical data modeler. Your expertise lies in understanding clinical intents and mapping them to appropriate FHIR resource types.  You are deeply familiar with the FHIR R4 specification and best practices for representing clinical information in FHIR. Your goal is to take a structured representation of clinical intents and determine the most suitable FHIR resource type for each intent to ensure accurate and semantically correct FHIR representation. You will provide a list of FHIR resource types associated with each intent.
"""


def get_fhir_resource_plan(state: AgentState):
    clinical_intent = state.get("clinical_intent")
    prompt = f"""
Based on the following structured representation of clinical intents extracted from a SOAP note, determine the most appropriate FHIR resource type for each intent.

For each clinical intent provided in the input JSON, you need to:
- `fhir_resource_type`: Identify the single most appropriate FHIR R4 resource type to represent this clinical intent. Choose from standard FHIR resource types like: Encounter, Patient, Practitioner, Observation, Condition, MedicationRequest, Procedure, ServiceRequest, DiagnosticReport, etc. Just provide the resource type name (e.g., "Observation", "ServiceRequest").
- `intent_source`:  Indicate the source of the intent from the input JSON (e.g., "subjective_intents[0]", "plan_intents[2]", "overall_encounter").  Use "overall_encounter" for intents that apply to the entire encounter (like creating an Encounter resource itself). Use "patient_context" for intents related to patient demographics. Use "practitioner_context" for intents related to the practitioner.
- `intent_details`:  Carry over the `intent_details` dictionary from the input JSON for context.

Consider these general mappings as guidelines:
- `patient_reported_symptom`, `symptom_characteristic`, `associated_symptom`, `negative_symptom`, `vital_sign_observation`, `physical_exam_finding`, `lab_result` intents usually map to `Observation`.
- `diagnosis`, `problem`, `differential_diagnosis` intents usually map to `Condition`.
- `medication_order` intents usually map to `MedicationRequest`.
- `procedure_order` intents usually map to `ServiceRequest` (or sometimes `Procedure` depending on context).
- `referral_request` intents usually map to `ServiceRequest` with category 'referral'.
- Overall encounter information should map to `Encounter`.
- Patient demographics and identifiers should map to `Patient`.
- Practitioner information should map to `Practitioner`.

Input JSON (Structured Clinical Intents):
```json
{clinical_intent}
```
"""
    response = client.models.generate_content(
        model=model_id,
        contents=[prompt],
        config=types.GenerateContentConfig(
            temperature=0.1, system_instruction=fhir_res_planning_sys_prompt
        ),
    )

    return {"resource_plan": response.text}

In [20]:
# generating the fhir resources
gen_fhir_res_sys_prompt = """
You are a meticulous FHIR JSON generator. You are an expert at creating valid and well-formed FHIR R4 JSON resources. Your task is to take a specification of FHIR resource types and associated clinical intent details and generate the full JSON content for each resource. You pay close attention to FHIR R4 structure, data types, and best practices. You aim for complete and semantically accurate FHIR resources based on the provided intent information.  Where appropriate, you will try to include relevant coding (e.g., SNOMED CT, LOINC) if you have access to or can infer suitable codes, but prioritize generating valid FHIR structure even if coding is not fully resolved.
    """

def generate_fhir_resources(state: AgentState):
    resource_plan = state.get("resource_plan")
    user_prompt = f"""
    Generate the FHIR R4 JSON content for each FHIR resource type specified in the input JSON array.  For each resource type, use the associated `intent_details` to populate the relevant fields in the FHIR resource.

    Input JSON Array (FHIR Resource Types and Intents):
    ```json
    {resource_plan}
    ```

    For each object in the input array, generate a corresponding FHIR R4 JSON resource. Ensure:

    The resourceType field is correctly set.

    Use appropriate FHIR data types for each element.

    For resources like Observation and ServiceRequest, try to include relevant coding where possible (e.g., for Observation.code, ServiceRequest.code, ServiceRequest.category). If you can infer relevant SNOMED CT or LOINC codes based on the intent_details, include them in coding elements. If not, at least include text elements with descriptive names.

    For resources that need references (e.g., Observation.subject, ServiceRequest.subject, Encounter.subject, Encounter.participant.individual), use placeholder references like:{{"reference": "Patient/example"}}, {{"reference": "Practitioner/example"}}. Assume Patient and Practitioner resources will be created separately and linked later.

    For Encounter resources, assume a status of "planned", class of "ambulatory", and use the first listed chief complaint from the Subjective intents as the reasonCode.text.

    Output a JSON object where the keys are descriptive resource names (e.g., "Observation_ChestPain", "ServiceRequest_CBC", "Encounter") and the values are the generated FHIR JSON resource objects.

    """

    response = client.models.generate_content(
        model=model_id,
        contents=[user_prompt],
        config=types.GenerateContentConfig(
            temperature=0.1,
            system_instruction=gen_fhir_res_sys_prompt
        )
    )

    return {"fhir_resources": response.text}

In [21]:
# def evaluate_outputs(state: AgentState):
#     # patient_emr, audio_file, questionnaire_response, soap_note, questionnaire_example, soap_example
#     patient_emr = None # state.get("patient_emr")
#     audio_file = state.get("uploaded_audio_file")
#     questionnaire_response = state.get("quest_resp")
#     soap_note = state.get("soap_note")
#     questionnaire_example = state.get("quest")
#     soap_note_sample = state.get("soap_note_sample")

#     prompt = f"""
#     You are a medical evaluator tasked with reviewing clinical documentation.

#     You will evaluate:
#     1. The quality and completeness of a **Questionnaire Response**
#     2. The accuracy and structure of a **SOAP Note**

#     @Lakshay - TODO - what does quality, accuracy and structure look like???
#     Use the following scale:
#     - 5 = Very Good
#     - 4 = Good
#     - 3 = Acceptable
#     - 2 = Poor
#     - 1 = Very Poor
#     - 0 = Unusable

#     Base your evaluation on:
#     - Clinical relevance and coherence
#     - Completeness compared to examples
#     - Consistency with the patient EMR

#     --- Patient EMR (for reference) ---
#     {patient_emr}

#     --- Audio Transcript of patient and Doctor Conversation (for reference) ---
#     {audio_file}

#     --- Example Questionnaire Response ---
#     {questionnaire_example}

#     --- Given Questionnaire Response ---
#     {questionnaire_response}

#     --- Example SOAP Note ---
#     {soap_note_sample}

#     --- Given SOAP Note ---
#     {soap_note}

#     Now provide a rating for each of the following:
#     1. Questionnaire Response (0–5):
#     2. SOAP Note (0–5):

#     Include a one-sentence rationale for each score.
#     """

#     response = client.models.generate_content(
#         model=model_id,
#         contents=[prompt],
#         config=types.GenerateContentConfig(
#             temperature=0,
#         )
#     )

#     return response.text


In [22]:
import json

def truncate_text(text: str, n: int = 50) -> str:
    """Truncates text to the first n characters, adding ellipsis if truncated."""
    if len(text) > n:
        return text[:n] + "..."
    return text

def write_response(state: AgentState):
    """
    Writes a response to the user based on the agent's state, reporting the outcome
    of the workflow.

    This function serves as the reporting node in the LangGraph workflow. It examines
    the AgentState to determine the success or failure

    Args:
        state (AgentState): The current state of the agent, containing information
                            about the workflow execution, including questionnaire
                            response and SOAP note.
    """

    print(truncate_text(state.get("transcription")))
    print(state.get("soap_note"))
    print(state.get("clinical_intent"))
    print(state.get("resource_plan"))
    print(state.get("fhir_resources"))
    quest_resp = state.get("quest_resp")
    soap_note = state.get("soap_note")
    quest_found = state.get("quest_found")

    if quest_resp is None or soap_note is None:
        error_message = "Workflow encountered an issue."
        if not quest_found:
            error_message += " It appears there was a problem finding relevant questionnaire information. "
        if quest_resp is None:
            error_message += "Questionnaire Response was not generated. "
        if soap_note is None:
            error_message += "SOAP Note was not generated."
        print(error_message)
    else:
        truncated_quest_resp = truncate_text(json.dumps(quest_resp, indent=2))
        truncated_soap_note = truncate_text(soap_note)

        success_message = "Workflow completed successfully!\n\n"
        success_message += "**Questionnaire Response:**\n"
        success_message += truncated_quest_resp + "\n\n"
        success_message += "**SOAP Note:**\n"
        success_message += truncated_soap_note
        print(success_message)

Bring everything together as a graph to execture our workflow

In [23]:
from langchain_google_genai import ChatGoogleGenerativeAI # Correct import path
from langgraph.graph import StateGraph, END, START
from typing import Literal

model = ChatGoogleGenerativeAI(model=model_id, google_api_key=GOOGLE_API_KEY)

# Defined the graph
wk_graph = StateGraph(AgentState)

def aggregate_state(state: StateGraph):
    return state

# node magic strings
audio_file_upload_key = "upload_to_gemini"
gen_soap_note_key = "generate_soap_note"
write_resp_key = "write_resp"
state_aggregator_key = "state_aggregator"
diarize_audio_key = "diarize_audio"
clinical_intent_key="get_clinical_intent"
fhir_res_planning_key="get_fhir_resource_plan"
gen_fhir_resources_key="generate_fhir_resources"

single_flow_fhir_gen_key="single_flow_fhir_gen"


# Nodes
wk_graph.add_node(audio_file_upload_key, upload_to_gemini)
wk_graph.add_node(gen_soap_note_key, generate_soap_note)
wk_graph.add_node(write_resp_key, write_response)
wk_graph.add_node(state_aggregator_key, aggregate_state)
wk_graph.add_node(diarize_audio_key, diarize_audio)
wk_graph.add_node(clinical_intent_key, get_clinical_intent)
wk_graph.add_node(fhir_res_planning_key, get_fhir_resource_plan)
wk_graph.add_node(gen_fhir_resources_key, generate_fhir_resources)

wk_graph.add_node(single_flow_fhir_gen_key, single_flow_fhir_gen)

def check_file_upload(state: AgentState):
    if state.get("uploaded_audio_file"):
        return diarize_audio_key
    else:
        return write_resp_key


# Edges
wk_graph.add_edge(START, audio_file_upload_key)
wk_graph.add_conditional_edges(audio_file_upload_key, check_file_upload)
wk_graph.add_edge(diarize_audio_key, gen_soap_note_key)
wk_graph.add_edge(gen_soap_note_key, single_flow_fhir_gen_key)
# wk_graph.add_edge(gen_soap_note_key, clinical_intent_key)
# wk_graph.add_edge(clinical_intent_key, fhir_res_planning_key)
# wk_graph.add_edge(fhir_res_planning_key, gen_fhir_resources_key)
# wk_graph.add_edge(gen_fhir_resources_key, write_resp_key)
wk_graph.add_edge(single_flow_fhir_gen_key, write_resp_key)
wk_graph.add_edge(write_resp_key, END)

graph = wk_graph.compile()

In [24]:
# from IPython.display import Image, display

# Image(graph.get_graph().draw_mermaid_png())

In [None]:
audio_file_path = '/kaggle/input/quest-sample-db/RES0005.mp3'
audio_file_path = "/home/peter/Desktop/gdaCapstone/Data/Audio Recordings/CAR0002.mp3"

inputs = {
    "audio_file_path": audio_file_path,
}
graph.invoke(inputs)