Skip to content
/ Damu Public

Named after a Mesopotamian reputed to know the cures to many incurable diseases. This is an accelerator for combining natural language queries and FHIR queries to assist medical research.

License

Notifications You must be signed in to change notification settings

microsoft/Damu

Repository files navigation

Project

This is an accelerator for combining natural language queries and FHIR queries into a patient data copilot.

Features

  • Natural language chat UI and AI generative answers
  • blob triggered function to process, vector embed and index clinical notes in JSON format
  • RAG pattern via hybrid semantic search on vector embedded clinical notes in JSON format
  • FHIR API plugin to retrieve patient data (untested)
  • Semantic function for LLM-generated FHIR queries based on user query to use with FHIR API plugin (untested)
  • Reference citations view support for retrieved supporting data
  • Search results in table format with export support (WIP)

Application Architecture and Flow

Application Architecture Flow Diagram

  • User Interface: The application’s chat interface is a react/js web application. This interface is what accepts user queries, routes request to the application backend, and displays generated responses.
  • Backend: The application backend is an ASP.NET Core Minimal API. The backend, deployed to an Azure App Service, hosts the react web application and the Semantic Kernel orchestration of the different services. Services and SDKs used in the RAG chat application include:
    • Semantic Kernel – orchestrates the RAG pattern completion between the services while managing chat history and other capabilities – ready for easy extension with additional plugin functions easy (more data sources, logic, actions, etc.).
    • Azure AI Search – searches indexed documents using vector search capabilities.
    • Azure OpenAI Service – provides the Large Language Models to generate responses.
  • Document Preparation: an IndexOrchestration Azure Function is included for chunking, embedding and indexing clinical note JSON blobs. The Azure Function is triggered on new and overwritten blobs in a notes container of the deployed storage account. Services and SDKs used in this process include:
    • Document Intelligence – used for analyzing the documents via the pre-built layout model as a part of the chunking process and for HTML support inside the JSON properties.
    • Azure OpenAI Service – provides the Large Language Models to generate vectoring embeddings for the indexed document chunks.
    • Azure AI Search – indexes embedded document chunks from the data stored in an Azure Storage Account. This makes the documents searchable using vector search capabilities.

Getting Started

This sample application, as deployed, includes the following Azure components:

Deployed Infrastructure Architecture Diagram

Account Requirements

In order to deploy and run this example, you'll need

Warning

By default this sample will create an Azure AI Search resource that has a monthly cost, as well as Document Intelligence (previously Form Recognizer) resource that has cost per document page. You can switch them to free versions of each of them if you want to avoid this cost by changing the parameters file under the infra folder (though there are some limits to consider)

Cost estimation

Pricing varies per region and usage, so it isn't possible to predict exact costs for your usage. However, you can try the Azure pricing calculator for the resources below:

Deployment

This project supports azd for easy deployment of the complete application, as defined in the main.bicep resources.

See Deployment Instructions here.

Process notes into the Search Service with Blob trigger

  1. Initialize the index:
    1. In Azure: navigate to the deployed Azure Function, the name should start with func-function-
    2. Under the Functions tab on the Overview page, click on the UpsertIndex function.
    3. Click Test/Run, leave all defaults and click Run.
  2. Add documents to be indexed, trigger the function:
    1. In Azure: navigate to the deployed storage account.
    2. Browse into the Data storage/Containers blade and into the notes container.
    3. Click Upload and add note JSON files to be processed.
  3. Confirm successful indexing:
    1. In Azure: navigate to the deployed AI Search service
    2. Browse into the Indexes blade (Search Management)
    3. A new index should exist, prefixed with the environment name provided during deployment.
    4. Open and search in the index to confirm content from the files uploaded are properly searchable.

NOTE
It may take several minutes to see processed documents in the index

Running locally for Dev and Debug

As many cloud resources are required to run the client app and minimal API even locally, deployment to Azure first will provision all the necessary services. You can then configure your local user secrets to point to those required cloud resources before building and running locally for the purposes of debugging and development.

Required cloud resources:

  • Azure AI Search
  • Azure OpenAI Service
    • chat model
    • embedding model
  • Azure Document Intelligence
  • Storage Account for blob trigger

Running the ChatApp.Server and ChatApp.Client locally

  1. Configure user secrets for the ChatApp.Server project, based on deployed resources.
    1. In Azure: navigate to the deployed Web App, and into the Configuration blade (under Settings).
    2. Copy the below required settings from the deployed Environment Variables into your local user secrets:
    {
    	"OpenAIOptions:Endpoint": "YOUR_OPENAI_ENDPOINT",
    	"OpenAIOptions:EmbeddingDeployment": "embedding",
    	"OpenAIOptions:ChatDeployment": "chat",
    	"FrontendSettings:ui:title": "Damu",
    	"FrontendSettings:ui:show_share_button": "True",
    	"FrontendSettings:ui:chat_description": "This chatbot is configured to answer your questions",
    	"FrontendSettings:sanitize_answer": "false",
    	"FrontendSettings:history_enabled": "false",
    	"FrontendSettings:feedback_enabled": "false",
    	"FrontendSettings:auth_enabled": "false",
    	"ENABLE_CHAT_HISTORY": "false",
    	"AzureAdOptions:TenantId": "YOUR_TENANT_ID",
    	"AISearchOptions:SemanticConfigurationName": "YOUR_SEMANTIC_CONFIGURATION_NAME",
    	"AISearchOptions:IndexName": "YOUR_INDEX_NAME",
    	"AISearchOptions:Endpoint": "YOUR_SEARCH_SERVICE_ENDPOINT"
    }

    NOTE
    See appsettings.json in the ChatApp.Server project for more settings that can be configured in user secrets if using optional features such as CosmosDB for history.

  2. Build and run the ChatApp.Server and ChatApp.Client projects
  3. Open a browser and navigate to https://localhost:5173 as instructed in the terminal to interact with the chat client.

Running the IndexOrchestration function locally

  1. Configure user secrets for the IndexOrchestration project, based on deployed resources.
    1. In Azure: navigate to the deployed Azure Function, and into the Configuration blade (under Settings).
    2. Copy the below required settings from the deployed Environment Variables into your local secrets:
    {
    	"AzureOpenAiEmbeddingDeployment": "embedding",
    	"AzureOpenAiEmbeddingModel": "text-embedding-ada-002",
    	"AzureOpenAiEndpoint": "YOUR_OPENAI_ENDPOINT",
    	"AzureWebJobsStorage": "UseDevelopmentStorage=true",
    	"DocIntelEndPoint": "YOUR_DOC_INTELLIGENCE_ENDPOINT",
    	"FUNCTIONS_WORKER_RUNTIME": "dotnet-isolated",
    	"IncomingBlobConnStr": "YOUR_INCOMING_BLOB_CONNECTION_STRING",
    	"ModelDimensions": 1536,
    	"ProjectPrefix": "YOUR_PROJECT_PREFIX",
    	"SearchEndpoint": "YOUR_SEARCH_SERVICE_ENDPOINT"
    }

    NOTE
    See local_settings_example.json in the IndexOrchestration project for more settings that can optionally be configured.

  2. Build and run the IndexOrchestration project
  3. Upload or overwrite a note JSON file in the notes container of the storage account to trigger the function.

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

About

Named after a Mesopotamian reputed to know the cures to many incurable diseases. This is an accelerator for combining natural language queries and FHIR queries to assist medical research.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published