From 242ab65ffbf4a46dd4c9036793601ebf838f2064 Mon Sep 17 00:00:00 2001
From: Timna Brown <24630902+brown9804@users.noreply.github.com>
Date: Wed, 29 Oct 2025 12:33:42 -0600
Subject: [PATCH] Update README with detailed project information
---
README.md | 39 ++++++++++++++++++++++++---------------
1 file changed, 24 insertions(+), 15 deletions(-)
diff --git a/README.md b/README.md
index ed0523b..a7248ef 100644
--- a/README.md
+++ b/README.md
@@ -31,21 +31,30 @@ Last updated: 2025-09-17
- [Solution Accelerator for AI Document Processor (ADP)](https://github.com/azure/ai-document-processor) - AI Factory
-| **Category** | **Details** |
-|------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| **Purpose** | Automate document processing using Azure services and LLMs. Extracts data from files (PDF, Word, MP3), processes via Azure OpenAI, and outputs structured insights (JSON/CSV) to blob storage. |
-| **Infrastructure Provisioning** | Uses Bicep templates to deploy all required Azure resources. Automates setup of networking, identity, and access controls using RBAC and managed identities to ensure secure and scalable deployment. |
-| **Main Azure Resources** | - **Azure Function App**: Hosts the orchestrator and activity functions using Durable Functions to manage the document processing workflow.
- **Azure Storage Account**: Stores input documents (bronze container) and output results (gold container). Also holds prompt configurations if UI is not deployed.
- **Azure Static Web App**: Provides a user-friendly interface for uploading files, editing prompts, and triggering workflows.
- **Azure OpenAI**: Processes extracted text using LLMs to generate structured summaries or insights.
- **Azure Cognitive Services**: Specifically uses Document Intelligence for OCR and text extraction from uploaded files.
- **Cosmos DB**: Stores prompt configurations when the frontend UI is enabled, allowing dynamic updates from the web interface.
- **Key Vault**: Securely stores secrets, keys, and credentials used by the Function App and other services.
- **Application Insights**: Enables monitoring, logging, and diagnostics for the Function App and other components.
- **App Service Plan**: Provides the compute resources for running the Function App. |
-| **Pipeline Components** | - `function_app.py`: Main orchestrator using Durable Functions chaining pattern.
- `activities/runDocIntel.py`: Extracts text from documents using Azure Document Intelligence.
- `activities/callAoai.py`: Sends extracted text and prompt to Azure OpenAI and receives structured JSON.
- `activities/writeToBlob.py`: Writes the final output to the gold container in blob storage. |
-| **Data Flow** | 1. Upload document to bronze container
2. OCR via Document Intelligence
3. Send extracted text + prompt to Azure OpenAI
4. Receive structured JSON
5. Write output to gold container |
-| **Frontend UI (Optional)** | - Allows business users to upload files and edit prompts
- Prompts are stored in Cosmos DB
- Users can trigger workflows and view job status directly from the interface |
-| **Prompt Configuration** | - Without UI: Prompts are stored in `prompts.yaml` file in blob storage
- With UI: Prompts are stored and managed in Cosmos DB via the web interface |
-| **Deployment Steps** | 1. Fork and clone the GitHub repo
2. Run `az login`, `azd auth login`, `azd up`
3. Provide User Principal ID for RBAC setup
4. Choose whether to deploy frontend UI |
-| **Execution (Without UI)** | - Update `prompts.yaml` with desired instructions
- Send POST request to `http_start` endpoint with blob metadata
- Monitor pipeline execution via Log Stream |
-| **Execution (With UI)** | - Upload files via web interface
- Edit system and user prompts
- Click "Start Workflow" to trigger pipeline
- View success/failure messages and job status |
-| **Monitoring & Troubleshooting** | - Use Log Stream for real-time logs
- Use Log Analytics Workspace to query exceptions and performance metrics
- Use SSH console in Development Tools to inspect deployment logs and file system |
-| **Pre-Requisites** | - Azure CLI
- Azure Developer CLI (azd)
- Node.js 18.x.x
- npm 9.x.x
- Python 3.11 |
-| **License** | MIT License – Free to use, modify, and distribute with attribution. No warranty provided. |
+ | **Category** | **Details** |
+ |------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+ | **Purpose** | Automate document processing using Azure services and LLMs. Extracts data from files (PDF, Word, MP3), processes via Azure OpenAI, and outputs structured insights (JSON/CSV) to blob storage. |
+ | **Infrastructure Provisioning** | Uses Bicep templates to deploy all required Azure resources. Automates setup of networking, identity, and access controls using RBAC and managed identities to ensure secure and scalable deployment. |
+ | **Main Azure Resources** | - **Azure Function App**: Hosts the orchestrator and activity functions using Durable Functions to manage the document processing workflow.
- **Azure Storage Account**: Stores input documents (bronze container) and output results (gold container). Also holds prompt configurations if UI is not deployed.
- **Azure Static Web App**: Provides a user-friendly interface for uploading files, editing prompts, and triggering workflows.
- **Azure OpenAI**: Processes extracted text using LLMs to generate structured summaries or insights.
- **Azure Cognitive Services**: Specifically uses Document Intelligence for OCR and text extraction from uploaded files.
- **Cosmos DB**: Stores prompt configurations when the frontend UI is enabled, allowing dynamic updates from the web interface.
- **Key Vault**: Securely stores secrets, keys, and credentials used by the Function App and other services.
- **Application Insights**: Enables monitoring, logging, and diagnostics for the Function App and other components.
- **App Service Plan**: Provides the compute resources for running the Function App. |
+ | **Pipeline Components** | - `function_app.py`: Main orchestrator using Durable Functions chaining pattern.
- `activities/runDocIntel.py`: Extracts text from documents using Azure Document Intelligence.
- `activities/callAoai.py`: Sends extracted text and prompt to Azure OpenAI and receives structured JSON.
- `activities/writeToBlob.py`: Writes the final output to the gold container in blob storage. |
+ | **Data Flow** | 1. Upload document to bronze container
2. OCR via Document Intelligence
3. Send extracted text + prompt to Azure OpenAI
4. Receive structured JSON
5. Write output to gold container |
+ | **Frontend UI (Optional)** | - Allows business users to upload files and edit prompts
- Prompts are stored in Cosmos DB
- Users can trigger workflows and view job status directly from the interface |
+ | **Prompt Configuration** | - Without UI: Prompts are stored in `prompts.yaml` file in blob storage
- With UI: Prompts are stored and managed in Cosmos DB via the web interface |
+ | **Deployment Steps** | 1. Fork and clone the GitHub repo
2. Run `az login`, `azd auth login`, `azd up`
3. Provide User Principal ID for RBAC setup
4. Choose whether to deploy frontend UI |
+ | **Execution (Without UI)** | - Update `prompts.yaml` with desired instructions
- Send POST request to `http_start` endpoint with blob metadata
- Monitor pipeline execution via Log Stream |
+ | **Execution (With UI)** | - Upload files via web interface
- Edit system and user prompts
- Click "Start Workflow" to trigger pipeline
- View success/failure messages and job status |
+ | **Monitoring & Troubleshooting** | - Use Log Stream for real-time logs
- Use Log Analytics Workspace to query exceptions and performance metrics
- Use SSH console in Development Tools to inspect deployment logs and file system |
+ | **Pre-Requisites** | - Azure CLI
- Azure Developer CLI (azd)
- Node.js 18.x.x
- npm 9.x.x
- Python 3.11 |
+ | **License** | MIT License – Free to use, modify, and distribute with attribution. No warranty provided. |
+
+
+
+ > Data flow:
+
+
+
+ > ZTA:
+
- [Use Azure AI services with SynapseML in Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/data-science/how-to-use-ai-services-with-synapseml)
- [Plan and manage costs for Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/costs-plan-manage)