page_type | languages | products | name | description | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
sample |
|
|
Document Processing with Azure AI Samples |
This collection of samples demonstrates how to use various Azure AI capabilities to build a solution to extract structured data, classify, redact, and analyze documents. |
This repository contains a collection of code samples that demonstrate how to use various Azure AI capabilities to process documents.
The samples are intended to help engineering teams establish techniques with Azure AI Foundry, Azure OpenAI, Azure AI Document Intelligence, and Azure AI Language services to build solutions to extract structured data, classify, and analyze documents.
The techniques demonstrated take advantage of various capabilities from each service to:
- Reduce complexity of custom model training by taking advantage of the capabilities of Generative AI models to analyze and classify documents.
- Improve reliability in document processing by utilizing combining AI service capbilities to extract structured data from any document type, with high accuracy and confidence.
- Simplify document processing workflows by providing reusable code and patterns that can be easily modified and evaluated for most use cases.
Sample | Link | Description | Example Use Cases |
---|---|---|---|
Vision-based Classification with Azure OpenAI GPT-4o | Python | .NET | Use Azure OpenAI GPT-4o models to classify documents using their built-in vision capabilities. | Processing multiple documents types or documents with varying purposes, such as contracts, legal documents, and emails. |
Semantic Similarity Classification with Vector Embeddings | Python | .NET | Use Azure OpenAI embedding models to convert document text and classify them based on similarity to pre-defined classification lists. | Processing multiple documents types or documents with varying purposes, such as contracts, legal documents, and emails. |
Sample | Link | Description | Example Use Cases |
---|---|---|---|
LLM-enabled Redaction with Azure AI Document Intelligence, Azure OpenAI GPT-4o, and Post-Processing | Python | .NET | Use Azure AI Document Intelligence prebuilt-layout and Azure OpenAI GPT models to redact sensitive information from documents using natural language instruction to determine redaction areas. |
Require specific redaction rules, such as redacting based on context or relationships. Also works for redacting PII, including names, addresses, and phone numbers. |
Document Redaction with Azure AI Language PII Native Document Analysis | Python | .NET | Use Azure AI Language Native Document Analysis to redact personally identifiable information (PII) from documents. | Redacting sensitive information from documents, such as names, addresses, and phone numbers. |
Note
All data extraction samples provide both an accuracy and confidence score for the extracted data. The accuracy score is calculated based on the similarity between the extracted data and the ground truth data. The confidence score can be calculated based on OCR analysis confidence and logprobs
in Azure OpenAI responses.
Sample | Link | Description | Example Use Cases |
---|---|---|---|
Text-based Extraction with Azure AI Document Intelligence and Azure OpenAI GPT-4o | Python | .NET | Use Azure AI Document Intelligence prebuilt-layout and Azure OpenAI GPT models to extract structured data from documents using text. |
Predominantly text-based documents such as invoices, receipts, and forms. |
Text-based Extraction with Azure AI Document Intelligence and Microsoft Phi | Python | .NET | Use Azure AI Document Intelligence prebuilt-layout and Microsoft's Phi models to extract structured data from documents using text. |
Predominantly text-based documents such as invoices, receipts, and forms. |
Vision-based Extraction with Azure OpenAI GPT-4o GPT-4o | Python | .NET | Use Azure OpenAI GPT-4o models to extract structured data from documents using vision capabilities. | Complex documents with a mix of text and images, including diagrams, signatures, selection marks, etc. such as reports and contracts. |
Multi-Modal (Text and Vision) Extraction with Azure AI Document Intelligence and Azure OpenAI GPT-4o | Python | .NET | Improve the accuracy and confidence in extracting structured data from documents by combining text and images with LLMs. | Any structured or unstructured document type. |
The sample repository comes with a Dev Container that contains all the necessary tools and dependencies to run the sample. Please review the container and it's dependencies to understand all of the necessary components required to run these in a real-world environment, including the use of Poppler.
Important
An Azure subscription is required to run these samples. If you don't have an Azure subscription, create an account.
To use the Dev Container in GitHub Codespaces, follow these steps:
- Click on the
Code
button in the repository and selectCodespaces
. - Click on the + button to create a new Codespace using the provided
.devcontainer\devcontainer.json
configuration. - Once the Codespace is created, continue to the Azure environment setup section.
To use the Dev Container, you need to have the following tools installed on your local machine:
- Install Visual Studio Code
- Install Docker Desktop
- Install Remote - Containers extension for Visual Studio Code
To setup a local development environment, follow these steps:
Important
Ensure that Docker Desktop is running on your local machine.
- Clone the repository to your local machine.
- Open the repository in Visual Studio Code.
- Press
F1
to open the command palette and typeDev Containers: Reopen in Container
.
Once the Dev Container is up and running, continue to the Azure environment setup section.
Once the Dev Container is up and running, you can setup the necessary Azure services and run the samples in the repository by running the following command in a pwsh
terminal:
Note
For the most optimal sample experience, it is recommended to run the samples in East US
which will provide support for all the services used in the samples. Find out more about region availability for Azure AI Document Intelligence, and GPT-4o
, Phi-4
, and text-embedding-3-large
models.
az login
./Setup-Environment.ps1 -DeploymentName <UniqueDeploymentName> -Location <AzureRegion>
Note
If a specific Azure tenant is required, use the --tenant <TenantId>
parameter in the az login
command.
az login --tenant <TenantId>
Tip
If you want to preview the changes without deployment, you can add the -WhatIf
parameter to the Setup-Environment.ps1
script.
./Setup-Environment.ps1 -DeploymentName <UniqueDeploymentName> -Location <AzureRegion> -WhatIf
The script will deploy the following resources to your Azure subscription:
- Azure AI Foundry Hub & Project, a development platform for building AI solutions that integrates with Azure AI Services in a secure manner using Microsoft Entra ID for authentication.
- Note: Phi-4 MoE will be deployed as a PAYG serverless endpoint in the Azure AI Foundry Project with its primary key stored in the associated Azure Key Vault.
- Azure AI Services, a managed service for all Azure AI Services, including Azure OpenAI, Azure AI Document Intelligence, and Azure AI Language services.
- Note: GPT-4o and GPT-4o-mini will be deployed as Global Standard models with 10K TPM quota allocation.
text-embedding-3-large
will be deployed as a Standard model with 115K TPM quota allocation. These can be adjusted based on your quota availability in the main.bicep file.
- Note: GPT-4o and GPT-4o-mini will be deployed as Global Standard models with 10K TPM quota allocation.
- Azure Storage Account, required by Azure AI Foundry.
- Azure Monitor, used to store logs and traces for monitoring and troubleshooting purposes.
- Azure Container Registry, used to store container images for the Azure AI Foundry environment.
Note
All resources are secured by default with Microsoft Entra ID using Azure RBAC. Your user client ID will be added with the necessary least-privilege roles to access the resources created. A user-assigned managed identity will also be deployed for the Azure AI Foundry environment.
After the script completes, you can run any of the samples in the repository by following their instructions.
You can contribute to the repository by opening an issue or submitting a pull request. For more information, see the Contributing guide.
This project is licensed under the MIT License.