diff --git a/docs/ai-ml/architecture/analyze-video-computer-vision-machine-learning-content.md b/docs/ai-ml/architecture/analyze-video-computer-vision-machine-learning-content.md index c04e8ae794..70575d521c 100644 --- a/docs/ai-ml/architecture/analyze-video-computer-vision-machine-learning-content.md +++ b/docs/ai-ml/architecture/analyze-video-computer-vision-machine-learning-content.md @@ -6,41 +6,41 @@ This article describes an architecture that you can use to replace the manual an :::image type="content" source="_images/analyze-video-content.png" alt-text="Diagram that shows an architecture for analyzing video content." lightbox="_images/analyze-video-content.png"::: -*Download a [PowerPoint file](https://arch-center.azureedge.net/analyze-video-content.pptx) of this architecture.* +*Download a [PowerPoint file](https://arch-center.azureedge.net/analyze-video-content.pptx) of this architecture.* -### Workflow +### Workflow -1. A collection of video footage, in MP4 format, is uploaded to Azure Blob Storage. Ideally, the videos go into a "raw" container. -2. A preconfigured pipeline in Azure Machine Learning recognizes that video files are uploaded to the container and initiates an inference cluster to start separating the video footage into frames. -3. FFmpeg, an open-source tool, breaks down the video and extracts frames. You can configure how many frames per second are extracted, the quality of the extraction, and the format of the image file. The format can be JPG or PNG. -4. The inference cluster sends the images to Azure Data Lake Storage. -5. A preconfigured logic app that monitors Data Lake Storage detects that new images are being uploaded. It starts a workflow. -6. The logic app calls a pretrained custom vision model to identify objects, features, or qualities in the images. Alternatively or additionally, it calls a computer vision (optical character recognition) model to identify textual information in the images. +1. A collection of video footage, in MP4 format, is uploaded to Azure Blob Storage. Ideally, the videos go into a "raw" container. +2. A preconfigured pipeline in Azure Machine Learning recognizes that video files are uploaded to the container and initiates an inference cluster to start separating the video footage into frames. +3. FFmpeg, an open-source tool, breaks down the video and extracts frames. You can configure how many frames per second are extracted, the quality of the extraction, and the format of the image file. The format can be JPG or PNG. +4. The inference cluster sends the images to Azure Data Lake Storage. +5. A preconfigured logic app that monitors Data Lake Storage detects that new images are being uploaded. It starts a workflow. +6. The logic app calls a pretrained custom vision model to identify objects, features, or qualities in the images. Alternatively or additionally, it calls a computer vision (optical character recognition) model to identify textual information in the images. 7. Results are received in JSON format. The logic app parses the results and creates key-value pairs. You can store the results in Azure dedicated SQL pools that are provisioned by Azure Synapse Analytics. -7. Power BI provides data visualization. +7. Power BI provides data visualization. ### Components -- [Azure Blob Storage](https://azure.microsoft.com/products/storage/blobs) provides object storage for cloud-native workloads and machine learning stores. In this architecture, it stores the uploaded video files. +- [Azure Blob Storage](https://azure.microsoft.com/products/storage/blobs) provides object storage for cloud-native workloads and machine learning stores. In this architecture, it stores the uploaded video files. - [Azure Machine Learning](https://azure.microsoft.com/products/machine-learning) is an enterprise-grade machine learning service for the end-to-end machine learning lifecycle. - [Azure Data Lake Storage](https://azure.microsoft.com/products/storage/data-lake-storage) provides massively scalable, enhanced-security, cost-effective cloud storage for high-performance analytics workloads. -- [Computer Vision](https://azure.microsoft.com/resources/cloud-computing-dictionary/what-is-computer-vision/) is part of [Azure Cognitive Services](https://azure.microsoft.com/products/cognitive-services). It's used to retrieve information about each image. +- [Computer Vision](https://azure.microsoft.com/resources/cloud-computing-dictionary/what-is-computer-vision/) is part of [Azure AI services](https://azure.microsoft.com/products/cognitive-services). It's used to retrieve information about each image. - [Custom Vision](https://azure.microsoft.com/products/cognitive-services/custom-vision-service) enables you to customize and embed state-of-the-art computer vision image analysis for your specific domains. -- [Azure Logic Apps](https://azure.microsoft.com/products/logic-apps) automates workflows by connecting apps and data across environments. It provides a way to access and process data in real time. -- [Azure Synapse Analytics](https://azure.microsoft.com/products/synapse-analytics) is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. +- [Azure Logic Apps](https://azure.microsoft.com/products/logic-apps) automates workflows by connecting apps and data across environments. It provides a way to access and process data in real time. +- [Azure Synapse Analytics](https://azure.microsoft.com/products/synapse-analytics) is a limitless analytics service that brings together data integration, enterprise data warehousing, and big data analytics. - [Dedicated SQL pool](/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-overview-what-is) (formerly SQL DW) is a collection of analytics resources that are provisioned when you use Azure Synapse SQL. - [Power BI](https://powerbi.microsoft.com) is a collection of software services, apps, and connectors that work together to provide visualizations of your data. ### Alternatives - [Azure Video Indexer](https://azure.microsoft.com/products/ai-video-indexer) is a video analytics service that uses AI to extract actionable insights from stored videos. You can use it without any expertise in machine learning. -- [Azure Data Factory](https://azure.microsoft.com/products/data-factory) is a fully managed serverless data integration service that helps you construct ETL and ELT processes. -- [Azure Functions](https://azure.microsoft.com/products/functions) is a serverless platform as a service (PaaS) that runs single-task code without requiring new infrastructure. +- [Azure Data Factory](https://azure.microsoft.com/products/data-factory) is a fully managed serverless data integration service that helps you construct extract, transform, and load (ETL) and extract, load, and transform (ELT) processes. +- [Azure Functions](https://azure.microsoft.com/products/functions) is a serverless platform as a service (PaaS) that runs single-task code without requiring new infrastructure. - [Azure Cosmos DB](https://azure.microsoft.com/products/cosmos-db) is a fully managed NoSQL database for modern app development. ## Scenario details -Many industries record video footage to detect the presence or absence of a particular object or entity or to classify objects or entities. Video monitoring and analyses are traditionally performed manually. These processes are often monotonous and prone to errors, particularly for tasks that are difficult for the human eye. You can automate these processes by using AI and machine learning. +Many industries record video footage to detect the presence or absence of a particular object or entity or to classify objects or entities. Video monitoring and analyses are traditionally performed manually. These processes are often monotonous and prone to errors, particularly for tasks that are difficult for the human eye. You can automate these processes by using AI and machine learning. A video recording can be separated into individual frames so that various technologies can analyze the images. One such technology is *computer vision*: the capability of a computer to identify objects and entities on an image. @@ -53,7 +53,7 @@ This scenario is relevant for any business that analyzes videos. Here are some s - **Agriculture.** Monitor and analyze crops and soil conditions over time. By using drones or UAVs, farmers can record video footage for analysis. - **Environmental sciences.** Analyze aquatic species to understand where they're located and how they evolve. By attaching underwater cameras to boats, environmental researchers can navigate the shoreline to record video footage. They can analyze the video footage to understand species migrations and how species populations change over time. - + - **Traffic control.** Classify vehicles into categories (SUV, car, truck, motorcycle), and use the information to plan traffic control. Video footage can be provided by CCTV in public locations. Most CCTV cameras record date and time, which can be easily retrieved via optical character recognition (OCR). - **Quality assurance.** Monitor and analyze quality control in a manufacturing facility. By installing cameras on the production line, you can train a computer vision model to detect anomalies. @@ -66,13 +66,13 @@ These considerations implement the pillars of the Azure Well-Architected Framewo Reliability ensures your application can meet the commitments you make to your customers. For more information, see [Overview of the reliability pillar](/azure/architecture/framework/resiliency/overview). -A reliable workload is one that's both resilient and available. *Resiliency* is the ability of the system to recover from failures and continue to function. The goal of resiliency is to return the application to a fully functioning state after a failure occurs. *Availability* is a measure of whether your users can access your workload when they need to. +A reliable workload is one that's both resilient and available. *Resiliency* is the ability of the system to recover from failures and continue to function. The goal of resiliency is to return the application to a fully functioning state after a failure occurs. *Availability* is a measure of whether your users can access your workload when they need to. For the availability guarantees of the Azure services in this solution, see these resources: -- [SLA for Storage Accounts](https://azure.microsoft.com/support/legal/sla/storage/v1_5) +- [service-level agreement (SLA) for Storage Accounts](https://azure.microsoft.com/support/legal/sla/storage/v1_5) - [SLA for Azure Machine Learning](https://azure.microsoft.com/support/legal/sla/machine-learning-service/v1_0) -- [SLA for Azure Cognitive Services](https://azure.microsoft.com/support/legal/sla/cognitive-services/v1_1) +- [SLA for Azure AI services](https://azure.microsoft.com/support/legal/sla/cognitive-services/v1_1) - [SLA for Logic Apps](https://azure.microsoft.com/support/legal/sla/logic-apps/v1_0) - [SLA for Azure Synapse Analytics](https://azure.microsoft.com/support/legal/sla/synapse-analytics/v1_1) - [SLA for Power BI](https://azure.microsoft.com/support/legal/sla/power-bi-embedded/v1_1) @@ -93,20 +93,20 @@ Consider the following resources: Cost optimization is about reducing unnecessary expenses and improving operational efficiencies. For more information, see [Overview of the cost optimization pillar](/azure/architecture/framework/cost/overview). -Here are some guidelines for optimizing costs: +Here are some guidelines for optimizing costs: -- Use the pay-as-you-go strategy for your architecture, and [scale out](/azure/architecture/framework/cost/optimize-autoscale) as needed rather than investing in large-scale resources at the start. -- Consider opportunity costs in your architecture, and the balance between first-mover advantage versus fast follow. Use the [pricing calculator](https://azure.microsoft.com/pricing/calculator) to estimate the initial cost and operational costs. +- Use the pay-as-you-go strategy for your architecture, and [scale out](/azure/architecture/framework/cost/optimize-autoscale) as needed rather than investing in large-scale resources at the start. +- Consider opportunity costs in your architecture, and the balance between first-mover advantage versus fast follow. Use the [pricing calculator](https://azure.microsoft.com/pricing/calculator) to estimate the initial cost and operational costs. - Establish [policies](/azure/architecture/framework/cost/principles), [budgets, and controls](/azure/architecture/framework/cost/monitor-alert) that set cost limits for your solution. ### Operational excellence Operational excellence covers the operations processes that deploy an application and keep it running in production. For more information, see [Overview of the operational excellence pillar](/azure/architecture/framework/devops/overview). -Deployments need to be reliable and predictable. Here are some guidelines: +Deployments need to be reliable and predictable. Here are some guidelines: - Automate deployments to reduce the chance of human error. -- Implement a fast, routine deployment process to avoid slowing down the release of new features and bug fixes. +- Implement a fast, routine deployment process to avoid slowing down the release of new features and bug fixes. - Quickly roll back or roll forward if an update causes problems. ### Performance efficiency @@ -117,14 +117,14 @@ Appropriate use of scaling and the implementation of PaaS offerings that have bu ## Contributors -*This article is maintained by Microsoft. It was originally written by the following contributors.* +*This article is maintained by Microsoft. It was originally written by the following contributors.* Principal author: - [Oscar Shimabukuro Kiyan](https://www.linkedin.com/in/oscarshk) | Senior Cloud Solutions Architect – Data & AI Other contributors: -- [Mick Alberts](https://www.linkedin.com/in/mick-alberts-a24a1414) | Technical Writer +- [Mick Alberts](https://www.linkedin.com/in/mick-alberts-a24a1414) | Technical Writer - [Brandon Cowen](https://www.linkedin.com/in/brandon-cowen-1658211b) | Senior Cloud Solutions Architect – Data & AI - [Arash Mosharraf](https://www.linkedin.com/in/arashaga) | Senior Cloud Solutions Architect – Data & AI - [Priyanshi Singh](https://www.linkedin.com/in/priyanshi-singh5) | Senior Cloud Solutions Architect – Data & AI @@ -136,10 +136,10 @@ Other contributors: - [Introduction to Azure Storage](/azure/storage/common/storage-introduction) - [What is Azure Machine Learning?](/azure/machine-learning/overview-what-is-azure-machine-learning) -- [What is Azure Cognitive Services?](/azure/cognitive-services/what-are-cognitive-services) +- [What is Azure AI services?](/azure/cognitive-services/what-are-cognitive-services) - [What is Azure Logic Apps?](/azure/logic-apps/logic-apps-overview) - [What is Azure Synapse Analytics?](/azure/synapse-analytics/overview-what-is) -- [What is Power BI embedded analytics?](/power-bi/developer/embedded/embedded-analytics-power-bi) +- [What is Power BI Embedded analytics?](/power-bi/developer/embedded/embedded-analytics-power-bi) - [Business Process Accelerator](https://github.com/Azure/business-process-automation) ## Related resources diff --git a/docs/ai-ml/architecture/automate-document-classification-durable-functions-content.md b/docs/ai-ml/architecture/automate-document-classification-durable-functions-content.md index d222dcb748..fa0c6c2ffd 100644 --- a/docs/ai-ml/architecture/automate-document-classification-durable-functions-content.md +++ b/docs/ai-ml/architecture/automate-document-classification-durable-functions-content.md @@ -17,7 +17,7 @@ This article describes an architecture for processing document files that contai 1. The Classify activity function calls the document classifier service that's hosted in an Azure Kubernetes Service (AKS) cluster. This service uses regular expression pattern matching to identify the starting page of each known document and to calculate how many document types are contained in the document file. The types and page ranges of the documents are calculated and returned to the orchestration. > [!NOTE] - > Azure doesn’t offer a service that can classify multiple document types in a single file. This solution uses a non-Azure service that's hosted in AKS. + > Azure doesn't offer a service that can classify multiple document types in a single file. This solution uses a non-Azure service that's hosted in AKS. 1. The Metadata Store activity function saves the document type and page range information in an Azure Cosmos DB store. 1. The Indexing activity function creates a new search document in the Cognitive Search service for each identified document type and uses the [Azure AI Search libraries for .NET](/dotnet/api/overview/azure/search?view=azure-dotnet) to include in the search document the full OCR results and document information. A correlation ID is also added to the search document so that the search results can be matched with the corresponding document metadata from Azure Cosmos DB. @@ -26,18 +26,18 @@ This article describes an architecture for processing document files that contai ### Components - [Durable Functions](/azure/azure-functions/durable/durable-functions-overview?tabs=csharp) is an extension of [Azure Functions](https://azure.microsoft.com/products/functions) that makes it possible for you write stateful functions in a serverless compute environment. In this application, it's used for managing document ingestion and workflow orchestration. It lets you define stateful workflows by writing orchestrator functions that adhere to the Azure Functions programming model. Behind the scenes, the extension manages state, checkpoints, and restarts, leaving you free to focus on the business logic. -- [Azure Cosmos DB](https://azure.microsoft.com/products/cosmos-db) is a globally distributed, multi-model database that makes it possible for your solutions to scale throughput and storage capacity across any number of geographic regions. Comprehensive service level agreements (SLAs) guarantee throughput, latency, availability, and consistency. +- [Azure Cosmos DB](https://azure.microsoft.com/products/cosmos-db) is a globally distributed, multi-model database that makes it possible for your solutions to scale throughput and storage capacity across any number of geographic regions. Comprehensive service-level agreements (SLAs) guarantee throughput, latency, availability, and consistency. - [Azure Storage](https://azure.microsoft.com/product-categories/storage) is a set of massively scalable and secure cloud services for data, apps, and workloads. It includes [Blob Storage](https://azure.microsoft.com/products/storage/blobs), [Azure Files](https://azure.microsoft.com/products/storage/files), [Azure Table Storage](https://azure.microsoft.com/products/storage/tables), and [Azure Queue Storage](https://azure.microsoft.com/products/storage/queues). - [Azure App Service](https://azure.microsoft.com/products/app-service) provides a framework for building, deploying, and scaling web apps. The Web Apps feature is an HTTP-based service for hosting web applications, REST APIs, and mobile back ends. With Web Apps, you can develop in .NET, .NET Core, Java, Ruby, Node.js, PHP, or Python. Applications easily run and scale in Windows and Linux-based environments. -- [Azure Cognitive Services](https://azure.microsoft.com/products/cognitive-services) provides intelligent algorithms to see, hear, speak, understand, and interpret your user needs by using natural methods of communication. +- [Azure AI services](https://azure.microsoft.com/products/cognitive-services) provides intelligent algorithms to see, hear, speak, understand, and interpret your user needs by using natural methods of communication. - [Azure AI Search](https://azure.microsoft.com/products/search) provides a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications. - [AKS](https://azure.microsoft.com/products/kubernetes-service) is a highly available, secure, and fully managed Kubernetes service. AKS makes it easy to deploy and manage containerized applications. ### Alternatives -- The [Form Recognizer read (OCR) model](/azure/applied-ai-services/form-recognizer/concept-read?view=form-recog-3.0.0) is an alternative to Computer Vision Read. +- The [Azure AI Document Intelligence read (OCR) model](/azure/applied-ai-services/form-recognizer/concept-read?view=form-recog-3.0.0) is an alternative to Computer Vision Read. - This solution stores metadata in Azure Cosmos DB to facilitate global distribution. [Azure SQL Database](https://azure.microsoft.com/products/azure-sql/database) is another option for persistent storage of document metadata and information. -- You can use other messaging platforms, including [Azure Service Bus](https://azure.microsoft.com/products/service-bus), to trigger Durable Functions instances. +- You can use other messaging platforms, including [Azure Service Bus](https://azure.microsoft.com/products/service-bus), to trigger Durable Functions instances. - For a solution accelerator that helps in clustering and segregating data into templates, see [Azure/form-recognizer-accelerator (github.com)](https://github.com/Azure/form-recognizer-accelerator). ### Scenario details @@ -73,7 +73,7 @@ A reliable workload is one that's both resilient and available. Resiliency is th For reliability information about solution components, see the following resources: - [SLA for Azure AI Search](https://azure.microsoft.com/support/legal/sla/search/v1_0) -- [SLA for Azure Applied AI Services](https://azure.microsoft.com/support/legal/sla/azure-applied-ai-services/v1_0) +- [SLA for Azure AI services](https://azure.microsoft.com/support/legal/sla/azure-applied-ai-services/v1_0) - [SLA for Azure Functions](https://azure.microsoft.com/support/legal/sla/functions/v1_2) - [SLA for App Service](https://azure.microsoft.com/support/legal/sla/app-service/v1_5) - [SLA for Storage Accounts](https://azure.microsoft.com/support/legal/sla/storage/v1_5) @@ -84,7 +84,7 @@ For reliability information about solution components, see the following resourc Cost optimization is about reducing unnecessary expenses and improving operational efficiencies. For more information, see [Overview of the cost optimization pillar](/azure/architecture/framework/cost/overview). -The most significant costs for this architecture will potentially come from the storage of image files in the storage account, Cognitive Services image processing, and index capacity requirements in the Azure AI Search service. +The most significant costs for this architecture will potentially come from the storage of image files in the storage account, Azure AI services image processing, and index capacity requirements in the Azure AI Search service. Costs can be optimized by [right sizing](/azure/architecture/framework/services/storage/storage-accounts/cost-optimization) the storage account by using reserved capacity and lifecycle policies, proper [Azure AI Search planning](/azure/search/search-sku-manage-costs) for regional deployments and operational scale up scheduling, and using [commitment tier pricing](/azure/cognitive-services/commitment-tier) that's available for the Computer Vision – OCR service to manage [predictable costs](/azure/cognitive-services/plan-manage-costs). @@ -98,7 +98,7 @@ Here are some guidelines for optimizing costs: Performance efficiency is the ability of your workload to scale in an efficient manner to meet the demands that users place on it. For more information, see [Performance efficiency pillar overview](/azure/architecture/framework/scalability/overview). -Periods when this solution processes high volumes can expose performance bottlenecks. Make sure that you understand and plan for the [scaling options for Azure Functions](/azure/azure-functions/functions-scale#scale), [Cognitive Services autoscaling](/azure/cognitive-services/autoscale?tabs=portal), and [Azure Cosmos DB partitioning](/azure/cosmos-db/partitioning-overview) to ensure proper performance efficiency for your solution. +Periods when this solution processes high volumes can expose performance bottlenecks. Make sure that you understand and plan for the [scaling options for Azure Functions](/azure/azure-functions/functions-scale#scale), [Azure AI services autoscaling](/azure/cognitive-services/autoscale?tabs=portal), and [Azure Cosmos DB partitioning](/azure/cosmos-db/partitioning-overview) to ensure proper performance efficiency for your solution. ## Contributors @@ -117,8 +117,8 @@ Introductory articles: - [Introduction to Azure Storage](/azure/storage/common/storage-introduction) - [What are Durable Functions?](/azure/azure-functions/durable/durable-functions-overview?tabs=csharp) -- [What are Azure Cognitive Services?](/azure/cognitive-services/what-are-cognitive-services) -- [What’s Azure AI Search?](/azure/search/search-what-is-azure-search) +- [What are Azure AI services?](/azure/cognitive-services/what-are-cognitive-services) +- [What's Azure AI Search?](/azure/search/search-what-is-azure-search) - [App Service overview](/azure/app-service/overview) - [Introduction to Azure Cosmos DB](/azure/cosmos-db/introduction) - [Azure Kubernetes Service](/azure/aks/intro-kubernetes) @@ -128,11 +128,11 @@ Product documentation: - [Azure documentation (all products)](/azure?product=all) - [Durable Functions documentation](/azure/azure-functions/durable) -- [Azure Cognitive Services documentation](/azure/cognitive-services) +- [Azure AI services documentation](/azure/cognitive-services) - [Azure AI Search documentation](/azure/search) ## Related resources - [Custom document processing models on Azure](../../example-scenario/document-processing/build-deploy-custom-models.yml) -- [Automate document processing by using Azure Form Recognizer](../../example-scenario/ai/automate-document-processing-azure-form-recognizer.yml) +- [Automate document processing by using AI Document Intelligence](../../example-scenario/ai/automate-document-processing-azure-form-recognizer.yml) - [Image classification on Azure](/azure/architecture/example-scenario/ai/intelligent-apps-image-processing) diff --git a/docs/ai-ml/architecture/automate-document-processing-azure-form-recognizer-content.md b/docs/ai-ml/architecture/automate-document-processing-azure-form-recognizer-content.md index 8c8768c635..fef5ae73f5 100644 --- a/docs/ai-ml/architecture/automate-document-processing-azure-form-recognizer-content.md +++ b/docs/ai-ml/architecture/automate-document-processing-azure-form-recognizer-content.md @@ -1,4 +1,4 @@ -This article outlines a scalable and secure solution for building an automated document processing pipeline. The solution uses Azure Form Recognizer for the structured extraction of data. Natural language processing (NLP) models and custom models enrich the data. +This article outlines a scalable and secure solution for building an automated document processing pipeline. The solution uses AI Document Intelligence for the structured extraction of data. Natural language processing (NLP) models and custom models enrich the data. ## Architecture @@ -14,16 +14,16 @@ The following sections describe the various stages of the data extraction proces 1. Documents are ingested through a browser at the front end of a web application. The documents contain images or are in PDF format. Azure App Service hosts a back-end application. The solution routes the documents to that application through Azure Application Gateway. This load balancer runs with Azure Web Application Firewall, which helps to protect the application from common attacks and vulnerabilities. -1. The back-end application posts a request to a Form Recognizer REST API endpoint that uses one of these models: +1. The back-end application posts a request to an Azure AI Document Intelligence REST API endpoint that uses one of these models: - * [Layout][Form Recognizer layout model] - * [Invoice][Form Recognizer invoice model] - * [Receipt][Form Recognizer receipt model] - * [ID document][Form Recognizer ID document model] - * [Business card][Form Recognizer business card model] - * [General document][Form Recognizer general document model (preview)], which is in preview + - [Layout][Form Recognizer layout model] + - [Invoice][Form Recognizer invoice model] + - [Receipt][Form Recognizer receipt model] + - [ID document][Form Recognizer ID document model] + - [Business card][Form Recognizer business card model] + - [General document][Form Recognizer general document model (preview)], which is in preview - The response from Form Recognizer contains raw OCR data and structured extractions. Form Recognizer also assigns [confidence values][Characteristics and limitations of Form Recognizer - Customer evaluation] to the extracted data. + The response from Azure AI Document Intelligence contains raw optical character recognition (OCR) data and structured extractions. Azure AI Document Intelligence also assigns [confidence values][Characteristics and limitations of Form Recognizer - Customer evaluation] to the extracted data. 1. The App Service back-end application uses the confidence values to check the extraction quality. If the quality is below a specified threshold, the app flags the data for manual verification. When the extraction quality meets requirements, the data enters [Azure Cosmos DB][Welcome to Azure Cosmos DB] for downstream application consumption. The app can also return the results to the front-end browser. @@ -31,9 +31,9 @@ The following sections describe the various stages of the data extraction proces 1. When a document enters Blob Storage, an Azure function is triggered. The function: - * Posts a request to the relevant Form Recognizer pre-built endpoint. - * Receives the response. - * Evaluates the extraction quality. + - Posts a request to the relevant Azure AI Document Intelligence pre-built endpoint. + - Receives the response. + - Evaluates the extraction quality. 1. The extracted data enters Azure Cosmos DB. @@ -43,82 +43,82 @@ The pipeline that's used for data enrichment depends on the use case. 1. Data enrichment can include the following NLP capabilities: - * Named entity recognition (NER) - * The extraction of personal information, key phrases, health information, and other domain-dependent entities + - Named entity recognition (NER) + - The extraction of personal information, key phrases, health information, and other domain-dependent entities To enrich the data, the web app: - * Retrieves the extracted data from Azure Cosmos DB. - * Posts requests to these features of the Azure Cognitive Service for Language API: + - Retrieves the extracted data from Azure Cosmos DB. + - Posts requests to these features of the AI Language API: - * [NER][What is Named Entity Recognition (NER) in Azure Cognitive Service for Language?] - * [Personal information][What is Personal Information detection in Azure Cognitive Service for Language?] - * [Key phrase extraction][What is key phrase extraction in Azure Cognitive Service for Language?] - * [Text analytics for health][What is Text Analytics for health in Azure Cognitive Service for Language?] - * [Custom NER][What is Custom Named Entity Recognition (NER) (preview)?], which is in preview - * [Sentiment analysis][Sentiment analysis] - * [Opinion mining][Opinion mining] + - [NER][What is Named Entity Recognition (NER) in Azure Cognitive Service for Language?] + - [Personal information][What is Personal Information detection in Azure Cognitive Service for Language?] + - [Key phrase extraction][What is key phrase extraction in Azure Cognitive Service for Language?] + - [Text Analytics for health][What is Text Analytics for health in Azure Cognitive Service for Language?] + - [Custom NER][What is Custom Named Entity Recognition (NER) (preview)?], which is in preview + - [Sentiment analysis][Sentiment analysis] + - [Opinion mining][Opinion mining] - * Receives responses from the Azure Cognitive Service for Language API. + - Receives responses from the AI Language API. 1. Custom models perform fraud detection, risk analysis, and other types of analysis on the data: - * Azure Machine Learning services train and deploy the custom models. - * The extracted data is retrieved from Azure Cosmos DB. - * The models derive insights from the data. + - Azure Machine Learning services train and deploy the custom models. + - The extracted data is retrieved from Azure Cosmos DB. + - The models derive insights from the data. These possibilities exist for inferencing: - * Real-time processes. The models can be deployed to [managed online endpoints](/azure/machine-learning/concept-endpoints#managed-online-endpoints) or Kubernetes online endpoints, where managed Kubernetes cluster can be anywhere including [Azure Kubernetes Service (AKS)][What is Kubernetes?]. - * Batch inferencing can be done at [batch endpoints](/azure/machine-learning/concept-endpoints#what-are-batch-endpoints) or in Azure Virtual Machines. + - Real-time processes. The models can be deployed to [managed online endpoints](/azure/machine-learning/concept-endpoints#managed-online-endpoints) or Kubernetes online endpoints, where managed Kubernetes cluster can be anywhere including [Azure Kubernetes Service (AKS)][What is Kubernetes?]. + - Batch inferencing can be done at [batch endpoints](/azure/machine-learning/concept-endpoints#what-are-batch-endpoints) or in Azure Virtual Machines. 1. The enriched data enters Azure Cosmos DB. #### Analytics and visualizations -1. Applications use the raw OCR, structured data from Form Recognizer endpoints, and the enriched data from NLP: +1. Applications use the raw OCR, structured data from Azure AI Document Intelligence endpoints, and the enriched data from NLP: - * Power BI displays the data and presents reports on it. - * The data functions as a source for Azure Cognitive Search. - * Other applications consume the data. + - Power BI displays the data and presents reports on it. + - The data functions as a source for Azure Cognitive Search. + - Other applications consume the data. ### Components -* [App Service][App Service] is a platform as a service (PaaS) offering on Azure. You can use App Service to host web applications that you can scale in or scale out manually or automatically. The service supports various languages and frameworks, such as ASP.NET, ASP.NET Core, Java, Ruby, Node.js, PHP, and Python. +- [App Service][App Service] is a platform as a service (PaaS) offering on Azure. You can use App Service to host web applications that you can scale in or scale out manually or automatically. The service supports various languages and frameworks, such as ASP.NET, ASP.NET Core, Java, Ruby, Node.js, PHP, and Python. -* [Application Gateway][Application Gateway service page] is a layer-7 (application layer) load balancer that manages traffic to web applications. You can run Application Gateway with [Azure Web Application Firewall][Azure Web Application Firewall service page] to help protect web applications from common exploits and vulnerabilities. +- [Application Gateway][Application Gateway service page] is a layer-7 (application layer) load balancer that manages traffic to web applications. You can run Application Gateway with [Azure Web Application Firewall][Azure Web Application Firewall service page] to help protect web applications from common exploits and vulnerabilities. -* [Azure Functions][Azure Functions service page] is a serverless compute platform that you can use to build applications. With Functions, you can use triggers and bindings to react to changes in Azure services like Blob Storage and Azure Cosmos DB. Functions can run scheduled tasks, process data in real time, and process messaging queues. +- [Azure Functions][Azure Functions service page] is a serverless compute platform that you can use to build applications. With Functions, you can use triggers and bindings to react to changes in Azure services like Blob Storage and Azure Cosmos DB. Functions can run scheduled tasks, process data in real time, and process messaging queues. -* [Form Recognizer][Azure Form Recognizer service page] is part of Azure Applied AI Services. Form Recognizer offers a collection of pre-built endpoints for extracting data from invoices, documents, receipts, ID cards, and business cards. This service maps each piece of extracted data to a field as a key-value pair. Form Recognizer also extracts table content and structure. The output format is JSON. +- [Azure AI Document Intelligence][Azure Form Recognizer service page] is part of Azure AI services. Azure AI Document Intelligence offers a collection of pre-built endpoints for extracting data from invoices, documents, receipts, ID cards, and business cards. This service maps each piece of extracted data to a field as a key-value pair. Azure AI Document Intelligence also extracts table content and structure. The output format is JSON. -* [Azure Storage][Azure Storage service page] is a cloud storage solution that includes object, blob, file, disk, queue, and table storage. +- [Azure Storage][Azure Storage service page] is a cloud storage solution that includes object, blob, file, disk, queue, and table storage. -* [Blob Storage][Azure Blob Storage] is a service that's part of Azure Storage. Blob Storage offers optimized cloud object storage for large amounts of unstructured data. +- [Blob Storage][Azure Blob Storage] is a service that's part of Azure Storage. Blob Storage offers optimized cloud object storage for large amounts of unstructured data. -* [Azure Data Lake Storage][Azure Data Lake Storage] is a scalable, secure data lake for high-performance analytics workloads. The data typically comes from multiple heterogeneous sources and can be structured, semi-structured, or unstructured. Azure Data Lake Storage Gen2 combines Azure Data Lake Storage Gen1 capabilities with Blob Storage. As a next-generation solution, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. But it also offers the tiered storage, high availability, and disaster recovery capabilities of Blob Storage. +- [Azure Data Lake Storage][Azure Data Lake Storage] is a scalable, secure data lake for high-performance analytics workloads. The data typically comes from multiple heterogeneous sources and can be structured, semi-structured, or unstructured. Azure Data Lake Storage Gen2 combines Azure Data Lake Storage Gen1 capabilities with Blob Storage. As a next-generation solution, Data Lake Storage Gen2 provides file system semantics, file-level security, and scale. But it also offers the tiered storage, high availability, and disaster recovery capabilities of Blob Storage. -* [Azure Cosmos DB][Azure Cosmos DB] is a fully managed, highly responsive, scalable NoSQL database. Azure Cosmos DB offers enterprise-grade security and supports APIs for many databases, languages, and platforms. Examples include SQL, MongoDB, Gremlin, Table, and Apache Cassandra. Serverless, automatic scaling options in Azure Cosmos DB efficiently manage capacity demands of applications. +- [Azure Cosmos DB][Azure Cosmos DB] is a fully managed, highly responsive, scalable NoSQL database. Azure Cosmos DB offers enterprise-grade security and supports APIs for many databases, languages, and platforms. Examples include SQL, MongoDB, Gremlin, Table, and Apache Cassandra. Serverless, automatic scaling options in Azure Cosmos DB efficiently manage capacity demands of applications. -* [Azure Cognitive Service for Language][Azure Cognitive Service service page] offers many NLP services that you can use to understand and analyze text. Some of these services are customizable, such as custom NER, custom text classification, conversational language understanding, and question answering. +- [AI Language][Azure Cognitive Service service page] offers many NLP services that you can use to understand and analyze text. Some of these services are customizable, such as custom NER, custom text classification, conversational language understanding, and question answering. -* [Machine Learning][Azure Machine Learning service page] is an open platform for managing the development and deployment of machine-learning models at scale. Machine Learning caters to skill levels of different users, such as data scientists or business analysts. The platform supports commonly used open frameworks and offers automated featurization and algorithm selection. You can deploy models to various targets. Examples include [AKS][Deploy Azure Machine Learning to AKS], [Azure Container Instances][Deploy Azure Machine Learning to ACI] as a web service for real-time inferencing at scale, and [Azure Virtual Machine for batch scoring][Tutorial: Build an Azure Machine Learning pipeline for batch scoring]. Managed endpoints in Machine Learning abstract the required infrastructure for [real-time][Deploy and score a machine learning model by using an online endpoint (preview)] or [batch][Use batch endpoints (preview) for batch scoring] model inferencing. +- [Machine Learning][Azure Machine Learning service page] is an open platform for managing the development and deployment of machine-learning models at scale. Machine Learning caters to skill levels of different users, such as data scientists or business analysts. The platform supports commonly used open frameworks and offers automated featurization and algorithm selection. You can deploy models to various targets. Examples include [AKS][Deploy Azure Machine Learning to AKS], [Azure Container Instances][Deploy Azure Machine Learning to ACI] as a web service for real-time inferencing at scale, and [Azure Virtual Machine for batch scoring][Tutorial: Build an Azure Machine Learning pipeline for batch scoring]. Managed endpoints in Machine Learning abstract the required infrastructure for [real-time][Deploy and score a machine learning model by using an online endpoint (preview)] or [batch][Use batch endpoints (preview) for batch scoring] model inferencing. -* [AKS][Azure Kubernetes Service (AKS)] is a fully managed Kubernetes service that makes it easy to deploy and manage containerized applications. AKS offers serverless Kubernetes technology, an integrated continuous integration and continuous delivery (CI/CD) experience, and enterprise-grade security and governance. +- [AKS][Azure Kubernetes Service (AKS)] is a fully managed Kubernetes service that makes it easy to deploy and manage containerized applications. AKS offers serverless Kubernetes technology, an integrated continuous integration and continuous delivery (CI/CD) experience, and enterprise-grade security and governance. -* [Power BI][Power BI] is a collection of software services and apps that display analytics information. +- [Power BI][Power BI] is a collection of software services and apps that display analytics information. -* [Azure Cognitive Search][Azure Cognitive Search] is a cloud search service that supplies infrastructure, APIs, and tools for searching. You can use Azure Cognitive Search to build search experiences over private, heterogeneous content in web, mobile, and enterprise applications. +- [Azure Cognitive Search][Azure Cognitive Search] is a cloud search service that supplies infrastructure, APIs, and tools for searching. You can use Azure Cognitive Search to build search experiences over private, heterogeneous content in web, mobile, and enterprise applications. ### Alternatives -* You can use [Azure Virtual Machines][Choose the right VM for your workload and reduce costs] instead of App Service to host your application. +- You can use [Azure Virtual Machines][Choose the right VM for your workload and reduce costs] instead of App Service to host your application. -* You can use any relational database for persistent storage of the extracted data, including: +- You can use any relational database for persistent storage of the extracted data, including: - * [Azure SQL Database][Azure SQL Database]. - * [Azure Database for PostgreSQL][Azure Database for PostgreSQL]. - * [Azure Database for MySQL][Azure Database for MySQL]. + - [Azure SQL Database][Azure SQL Database]. + - [Azure Database for PostgreSQL][Azure Database for PostgreSQL]. + - [Azure Database for MySQL][Azure Database for MySQL]. ## Scenario details @@ -130,13 +130,13 @@ Optical character recognition (OCR) can extract content from images and PDF file This solution is ideal for the finance industry. It can also apply to the automotive, travel, and hospitality industries. The following tasks can benefit from this solution: -* Approving expense reports -* Processing invoices, receipts, and bills for insurance claims and financial audits -* Processing claims that include invoices, discharge summaries, and other documents -* Automating statement of work (SoW) approvals -* Automating ID extraction for verification purposes, as with passports or driver licenses -* Automating the process of entering business card data into visitor management systems -* Identifying purchase patterns and duplicate financial documents for fraud detection +- Approving expense reports +- Processing invoices, receipts, and bills for insurance claims and financial audits +- Processing claims that include invoices, discharge summaries, and other documents +- Automating statement of work (SoW) approvals +- Automating ID extraction for verification purposes, as with passports or driver licenses +- Automating the process of entering business card data into visitor management systems +- Identifying purchase patterns and duplicate financial documents for fraud detection ## Considerations @@ -148,77 +148,77 @@ Keep these points in mind when you use this solution. The availability of the architecture depends on the Azure services that make up the solution: -* Form Recognizer is part of Applied AI Services. For this service's availability guarantee, see [SLA for Azure Applied AI Services][SLA for Azure Applied AI Services]. +- Azure AI Document Intelligence is part of Azure AI services. For this service's availability guarantee, see [Service-level agreement (SLA) for Azure AI services][SLA for Azure Applied AI Services]. -* Azure Cognitive Service for Language is part of Azure Cognitive Services. For the availability guarantee for these services, see [SLA for Azure Cognitive Services][SLA for Azure Cognitive Services]. +- AI Language is part of Azure AI services. For the availability guarantee for these services, see [SLA for Azure AI services][SLA for Azure Cognitive Services]. -* Azure Cosmos DB provides high availability by maintaining four replicas of data within each region and by replicating data across regions. The exact availability guarantee depends on whether you replicate within a single region or across multiple regions. For more information, see [Achieve high availability with Azure Cosmos DB][Achieve high availability with Azure Cosmos DB]. +- Azure Cosmos DB provides high availability by maintaining four replicas of data within each region and by replicating data across regions. The exact availability guarantee depends on whether you replicate within a single region or across multiple regions. For more information, see [Achieve high availability with Azure Cosmos DB][Achieve high availability with Azure Cosmos DB]. -* Blob Storage offers redundancy options that help ensure high availability. You can use either of these approaches to replicate data three times in a primary region: +- Blob Storage offers redundancy options that help ensure high availability. You can use either of these approaches to replicate data three times in a primary region: - * At a single physical location for locally redundant storage (LRS). - * Across three availability zones that use differing availability parameters. For more information, see [Durability and availability parameters][Durability and availability parameters]. This option works best for applications that require high availability. + - At a single physical location for locally redundant storage (LRS). + - Across three availability zones that use differing availability parameters. For more information, see [Durability and availability parameters][Durability and availability parameters]. This option works best for applications that require high availability. -* For the availability guarantees of other Azure services in the solution, see these resources: +- For the availability guarantees of other Azure services in the solution, see these resources: - * [SLA for App Service][SLA for App Service] - * [SLA for Azure Functions][SLA for Azure Functions] - * [SLA for Application Gateway][SLA for Application Gateway] - * [SLA for Azure Kubernetes Service (AKS)][SLA for Azure Kubernetes Service (AKS)] + - [SLA for App Service][SLA for App Service] + - [SLA for Azure Functions][SLA for Azure Functions] + - [SLA for Application Gateway][SLA for Application Gateway] + - [SLA for Azure Kubernetes Service (AKS)][SLA for Azure Kubernetes Service (AKS)] ### Scalability -* App Service can automatically scale out and in as the application load varies. For more information, see [Create an autoscale setting for Azure resources based on performance data or a schedule][Create an Autoscale Setting for Azure resources based on performance data or a schedule]. +- App Service can automatically scale out and in as the application load varies. For more information, see [Create an autoscale setting for Azure resources based on performance data or a schedule][Create an Autoscale Setting for Azure resources based on performance data or a schedule]. -* Azure Functions can scale automatically or manually. The hosting plan that you choose determines the scaling behavior of your function apps. For more information, see [Azure Functions hosting options][Azure Functions hosting options]. +- Azure Functions can scale automatically or manually. The hosting plan that you choose determines the scaling behavior of your function apps. For more information, see [Azure Functions hosting options][Azure Functions hosting options]. -* By default, Form Recognizer supports 15 concurrent requests per second. You can increase this value by [creating an Azure support ticket][Create an Azure support request] with a quota increase request. +- By default, Azure AI Document Intelligence supports 15 concurrent requests per second. You can increase this value by [creating an Azure Support ticket][Create an Azure support request] with a quota increase request. -* For custom models that you host as web services on AKS, [azureml-fe][Deploy a model to an Azure Kubernetes Service cluster - Autoscaling] automatically scales as needed. This front-end component routes incoming inference requests to deployed services. +- For custom models that you host as web services on AKS, [azureml-fe][Deploy a model to an Azure Kubernetes Service cluster - Autoscaling] automatically scales as needed. This front-end component routes incoming inference requests to deployed services. -* For batch inferencing, Machine Learning creates a compute cluster on demand that scales automatically. For more information, see [Tutorial: Build an Azure Machine Learning pipeline for batch scoring][Tutorial: Build an Azure Machine Learning pipeline for batch scoring]. Machine Learning uses the [ParellelRunStep][ParallelRunStep Class] class to run the inferencing jobs in parallel. +- For batch inferencing, Machine Learning creates a compute cluster on demand that scales automatically. For more information, see [Tutorial: Build an Azure Machine Learning pipeline for batch scoring][Tutorial: Build an Azure Machine Learning pipeline for batch scoring]. Machine Learning uses the [ParellelRunStep][ParallelRunStep Class] class to run the inferencing jobs in parallel. -* For Azure Cognitive Service for Language, data and rate limits apply. For more information, see these resources: +- For AI Language, data and rate limits apply. For more information, see these resources: - * [How to use named entity recognition (NER)][How to use named entity recognition (NER) - Data limits] - * [How to detect and redact personal information][How to detect and redact Personal Information - Data limits] - * [How to use sentiment analysis and opinion mining][How to: Use Sentiment analysis and Opinion Mining - Data limits] - * [How to use Text Analytics for health][How to use Text Analytics for health - Data limits] + - [How to use named entity recognition (NER)][How to use named entity recognition (NER) - Data limits] + - [How to detect and redact personal information][How to detect and redact Personal Information - Data limits] + - [How to use sentiment analysis and opinion mining][How to: Use Sentiment analysis and Opinion Mining - Data limits] + - [How to use Text Analytics for health][How to use Text Analytics for health - Data limits] ### Security Security provides assurances against deliberate attacks and the abuse of your valuable data and systems. For more information, see [Overview of the security pillar](/azure/architecture/framework/security/overview). -* Azure Web Application Firewall helps protect your application from common vulnerabilities. This Application Gateway option uses Open Web Application Security Project (OWASP) rules to prevent attacks like cross-site scripting, session hijacks, and other exploits. +- Azure Web Application Firewall helps protect your application from common vulnerabilities. This Application Gateway option uses Open Web Application Security Project (OWASP) rules to prevent attacks like cross-site scripting, session hijacks, and other exploits. -* To improve App Service security, consider these options: +- To improve App Service security, consider these options: - * App Service can access resources in Azure Virtual Network through virtual network integration. - * You can use App Service in an app service environment (ASE), which you deploy to a dedicated virtual network. This approach helps to isolate the connectivity between App Service and other resources in the virtual network. + - App Service can access resources in Azure Virtual Network through virtual network integration. + - You can use App Service in an App Service Environment, which you deploy to a dedicated virtual network. This approach helps to isolate the connectivity between App Service and other resources in the virtual network. For more information, see [Security in Azure App Service][Security in Azure App Service - Resources inside an Azure Virtual Network]. -* Blob Storage and Azure Cosmos DB encrypt data at rest. You can secure these services by using service endpoints or private endpoints. +- Blob Storage and Azure Cosmos DB encrypt data at rest. You can secure these services by using service endpoints or private endpoints. -* Azure Functions supports virtual network integration. By using this functionality, function apps can access resources inside a virtual network. For more information, see [Azure Functions networking options][Azure Functions networking options]. +- Azure Functions supports virtual network integration. By using this functionality, function apps can access resources inside a virtual network. For more information, see [Azure Functions networking options][Azure Functions networking options]. -* You can configure Form Recognizer and Azure Cognitive Service for Language for access from specific virtual networks or from private endpoints. These services encrypt data at rest. You can use subscription keys, tokens, or Microsoft Entra ID to authenticate requests to these services. For more information, see [Authenticate requests to Azure Cognitive Services][Authenticate requests to Azure Cognitive Services]. +- You can configure Azure AI Document Intelligence and AI Language for access from specific virtual networks or from private endpoints. These services encrypt data at rest. You can use subscription keys, tokens, or Microsoft Entra ID to authenticate requests to these services. For more information, see [Authenticate requests to Azure AI services][Authenticate requests to Azure Cognitive Services]. -* Machine Learning offers many levels of security: +- Machine Learning offers many levels of security: - * [Workspace authentication][Set up authentication for Azure Machine Learning resources and workflows] provides identity and access management. - * You can use [authorization][Manage access to an Azure Machine Learning workspace] to manage access to the workspace. - * By [securing workspace resources][Secure an Azure Machine Learning workspace with virtual networks], you can improve network security. - * You can [use Transport Layer Security (TLS) to secure web services][Use TLS to secure a web service through Azure Machine Learning] that you deploy through Machine Learning. - * To protect data, you can [change the access keys for Azure Storage accounts][Regenerate storage account access keys] that Machine Learning uses. + - [Workspace authentication][Set up authentication for Azure Machine Learning resources and workflows] provides identity and access management. + - You can use [authorization][Manage access to an Azure Machine Learning workspace] to manage access to the workspace. + - By [securing workspace resources][Secure an Azure Machine Learning workspace with virtual networks], you can improve network security. + - You can [use Transport Layer Security (TLS) to secure web services][Use TLS to secure a web service through Azure Machine Learning] that you deploy through Machine Learning. + - To protect data, you can [change the access keys for Azure Storage accounts][Regenerate storage account access keys] that Machine Learning uses. ### Resiliency -* The solution's resiliency depends on the failure modes of individual services like App Service, Functions, Azure Cosmos DB, Storage, and Application Gateway. For more information, see [Resiliency checklist for specific Azure services][Resiliency checklist for specific Azure services]. +- The solution's resiliency depends on the failure modes of individual services like App Service, Functions, Azure Cosmos DB, Storage, and Application Gateway. For more information, see [Resiliency checklist for specific Azure services][Resiliency checklist for specific Azure services]. -* You can make Form Recognizer resilient. Possibilities include designing it to fail over to another region and splitting the workload into two or more regions. For more information, see [Back up and recover your Form Recognizer models][Back up and recover your Form Recognizer models]. +- You can make Azure AI Document Intelligence resilient. Possibilities include designing it to fail over to another region and splitting the workload into two or more regions. For more information, see [Back up and recover your Azure AI Document Intelligence models][Back up and recover your Form Recognizer models]. -* Machine Learning services depend on many Azure services. To provide resiliency, you need to configure each service to be resilient. For more information, see [Failover for business continuity and disaster recovery][Failover for business continuity and disaster recovery]. +- Machine Learning services depend on many Azure services. To provide resiliency, you need to configure each service to be resilient. For more information, see [Failover for business continuity and disaster recovery][Failover for business continuity and disaster recovery]. ### Cost optimization @@ -228,21 +228,21 @@ The cost of implementing this solution depends on which components you use and w Many factors can affect the price of each component: -* The number of documents that you process -* The number of concurrent requests that your application receives -* The size of the data that you store after processing -* Your deployment region +- The number of documents that you process +- The number of concurrent requests that your application receives +- The size of the data that you store after processing +- Your deployment region These resources provide information on component pricing options: -* [Azure Form Recognizer pricing][Azure Form Recognizer pricing] -* [App Service pricing][App Service pricing] -* [Azure Functions pricing][Azure Functions pricing] -* [Application Gateway pricing][Application Gateway pricing] -* [Azure Blob Storage pricing][Azure Blob Storage pricing] -* [Azure Cosmos DB pricing][Azure Cosmos DB pricing] -* [Language Service pricing][Language Service pricing] -* [Azure Machine Learning pricing][Azure Machine Learning pricing] +- [AI Document Intelligence pricing][Azure Form Recognizer pricing] +- [App Service pricing][App Service pricing] +- [Azure Functions pricing][Azure Functions pricing] +- [Application Gateway pricing][Application Gateway pricing] +- [Azure Blob Storage pricing][Azure Blob Storage pricing] +- [Azure Cosmos DB pricing][Azure Cosmos DB pricing] +- [Language Service pricing][Language Service pricing] +- [Azure Machine Learning pricing][Azure Machine Learning pricing] After deciding on a pricing tier for each component, use the [Azure Pricing calculator][Azure Pricing calculator] to estimate the solution cost. @@ -252,27 +252,27 @@ After deciding on a pricing tier for each component, use the [Azure Pricing calc Principal author: -* [Jyotsna Ravi](https://in.linkedin.com/in/jyotsna-ravi-50182624) | Senior Customer Engineer +- [Jyotsna Ravi](https://in.linkedin.com/in/jyotsna-ravi-50182624) | Senior Customer Engineer ## Next steps -* [What is Azure Form Recognizer?][What is Azure Form Recognizer?] -* [Use Form Recognizer SDKs or REST API][Use Form Recognizer SDKs or REST API] -* [What is Azure Cognitive Service for Language?][What is Azure Cognitive Service for Language?] -* [What is Azure Machine Learning?][What is Azure Machine Learning?] -* [Introduction to Azure Functions][Introduction to Azure Functions] -* [How to configure Azure Functions with a virtual network][How to configure Azure Functions with a virtual network] -* [What is Azure Application Gateway?][What is Azure Application Gateway?] -* [What is Azure Web Application Firewall on Azure Application Gateway?][What is Azure Web Application Firewall on Azure Application Gateway?] -* [Tutorial: How to access on-premises SQL Server from Data Factory Managed VNet using Private Endpoint][Tutorial: How to access on-premises SQL Server from Data Factory Managed VNet using Private Endpoint] -* [Azure Storage documentation][Azure Storage documentation] +- [What is AI Document Intelligence?][What is Azure Form Recognizer?] +- [Use Azure AI Document Intelligence SDKs or REST API][Use Form Recognizer SDKs or REST API] +- [What is AI Language?][What is Azure Cognitive Service for Language?] +- [What is Azure Machine Learning?][What is Azure Machine Learning?] +- [Introduction to Azure Functions][Introduction to Azure Functions] +- [How to configure Azure Functions with a virtual network][How to configure Azure Functions with a virtual network] +- [What is Azure Application Gateway?][What is Azure Application Gateway?] +- [What is Azure Web Application Firewall on Azure Application Gateway?][What is Azure Web Application Firewall on Azure Application Gateway?] +- [Tutorial: How to access on-premises SQL Server from Data Factory Managed virtual network using Private Endpoint][Tutorial: How to access on-premises SQL Server from Data Factory Managed VNet using Private Endpoint] +- [Azure Storage documentation][Azure Storage documentation] ## Related resources -* [Extract text from objects using Power Automate and AI Builder][Extract text from objects using Power Automate and AI Builder] -* [Knowledge mining in business process management][Knowledge mining in business process management] -* [Knowledge mining in contract management][Knowledge mining in contract management] -* [Knowledge mining for content research][Knowledge mining for content research] +- [Extract text from objects using Power Automate and AI Builder][Extract text from objects using Power Automate and AI Builder] +- [Knowledge mining in business process management][Knowledge mining in business process management] +- [Knowledge mining in contract management][Knowledge mining in contract management] +- [Knowledge mining for content research][Knowledge mining for content research] [Achieve high availability with Azure Cosmos DB]: /azure/cosmos-db/high-availability#slas-for-availability [App Service]: https://azure.microsoft.com/services/app-service diff --git a/docs/ai-ml/architecture/automate-pdf-forms-processing-content.md b/docs/ai-ml/architecture/automate-pdf-forms-processing-content.md index f19d317640..1a2eab91d8 100644 --- a/docs/ai-ml/architecture/automate-pdf-forms-processing-content.md +++ b/docs/ai-ml/architecture/automate-pdf-forms-processing-content.md @@ -15,7 +15,7 @@ This article describes an Azure architecture that you can use to replace costly 1. The logic app sends the location of the PDF file to a function app for processing. The function app is built by using the capabilities of Azure Functions. 1. The function app receives the location of the file and takes these actions: 1. It splits the file into single pages if the file has multiple pages. Each page contains one independent form. Split files are saved to a second container in Data Lake Storage. - 1. It uses HTTPS POST, an Azure REST API, to send the location of the single-page PDF file to Azure Form Recognizer for processing. When Form Recognizer completes its processing, it sends a response back to the function app, which places the information into a data structure. + 1. It uses HTTPS POST, an Azure REST API, to send the location of the single-page PDF file to AI Document Intelligence for processing. When Azure AI Document Intelligence completes its processing, it sends a response back to the function app, which places the information into a data structure. 1. It creates a JSON data file that contains the response data and stores the file to a third container in Data Lake Storage. 1. The forms processing logic app receives the processed response data. 1. The forms processing logic app sends the processed data to Azure Cosmos DB, which saves the data in a database and in collections. @@ -24,7 +24,7 @@ This article describes an Azure architecture that you can use to replace costly ### Components -- [Azure Applied AI Services](https://azure.microsoft.com/products/applied-ai-services) is a category of Azure AI products that use Azure Cognitive Services, task-specific AI, and business logic to provide turnkey AI services for common business processes. One of these products is [Form Recognizer](https://azure.microsoft.com/products/form-recognizer), which uses machine learning models to extract key-value pairs, text, and tables from documents. +- [Azure AI services](https://azure.microsoft.com/products/applied-ai-services) is a category of Azure AI products that use Azure AI services, task-specific AI, and business logic to provide turnkey AI services for common business processes. One of these products is [Azure AI Document Intelligence](https://azure.microsoft.com/products/form-recognizer), which uses machine learning models to extract key-value pairs, text, and tables from documents. - [Azure Logic Apps](https://azure.microsoft.com/products/logic-apps) is a serverless cloud service for creating and running automated workflows that integrate apps, data, services, and systems. - [Azure Functions](https://azure.microsoft.com/products/functions) is a serverless solution that makes it possible for you to write less code, maintain less infrastructure, and save on costs. - [Azure Data Lake Storage](https://azure.microsoft.com/products/storage/data-lake-storage) is the foundation for building enterprise data lakes on Azure. @@ -40,7 +40,7 @@ This article describes an Azure architecture that you can use to replace costly Forms processing is often a critical business function. Many companies still rely on manual processes that are costly, time consuming, and prone to error. Replacing manual processes reduces cost and risk and makes a company more agile. -This article describes an architecture that you can use to replace manual PDF forms processing or costly legacy systems that automate PDF forms processing. Form Recognizer processes the PDF forms, Logic Apps provides the workflow, and Functions provides data processing capabilities. +This article describes an architecture that you can use to replace manual PDF forms processing or costly legacy systems that automate PDF forms processing. Azure AI Document Intelligence processes the PDF forms, Logic Apps provides the workflow, and Functions provides data processing capabilities. For deployment information, see [Deploy this scenario](#deploy-this-scenario) in this article. @@ -60,19 +60,19 @@ The solution that's described in this article can process many types of forms, i ## Considerations -These considerations implement the pillars of the Azure Well-Architected Framework, a set of guiding tenets that you can use to improve the quality of a workload. For more information, see [Microsoft Azure Well-Architected Framework](/azure/architecture/framework). +These considerations implement the pillars of the Azure Well-Architected Framework, a set of guiding tenets that you can use to improve the quality of a workload. For more information, see [Microsoft Azure Well-Architected Framework](/azure/architecture/framework). ### Reliability -Reliability ensures that your application can meet the commitments that you make to your customers. For more information, see [Overview of the reliability pillar](/azure/architecture/framework/resiliency/overview). +Reliability ensures that your application can meet the commitments that you make to your customers. For more information, see [Overview of the reliability pillar](/azure/architecture/framework/resiliency/overview). -A reliable workload is one that's both resilient and available. *Resiliency* is the ability of the system to recover from failures and continue to function. The goal of resiliency is to return the application to a fully functioning state after a failure occurs. *Availability* is a measure of whether your users can access your workload when they need to. +A reliable workload is one that's both resilient and available. *Resiliency* is the ability of the system to recover from failures and continue to function. The goal of resiliency is to return the application to a fully functioning state after a failure occurs. *Availability* is a measure of whether your users can access your workload when they need to. This architecture is intended as a starter architecture that you can quickly deploy and prototype to provide a business solution. If your prototype is a success, you can then extend and enhance the architecture, if necessary, to meet additional requirements. This architecture utilizes scalable and resilient Azure infrastructure and technologies. For example, Azure Cosmos DB has built-in redundancy and global coverage that you can configure to meet your needs. -For the availability guarantees of the Azure services that this solution uses, see [Service Level Agreements (SLA) for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services). +For the availability guarantees of the Azure services that this solution uses, see [Service-level agreements (SLAs) for Online Services](https://www.microsoft.com/licensing/docs/view/Service-Level-Agreements-SLA-for-Online-Services). ### Security @@ -91,7 +91,7 @@ Cost optimization is about looking at ways to reduce unnecessary expenses and to Here are some guidelines for optimizing costs: -- Use the pay-as-you-go strategy for your architecture, and [scale out](/azure/architecture/framework/cost/optimize-autoscale) as needed rather than investing in large-scale resources at the start. +- Use the pay-as-you-go strategy for your architecture, and [scale out](/azure/architecture/framework/cost/optimize-autoscale) as needed rather than investing in large-scale resources at the start. - The implementation of the architecture that's described in [Deploy this scenario](#deploy-this-scenario) deploys a starting solution that's suitable for proof of concept. The deployment scripts create a working architecture with minimal resource requirements. For example, the deployment scripts create a smallest serverless Linux host to run the function app. ### Performance efficiency @@ -111,7 +111,7 @@ The accelerator receives the PDF forms, extracts the data fields, and saves the You can use the accelerator as is, without code modification, to process and visualize any single-page PDF forms such as safety forms, invoices, incident records, and many others. To use it, you only need to collect sample PDF forms, train a new model to learn the layout of the forms, and plug the model into the solution. You also need to redesign the Power BI report for your datasets so that it provides the insights that you want. -The implementation uses [Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio) to create custom models. The accelerator uses the field names that are saved in the machine learning model as a reference to process other forms. Only five sample forms are needed to create a custom-built machine learning model. You can merge as many as 100 custom-built models to create a composite machine learning model that can process a variety of forms. +The implementation uses [Azure AI Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio) to create custom models. The accelerator uses the field names that are saved in the machine learning model as a reference to process other forms. Only five sample forms are needed to create a custom-built machine learning model. You can merge as many as 100 custom-built models to create a composite machine learning model that can process a variety of forms. ### Deployment repository @@ -123,7 +123,7 @@ To deploy, you need an Azure subscription. For information about free subscripti To learn about the services that are used in the accelerator, see the overview and reference articles that are listed in: -- [Azure Form Recognizer documentation](/azure/applied-ai-services/form-recognizer/?view=form-recog-3.0.0) +- [AI Document Intelligence documentation](/azure/applied-ai-services/form-recognizer/?view=form-recog-3.0.0) - [Azure Logic Apps documentation](/azure/logic-apps) - [Azure Functions documentation](/azure/azure-functions) - [Introduction to Azure Data Lake Storage Gen2](/azure/storage/blobs/data-lake-storage-introduction) @@ -169,7 +169,7 @@ Other contributors: ## Next steps -- [Video: Azure PDF Form Processing Automation SA](https://www.youtube.com/watch?v=2zvoO1jc8CE). +- [Video: Azure PDF Form Processing Automation](https://www.youtube.com/watch?v=2zvoO1jc8CE). - [Azure PDF Form Processing Automation Solution Accelerator](https://github.com/microsoft/Azure-PDF-Form-Processing-Automation-Solution-Accelerator) - [Azure invoice Process Automation Solution Accelerator](https://github.com/microsoft/Azure-Invoice-Process-Automation-Solution-Accelerator) - [Business Process Automation Accelerator](https://github.com/Azure/business-process-automation) @@ -180,4 +180,4 @@ Other contributors: - [Custom document processing models on Azure](../../example-scenario/document-processing/build-deploy-custom-models.yml) - [Index file content and metadata by using Azure Cognitive Search](../../example-scenario/data/search-blob-metadata.yml) - [Automate document identification, classification, and search by using Durable Functions](../../example-scenario/ai/automate-document-classification-durable-functions.yml) -- [Automate document processing by using Azure Form Recognizer](../../example-scenario/ai/automate-document-processing-azure-form-recognizer.yml) +- [Automate document processing by using Azure AI Document Intelligence](../../example-scenario/ai/automate-document-processing-azure-form-recognizer.yml) diff --git a/docs/ai-ml/architecture/batch-scoring-R-models-content.md b/docs/ai-ml/architecture/batch-scoring-R-models-content.md index b0ab59f94f..38db32822d 100644 --- a/docs/ai-ml/architecture/batch-scoring-R-models-content.md +++ b/docs/ai-ml/architecture/batch-scoring-R-models-content.md @@ -33,7 +33,7 @@ A supermarket chain needs to forecast sales of products over the upcoming quarte Processing involves the following steps: -1. An Azure Logic App triggers the forecast generation process once per week. +1. An Azure logic app triggers the forecast generation process once per week. 1. The logic app starts an Azure Container Instance running the scheduler Docker container, which triggers the scoring jobs on the Batch cluster. @@ -85,7 +85,7 @@ Monitor and terminate Batch jobs from the **Jobs** pane of the Batch account in The doAzureParallel package automatically collects logs of all stdout/stderr for every job submitted on Azure Batch. These logs can be found in the storage account created at setup. To view them, use a storage navigation tool such as [Azure Storage Explorer][storage-explorer] or Azure portal. -To quickly debug Batch jobs during development, view the logs in your local R session. For more information, see using the [Configure and submit training runs][getJobFiles]. +To quickly debug Batch jobs during development, view the logs in your local R session. For more information, see using the [Configure and submit training runs][getJobFiles]. ### Cost optimization @@ -93,10 +93,10 @@ Cost optimization is about looking at ways to reduce unnecessary expenses and im The compute resources used in this reference architecture are the most costly components. For this scenario, a cluster of fixed size is created whenever the job is triggered and then shut down after the job has completed. Cost is incurred only while the cluster nodes are starting, running, or shutting down. This approach is suitable for a scenario where the compute resources required to generate the forecasts remain relatively constant from job to job. -In scenarios where the amount of compute required to complete the job isn't known in advance, it may be more suitable to use automatic scaling. With this approach, the size of the cluster is scaled up or down depending on the size of the job. Azure Batch supports a range of autoscale formulae, which you can set when defining the cluster using the +In scenarios where the amount of compute required to complete the job isn't known in advance, it might be more suitable to use automatic scaling. With this approach, the size of the cluster is scaled up or down depending on the size of the job. Azure Batch supports a range of autoscale formulas, which you can set when defining the cluster using the [doAzureParallel][doAzureParallel] API. -For some scenarios, the time between jobs may be too short to shut down and start up the cluster. In these cases, keep the cluster running between jobs if appropriate. +For some scenarios, the time between jobs might be too short to shut down and start up the cluster. In these cases, keep the cluster running between jobs if appropriate. Azure Batch and doAzureParallel support the use of low-priority VMs. These VMs come with a significant discount but risk being appropriated by other higher priority workloads. Therefore, the use of low-priority VMs isn't recommended for critical production workloads. However, they're useful for experimental or development workloads. @@ -106,12 +106,12 @@ To deploy this reference architecture, follow the steps described in the [GitHub ## Contributors -*This article is maintained by Microsoft. It was originally written by the following contributors.* +*This article is maintained by Microsoft. It was originally written by the following contributors.* Principal author: - - [Angus Taylor](https://www.linkedin.com/in/angus-taylor-99ab4a74) | Senior Data Scientist - +- [Angus Taylor](https://www.linkedin.com/in/angus-taylor-99ab4a74) | Senior Data Scientist + *To see non-public LinkedIn profiles, sign in to LinkedIn.* ## Next steps diff --git a/docs/ai-ml/architecture/batch-scoring-databricks-content.md b/docs/ai-ml/architecture/batch-scoring-databricks-content.md index 3543ee4d0f..7575b5b734 100644 --- a/docs/ai-ml/architecture/batch-scoring-databricks-content.md +++ b/docs/ai-ml/architecture/batch-scoring-databricks-content.md @@ -73,7 +73,7 @@ These considerations implement the pillars of the Azure Well-Architected Framewo ### Performance -An Azure Databricks cluster enables autoscaling by default so that during runtime, Databricks dynamically reallocates workers to account for the characteristics of your job. Certain parts of your pipeline may be more computationally demanding than others. Databricks adds extra workers during these phases of your job (and removes them when they're no longer needed). Autoscaling makes it easier to achieve high [cluster utilization][cluster], because you don't need to provision the cluster to match a workload. +An Azure Databricks cluster enables autoscaling by default so that during runtime, Databricks dynamically reallocates workers to account for the characteristics of your job. Certain parts of your pipeline might be more computationally demanding than others. Databricks adds extra workers during these phases of your job (and removes them when they're no longer needed). Autoscaling makes it easier to achieve high [cluster utilization][cluster], because you don't need to provision the cluster to match a workload. Develop more complex scheduled pipelines by using [Azure Data Factory][adf] with Azure Databricks. @@ -93,7 +93,7 @@ For this scenario, the standard pricing tier is sufficient. However, if your spe The solution notebooks can run on any Spark-based platform with minimal edits to remove the Databricks-specific packages. See the following similar solutions for various Azure platforms: -- [SQL Server R services][sql-r] +- [SQL Server R Services][sql-r] - [PySpark on an Azure Data Science Virtual Machine][py-dvsm] ## Deploy this scenario @@ -102,12 +102,12 @@ To deploy this reference architecture, follow the steps described in the [GitHub ## Contributors -*This article is maintained by Microsoft. It was originally written by the following contributors.* +*This article is maintained by Microsoft. It was originally written by the following contributors.* Principal author: - - [John Ehrlinger](https://www.linkedin.com/in/ehrlinger) | Senior Applied Scientist - +- [John Ehrlinger](https://www.linkedin.com/in/ehrlinger) | Senior Applied Scientist + *To see non-public LinkedIn profiles, sign in to LinkedIn.* ## Next steps diff --git a/docs/ai-ml/architecture/batch-scoring-deep-learning-content.md b/docs/ai-ml/architecture/batch-scoring-deep-learning-content.md index d61f241d72..7dfecf5eb6 100644 --- a/docs/ai-ml/architecture/batch-scoring-deep-learning-content.md +++ b/docs/ai-ml/architecture/batch-scoring-deep-learning-content.md @@ -20,7 +20,7 @@ This architecture consists of the following components. #### Trigger -**[Azure Logic Apps][logic-apps]** triggers the workflow. When the Logic App detects that a blob has been added to the container, it triggers the Azure Machine Learning pipeline. Logic Apps is a good fit for this reference architecture because it's an easy way to detect changes to blob storage, with an easy process for changing the trigger. +**[Azure Logic Apps][logic-apps]** triggers the workflow. When the logic app detects that a blob has been added to the container, it triggers the Azure Machine Learning pipeline. Logic Apps is a good fit for this reference architecture because it's an easy way to detect changes to blob storage, with an easy process for changing the trigger. #### Preprocess and postprocess the data @@ -40,7 +40,7 @@ This reference architecture uses video footage of an orangutan in a tree. ## Solution details -This reference architecture is designed for workloads that are triggered by the presence of new media in Azure storage. +This reference architecture is designed for workloads that are triggered by the presence of new media in Azure Storage. Processing involves the following steps: @@ -83,7 +83,7 @@ Security provides assurances against deliberate attacks and the abuse of your va #### Restrict access to Azure Blob Storage -In this reference architecture, Azure Blob Storage is the main storage component that needs to be protected. The baseline deployment shown in the GitHub repo uses storage account keys to access the blob storage. For further control and protection, consider using a [shared access signature (SAS)][storage-sas-overview] instead. This grants limited access to objects in storage, without needing to hard code the account keys or save them in plaintext. This approach is especially useful because account keys are visible in plaintext inside of Logic App's designer interface. Using an SAS also helps to ensure that the storage account has proper governance, and that access is granted only to the people intended to have it. +In this reference architecture, Azure Blob Storage is the main storage component that needs to be protected. The baseline deployment shown in the GitHub repo uses storage account keys to access the blob storage. For further control and protection, consider using a [Shared Access Signature (SAS)][storage-sas-overview] instead. This grants limited access to objects in storage, without needing to hard code the account keys or save them in plaintext. This approach is especially useful because account keys are visible in plaintext inside of logic app's designer interface. Using an SAS also helps to ensure that the storage account has proper governance, and that access is granted only to the people intended to have it. For scenarios with more sensitive data, make sure that all of your storage keys are protected, because these keys grant full access to all input and output data from the workload. @@ -99,7 +99,7 @@ When deploying your Machine Learning compute cluster, you can configure your clu In scenarios where there are multiple users, make sure that sensitive data is protected against malicious activity. If other users are given access to this deployment to customize the input data, note the following precautions and considerations: -- Use [Azure role-based access control (Azure RBAC)][rbac] to limit users' access to only the resources they need. +- Use [Azure role-based access control (RBAC)][rbac] to limit users' access to only the resources they need. - Provision two separate storage accounts. Store input and output data in the first account. External users can be given access to this account. Store executable scripts and output log files in the other account. External users should not have access to this account. This separation ensures that external users can't modify any executable files (to inject malicious code), and don't have access to log files, which could hold sensitive information. - Malicious users can perform a [DDoS attack][ddos] on the job queue or inject malformed poison messages in the job queue, causing the system to lock up or causing dequeuing errors. @@ -113,9 +113,9 @@ The Azure Machine Learning Compute cluster size can automatically scale up and d For work that doesn't require immediate processing, configure autoscale so the default state (minimum) is a cluster of zero nodes. With this configuration, the cluster starts with zero nodes and only scales up when it detects jobs in the queue. If the batch scoring process happens only a few times a day or less, this setting results in significant cost savings. -Autoscaling may not be appropriate for batch jobs that happen too close to each other. The time that it takes for a cluster to spin up and spin down also incur a cost, so if a batch workload begins only a few minutes after the previous job ends, it might be more cost effective to keep the cluster running between jobs. +Autoscaling might not be appropriate for batch jobs that happen too close to each other. The time that it takes for a cluster to spin up and spin down also incur a cost, so if a batch workload begins only a few minutes after the previous job ends, it might be more cost effective to keep the cluster running between jobs. -Azure Machine Learning Compute also supports low-priority virtual machines, which allows you to run your computation on discounted virtual machines, with the caveat that they may be preempted at any time. Low-priority virtual machines are ideal for non-critical batch scoring workloads. +Azure Machine Learning Compute also supports low-priority virtual machines, which allows you to run your computation on discounted virtual machines, with the caveat that they might be preempted at any time. Low-priority virtual machines are ideal for non-critical batch scoring workloads. ### Monitor batch jobs @@ -137,11 +137,11 @@ You can also deploy a batch scoring architecture for deep learning models by usi ## Contributors -*This article is maintained by Microsoft. It was originally written by the following contributors.* +*This article is maintained by Microsoft. It was originally written by the following contributors.* Principal author: - - [Jian Tang](https://www.linkedin.com/in/jian-tang-9739a814/) | Program Manager II +- [Jian Tang](https://www.linkedin.com/in/jian-tang-9739a814/) | Program Manager II *To see non-public LinkedIn profiles, sign in to LinkedIn.*