Skip to content

Latest commit

 

History

History
492 lines (283 loc) · 32.4 KB

File metadata and controls

492 lines (283 loc) · 32.4 KB
title titleSuffix description author manager ms.service ms.custom ms.topic ms.date ms.author
How to guide: create and compose custom models with Document Intelligence (formerly Form Recognizer)
Azure AI services
Learn how to create, use, and manage Document Intelligence custom and composed models
laujan
nitinme
azure-ai-document-intelligence
ignite-2023
how-to
05/23/2024
lajanuar

Compose custom models

::: moniker range="doc-intel-4.0.0" [!INCLUDE applies to v4.0] ::: moniker-end

::: moniker range="doc-intel-3.1.0" [!INCLUDE applies to v3.1] ::: moniker-end

::: moniker range="doc-intel-3.0.0" [!INCLUDE applies to v3.0] ::: moniker-end

::: moniker range="doc-intel-2.1.0" [!INCLUDE applies to v2.1] ::: moniker-end

::: moniker range=">=doc-intel-3.0.0"

A composed model is created by taking a collection of custom models and assigning them to a single model ID. You can assign up to 200 trained custom models to a single composed model ID. When a document is submitted to a composed model, the service performs a classification step to decide which custom model accurately represents the form presented for analysis. Composed models are useful when you've trained several models and want to group them to analyze similar form types. For example, your composed model might include custom models trained to analyze your supply, equipment, and furniture purchase orders. Instead of manually trying to select the appropriate model, you can use a composed model to determine the appropriate custom model for each analysis and extraction.

To learn more, see Composed custom models.

In this article, you learn how to create and use composed custom models to analyze your forms and documents.

Prerequisites

To get started, you need the following resources:

  • An Azure subscription. You can create a free Azure subscription.

  • A Document Intelligence instance. Once you have your Azure subscription, create a Document Intelligence resource in the Azure portal to get your key and endpoint. If you have an existing Document Intelligence resource, navigate directly to your resource page. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

    1. After the resource deploys, select Go to resource.

    2. Copy the Keys and Endpoint values from the Azure portal and paste them in a convenient location, such as Microsoft Notepad. You need the key and endpoint values to connect your application to the Document Intelligence API.

    :::image type="content" source="../media/containers/keys-and-endpoint.png" alt-text="Still photo showing how to access resource key and endpoint URL.":::

    [!TIP] For more information, see create a Document Intelligence resource.

  • An Azure storage account. If you don't know how to create an Azure storage account, follow the Azure Storage quickstart for Azure portal. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Create your custom models

First, you need a set of custom models to compose. You can use the Document Intelligence Studio, REST API, or client-library SDKs. The steps are as follows:

Assemble your training dataset

Building a custom model begins with establishing your training dataset. You need a minimum of five completed forms of the same type for your sample dataset. They can be of different file types (jpg, png, pdf, tiff) and contain both text and handwriting. Your forms must follow the input requirements for Document Intelligence.

Tip

Follow these tips to optimize your data set for training:

  • If possible, use text-based PDF documents instead of image-based documents. Scanned PDFs are handled as images.
  • For filled-in forms, use examples that have all of their fields filled in.
  • Use forms with different values in each field.
  • If your form images are of lower quality, use a larger data set (10-15 images, for example).

See Build a training data set for tips on how to collect your training documents.

Upload your training dataset

When you've gathered a set of training documents, you need to upload your training data to an Azure blob storage container.

If you want to use manually labeled data, you have to upload the .labels.json and .ocr.json files that correspond to your training documents.

Train your custom model

When you train your model with labeled data, the model uses supervised learning to extract values of interest, using the labeled forms you provide. Labeled data results in better-performing models and can produce models that work with complex forms or forms containing values without keys.

Document Intelligence uses the prebuilt-layout model API to learn the expected sizes and positions of typeface and handwritten text elements and extract tables. Then it uses user-specified labels to learn the key/value associations and tables in the documents. We recommend that you use five manually labeled forms of the same type (same structure) to get started with training a new model. Then, add more labeled data, as needed, to improve the model accuracy. Document Intelligence enables training a model to extract key-value pairs and tables using supervised learning capabilities.

To create custom models, start with configuring your project:

  1. From the Studio homepage, select Create new from the Custom model card.

  2. Use the ➕ Create a project command to start the new project configuration wizard.

  3. Enter project details, select the Azure subscription and resource, and the Azure Blob storage container that contains your data.

  4. Review and submit your settings to create the project.

:::image type="content" source="../media/studio/create-project.gif" alt-text="Animation showing create a custom project in Document Intelligence Studio.":::

While creating your custom models, you may need to extract data collections from your documents. The collections may appear one of two formats. Using tables as the visual pattern:

  • Dynamic or variable count of values (rows) for a given set of fields (columns)

  • Specific collection of values for a given set of fields (columns and/or rows)

See Document Intelligence Studio: labeling as tables

Training with labels leads to better performance in some scenarios. To train with labels, you need to have special label information files (<filename>.pdf.labels.json) in your blob storage container alongside the training documents.

Label files contain key-value associations that a user has entered manually. They're needed for labeled data training, but not every source file needs to have a corresponding label file. Source files without labels are treated as ordinary training documents. We recommend five or more labeled files for reliable training. You can use a UI tool like Document Intelligence Studio to generate these files.

Once you have your label files, you can include them with by calling the training method with the useLabelFile parameter set to true.

:::image type="content" source="../media/studio/rest-use-labels.png" alt-text="Screenshot showing the useLabelFile optional parameter.":::

Training with labels leads to better performance in some scenarios. To train with labels, you need to have special label information files (<filename>.pdf.labels.json) in your blob storage container alongside the training documents. Once you have them, you can call the training method with the useTrainingLabels parameter set to true.

Language Method
C# StartBuildModel
Java beginBuildModel
JavaScript beginBuildModel
Python begin_build_document_model

Create a composed model

Note

the create compose model operation is only available for custom models trained with labels. Attempting to compose unlabeled models will produce an error.

With the create compose model operation, you can assign up to 100 trained custom models to a single model ID. When analyze documents with a composed model, Document Intelligence first classifies the form you submitted, then chooses the best matching assigned model, and returns results for that model. This operation is useful when incoming forms may belong to one of several templates.

Once the training process has successfully completed, you can begin to build your composed model. Here are the steps for creating and using composed models:

Gather your model IDs

When you train models using the Document Intelligence Studio, the model ID is located in the models menu under a project:

:::image type="content" source="../media/studio/composed-model.png" alt-text="Screenshot of model configuration window in Document Intelligence Studio.":::

Compose your custom models

  1. Select a custom models project.

  2. In the project, select the Models menu item.

  3. From the resulting list of models, select the models you wish to compose.

  4. Choose the Compose button from the upper-left corner.

  5. In the pop-up window, name your newly composed model and select Compose.

  6. When the operation completes, your newly composed model appears in the list.

  7. Once the model is ready, use the Test command to validate it with your test documents and observe the results.

Analyze documents

The custom model Analyze operation requires you to provide the modelID in the call to Document Intelligence. You should provide the composed model ID for the modelID parameter in your applications.

:::image type="content" source="../media/studio/composed-model-id.png" alt-text="Screenshot of a composed model ID in Document Intelligence Studio.":::

Manage your composed models

You can manage your custom models throughout life cycles:

  • Test and validate new documents.
  • Download your model to use in your applications.
  • Delete your model when its lifecycle is complete.

:::image type="content" source="../media/studio/compose-manage.png" alt-text="Screenshot of a composed model in the Document Intelligence Studio":::

Once the training process has successfully completed, you can begin to build your composed model. Here are the steps for creating and using composed models:

Compose your custom models

The compose model API accepts a list of model IDs to be composed.

:::image type="content" source="../media/compose-model-request-body.png" alt-text="Screenshot of compose model request.":::

Analyze documents

To make an Analyze document request, use a unique model name in the request parameters.

:::image type="content" source="../media/custom-model-analyze-request.png" alt-text="Screenshot of a custom model request URL.":::

Manage your composed models

You can manage custom models throughout your development needs including copying, listing, and deleting your models.

Once the training process has successfully completed, you can begin to build your composed model. Here are the steps for creating and using composed models:

Create a composed model

You can use the programming language of your choice to create a composed model:

Programming language Code sample
C# Model compose
Java Model compose
JavaScript Compose model
Python Create composed model

Analyze documents

Once you've built your composed model, you can use it to analyze forms and documents. Use your composed model ID and let the service decide which of your aggregated custom models fits best according to the document provided.

Programming language Code sample
C# Analyze a document with a custom/composed model using model ID
Java Analyze a document with a custom/composed model using model ID
JavaScript Analyze a document with a custom/composed model using model ID
Python Analyze a document with a custom/composed model using model ID

Manage your composed models

You can manage a custom model at each stage in its life cycles. You can copy a custom model between resources, view a list of all custom models under your subscription, retrieve information about a specific custom model, and delete custom models from your account.

Programming language Code sample
C# Copy a custom model between Document Intelligence resources
Java Copy a custom model between Document Intelligence resources
JavaScript Copy a custom model between Document Intelligence resources
Python Copy a custom model between Document Intelligence resources

Great! You've learned the steps to create custom and composed models and use them in your Document Intelligence projects and applications.

Next steps

Try one of our Document Intelligence quickstarts:

[!div class="nextstepaction"] Document Intelligence Studio

[!div class="nextstepaction"] REST API

[!div class="nextstepaction"] C#

[!div class="nextstepaction"] Java

[!div class="nextstepaction"] JavaScript

[!div class="nextstepaction"] Python

:::moniker-end

::: moniker range="doc-intel-2.1.0"

Document Intelligence uses advanced machine-learning technology to detect and extract information from document images and return the extracted data in a structured JSON output. With Document Intelligence, you can train standalone custom models or combine custom models to create composed models.

  • Custom models. Document Intelligence custom models enable you to analyze and extract data from forms and documents specific to your business. Custom models are trained for your distinct data and use cases.

  • Composed models. A composed model is created by taking a collection of custom models and assigning them to a single model that encompasses your form types. When a document is submitted to a composed model, the service performs a classification step to decide which custom model accurately represents the form presented for analysis.

In this article, you learn how to create Document Intelligence custom and composed models using our Document Intelligence Sample Labeling tool, REST APIs, or client-library SDKs.

Sample Labeling tool

Try extracting data from custom forms using our Sample Labeling tool. You need the following resources:

  • An Azure subscription—you can create one for free

  • A Document Intelligence instance in the Azure portal. You can use the free pricing tier (F0) to try the service. After your resource deploys, select Go to resource to get your key and endpoint.

:::image type="content" source="../media/containers/keys-and-endpoint.png" alt-text="Screenshot of keys and endpoint location in the Azure portal.":::

[!div class="nextstepaction"] Try it

In the Document Intelligence UI:

  1. Select Use Custom to train a model with labels and get key value pairs.

    :::image type="content" source="../media/label-tool/fott-use-custom.png" alt-text="Screenshot of the FOTT tool select custom model option.":::

  2. In the next window, select New project:

    :::image type="content" source="../media/label-tool/fott-new-project.png" alt-text="Screenshot of the FOTT tool select new project option.":::

Create your models

The steps for building, training, and using custom and composed models are as follows:

Assemble your training dataset

Building a custom model begins with establishing your training dataset. You need a minimum of five completed forms of the same type for your sample dataset. They can be of different file types (jpg, png, pdf, tiff) and contain both text and handwriting. Your forms must follow the input requirements for Document Intelligence.

Upload your training dataset

You need to upload your training data to an Azure blob storage container. If you don't know how to create an Azure storage account with a container, see Azure Storage quickstart for Azure portal. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Train your custom model

You train your model with labeled data sets. Labeled datasets rely on the prebuilt-layout API, but supplementary human input is included such as your specific labels and field locations. Start with at least five completed forms of the same type for your labeled training data.

When you train with labeled data, the model uses supervised learning to extract values of interest, using the labeled forms you provide. Labeled data results in better-performing models and can produce models that work with complex forms or forms containing values without keys.

Document Intelligence uses the Layout API to learn the expected sizes and positions of typeface and handwritten text elements and extract tables. Then it uses user-specified labels to learn the key/value associations and tables in the documents. We recommend that you use five manually labeled forms of the same type (same structure) to get started when training a new model. Add more labeled data as needed to improve the model accuracy. Document Intelligence enables training a model to extract key value pairs and tables using supervised learning capabilities.

Get started with Train with labels

[!VIDEO https://learn.microsoft.com/Shows/Docs-Azure/Azure-Form-Recognizer/player]

Create a composed model

Note

Model Compose is only available for custom models trained with labels. Attempting to compose unlabeled models will produce an error.

With the Model Compose operation, you can assign up to 200 trained custom models to a single model ID. When you call Analyze with the composed model ID, Document Intelligence classifies the form you submitted first, chooses the best matching assigned model, and then returns results for that model. This operation is useful when incoming forms may belong to one of several templates.

Using the Document Intelligence Sample Labeling tool, the REST API, or the Client-library SDKs, follow the steps to set up a composed model:

  1. Gather your custom model IDs
  2. Compose your custom models

Gather your custom model IDs

Once the training process has successfully completed, your custom model is assigned a model ID. You can retrieve a model ID as follows:

When you train models using the Document Intelligence Sample Labeling tool, the model ID is located in the Train Result window:

:::image type="content" source="../media/fott-training-results.png" alt-text="Screenshot of training results window.":::

The REST API returns a 201 (Success) response with a Location header. The value of the last parameter in this header is the model ID for the newly trained model:

:::image type="content" source="../media/model-id.png" alt-text="Screenshot of the returned location header containing the model ID.":::

The client-library SDKs return a model object that can be queried to return the trained model ID:


Compose your custom models

After you've gathered your custom models corresponding to a single form type, you can compose them into a single model.

The Sample Labeling tool enables you to quickly get started training models and composing them to a single model ID.

After you have completed training, compose your models as follows:

  1. On the left rail menu, select the Model Compose icon (merging arrow).

  2. In the main window, select the models you wish to assign to a single model ID. Models with the arrows icon are already composed models.

  3. Choose the Compose button from the upper-left corner.

  4. In the pop-up window, name your newly composed model and select Compose.

When the operation completes, your newly composed model appears in the list.

:::image type="content" source="../media/custom-model-compose.png" alt-text="Screenshot of the model compose window." lightbox="../media/custom-model-compose-expanded.png":::

Using the REST API, you can make a Compose Custom Model request to create a single composed model from existing models. The request body requires a string array of your modelIds to compose and you can optionally define the modelName.

Use the programming language code of your choice to create a composed model that is called with a single model ID. The following links are code samples that demonstrate how to create a composed model from existing custom models:


Analyze documents with your custom or composed model

The custom form Analyze operation requires you to provide the modelID in the call to Document Intelligence. You can provide a single custom model ID or a composed model ID for the modelID parameter.

  1. On the tool's left-pane menu, select the Analyze icon (light bulb).

  2. Choose a local file or image URL to analyze.

  3. Select the Run Analysis button.

  4. The tool applies tags in bounding boxes and reports the confidence percentage for each tag.

:::image type="content" source="../media/analyze.png" alt-text="Screenshot of Document Intelligence tool analyze-a-custom-form window.":::

Using the REST API, you can make an Analyze Document request to analyze a document and extract key-value pairs and table data.

Using the programming language of your choice to analyze a form or document with a custom or composed model. You need your Document Intelligence endpoint, key, and model ID.


Test your newly trained models by analyzing forms that weren't part of the training dataset. Depending on the reported accuracy, you may want to do further training to improve the model. You can continue further training to improve results.

Manage your custom models

You can manage your custom models throughout their lifecycle by viewing a list of all custom models under your subscription, retrieving information about a specific custom model, and deleting custom models from your account.

Great! You've learned the steps to create custom and composed models and use them in your Document Intelligence projects and applications.

Next steps

Learn more about the Document Intelligence client library by exploring our API reference documentation.

[!div class="nextstepaction"] Document Intelligence API reference

::: moniker-end