Skip to content

Latest commit

 

History

History
349 lines (262 loc) · 15.9 KB

computer-vision-how-to-install-containers.md

File metadata and controls

349 lines (262 loc) · 15.9 KB
title titleSuffix description author manager ms.service ms.topic ms.date ms.author keywords
Azure AI Vision 3.2 GA Read OCR container
Azure AI services
Use the Read 3.2 OCR containers from Azure AI Vision to extract text from images and documents, on-premises.
PatrickFarley
nitinme
azure-ai-vision
how-to
06/26/2024
pafarley
on-premises, OCR, Docker, container

Install Azure AI Vision 3.2 GA Read OCR container

Containers let you run the Azure AI Vision APIs in your own environment and can help you meet specific security and data governance requirements. In this article you'll learn how to download, install, and run the Azure AI Vision Read (OCR) container.

The Read container allows you to extract printed and handwritten text from images and documents in JPEG, PNG, BMP, PDF, and TIFF file formats. For more information on the Read service, see the Read API how-to guide.

What's new

The 3.2-model-2022-04-30 GA version of the Read container is available with support for 164 languages and other enhancements. If you're an existing customer, follow the download instructions to get started.

The Read 3.2 OCR container is the latest GA model and provides:

  • New models for enhanced accuracy.
  • Support for multiple languages within the same document.
  • Support for a total of 164 languages. See the full list of OCR-supported languages.
  • A single operation for both documents and images.
  • Support for larger documents and images.
  • Confidence scores.
  • Support for documents with both print and handwritten text.
  • Ability to extract text from only selected page(s) in a document.
  • Choose text line output order from default to a more natural reading order for Latin languages only.
  • Text line classification as handwritten style or not for Latin languages only.

If you're using the Read 2.0 container today, see the migration guide to learn about changes in the new versions.

Prerequisites

You must meet the following prerequisites before using the containers:

Required Purpose
Docker Engine You need the Docker Engine installed on a host computer. Docker provides packages that configure the Docker environment on macOS, Windows, and Linux. For a primer on Docker and container basics, see the Docker overview.

Docker must be configured to allow the containers to connect with and send billing data to Azure.

On Windows, Docker must also be configured to support Linux containers.

Familiarity with Docker You should have a basic understanding of Docker concepts, like registries, repositories, containers, and container images, as well as knowledge of basic docker commands.
Computer Vision resource In order to use the container, you must have:

A Computer Vision resource and the associated API key the endpoint URI. Both values are available on the Overview and Keys pages for the resource and are required to start the container.

{API_KEY}: One of the two available resource keys on the Keys page

{ENDPOINT_URI}: The endpoint as provided on the Overview page

If you don't have an Azure subscription, create a free account before you begin.

[!INCLUDE Gathering required container parameters]

Host computer requirements

[!INCLUDE Host Computer requirements]

Advanced Vector Extension support

The host computer is the computer that runs the docker container. The host must support Advanced Vector Extensions (AVX2). You can check for AVX2 support on Linux hosts with the following command:

grep -q avx2 /proc/cpuinfo && echo AVX2 supported || echo No AVX2 support detected

Warning

The host computer is required to support AVX2. The container will not function correctly without AVX2 support.

Container requirements and recommendations

[!INCLUDE Container requirements and recommendations]

Get the container image

The Azure AI Vision Read OCR container image can be found on the mcr.microsoft.com container registry syndicate. It resides within the azure-cognitive-services repository and is named read. The fully qualified container image name is, mcr.microsoft.com/azure-cognitive-services/vision/read.

To use the latest version of the container, you can use the latest tag. You can also find a full list of tags on the MCR.

The following container images for Read are available.

Container Container Registry / Repository / Image Name Tags
Read 3.2 GA mcr.microsoft.com/azure-cognitive-services/vision/read:3.2-model-2022-04-30 latest, 3.2, 3.2-model-2022-04-30

Use the docker pull command to download a container image.

docker pull mcr.microsoft.com/azure-cognitive-services/vision/read:3.2-model-2022-04-30

[!INCLUDE Tip for using docker list]

How to use the container

Once the container is on the host computer, use the following process to work with the container.

  1. Run the container, with the required billing settings. More examples of the docker run command are available.
  2. Query the container's prediction endpoint.

Run the container

Use the docker run command to run the container. Refer to gather required parameters for details on how to get the {ENDPOINT_URI} and {API_KEY} values.

Examples of the docker run command are available.

docker run --rm -it -p 5000:5000 --memory 16g --cpus 8 \
mcr.microsoft.com/azure-cognitive-services/vision/read:3.2-model-2022-04-30 \
Eula=accept \
Billing={ENDPOINT_URI} \
ApiKey={API_KEY}

The above command:

  • Runs the Read OCR latest GA container from the container image.
  • Allocates 8 CPU core and 16 gigabytes (GB) of memory.
  • Exposes TCP port 5000 and allocates a pseudo-TTY for the container.
  • Automatically removes the container after it exits. The container image is still available on the host computer.

You can alternatively run the container using environment variables:

docker run --rm -it -p 5000:5000 --memory 16g --cpus 8 \
--env Eula=accept \
--env Billing={ENDPOINT_URI} \
--env ApiKey={API_KEY} \
mcr.microsoft.com/azure-cognitive-services/vision/read:3.2-model-2022-04-30

More examples of the docker run command are available.

Important

The Eula, Billing, and ApiKey options must be specified to run the container; otherwise, the container won't start. For more information, see Billing.

If you're using Azure Storage to store images for processing, you can create a connection string to use when calling the container.

To find your connection string:

  1. Navigate to Storage accounts on the Azure portal, and find your account.
  2. Select on Access keys in the left navigation list.
  3. Your connection string will be located below Connection string

[!INCLUDE Running multiple containers on the same host]

[!INCLUDE Container API documentation]

Query the container's prediction endpoint

The container provides REST-based query prediction endpoint APIs.

Use the host, http://localhost:5000, for container APIs. You can view the Swagger path at: http://localhost:5000/swagger/.

Asynchronous Read

You can use the POST /vision/v3.2/read/analyze and GET /vision/v3.2/read/operations/{operationId} operations in concert to asynchronously read an image, similar to how the Azure AI Vision service uses those corresponding REST operations. The asynchronous POST method will return an operationId that is used as the identifier to the HTTP GET request.

From the swagger UI, select the Analyze to expand it in the browser. Then select Try it out > Choose file. In this example, we'll use the following image:

tabs vs spaces

When the asynchronous POST has run successfully, it returns an HTTP 202 status code. As part of the response, there is an operation-location header that holds the result endpoint for the request.

 content-length: 0
 date: Fri, 04 Sep 2020 16:23:01 GMT
 operation-location: http://localhost:5000/vision/v3.2/read/operations/a527d445-8a74-4482-8cb3-c98a65ec7ef9
 server: Kestrel

The operation-location is the fully qualified URL and is accessed via an HTTP GET. Here is the JSON response from executing the operation-location URL from the preceding image:

{
  "status": "succeeded",
  "createdDateTime": "2021-02-04T06:32:08.2752706+00:00",
  "lastUpdatedDateTime": "2021-02-04T06:32:08.7706172+00:00",
  "analyzeResult": {
    "version": "3.2.0",
    "readResults": [
      {
        "page": 1,
        "angle": 2.1243,
        "width": 502,
        "height": 252,
        "unit": "pixel",
        "lines": [
          {
            "boundingBox": [
              58,
              42,
              314,
              59,
              311,
              123,
              56,
              121
            ],
            "text": "Tabs vs",
            "appearance": {
              "style": {
                "name": "handwriting",
                "confidence": 0.96
              }
            },
            "words": [
              {
                "boundingBox": [
                  68,
                  44,
                  225,
                  59,
                  224,
                  122,
                  66,
                  123
                ],
                "text": "Tabs",
                "confidence": 0.933
              },
              {
                "boundingBox": [
                  241,
                  61,
                  314,
                  72,
                  314,
                  123,
                  239,
                  122
                ],
                "text": "vs",
                "confidence": 0.977
              }
            ]
          },
          {
            "boundingBox": [
              286,
              171,
              415,
              165,
              417,
              197,
              287,
              201
            ],
            "text": "paces",
            "appearance": {
              "style": {
                "name": "handwriting",
                "confidence": 0.746
              }
            },
            "words": [
              {
                "boundingBox": [
                  286,
                  179,
                  404,
                  166,
                  405,
                  198,
                  290,
                  201
                ],
                "text": "paces",
                "confidence": 0.938
              }
            ]
          }
        ]
      }
    ]
  }
}

Important

If you deploy multiple Read OCR containers behind a load balancer, for example, under Docker Compose or Kubernetes, you must have an external cache. Because the processing container and the GET request container might not be the same, an external cache stores the results and shares them across containers. For details about cache settings, see Configure Azure AI Vision Docker containers.

Synchronous read

You can use the following operation to synchronously read an image.

POST /vision/v3.2/read/syncAnalyze

When the image is read in its entirety, then and only then does the API return a JSON response. The only exception to this behavior is if an error occurs. If an error occurs, the following JSON is returned:

{
    "status": "Failed"
}

The JSON response object has the same object graph as the asynchronous version. If you're a JavaScript user and want type safety, consider using TypeScript to cast the JSON response.

For an example use-case, see the TypeScript sandbox here and select Run to visualize its ease-of-use.

Run the container disconnected from the internet

[!INCLUDE configure-disconnected-container]

Stop the container

[!INCLUDE How to stop the container]

Troubleshooting

If you run the container with an output mount and logging enabled, the container generates log files that are helpful to troubleshoot issues that happen while starting or running the container.

[!INCLUDE Azure AI services FAQ note]

[!INCLUDE Diagnostic container]

Billing

The Azure AI containers send billing information to Azure, using the corresponding resource on your Azure account.

[!INCLUDE Container's Billing Settings]

For more information about these options, see Configure containers.

Summary

In this article, you learned concepts and workflow for downloading, installing, and running Azure AI Vision containers. In summary:

  • Azure AI Vision provides a Linux container for Docker, encapsulating Read.
  • The read container image requires an application to run it.
  • Container images run in Docker.
  • You can use either the REST API or SDK to call operations in Read OCR containers by specifying the host URI of the container.
  • You must specify billing information when instantiating a container.

Important

Azure AI containers are not licensed to run without being connected to Azure for metering. Customers need to enable the containers to communicate billing information with the metering service at all times. Azure AI containers do not send customer data (for example, the image or text that is being analyzed) to Microsoft.

Next steps