# Get started with Azure AI Services for Private Environments

### Introduction
`Azure AI services` provide **Docker containers** that let you can keep the data closer to your host. To deploy and use an Azure AI services container, the following three activities must occur:

1. The container image for the specific Azure AI services API you want to use is downloaded and deployed to a container host, such as a local Docker server, an **Azure Container Instance (ACI)**, or **Azure Kubernetes Service (AKS)**.
2. Client applications **submit data to the endpoint** provided by the containerized service, and retrieve results just as they would from an Azure AI services cloud resource in Azure.
3. Periodically, **usage metrics** for the containerized service are sent to an Azure AI services resource in Azure in order to calculate billing for the service.

![image.png](./assets/ai-services-container.png)

this architecture gives the features and benefits below:
* **Immutable infrastructure**:Enable DevOps teams to leverage a consistent and reliable set of known system functions. 
* **Control over data**: Choose where your data gets processed by Azure AI services
* **Control over model updates**: Flexibility in versioning and updating of models deployed in their solutions
* **Portable architecture**: Enables the creation of a portable application architecture that can be deployed on Azure, on-premises and the edge.
* **High throughput & low latency**: Enabling Azure AI services to run physically close to their application logic and data.
* **Scalability**: With the ever growing popularity of containerization and container orchestration software, such as Kubernetes

### Build Your own Cognitive Service

Here we use **Document Intelligence** as example to demonstrate how to install and run a container.<br> 

#### **Prerequisites**
* An active **Azure Account**
* **Azure Key Vault** for secure API Keys (*Optional)

#### Step1. Create your own **Document Intelligence Service** and get the `API Key` and `Endpoint`.
<img src="./assets/azure-ai-service-portal.png" width="900"/>

In [1]:
FORM_RECOGNIZER_ENDPOINT_URI = 'https://ces-document-intelligence.cognitiveservices.azure.com/'
FORM_RECOGNIZER_KEY = '30bfa09ff3fe426c83e434a921277770'

#### Step2. Install Azure AI Python SDK for **Document Intelligence**.

> you can browse and choice other cognitive services from [HERE](https://learn.microsoft.com/en-us/python/api/overview/azure/cognitive-services?view=azure-python-preview)

In [2]:
!pip install azure-ai-documentintelligence==1.0.0b4
!pip install tabulate

Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable


#### Step3. Create an Analysis Client with `prebuilt-layout` Module

* `model_id`: Required, d-type(str). Use this to specify the **Custom model ID** or **Prebuilt model ID** for your analysis process. **Prebuilt model IDs** supported can be found here: https://aka.ms/azsdk/formrecognizer/models

In [3]:
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult

credential = AzureKeyCredential(FORM_RECOGNIZER_KEY)
client = DocumentIntelligenceClient(endpoint=FORM_RECOGNIZER_ENDPOINT_URI, credential=credential)

with open('./assets/attention-is-all-you-need.pdf', "rb") as f:
    poller = client.begin_analyze_document(model_id="prebuilt-layout", analyze_request=f, content_type="application/octet-stream")
    result: AnalyzeResult = poller.result()

#### Step4. Perform a Proof-of-Concept of its Functionality on Cloud
> Check more usage of **Azure AI Python SDK** with [Offical Technical Report](https://learn.microsoft.com/zh-tw/python/api/azure-ai-documentintelligence/azure.ai.documentintelligence.documentintelligenceclient?view=azure-python-preview).

<img src="./assets/document-layout-example.png" width="480"/>

##### a. HandWritten Contect Detection

In [4]:
if result.styles and any([style.is_handwritten for style in result.styles]):
    print("Document contains handwritten content")
else:
    print("Document does not contain handwritten content")

Document contains handwritten content


##### b. Extracting contents of each pages 

In [5]:
for page in result.pages:
    print(f"Page #{page.page_number} has Width: {page.width} Height: {page.height} and measured with unit: {page.unit}")
    if page.lines:
        for line_idx, line in enumerate(page.lines):
            print(f" - {line_idx}: {line.content}")  # using line.polygon to get the placement of this content

Page #1 has Width: 8.5 Height: 11 and measured with unit: LengthUnit.INCH
 - 0: arXiv:1706.03762v7 [cs.CL] 2 Aug 2023
 - 1: Provided proper attribution is provided, Google hereby grants permission to
 - 2: reproduce the tables and figures in this paper solely for use in journalistic or
 - 3: scholarly works.
 - 4: Attention Is All You Need
 - 5: Ashish Vaswani*
 - 6: Google Brain
 - 7: Noam Shazeer*
 - 8: Google Brain
 - 9: Niki Parmar*
 - 10: Google Research
 - 11: Jakob Uszkoreit*
 - 12: Google Research
 - 13: avaswani@google.com
 - 14: noam@google.com
 - 15: nikip@google.com
 - 16: usz@google.com
 - 17: Llion Jones*
 - 18: Google Research
 - 19: Aidan N. Gomez* +
 - 20: University of Toronto
 - 21: Łukasz Kaiser*
 - 22: Google Brain
 - 23: llion@google.com
 - 24: aidan@cs.toronto.edu
 - 25: lukaszkaiser@google.com
 - 26: Illia Polosukhin* *
 - 27: illia.polosukhin@gmail.com
 - 28: Abstract
 - 29: The dominant sequence transduction models are based on complex recurrent or
 - 30: conv

##### Extracting tables in whole pdf

In [6]:
import numpy as np
from tabulate import tabulate

if result.tables:
    for table_idx, table in enumerate(result.tables):
        print(f"Table # {table_idx} has {table.row_count} rows and " f"{table.column_count} columns")
        output = [[None for _ in range(table.column_count)] for _ in range(table.row_count)]
        for cell in table.cells:
            output[cell.row_index][cell.column_index] = cell.content
        print(tabulate(output, headers=[str(i) for i in range(table.column_count)]))

Table # 0 has 2 rows and 4 columns
0                             1                           2                             3
----------------------------  --------------------------  ----------------------------  --------------------------------
Ashish Vaswani* Google Brain  Noam Shazeer* Google Brain  Niki Parmar* Google Research  Jakob Uszkoreit* Google Research
avaswani@google.com           noam@google.com             nikip@google.com              usz@google.com
Table # 1 has 2 rows and 3 columns
0                             1                                        2
----------------------------  ---------------------------------------  ---------------------------
Llion Jones* Google Research  Aidan N. Gomez* + University of Toronto  Łukasz Kaiser* Google Brain
llion@google.com              aidan@cs.toronto.edu                     lukaszkaiser@google.com
Table # 2 has 5 rows and 4 columns
0                            1                     2                      3
------------------

### Deploy your Cognitive Module in Local Container

According to [Official Tutorial](https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/containers/install-run?view=doc-intel-4.0.0&tabs=read), Dockernize Document Intelligence Service only support for following modules.

* **Read, Layout, ID Document, Receipt,** and **Invoice** modules are supported by Document Intelligence v3.1 containers.
* **Read, Layout, General Document, Business Card,** and **Custom** modules are supported by Document Intelligence v3.0 containers.

### **Prerequisites**
* **Docker Engine** installed
* Enough **CPU Cores** and **Memory** for given service

    <img src="./assets/system-requirements-for-documents-intelligence.png" width="600"/>

#### Step1. Fill the Configuration of your Module and Generate `docker-compose.yml` File<sub>[[More YML Details]](https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/containers/install-run?view=doc-intel-4.0.0&tabs=read)

* `ApiKey`: The value of this option must be set to a key for the provisioned resource specified in Billing.
* `Billing`: The value of this option must be set to the endpoint URI of a provisioned Azure resource.
* `Eula`: Indicates that you accepted the license for the container. The value of this option must be set to accept.

In [7]:
import yaml
IMAGE_URL = 'mcr.microsoft.com/azure-cognitive-services/form-recognizer/layout-3.1'
CONTAINER_NAME = 'azure-form-recognizer-layout'
docker_compose_config = {
    'version': '3.9',
    'services':{
        'azure-form-recognizer-read':{
            'container_name': CONTAINER_NAME,
            'image': IMAGE_URL,
            'environment':[
                'EULA=accept', 
                f'billing={FORM_RECOGNIZER_ENDPOINT_URI}', 
                f'apiKey={FORM_RECOGNIZER_KEY}'],
            'ports':['5000:5000'],
            'networks':['ocrvnet']
        }
    },
    'networks':{
        'ocrvnet':{
            'driver': 'bridge'
        }
    }
}
with open('docker-compose.yml', 'w') as f:
    yaml.dump(docker_compose_config, f, default_flow_style=False)

#### Step2. Start the service with the docker compose.

run `docker-compose up` in your Terminal or Command Line

#### Step3. Validate that the service is running

a. Open a new browser tab and use the Endpoint RUL `http://localhost:5000`<br>
b. Select **Service API Description** to view the swagger page, and select any of the POST APIs and select Try it out<br><br>
    <img src="./assets/navigator-for-container.png" width="640"/>

In [9]:
import requests

r = requests.get('http://localhost:5000/status')
r.text

'{"service":"formrecognizerlayout","apiStatus":"Valid","apiStatusMessage":"Api Key is valid, no action needed."}'

In [11]:
r = requests.get('http://localhost:5000/ready')
r.text

'{"service":"formrecognizerlayout","ready":"ready","message":"Api Key is valid, no action needed."}'

**【Note】** To use this container in a disconnected environment, please submit a **REQUEST FORM** and **PURCHASE A COMMITMENT PLAN** to Microsoft.

### Appendix -  An Overview of Containers Supporting Disconnected Environments<sub> (2024/10 updated)

https://learn.microsoft.com/en-us/azure/ai-services/cognitive-services-container-support?view=doc-intel-3.0.0

#### Language containers
|  Service   | Description  | Availability  |
|  --------  | -----------  | ------------- |
| Key Phrase Extraction | Extracts key phrases to identify the main points. | Generally Available |
| Text Language Detection | For up to 120 languages, detects which language the input text is written in and report a single language code for every document submitted on the request.  |  Generally Available|
| Sentiment Analysis | Analyzes raw text for clues about positive or negative sentiment. This version of sentiment analysis returns sentiment labels | Generally Available |
| Named Entity Recognition | Extract named entities from text. | Generally Available |
| Summarization | Summarize text from various sources.	 | Generally Available |
| ranslator | Translate text in several languages and dialects. | Generally Available |

#### Speech containers
|  Service   | Description  | Status  |
|  --------  | -----------  | ------------- |
| Speech to text  | Transcribes continuous real-time speech into text.| Generally Available|
| Custom Speech to text  | Transcribes continuous real-time speech into text using a custom model. | Generally Available |
| Neural Text to speech  | Converts text to natural-sounding speech using deep neural network technology, allowing for more natural synthesized speech. | Generally Available|
| Speech language identification | Determines the language of spoken audio. | Preview |

#### Vision containers
|  Service   | Description  | Status  |
|  --------  | -----------  | ------------- |
| Read OCR  | Extract printed and handwritten text from images and documents, support for `JPEG`, `PNG`, `BMP`, `PDF`, and `TIFF` file formats.| Generally Available |
| Spatial analysis | Analyzes real-time streaming video to understand spatial relationships between people, their movement, and interactions with objects in physical. | Preview |