# Fundamentals of Azure AI Document Intelligence

Document intelligence describes AI capabilities that support processing text and making sense of information in text. As an extension of optical character recognition (OCR), document intelligence takes the next step a person might after reading a form or document. It automates the process of extracting, understanding, and saving the data in text.

Consider an organization that needs to process large numbers of receipts for expenses claims, project costs, and other accounting purposes. Suppose someone needs to manually enter the information into a database. The manual process is relatively slow and potentially error-prone.

Using document intelligence, the company can take a scanned image of a receipt, digitize the text with OCR, and pair the field items with their field names in a database. Document intelligence can identify specific data such as the merchant's name, merchant's address, total value, and tax value.

Azure AI Document Intelligence supports features that can analyze documents and forms with prebuilt and custom models. In this module, you explore how Azure AI services provide access to document intelligence capabilities.

## Explore capabilities of document intelligence

Document intelligence relies on machine learning models that are trained to recognize data in text. The ability to extract text, layout, and key-value pairs are known as document analysis. Document analysis provides locations of text on a page identified by bounding box coordinates.

![contoso-receipt-small.png](attachment:contoso-receipt-small.png)

For example, the information in on the receipt 123 Main Street is saved as a key, address and a value, 123 Main Street. Document analysis could record the location of the field value as bounding box coordinates [4.1, 2.2], [4.3, 2.2], [4.3, 2.4], [4.1, 2.4]. Machine learning models can interpret the data in a document or form because they are trained to recognize patterns in bounding box coordinate locations and text.

A challenge for automating the process of analyzing documents is that forms and documents come in all different formats. For example, while tax forms and driver's license documents both include an individual's name, the bounding box coordinates for the name differ. Separate machine learning models need to be trained to provide high quality results for different forms and documents. In this way, sometimes you might be able to use prebuilt machine learning models that have been trained on commonly used document formats. Other times, you might need to customize a machine learning model to recognize a unique document format.

Automating the process of reading text and recording data can accelerate operations, create better customer experiences, improve decision making, and more. Next you will explore how to use Azure AI services to implement document intelligence.

## Get started with receipt analysis on Azure

Azure AI Document Intelligence consists of features grouped by model type:
- Prebuilt models - pretrained models that have been built to process common document types such as invoices, business cards, ID documents, and more. These models are designed to recognize and extract specific fields that are important for each document type.
- Custom models - can be trained to identify specific fields that are not included in the existing pretrained models.
- Document analysis - general document analysis that returns structured data representations, including regions of interest and their inter-relationships.

### Prebuilt models
The prebuilt models apply advanced machine learning to accurately identify and extract text, key-value pairs, tables, and structures from forms and documents. These capabilities include extracting:

- customer and vendor details from invoices
- sales and transaction details from receipts
- identification and verification details from identity documents
- health insurance details
- business contact details
- agreement and party details from contracts
- taxable compensation, mortgage interest, student loan details and more

For example, consider the prebuilt receipt model. It processes receipts by:
- Matching field names to values
- Identifying tables of data
- Identifying specific fields, such as dates, telephone numbers, addresses, totals, and others

The receipt model has been trained to recognize data on several different receipt types, such as thermal receipts (printed on heat-sensitive paper), hotel receipts, gas receipts, credit card receipts, and parking receipts. Fields recognized include:

- Name, address, and telephone number of the merchant
- Date and time of the purchase
- Name, quantity, and price of each item purchased
- Total, subtotals, and tax values

Each field and data pair has a confidence level, indicating the likely level of accuracy. This could be used to automatically identify when a person needs to verify a receipt.

The model has been trained to recognize several different languages, depending on the receipt type. For best results when using the prebuilt receipt model, images should be:
- JPEG, PNG, BMP, PDF, or TIFF format
- File size less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier
- Between 50 x 50 pixels and 10000 x 10000 pixels
- For PDF documents, no larger than 17 inches x 17 inches
- One receipt per document

You can get started with training models in the [Document Intelligence](https://formrecognizer.appliedai.azure.com/studio) Studio, a user interface for testing document analysis, prebuilt models, and creating custom models.

### Azure AI Document Intelligence resource
To use Azure AI Document Intelligence, create either a Document Intelligence or Azure AI services resource in your Azure subscription. If you have not used Document Intelligence before, select the free tier when you create the resource. There are some restrictions with the free tier, for example only the first two pages are processed for PDF or TIFF documents.

After the resource has been created, you can create client applications that use its key and endpoint to connect forms for analysis, or use the resource in Document Intelligence Studio.