# 📄 Document Intelligence

Document Intelligence refers to the use of advanced technologies such as artificial intelligence (AI) and machine learning (ML) to analyze, understand, and extract valuable information from documents. This field encompasses a variety of techniques and tools designed to automate the processing of documents, making it easier to manage and utilize the information they contain.

## Key Features

- 🧠 **AI-Powered Analysis**: Leveraging AI to interpret and understand the content of documents.
- 🔍 **Data Extraction**: Automatically extracting relevant data from documents for further use.
- 📑 **Document Classification**: Categorizing documents based on their content.
- 📝 **Text Recognition**: Using Optical Character Recognition (OCR) to convert scanned documents into editable text.
- 📊 **Data Integration**: Integrating extracted data into other systems and workflows.

## Benefits

- ⏱️ **Time Savings**: Reduces the time spent on manual document processing.
- 📈 **Increased Accuracy**: Minimizes human errors in data extraction and interpretation.
- 💼 **Enhanced Productivity**: Allows employees to focus on higher-value tasks by automating routine document handling.
- 🔒 **Improved Compliance**: Ensures that documents are processed in accordance with regulatory requirements.

## Applications

- 🏦 **Financial Services**: Automating the processing of invoices, receipts, and financial statements.
- 🏥 **Healthcare**: Extracting patient information from medical records and forms.
- 🏢 **Legal**: Analyzing contracts and legal documents for key terms and clauses.
- 📚 **Education**: Managing and organizing academic records and research papers.

Document Intelligence is transforming the way organizations handle their documents, leading to more efficient and effective operations.

In [None]:
%pip install azure-ai-documentintelligence==1.0.0
%pip install python-dotenv==1.0.1

In [1]:
# Load the environment variables

import os
from dotenv import load_dotenv

load_dotenv(override=True)

endpoint = os.getenv("ENDPOINT")
key = os.getenv("KEY")

Create helper functions

In [2]:
def get_words(page, line):
    result = []
    for word in page.words:
        if _in_span(word, line.spans):
            result.append(word)
    return result


def _in_span(word, spans):
    for span in spans:
        if word.span.offset >= span.offset and (
            word.span.offset + word.span.length
        ) <= (span.offset + span.length):
            return True
    return False

Create the document intelligence client

In [3]:
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest

document_intelligence = DocumentIntelligenceClient(endpoint=endpoint, credential=AzureKeyCredential(key))

url = "https://github.com/hugogirard/GenAIDemo/blob/main/documentintelligence/sample-layout.pdf?raw=true"

poller = document_intelligence.begin_analyze_document("prebuilt-layout", AnalyzeDocumentRequest(url_source=url))

result: AnalyzeResult = poller.result()


Analyze the results

In [4]:
print(result)

{'apiVersion': '2024-11-30', 'modelId': 'prebuilt-layout', 'stringIndexType': 'textElements', 'content': 'UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549\nFORM 10-Q\n☐ ☒ :selected: QUARTERLY REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the Quarterly Period Ended March 31, 2020 OR :unselected: TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the Transition Period From to\nCommission File Number 001-37845\nMICROSOFT CORPORATION\nWASHINGTON (STATE OF INCORPORATION) ONE MICROSOFT WAY, REDMOND, WASHINGTON 98052-6399 (425) 882-8080 www.microsoft.com/investor\n91-1144442 (I.R.S. ID)\nSecurities registered pursuant to Section 12(b) of the Act:\nTitle of each class\nTrading Symbol\nName of exchange on which registered\nCommon stock, $0.00000625 par value per share\nMSFT\nNASDAQ\n2.125% Notes due 2021\nMSFT\nNASDAQ\n3.125% Notes due 2028\nMSFT\nNASDAQ\n2.625% Notes due 2033\nMSFT\nNASDAQ\nSec

In [None]:
# Show only paragraph by paragraph

for paragraph in result.paragraphs:
    print(paragraph.content)

In [6]:
if result.styles and any([style.is_handwritten for style in result.styles]):
    print("Document contains handwritten content")
else:
    print("Document does not contain handwritten content")

for page in result.pages:
    print(f"----Analyzing layout from page #{page.page_number}----")
    print(
        f"Page has width: {page.width} and height: {page.height}, measured with unit: {page.unit}"
    )

    if page.lines:
        for line_idx, line in enumerate(page.lines):
            words = get_words(page, line)
            print(
                f"...Line # {line_idx} has word count {len(words)} and text '{line.content}' "
                f"within bounding polygon '{line.polygon}'"
            )

            for word in words:
                print(
                    f"......Word '{word.content}' has a confidence of {word.confidence}"
                )

    if page.selection_marks:
        for selection_mark in page.selection_marks:
            print(
                f"Selection mark is '{selection_mark.state}' within bounding polygon "
                f"'{selection_mark.polygon}' and has a confidence of {selection_mark.confidence}"
            )

if result.tables:
    for table_idx, table in enumerate(result.tables):
        print(
            f"Table # {table_idx} has {table.row_count} rows and "
            f"{table.column_count} columns"
        )
        if table.bounding_regions:
            for region in table.bounding_regions:
                print(
                    f"Table # {table_idx} location on page: {region.page_number} is {region.polygon}"
                )
        for cell in table.cells:
            print(
                f"...Cell[{cell.row_index}][{cell.column_index}] has text '{cell.content}'"
            )
            if cell.bounding_regions:
                for region in cell.bounding_regions:
                    print(
                        f"...content on page {region.page_number} is within bounding polygon '{region.polygon}'"
                    )

print("----------------------------------------")




Document does not contain handwritten content
----Analyzing layout from page #1----
Page has width: 8.5 and height: 11.0, measured with unit: LengthUnit.INCH
...Line # 0 has word count 2 and text 'UNITED STATES' within bounding polygon '[3.4695, 0.6555, 5.0216, 0.6576, 5.0214, 0.847, 3.4693, 0.845]'
......Word 'UNITED' has a confidence of 0.997
......Word 'STATES' has a confidence of 0.998
...Line # 1 has word count 4 and text 'SECURITIES AND EXCHANGE COMMISSION' within bounding polygon '[2.1754, 0.8727, 6.3155, 0.8723, 6.3155, 1.0737, 2.1754, 1.0742]'
......Word 'SECURITIES' has a confidence of 0.993
......Word 'AND' has a confidence of 0.998
......Word 'EXCHANGE' has a confidence of 0.992
......Word 'COMMISSION' has a confidence of 0.992
...Line # 2 has word count 3 and text 'Washington, D.C. 20549' within bounding polygon '[3.4515, 1.0922, 5.0382, 1.0888, 5.0386, 1.2532, 3.4518, 1.2565]'
......Word 'Washington,' has a confidence of 0.995
......Word 'D.C.' has a confidence of 0.993
.