****Important – Do not use in production, for demonstration purposes only – please review the legal notices before continuing****

# Azure Form Recognizer

Azure Form Recognizer uses machine learning to extract data from forms. This notebook will extract data from a receipt, an ID card, and an invoice, similar of the ones used in retail stores.

### Receipts

In [2]:
"""
This code sample shows Prebuilt Receipt operations with the Azure Form Recognizer client library. 
The async versions of the samples require Python 3.6 or later.

To learn more, please visit the documentation - Quickstart: Form Recognizer Python client library SDKs v3.0
https://docs.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/quickstarts/try-v3-python-sdk
"""

from azure.core.credentials import AzureKeyCredential
from azure.ai.formrecognizer import DocumentAnalysisClient
import GlobalVariables as gv

endpoint = gv.FORM_RECOGNIZER_ENDPOINT
key = gv.FORM_RECOGNIZER_KEY

url = "https://raw.githubusercontent.com/Azure/azure-sdk-for-python/main/sdk/formrecognizer/azure-ai-formrecognizer/tests/sample_forms/receipt/contoso-receipt.png"

document_analysis_client = DocumentAnalysisClient(
    endpoint=endpoint, credential=AzureKeyCredential(key)
)

poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-receipt", url)
receipts = poller.result()

for idx, receipt in enumerate(receipts.documents):
    print("--------Recognizing receipt #{}--------".format(idx + 1))
    receipt_type = receipt.fields.get("ReceiptType")
    if receipt_type:
        print(
            "Receipt Type: {} has confidence: {}".format(
                receipt_type.value, receipt_type.confidence
            )
        )
    merchant_name = receipt.fields.get("MerchantName")
    if merchant_name:
        print(
            "Merchant Name: {} has confidence: {}".format(
                merchant_name.value, merchant_name.confidence
            )
        )
    transaction_date = receipt.fields.get("TransactionDate")
    if transaction_date:
        print(
            "Transaction Date: {} has confidence: {}".format(
                transaction_date.value, transaction_date.confidence
            )
        )
    if receipt.fields.get("Items"):
        print("Receipt items:")
        for idx, item in enumerate(receipt.fields.get("Items").value):
            print("...Item #{}".format(idx + 1))
            item_name = item.value.get("Name")
            if item_name:
                print(
                    "......Item Name: {} has confidence: {}".format(
                        item_name.value, item_name.confidence
                    )
                )
            item_quantity = item.value.get("Quantity")
            if item_quantity:
                print(
                    "......Item Quantity: {} has confidence: {}".format(
                        item_quantity.value, item_quantity.confidence
                    )
                )
            item_price = item.value.get("Price")
            if item_price:
                print(
                    "......Individual Item Price: {} has confidence: {}".format(
                        item_price.value, item_price.confidence
                    )
                )
            item_total_price = item.value.get("TotalPrice")
            if item_total_price:
                print(
                    "......Total Item Price: {} has confidence: {}".format(
                        item_total_price.value, item_total_price.confidence
                    )
                )
    subtotal = receipt.fields.get("Subtotal")
    if subtotal:
        print(
            "Subtotal: {} has confidence: {}".format(
                subtotal.value, subtotal.confidence
            )
        )
    tax = receipt.fields.get("Tax")
    if tax:
        print("Tax: {} has confidence: {}".format(tax.value, tax.confidence))
    tip = receipt.fields.get("Tip")
    if tip:
        print("Tip: {} has confidence: {}".format(tip.value, tip.confidence))
    total = receipt.fields.get("Total")
    if total:
        print("Total: {} has confidence: {}".format(total.value, total.confidence))
    print("--------------------------------------")


--------Recognizing receipt #1--------
Merchant Name: Contoso has confidence: 0.948
Transaction Date: 2019-06-10 has confidence: 0.99
Receipt items:
...Item #1
......Item Quantity: 1.0 has confidence: 0.96
......Total Item Price: 999.0 has confidence: 0.985
...Item #2
......Item Quantity: 1.0 has confidence: 0.959
......Total Item Price: 99.99 has confidence: 0.984
Subtotal: 1098.99 has confidence: 0.987
Total: 1203.39 has confidence: 0.984
--------------------------------------


### Invoices

In [3]:


formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/invoice_sample.jpg"

document_analysis_client = DocumentAnalysisClient(
    endpoint=endpoint, credential=AzureKeyCredential(key)
)
    
poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-invoice", formUrl)
invoices = poller.result()

for idx, invoice in enumerate(invoices.documents):
    print("--------Recognizing invoice #{}--------".format(idx + 1))
    vendor_name = invoice.fields.get("VendorName")
    if vendor_name:
        print(
            "Vendor Name: {} has confidence: {}".format(
                vendor_name.value, vendor_name.confidence
            )
        )
    vendor_address = invoice.fields.get("VendorAddress")
    if vendor_address:
        print(
            "Vendor Address: {} has confidence: {}".format(
                vendor_address.value, vendor_address.confidence
            )
        )
    vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
    if vendor_address_recipient:
        print(
            "Vendor Address Recipient: {} has confidence: {}".format(
                vendor_address_recipient.value, vendor_address_recipient.confidence
            )
        )
    customer_name = invoice.fields.get("CustomerName")
    if customer_name:
        print(
            "Customer Name: {} has confidence: {}".format(
                customer_name.value, customer_name.confidence
            )
        )
    customer_id = invoice.fields.get("CustomerId")
    if customer_id:
        print(
            "Customer Id: {} has confidence: {}".format(
                customer_id.value, customer_id.confidence
            )
        )
    customer_address = invoice.fields.get("CustomerAddress")
    if customer_address:
        print(
            "Customer Address: {} has confidence: {}".format(
                customer_address.value, customer_address.confidence
            )
        )
    customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
    if customer_address_recipient:
        print(
            "Customer Address Recipient: {} has confidence: {}".format(
                customer_address_recipient.value,
                customer_address_recipient.confidence,
            )
        )
    invoice_id = invoice.fields.get("InvoiceId")
    if invoice_id:
        print(
            "Invoice Id: {} has confidence: {}".format(
                invoice_id.value, invoice_id.confidence
            )
        )
    invoice_date = invoice.fields.get("InvoiceDate")
    if invoice_date:
        print(
            "Invoice Date: {} has confidence: {}".format(
                invoice_date.value, invoice_date.confidence
            )
        )
    invoice_total = invoice.fields.get("InvoiceTotal")
    if invoice_total:
        print(
            "Invoice Total: {} has confidence: {}".format(
                invoice_total.value, invoice_total.confidence
            )
        )
    due_date = invoice.fields.get("DueDate")
    if due_date:
        print(
            "Due Date: {} has confidence: {}".format(
                due_date.value, due_date.confidence
            )
        )
    purchase_order = invoice.fields.get("PurchaseOrder")
    if purchase_order:
        print(
            "Purchase Order: {} has confidence: {}".format(
                purchase_order.value, purchase_order.confidence
            )
        )
    billing_address = invoice.fields.get("BillingAddress")
    if billing_address:
        print(
            "Billing Address: {} has confidence: {}".format(
                billing_address.value, billing_address.confidence
            )
        )
    billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
    if billing_address_recipient:
        print(
            "Billing Address Recipient: {} has confidence: {}".format(
                billing_address_recipient.value,
                billing_address_recipient.confidence,
            )
        )
    shipping_address = invoice.fields.get("ShippingAddress")
    if shipping_address:
        print(
            "Shipping Address: {} has confidence: {}".format(
                shipping_address.value, shipping_address.confidence
            )
        )
    shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
    if shipping_address_recipient:
        print(
            "Shipping Address Recipient: {} has confidence: {}".format(
                shipping_address_recipient.value,
                shipping_address_recipient.confidence,
            )
        )
    print("Invoice items:")
    for idx, item in enumerate(invoice.fields.get("Items").value):
        print("...Item #{}".format(idx + 1))
        item_description = item.value.get("Description")
        if item_description:
            print(
                "......Description: {} has confidence: {}".format(
                    item_description.value, item_description.confidence
                )
            )
        item_quantity = item.value.get("Quantity")
        if item_quantity:
            print(
                "......Quantity: {} has confidence: {}".format(
                    item_quantity.value, item_quantity.confidence
                )
            )
        unit = item.value.get("Unit")
        if unit:
            print(
                "......Unit: {} has confidence: {}".format(
                    unit.value, unit.confidence
                )
            )
        unit_price = item.value.get("UnitPrice")
        if unit_price:
            print(
                "......Unit Price: {} has confidence: {}".format(
                    unit_price.value, unit_price.confidence
                )
            )
        product_code = item.value.get("ProductCode")
        if product_code:
            print(
                "......Product Code: {} has confidence: {}".format(
                    product_code.value, product_code.confidence
                )
            )
        item_date = item.value.get("Date")
        if item_date:
            print(
                "......Date: {} has confidence: {}".format(
                    item_date.value, item_date.confidence
                )
            )
        tax = item.value.get("Tax")
        if tax:
            print(
                "......Tax: {} has confidence: {}".format(tax.value, tax.confidence)
            )
        amount = item.value.get("Amount")
        if amount:
            print(
                "......Amount: {} has confidence: {}".format(
                    amount.value, amount.confidence
                )
            )
    subtotal = invoice.fields.get("SubTotal")
    if subtotal:
        print(
            "Subtotal: {} has confidence: {}".format(
                subtotal.value, subtotal.confidence
            )
        )
    total_tax = invoice.fields.get("TotalTax")
    if total_tax:
        print(
            "Total Tax: {} has confidence: {}".format(
                total_tax.value, total_tax.confidence
            )
        )
    previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
    if previous_unpaid_balance:
        print(
            "Previous Unpaid Balance: {} has confidence: {}".format(
                previous_unpaid_balance.value, previous_unpaid_balance.confidence
            )
        )
    amount_due = invoice.fields.get("AmountDue")
    if amount_due:
        print(
            "Amount Due: {} has confidence: {}".format(
                amount_due.value, amount_due.confidence
            )
        )
    service_start_date = invoice.fields.get("ServiceStartDate")
    if service_start_date:
        print(
            "Service Start Date: {} has confidence: {}".format(
                service_start_date.value, service_start_date.confidence
            )
        )
    service_end_date = invoice.fields.get("ServiceEndDate")
    if service_end_date:
        print(
            "Service End Date: {} has confidence: {}".format(
                service_end_date.value, service_end_date.confidence
            )
        )
    service_address = invoice.fields.get("ServiceAddress")
    if service_address:
        print(
            "Service Address: {} has confidence: {}".format(
                service_address.value, service_address.confidence
            )
        )
    service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
    if service_address_recipient:
        print(
            "Service Address Recipient: {} has confidence: {}".format(
                service_address_recipient.value,
                service_address_recipient.confidence,
            )
        )
    remittance_address = invoice.fields.get("RemittanceAddress")
    if remittance_address:
        print(
            "Remittance Address: {} has confidence: {}".format(
                remittance_address.value, remittance_address.confidence
            )
        )
    remittance_address_recipient = invoice.fields.get("RemittanceAddressRecipient")
    if remittance_address_recipient:
        print(
            "Remittance Address Recipient: {} has confidence: {}".format(
                remittance_address_recipient.value,
                remittance_address_recipient.confidence,
            )
        )
    print("----------------------------------------")


--------Recognizing invoice #1--------
Vendor Name: CONTOSO LTD. has confidence: 0.913
Vendor Address: 123 456th St New York, NY, 10001 has confidence: 0.895
Vendor Address Recipient: Contoso Headquarters has confidence: 0.907
Customer Name: MICROSOFT CORPORATION has confidence: 0.855
Customer Id: CID-12345 has confidence: 0.967
Customer Address: 123 Other St, Redmond WA, 98052 has confidence: 0.898
Customer Address Recipient: Microsoft Corp has confidence: 0.907
Invoice Id: INV-100 has confidence: 0.97
Invoice Date: 2019-11-15 has confidence: 0.97
Invoice Total: CurrencyValue(amount=110.0, symbol=$) has confidence: 0.97
Due Date: 2019-12-15 has confidence: 0.973
Purchase Order: PO-3333 has confidence: 0.956
Billing Address: 123 Bill St, Redmond WA, 98052 has confidence: 0.906
Billing Address Recipient: Microsoft Finance has confidence: 0.913
Shipping Address: 123 Ship St, Redmond WA, 98052 has confidence: 0.906
Shipping Address Recipient: Microsoft Delivery has confidence: 0.908
Invoi

![Form Recognizer](https://stretaildemodev.blob.core.windows.net/notebookimages/BC.jpg)

### Business Card

In [4]:


formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/business-card-english.jpg"

document_analysis_client = DocumentAnalysisClient(
    endpoint=endpoint, credential=AzureKeyCredential(key)
)

poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-businessCard", formUrl)
business_cards = poller.result()

for idx, business_card in enumerate(business_cards.documents):
    print("--------Analyzing business card #{}--------".format(idx + 1))
    contact_names = business_card.fields.get("ContactNames")
    if contact_names:
        for contact_name in contact_names.value:
            print(
                "Contact First Name: {} has confidence: {}".format(
                    contact_name.value["FirstName"].value,
                    contact_name.value[
                        "FirstName"
                    ].confidence,
                )
            )
            print(
                "Contact Last Name: {} has confidence: {}".format(
                    contact_name.value["LastName"].value,
                    contact_name.value[
                        "LastName"
                    ].confidence,
                )
            )
    company_names = business_card.fields.get("CompanyNames")
    if company_names:
        for company_name in company_names.value:
            print(
                "Company Name: {} has confidence: {}".format(
                    company_name.value, company_name.confidence
                )
            )
    departments = business_card.fields.get("Departments")
    if departments:
        for department in departments.value:
            print(
                "Department: {} has confidence: {}".format(
                    department.value, department.confidence
                )
            )
    job_titles = business_card.fields.get("JobTitles")
    if job_titles:
        for job_title in job_titles.value:
            print(
                "Job Title: {} has confidence: {}".format(
                    job_title.value, job_title.confidence
                )
            )
    emails = business_card.fields.get("Emails")
    if emails:
        for email in emails.value:
            print(
                "Email: {} has confidence: {}".format(email.value, email.confidence)
            )
    websites = business_card.fields.get("Websites")
    if websites:
        for website in websites.value:
            print(
                "Website: {} has confidence: {}".format(
                    website.value, website.confidence
                )
            )
    addresses = business_card.fields.get("Addresses")
    if addresses:
        for address in addresses.value:
            print(
                "Address: {} has confidence: {}".format(
                    address.value, address.confidence
                )
            )
    mobile_phones = business_card.fields.get("MobilePhones")
    if mobile_phones:
        for phone in mobile_phones.value:
            print(
                "Mobile phone number: {} has confidence: {}".format(
                    phone.content, phone.confidence
                )
            )
    faxes = business_card.fields.get("Faxes")
    if faxes:
        for fax in faxes.value:
            print(
                "Fax number: {} has confidence: {}".format(
                    fax.content, fax.confidence
                )
            )
    work_phones = business_card.fields.get("WorkPhones")
    if work_phones:
        for work_phone in work_phones.value:
            print(
                "Work phone number: {} has confidence: {}".format(
                    work_phone.content, work_phone.confidence
                )
            )
    other_phones = business_card.fields.get("OtherPhones")
    if other_phones:
        for other_phone in other_phones.value:
            print(
                "Other phone number: {} has confidence: {}".format(
                    other_phone.value, other_phone.confidence
                )
            )
    print("----------------------------------------")


--------Analyzing business card #1--------
Contact First Name: Avery has confidence: 0.979
Contact Last Name: Smith has confidence: 0.984
Company Name: Contoso has confidence: 0.95
Department: Cloud & Al Department has confidence: 0.858
Job Title: Senior Researcher has confidence: 0.979
Email: avery.smith@contoso.com has confidence: 0.968
Website: https://www.contoso.com/ has confidence: 0.977
Address: 2 Kingdom Street Paddington, London, W2 6BD has confidence: 0.958
Mobile phone number: +44 (0) 7911 123456 has confidence: 0.984
Fax number: +44 (0) 20 6789 2345 has confidence: 0.986
Work phone number: +44 (0) 20 9876 5432 has confidence: 0.971
----------------------------------------


### Identify Documents

In [5]:


formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/DriverLicense.png"

document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )
    
poller = document_analysis_client.begin_analyze_document_from_url("prebuilt-idDocument", formUrl)
id_documents = poller.result()

for idx, id_document in enumerate(id_documents.documents):
    print("--------Recognizing ID document #{}--------".format(idx + 1))
    first_name = id_document.fields.get("FirstName")
    if first_name:
        print(
            "First Name: {} has confidence: {}".format(
                first_name.value, first_name.confidence
            )
        )
    last_name = id_document.fields.get("LastName")
    if last_name:
        print(
            "Last Name: {} has confidence: {}".format(
                last_name.value, last_name.confidence
            )
        )
    document_number = id_document.fields.get("DocumentNumber")
    if document_number:
        print(
            "Document Number: {} has confidence: {}".format(
                document_number.value, document_number.confidence
            )
        )
    dob = id_document.fields.get("DateOfBirth")
    if dob:
        print(
            "Date of Birth: {} has confidence: {}".format(dob.value, dob.confidence)
        )
    doe = id_document.fields.get("DateOfExpiration")
    if doe:
        print(
            "Date of Expiration: {} has confidence: {}".format(
                doe.value, doe.confidence
            )
        )
    sex = id_document.fields.get("Sex")
    if sex:
        print("Sex: {} has confidence: {}".format(sex.value, sex.confidence))
    address = id_document.fields.get("Address")
    if address:
        print(
            "Address: {} has confidence: {}".format(
                address.value, address.confidence
            )
        )
    country_region = id_document.fields.get("CountryRegion")
    if country_region:
        print(
            "Country/Region: {} has confidence: {}".format(
                country_region.value, country_region.confidence
            )
        )
    region = id_document.fields.get("Region")
    if region:
        print(
            "Region: {} has confidence: {}".format(region.value, region.confidence)
        )


--------Recognizing ID document #1--------
First Name: LIAM R. has confidence: 0.882
Last Name: TALBOT has confidence: 0.892
Document Number: WDLABCD456DG has confidence: 0.832
Date of Birth: 1958-01-06 has confidence: 0.851
Date of Expiration: 2020-08-12 has confidence: 0.85
Sex: M has confidence: 0.85
Address: 123 STREET ADDRESS YOUR CITY WA 99999-1234 has confidence: 0.851
Country/Region: USA has confidence: 0.87
Region: Washington has confidence: 0.867
