# Intelligent Document Processing (IDP) with Amazon Nova Models

In this notebook, we demonstrate how to leverage Amazon Nova models along with the AWS Bedrock Converse API for Intelligent Document Processing (IDP). Our goal is to extract both unstructured summaries and structured data from PDF invoices. We cover several use cases:

- **Summarizing Content:** Extracting a brief summary and key insights from a single invoice.
- **Structured Data Extraction:** Using a custom tool configuration to force the model to output structured data from invoices that follow the same layout.
- **Generating Business Insights:** Aggregating data from multiple invoices to answer specific business questions.

This notebook assumes you are running the code with proper AWS credentials (preferably using an IAM role) and that you have enabled Amazon Bedrock models (in us-west-2) in your account. For more details on setting up temporary AWS credentials and enabling models, please refer to the provided documentation links.

**Notes:**
- Make sure you are running this code using your AWS Credentials. This notebook assumes you are loading the credentials using an IAM role, however, you may use your access_key if you are not using IAM Roles. For more details about how to set temporaty AWS credentials please check [this link](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_use-resources.html).
- Before using Amazon Bedrock models in this notebook you need to enable them in your account in us-west-2, for more details about the steps required to enabled the model please check [this link](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access-modify.html).
- This notebook uses Amazon Nova Pro as default, all [charges on-demand on request basis](https://aws.amazon.com/bedrock/pricing/).
- Amazon Nova Pro will be used with the [Cross-Region Inference mode](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html).
- While this notebook uses us-west-2 (Oregon), Amazon Nova is available in a variety of AWS Regions. Check our [document pages](https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html] for more details about the regions available.

## Installing requirements

In [1]:
%pip install pip setuptools packaging -U --quiet

Note: you may need to restart the kernel to use updated packages.


In [2]:
%pip install boto3 IPython --quiet

Note: you may need to restart the kernel to use updated packages.


Lets import our modules:

In [3]:
import boto3
import logging
from IPython.display import IFrame
from botocore.exceptions import ClientError
from pprint import pprint
import json

In [4]:
logging.basicConfig(level=logging.INFO,
                        format="%(levelname)s: %(message)s")

In [5]:
# Update this variable if you have enabled Amazon Nova in a different region
region_name = "us-west-2"

## 1. Summarizing Content: Extracting a brief summary and key insights from a single invoice.
In this use case we will use Amazon Nova models to process and extract data from PDF invoices using the Bedrock Converse API.

Here is our help function to load pdf and send requests to Amazon Bedrock using **Amazon Nova Pro**.

In [6]:
def send_pdf_to_bedrock(prompt, model_id="us.amazon.nova-pro-v1:0", pdf_file_path=None, tool_config=None):
    """
    Sends a message to the AWS Bedrock Converse API for processing, optionally including a PDF document.
    This function constructs a message that includes a text prompt and, if provided, a PDF document (as raw bytes)
    read from the specified file system path. It then sends the message to the AWS Bedrock Converse API using the provided
    model ID. Optionally, if a tool configuration is provided via the `tool_config` parameter, it is passed to the API call.
    The API response, which includes the model's output message and associated metadata, is returned.
    Parameters:
        prompt (str): A text prompt that provides context or instructions for processing.
        model_id (str): The identifier of the AWS Bedrock model to use for inference. This model must support document
            inputs. If no model_id is provided, Amazon Nova Pro will be used.
        pdf_file_path (str, optional): The file system path to the PDF document. If provided, the PDF will be read and
            included in the message. Defaults to None.
        tool_config (dict, optional): A dictionary specifying the tool configuration to be passed to the Converse API.
            Defaults to None.
    Returns:
        dict or None: The JSON response from the Bedrock Converse API if the call is successful, containing the model's
        output and metadata; returns None if an error occurs.
    Raises:
        ClientError: If there is an error invoking the AWS Bedrock Converse API.
    """
    # Build the content list starting with the prompt
    content_blocks = [{"text": prompt}]
    # If a PDF file path is provided, read the file and add a document block
    if pdf_file_path is not None:
        with open(pdf_file_path, "rb") as f:
            pdf_bytes = f.read()
        content_blocks.append({
            "document": {
                "name": "my_invoice_template",  # Use a neutral name
                "format": "pdf",  # Supported formats: pdf | csv | doc | docx | xls | xlsx | html | txt | md
                "source": {
                    "bytes": pdf_bytes
                }
            }
        })
    # Initialize the Bedrock runtime client
    # Note: 'region_name' should be defined or replaced with your desired AWS region
    bedrock_client = boto3.client("bedrock-runtime", region_name=region_name)
    # Construct the message with the content blocks
    message = {
        "role": "user",
        "content": content_blocks
    }
    # Optional inference configuration parameters
    inference_config = {
        "maxTokens": 3000,
        "temperature": 0
    }
    try:
        # Build the parameters for the converse call
        converse_params = {
            "modelId": model_id,
            "messages": [message],
            "inferenceConfig": inference_config
        }
        # Include toolConfig if provided
        if tool_config is not None:
            converse_params["toolConfig"] = tool_config
        # Call the Converse API with the constructed parameters
        response = bedrock_client.converse(**converse_params)
        return response
    except ClientError as e:
        print("An error occurred:", e)
        return None

Here is how our test pdf looks like:

In [7]:
pdf_path = "invoices/test_invoice_0_1.pdf"
IFrame(pdf_path, width=600, height=900)

Lets extract a summary from one of our invoices:

In [8]:
prompt_text = "Please summarize the content of this PDF and share your insights in a list format."

result = send_pdf_to_bedrock(prompt=prompt_text, pdf_file_path=pdf_path)
if result:
    # Extract and print the text response from the model
    output_message = result.get("output", {}).get("message", {})
    for content in output_message.get("content", []):
        if "text" in content:
            print(content["text"])

INFO: Found credentials from IAM Role: BaseNotebookInstanceEc2InstanceRole


- The invoice is from the Sunset Hotel.
- Invoice number is 5331.
- The guest's name is Thomas Johnson DDS.
- The guest's address is 37618 Lee Keys, West Jenniferbury, MO 80214.
- The date of issue is January 14, 2022.
- There is 1 adult and 0 children staying.
- The guest room costs $180.
- Laundry service costs $15.
- Dinner costs $70.
- Breakfast costs $30.
- Minibar costs $90.
- Taxi costs $90.
- Subtotal is $475.
- Discounts amount to $22.
- Taxes amount to $40.2.
- The total amount due is $493.2.


## Structured Data Extraction: Using a custom tool configuration to force the model to output structured data from invoices that follow the same layout
Amazon Nova Pro was able to extract all the information present in our sample invoice. However, let’s improve our workflow by extracting data from our invoices using a custom blueprint and producing structured output. For this approach, we will use a custom tool, which is ideal if you need to force your model to produce structured output. In this sample code, we assume that all invoices follow the same layout; however, you can modify the prompts to dynamically extract data as well. First, we will ask Amazon Nova to generate our tool schema:

In [9]:
prompt_text_tool_builder="""
You are an expert to generate toolConfig for AWS Converse API. Your response must be a valid compact single-line JSON data structure and must contain 
the required toolConfig for a converse API tool to extract structured data from documents like my_invoice_template.
Here is a sample toolConfig, use my_invoice_template document to create the required toolConfig:
toolConfig={
        "tools": [
            {
                "toolSpec": {
                    "name": "weather",
                    "description": "Use this tool if the request is to get the previous, current, or future weather for a country",
                    "inputSchema": {
                        "json": {
                            "type": "object",
                            "properties": {
                                "country": {
                                    "type": "string"
                                },
                                "dates": {
                                    "type": "object",
                                    "properties": {
                                        "date_start": {
                                            "type": "string",
                                            "format": "date"
                                        },
                                        "date_end": {
                                            "type": "string",
                                            "format": "date"
                                        }
                                    },
                                    "required": [
                                        "date_start",
                                        "date_end"
                                    ]
                                }
                            },
                            "required": [
                                "country",
                                "dates"
                            ]
                        }
                    }
                }
            }
        ],
        "toolChoice": {
            "auto": {}
        }
    }
"""

In [10]:
result = send_pdf_to_bedrock(prompt=prompt_text_tool_builder, pdf_file_path=pdf_path)
if result:
    # Extract and print the text response from the model
    output_message = result.get("output", {}).get("message", {})
    for content in output_message.get("content", []):
        if "text" in content:
            tools = content["text"]
            print(tools)

{"tools":[{"toolSpec":{"name":"extract_invoice_data","description":"Use this tool to extract structured data from a hotel invoice document.","inputSchema":{"json":{"type":"object","properties":{"invoice_number":{"type":"string"},"guest_name":{"type":"string"},"date_of_issue":{"type":"string","format":"date"},"guest_address":{"type":"string"},"no_of_adults":{"type":"integer"},"no_of_children":{"type":"integer"},"others":{"type":"integer"},"items":{"type":"array","items":{"type":"object","properties":{"no":{"type":"integer"},"date":{"type":"string","format":"date"},"description":{"type":"string"},"amount":{"type":"number"}},"required":["no","date","description","amount"]}},"subtotal":{"type":"number"},"discounts":{"type":"number"},"taxes":{"type":"number"},"total":{"type":"number"}},"required":["invoice_number","guest_name","date_of_issue","guest_address","no_of_adults","no_of_children","others","items","subtotal","discounts","taxes","total"]}}}}],"toolChoice":{"auto":{}}}


It is time to use our `send_pdf_to_bedrock` function again but this time adding our new tool.

In [11]:
prompt_text = "Please summarize the content of this PDF and share your insights."
document_list = []

result = send_pdf_to_bedrock(prompt=prompt_text, pdf_file_path=pdf_path, tool_config=eval(tools))
if result:
    # Extract and print the text response from the model
    output_message = result.get("output", {}).get("message", {})
    for content in output_message.get("content", []):
        if "toolUse" in content:
            pprint(content["toolUse"]["input"])
            # We will store the extracted information in our list
            document_list.append(content["toolUse"]["input"])

{'date_of_issue': '2022-01-14',
 'discounts': 22.0,
 'guest_address': '37618 Lee Keys West Jenniferbury, MO 80214',
 'guest_name': 'Thomas Johnson DDS',
 'invoice_number': '5331',
 'items': [{'amount': 180.0,
            'date': '2022-01-14',
            'description': 'Guest room',
            'no': 1},
           {'amount': 15.0,
            'date': '2022-01-14',
            'description': 'Laundry',
            'no': 2},
           {'amount': 70.0,
            'date': '2022-01-14',
            'description': 'Dinner',
            'no': 3},
           {'amount': 30.0,
            'date': '2022-01-14',
            'description': 'Breakfast',
            'no': 4},
           {'amount': 90.0,
            'date': '2022-01-14',
            'description': 'Minibar',
            'no': 5},
           {'amount': 90.0,
            'date': '2022-01-14',
            'description': 'Taxi',
            'no': 6}],
 'no_of_adults': 1,
 'no_of_children': 0,
 'others': 0,
 'subtotal': 475.0,
 'taxes':

We will now extract the information from a second document using the same tool created by Nova, and add the document to our list.

In [12]:
# here is a differenc invoice
pdf_path="invoices/test_invoice_0_2.pdf"

result = send_pdf_to_bedrock(prompt=prompt_text, pdf_file_path=pdf_path, tool_config=eval(tools))
if result:
    # Extract and print the text response from the model
    output_message = result.get("output", {}).get("message", {})
    for content in output_message.get("content", []):
        if "toolUse" in content:
            pprint(content["toolUse"]["input"])
            # We will store the extracted information in our list
            document_list.append(content["toolUse"]["input"])

{'date_of_issue': '2021-04-21',
 'discounts': 24,
 'guest_address': '73065 Amanda Forges Crossbyside, AL 81369',
 'guest_name': 'Michaela Johns',
 'invoice_number': '9582',
 'items': [{'amount': 180,
            'date': '2021-04-19',
            'description': 'Guest room',
            'no': 1},
           {'amount': 30,
            'date': '2021-04-19',
            'description': 'Breakfast',
            'no': 2},
           {'amount': 180,
            'date': '2021-04-20',
            'description': 'Guest room',
            'no': 3},
           {'amount': 15,
            'date': '2021-04-20',
            'description': 'Laundry',
            'no': 4},
           {'amount': 70,
            'date': '2021-04-20',
            'description': 'Dinner',
            'no': 5},
           {'amount': 30,
            'date': '2021-04-20',
            'description': 'Breakfast',
            'no': 6},
           {'amount': 180,
            'date': '2021-04-21',
            'description': 'Guest r

In [13]:
IFrame(pdf_path, width=600, height=900)

We have two documents in our list now:

In [14]:
print(f'Document count: {len(document_list)}')

Document count: 2


## Generating Business Insights: Aggregating data from multiple invoices to answer specific business questions
We have used Amazon Nova to create a custom blueprint using a tool for the Converse API and also extracted data from 2 documents using the same blueprint. We will interrogate our data and generate insights with the following questions:
1. What guest spent the most in the hotel?
2. What guest spent the least nights in the hotel?

In [15]:
user_questions="""
1. What guest spent the most in the hotel?
2. What guest spent the least nights in the hotel?
"""

prompt_extracting_insights=f"""
You are a business analysist specialiased to extract information from invoices. You will be provided with a list of invoices, 
answer the user question using only information present in your list of invoices. Your response must be in raw text.
User question: {user_questions}
List of invoices: {document_list}
"""

In [17]:
result = send_pdf_to_bedrock(prompt=prompt_extracting_insights)
if result:
    # Extract and print the text response from the model
    output_message = result.get("output", {}).get("message", {})
    for content in output_message.get("content", []):
        if "text" in content:
            print(content["text"])

1. The guest who spent the most in the hotel is Michaela Johns with a total of 844.9.
2. The guest who spent the least nights in the hotel is Thomas Johnson DDS with 1 night.
