# Long Context Instruction Following and Understanding with Amazon Nova Premier

Large Language Models (LLMs) with extended context lengths, particularly up to 1 million tokens, represent a significant advancement in natural language processing, enabling the handling of vast amounts of information in a single input.

This large context enables and simplifies cases requiring extended context lengths.

This notebook demonstrates how to use long context with up to 1M context window for Amazon Nova Premier, by analysing Amazon Financial Reports over multiple years in one request.

## When is long context useful?

* **Document Processing and Summarization:** LLMs can analyze and summarize extensive documents, such as legal contracts, regulatory texts, or academic papers. For instance, law firms can use these models to reduce manual review time, minimizing the risk of overlooking critical information, while banks can process lengthy regulatory documents for compliance

* **Enhanced Conversation Memory:** In customer service and chat applications, a 1 million token context enables LLMs to retain extensive conversation histories, improving coherence and relevance. This is particularly beneficial for handling long chat threads or email exchanges, enhancing user satisfaction by ensuring the model remembers previous interactions. For example, chatbots can maintain context over hours of dialogue, a significant improvement over models with shorter memory spans.

* **Complex Task Handling:** Tasks requiring long-term planning or integration of multiple data sources benefit from extended contexts. This includes academic research, where LLMs can summarize large volumes of literature and generate hypotheses, and financial analysis, where they can process complex datasets for decision-making. Additionally, LLMs can manage large codebases, supporting developers in debugging or generating code for projects with over 30,000 lines.

## Implementation Example

This notebook walks through a simple financial document analyser:

* Download financial documents over multiple years
* Stack the PDF documents as part of the prompt.
* Request analysis over time for these reports to get insights on trends.
* Show the results

In [None]:
%pip install "boto3>=1.28.57" "awscli>=1.29.57" "botocore>=1.31.57" "requests>=2.32.3" --quiet

## Use case: Multi year financial report analyser

In [None]:
import boto3
import json
import base64
from datetime import datetime
from botocore.config import Config
import requests

PREMIER_MODEL_ID = "us.amazon.nova-premier-v1:0"
PRO_MODEL_ID = "us.amazon.nova-pro-v1:0"
LITE_MODEL_ID = "us.amazon.nova-lite-v1:0"
MICRO_MODEL_ID = "us.amazon.nova-micro-v1:0"

# Create a Bedrock Runtime client in the AWS Region of your choice.
client = boto3.client(
    "bedrock-runtime",
    region_name="us-east-1",
    config=Config(
        connect_timeout=300,  # 5 minutes
        read_timeout=300,     # 5 minutes
        retries={'max_attempts': 1}
    )
)

In [None]:
messages = [
    {
        "role": "user",
        "content": []
    }
]

In [None]:
instructions ="""
## Task:
Analyze Amazon's financial reports across multiple years to identify significant performance trends, segment growth patterns, and strategic shifts.

## Context information:
- You have access to Amazon's annual financial reports (10-K) for multiple fiscal years in PDF format
- These reports contain comprehensive financial data including income statements, balance sheets, cash flow statements, and management discussions
- The analysis should focus on year-over-year comparisons to identify meaningful trends
- Amazon operates multiple business segments including North America retail, International retail, Amazon Web Services (AWS), advertising, and subscription services

## Model Instructions:
- FIRST extract key financial metrics from each year's reports
- THEN organize data chronologically to identify meaningful trends
- DO compare segment performance across the five-year period
- DO identify significant strategic shifts or investments mentioned in management discussions
- DO NOT make speculative predictions beyond what is supported by the data
- ALWAYS note any changes in accounting practices or reporting methodologies that might affect year-over-year comparisons

## Response style and format requirements:
- Respond in markdown
- Structure the analysis with clear headings and subheadings
- Present key financial metrics in tabular format showing all five years side-by-side
- Include percentage changes year-over-year for all major metrics
- Create a section dedicated to visualizing the most significant trends (with descriptions of what would be shown in charts)
- Limit the executive summary to 250 words maximum
- Format segment analysis as separate sections with consistent metrics across all segments
- MUST include a "Key Insights" bullet-pointed list at the end of each major section
"""


In [None]:
query = {
    "text": instructions
}

messages[0]['content'].append(query)

In [None]:
import io
import os

# reports from: https://ir.aboutamazon.com/annual-reports-proxies-and-shareholder-letters/default.aspx
reports = [
    "https://s2.q4cdn.com/299287126/files/doc_financials/2025/ar/Amazon-2024-Annual-Report.pdf",
    "https://s2.q4cdn.com/299287126/files/doc_financials/2024/ar/Amazon-com-Inc-2023-Annual-Report.pdf",
    # "https://s2.q4cdn.com/299287126/files/doc_financials/2023/ar/Amazon-2022-Annual-Report.pdf",
    # "https://s2.q4cdn.com/299287126/files/doc_financials/2022/ar/Amazon-2021-Annual-Report.pdf",
]

for report_url in reports:
    file_name = report_url.split("/")[-1].split('.')[0]
    # download_file(report_url, file_name)
    print(file_name)

    response = requests.get(report_url)
    response.raise_for_status()  # Raise exception for HTTP errors
        
    # Create a file-like object from the response content
    pdf_file_object = response.content
    obj = \
    { 
        "document": {
            "name": file_name,
            "format": "pdf",
            "source": {
                "bytes": pdf_file_object
            }
        }
    }
    messages[0]['content'].append(obj)

In [None]:
#Define your system prompt(s).
system_prompt = [
    {
        "text": "You are an expert analyst that can critically analyse financial reports."
    }
]

# Configure the inference parameters.
inf_params = {"maxTokens": 10000, "topP": 0.1, "temperature": 0.1}

## Converse API

You can use the Amazon Bedrock [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html) to create conversational applications that send and receive messages to and from an Amazon Bedrock model. To use the Converse API, you use the Converse or ConverseStream (for streaming responses) operations to send messages to a model.

In [None]:
from IPython.display import JSON, Markdown

model_id=PREMIER_MODEL_ID

model_response = client.converse(
    modelId=model_id, system=system_prompt, messages=messages, inferenceConfig=inf_params,
)

content_text = model_response["output"]["message"]["content"][0]["text"]

JSON(model_response)

Render response text as markdown in Jupyterlab

In [None]:
from IPython.display import JSON, Markdown

Markdown(content_text)

## Converse Stream API

In [None]:
model_response = client.converse_stream(
    modelId=model_id, messages=messages, system=system_prompt, inferenceConfig=inf_params
)

stream = model_response.get('stream')
if stream:
    for event in stream:

        if 'messageStart' in event:
            print(f"\nRole: {event['messageStart']['role']}")

        if 'contentBlockDelta' in event:
            print(event['contentBlockDelta']['delta']['text'], end="")

        if 'messageStop' in event:
            print(f"\nStop reason: {event['messageStop']['stopReason']}")

        if 'metadata' in event:
            metadata = event['metadata']
            if 'usage' in metadata:
                print("\nToken usage")
                print(f"Input tokens: {metadata['usage']['inputTokens']}")
                print(
                    f":Output tokens: {metadata['usage']['outputTokens']}")
                print(f":Total tokens: {metadata['usage']['totalTokens']}")
            if 'metrics' in event['metadata']:
                print(
                    f"Latency: {metadata['metrics']['latencyMs']} milliseconds")


## Asking questions about the document

Let's take a look at Amazon Nova's capability to answer questions based on the information hidden somewhere in the documents.

In [None]:
query = {
    "text": "How many employees worked at Amazon at the end of 2023?"
}

messages[0]['content'].append(query)


In [None]:
model_response = client.converse(
    modelId=model_id, system=system_prompt, messages=messages, inferenceConfig=inf_params,
)

content_text = model_response["output"]["message"]["content"][0]["text"]

print("\n[Full Response]")
JSON(model_response)

In [None]:
Markdown(content_text)

## Conclusion

We showed how we can download and easily analyse multiple years of financial reports in PDF using a simple prompt and Amazon Nova Premier with up to 1M context window. This enables new cases and simplifies many existing cases.
