# Getting Started with Amazon Nova Models

Amazon Nova foundation models deliver frontier intelligence and industry leading price-performance. Through Amazon Bedrock, organizations can seamlessly build and scale generative AI applications with Nova that are safe, reliable, and cost-effective.

This workshop will focus primarily on **Amazon Nova Understanding Models:**

**Amazon Nova Micro:** Lightening fast, cost-effective text-only model

**Amazon Nova Lite:** Fastest, most affordable multimodal FM in the industry for its intelligence tier

**Amazon Nova Pro:** The fastest, most cost-effective, state-of-the-art multimodal model in the industry

**Amazon Nova Premier:** Most capable multimodal model for complex tasks and the best teacher for distilling custom models for cost-effective applications.



<div class="alert alert-info">
 <b>You will see pip dependency errors, you can safely ignore these errors.</b>
    
    IGNORE ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
</div>

In [1]:
%pip install --upgrade -r ../requirements.txt -q

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
autogluon-multimodal 1.4.0 requires nvidia-ml-py3<8.0,>=7.352.0, which is not installed.
sagemaker-studio 1.1.1 requires pydynamodb>=0.7.4, which is not installed.
aiobotocore 2.22.0 requires botocore<1.37.4,>=1.37.2, but you have botocore 1.42.2 which is incompatible.
amazon-sagemaker-jupyter-ai-q-developer 1.2.8 requires numpy<=2.0.1, but you have numpy 2.3.5 which is incompatible.
amazon-sagemaker-sql-magic 0.1.4 requires numpy<2, but you have numpy 2.3.5 which is incompatible.
autogluon-multimodal 1.4.0 requires transformers[sentencepiece]<4.50,>=4.38.0, but you have transformers 4.57.1 which is incompatible.
autogluon-timeseries 1.4.0 requires transformers[sentencepiece]<4.50,>=4.38.0, but you have transformers 4.57.1 which is incompatible.
catboost 1.2.7 requires numpy<2.0,>=1.16.0, but you have numpy 2

In [3]:
# restart kernel
from IPython.core.display import HTML

HTML("<script>Jupyter.notebook.kernel.restart()</script>")

## Using the boto3 SDK in Python

Interaction with the Bedrock API is done via the AWS SDK for Python: [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html).

### Using the Default Credential Chain

If you are running this notebook from [Amazon SageMaker Studio](https://aws.amazon.com/sagemaker/studio/) and your SageMaker Studio [execution role](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) has permissions to access Bedrock, you can just run the cells below as-is. This is also the case if you are running these notebooks from a computer whose default AWS credentials have access to Bedrock.

In [4]:
import sagemaker
from IPython.display import display, Markdown, Video
import base64
from datetime import datetime
import json
import boto3
import os

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


For hosted workshop, we are accessing Nova models from us-west-2 region via CRIS. For more information, check out the [inference profiles documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html)

In [5]:
# Get the region from the SageMaker session
region_name = sagemaker.Session().boto_region_name
print(f"Current AWS Region: {region_name}")

Current AWS Region: us-west-2


In [6]:
MICRO_MODEL_ID = "us.amazon.nova-micro-v1:0"
LITE_MODEL_ID = "us.amazon.nova-lite-v1:0"
PRO_MODEL_ID = "us.amazon.nova-pro-v1:0"
PREMIER_MODEL_ID = "us.amazon.nova-premier-v1:0"

%store MICRO_MODEL_ID
%store LITE_MODEL_ID
%store PRO_MODEL_ID
%store PREMIER_MODEL_ID
%store region_name

Stored 'MICRO_MODEL_ID' (str)
Stored 'LITE_MODEL_ID' (str)
Stored 'PRO_MODEL_ID' (str)
Stored 'PREMIER_MODEL_ID' (str)
Stored 'region_name' (str)


### Validate the Connection

We can check that the client works by trying out the `list_foundation_models()` method, which will tell us all the models available for us to use:

In [7]:
client = boto3.client("bedrock")
[
    model["modelId"]
    for model in client.list_foundation_models()["modelSummaries"]
    if model["modelId"].startswith("amazon.nova")
]

['amazon.nova-pro-v1:0',
 'amazon.nova-2-lite-v1:0',
 'amazon.nova-2-sonic-v1:0',
 'amazon.nova-premier-v1:0:8k',
 'amazon.nova-premier-v1:0:20k',
 'amazon.nova-premier-v1:0:1000k',
 'amazon.nova-premier-v1:0:mm',
 'amazon.nova-premier-v1:0',
 'amazon.nova-lite-v1:0',
 'amazon.nova-micro-v1:0']

### InvokeModel Request and Response Format

The `invoke_model()` method of the Amazon Bedrock runtime client (InvokeModel API) will be the primary method we use for most of our Text Generation and Processing tasks.

Although the method is shared, the format of input and output varies depending on the foundation model used. The sample JSON schema below shows the structure for Amazon Nova models:

```json
{
  "system": [
    {
      "text": string
    }
  ],
  "messages": [
    {
      "role": "user", // first turn should always be the user turn
      "content": [
        {
          "text": string
        },
        {
          "image": {
            "format": "jpeg"| "png" | "gif" | "webp",
            "source": {
              "bytes": "base64EncodedImageDataHere..." // base64-encoded binary
            }
          }
        },
        {
          "video": {
            "format": "mkv" | "mov" | "mp4" | "webm" | "three_gp" | "flv" | "mpeg" | "mpg" | "wmv",
            "source": {
            // source can be s3 location of base64 bytes based on size of input file. 
               "s3Location": {
                "uri": string, // example: s3://my-bucket/object-key
                "bucketOwner": string // (Optional) example: 123456789012)
               }
              "bytes": "base64EncodedImageDataHere..." // base64-encoded binary
            }
          }
        },
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "text": string // prefilling assistant turn
        }
      ]
    }
  ],
 "inferenceConfig":{ // all Optional
    "max_new_tokens": int, // greater than 0, equal or less than 5k (default: dynamic*)
    "temperature": float, // greater then 0 and less than 1.0 (default: 0.7)
    "top_p": float, // greater than 0, equal or less than 1.0 (default: 0.9)
    "top_k": int // 0 or greater (default: 50)
    "stopSequences": [string]
  },
  "toolConfig": { // all Optional
        "tools": [
                {
                    "toolSpec": {
                        "name": string // meaningful tool name (Max char: 64)
                        "description": string // meaningful description of the tool
                        "inputSchema": {
                            "json": { // The JSON schema for the tool
                                "type": "object",
                                "properties": {
                                    <args>: { // arguments 
                                        "type": string, // argument data type
                                        "description": string // meaningful description
                                    }
                                },
                                "required": [
                                    string // args
                                ]
                            }
                        }
                    }
                }
            ],
        "toolChoice": "auto" // Three supported parameter options: tool, any, and auto
        }
    }
}
```

#### Required and Optional Parameters

* `system` ‚Äì (Optional) The system prompt for the request.
  * A system prompt provides context and instructions to Amazon Nova, such as specifying a particular goal or role.

* `messages` ‚Äì (Required) The input messages.
  * `role` ‚Äì The role of the conversation turn. Valid values are user and assistant. 
  * `content` ‚Äì (required) The content of the conversation turn.
    * `type` ‚Äì (required) The type of the content. Valid values are image, text, video
      * if chosen text (text content)
        * `text` - The content of the conversation turn. 
      * If chosen Image (image content)
        * `source` ‚Äì (required) The base64 encoded image bytes for the image.
        * `format` ‚Äì (required) The type of the image (jpeg, png, webp, gif)
      * If chosen video: (video content)
        * `source` ‚Äì (required) The base64 encoded video bytes or S3 URI with bucket owner
        * `format` ‚Äì (required) The type of the video (mkv, mov, mp4, webm, etc.)

* `inferenceConfig` ‚Äì (Optional) Inference configuration parameters
  * `max_new_tokens` ‚Äì Maximum number of tokens to generate (max 5K)
  * `temperature` ‚Äì Amount of randomness in the response
  * `top_p` ‚Äì Nucleus sampling probability threshold
  * `top_k` ‚Äì Limiting sampling to top K options for each token
  * `stopSequences` ‚Äì Array of strings to stop generation when encountered

* `toolConfig` ‚Äì (Optional) JSON object containing the tool specification and tool choice

## Text Understanding with Nova Models

Note: The examples below can work with Nova Micro, Nova Lite, and Nova Pro models. We're using Nova Micro for illustrative purposes, but you can substitute any model of the Nova family.

### Synchronous API Call

In [8]:
def call_nova(
    model,
    messages,
    system_message="",
    streaming=False,
    max_tokens=1024,
    temp=0.7,
    top_p=0.99,
    top_k=20,
    tools=None,
    verbose=False,
):
    """Call Amazon Nova models with various parameters.
    
    Args:
        model (str): The model ID to use
        messages (list): List of message objects with role and content
        system_message (str, optional): System prompt. Defaults to "".
        streaming (bool, optional): Whether to use streaming API. Defaults to False.
        max_tokens (int, optional): Maximum tokens to generate. Defaults to 512.
        temp (float, optional): Temperature parameter. Defaults to 0.7.
        top_p (float, optional): Top-p parameter. Defaults to 0.99.
        top_k (int, optional): Top-k parameter. Defaults to 20.
        tools (list, optional): List of tool specifications. Defaults to None.
        verbose (bool, optional): Whether to print request body. Defaults to False.
        
    Returns:
        tuple or stream: Model response and content text if not streaming, else stream
    """
    client = boto3.client("bedrock-runtime")
    
    # Prepare system prompt
    system_list = [{"text": system_message}]
    
    # Prepare inference parameters
    inf_params = {
        "max_new_tokens": max_tokens,
        "top_p": top_p,
        "top_k": top_k,
        "temperature": temp,
    }
    
    # Build request body
    request_body = {
        "messages": messages,
        "system": system_list,
        "inferenceConfig": inf_params,
    }
    
    # Add tool configuration if provided
    if tools is not None:
        tool_config = []
        for tool in tools:
            tool_config.append({"toolSpec": tool})
        request_body["toolConfig"] = {"tools": tool_config}
    
    if verbose:
        print("Request Body", request_body)
    
    if not streaming:
        # Use synchronous API
        response = client.invoke_model(modelId=model, body=json.dumps(request_body))
        model_response = json.loads(response["body"].read())
        return model_response, model_response["output"]["message"]["content"][0]["text"]
    else:
        # Use streaming API
        response = client.invoke_model_with_response_stream(
            modelId=model, body=json.dumps(request_body)
        )
        return response["body"]


def get_base64_encoded_value(media_path):
    """Convert media file to base64 encoded string.
    
    Args:
        media_path (str): Path to the media file
        
    Returns:
        str: Base64 encoded string
    """
    with open(media_path, "rb") as media_file:
        binary_data = media_file.read()
        base_64_encoded_data = base64.b64encode(binary_data)
        base64_string = base_64_encoded_data.decode("utf-8")
        return base64_string


def print_output(content_text):
    """Display model output as Markdown.
    
    Args:
        content_text (str): Text to display
    """
    display(Markdown(content_text))

### Use Amazon Nova with a system prompt

System prompts provide context and instructions to guide the model's behavior. Here we demonstrate how to use a system prompt to instruct the model to act as a creative writing assistant for marketing.

In [9]:
system_message = "Act as a telecom industry marketer for AnyCompany Telecom. When the user provides you with a topic, write a LinkedIn Launch Post about that topic."
messages = [
    {
        "role": "user",
        "content": [
            {"text": "Meet our AI-powered customer support assistant"}
        ],
    }
]
model_response, content_text = call_nova(
    MICRO_MODEL_ID, messages, system_message=system_message
)

print("\n[Response Content Text]")
print_output(content_text)


[Response Content Text]


üåü Exciting News from AnyCompany Telecom! üåü

We are thrilled to introduce our cutting-edge innovation that's set to revolutionize the way we connect with our customers: Our AI-Powered Customer Support Assistant!

üîç **Why This Matters:**
In today's fast-paced world, exceptional customer service is more crucial than ever. Our AI-Powered Customer Support Assistant is here to transform the customer experience by providing:

- **24/7 Availability:** Never miss a beat, even outside of business hours.
- **Instant Responses:** Get answers to your queries in real-time, no waiting required.
- **Personalized Interactions:** Tailored solutions that cater to your unique needs.
- **Efficiency:** Streamlined processes that save you time and us valuable resources.

ü§ñ **How It Works:**
Our AI-Powered Assistant leverages advanced machine learning algorithms to understand and respond to your needs with remarkable accuracy. Whether you‚Äôre troubleshooting an issue, seeking billing information, or just have a question, our AI is here to assist you seamlessly.

üí¨ **Customer Feedback:**
"Since integrating the AI-Powered Customer Support Assistant, our response times have decreased by 40%, and customer satisfaction scores have soared!" - Sarah L., Customer Relations Manager

üîó **Try It Today:**
Experience the future of customer support. Visit our website or reach out directly through our app to start interacting with our AI-Powered Assistant.

Thank you for being a part of the AnyCompany Telecom family. We‚Äôre committed to delivering unparalleled service and innovation to meet your needs.

#AI #CustomerSupport #Innovation #TechAdvancements #AnyCompanyTelecom #CustomerService #AIAssistant #TechTrends #CustomerExperience

---

Feel free to connect with us for more details or to share your thoughts on this new service! üöÄ

---

Let‚Äôs connect the future of telecom with the best customer support experience! üåê‚ú®

---

#TelecomInnovation #AIinTelecom #CustomerFirst #TechRevolution #NextGenSupport

---

We look forward to your feedback and are excited to see how this innovation will enhance your experience with us!

Best regards,

[Your Name]  
Marketing Team, AnyCompany Telecom

**Note:** Review Nova's output for the task above. Do you see anything that you may not want in a public message for a business? You can optimize the prompt to address this


### Multi-lingual content generation

For this context, imagine you are a customer communications associate who communicates incidents to global users. You use Amazon Nova to help you generate messages in multiple languages

In [None]:
system_message = "You are a customer communications specialist who balances technical detail with clarity."
incident_context = """
Event: System Maintenance Impact
Time: 15-minute delay in customer portal
Cause: Database optimization
Status: Completed
Benefit: 50% faster loading times
"""
messages = [
    {
        "role": "user",
        "content": [
            {"text": f"Generate brief customer emails in English, Spanish, French, and Japanese about this maintenance impact - be clear but friendly:\n\n{incident_context}"}
        ],
    }
]
model_response, content_text = call_nova(
    MICRO_MODEL_ID, messages, system_message=system_message
)

print("\n[Response Content Text]")
print_output(content_text)


## Multimodal Understanding with Nova Models

Let's show how Amazon Nova can analyze multimodal content (e.g. an image). The following example demonstrates how to pass images to Nova models for multimodal understanding. Note that multimodal capabilities are only available with Nova Lite, Nova Pro models and Nova Premier (not with Nova Micro).

### Image Understanding

Let's see how Amazon Nova models perform on image understanding

Amazon Nova models allow you to include multiple images in a payload with a maximum size limit of 25MB. The model can:
- Analyze images and answer questions about them
- Classify images
- Summarize image content based on provided instructions

In this example, we'll pass an image of a coverage map and give Nova a task

![Coverage map](../1-nova-fundamentals/images/global_coverage_map.png)

In [10]:
system_message = "You are a coverage expert explaining service availability with high accuracy"
messages = [
    {
        "role": "user",
        "content": [
            {
                "image": {
                    "format": "png",
                    "source": {
                        "bytes": get_base64_encoded_value(
                            "../1-nova-fundamentals/images/global_coverage_map.png"
                        )
                    },
                }
            },
            {
                "text": "What does this coverage map tell us about service quality in different continents? List the continents and explanation in bullet points"
            },
        ],
    }
]
model_response, content_text = call_nova(
    LITE_MODEL_ID, messages, system_message=system_message, max_tokens=300
)

print("\n[Response Content Text]")
print_output(content_text)


[Response Content Text]


Here is a summary of the coverage map's insights on service quality across different continents:

*   **Europe**: The map indicates strong service coverage across most of Europe, suggesting high-quality service in this region.
*   **Asia**: The map shows significant service coverage in Asia, particularly in developed countries. However, there are some areas with lower coverage, indicating potential service quality variations within the continent.
*   **North America**: The map displays extensive service coverage in North America, particularly in the United States and Canada. This suggests a high level of service quality in this region.
*   **South America**: The map shows moderate service coverage in South America, with some areas having lower coverage. This indicates potential service quality differences within the continent.
*   **Africa**: The map indicates limited service coverage in Africa, suggesting lower service quality in this region. However, there are some areas with moderate coverage, indicating potential variations in service quality within the continent.
*   **Australia**: The map shows moderate service coverage in Australia, indicating a relatively high level of service quality in this region.

Overall, the coverage map suggests that Europe and North America have the highest service quality, followed by Asia and South America. Africa and Australia have the lowest service quality, with some variations within these continents.

### Video Understanding

The Amazon Nova models allow you to include a single video in the payload, which can be provided either in base64 format or through an Amazon S3 URI. When using the base64 method, the overall payload size must remain within 25MB. However, you can specify an Amazon S3 URI for video understanding. This approach enables you to leverage the model for longer videos without being constrained by the overall payload size limitation. Amazon Nova models can analyze the passed video and answer questions, classify a video, and summarize information in the video based on provided instructions.

Let's analyze a video using Amazon Nova. Here, we will use base64 format

In [11]:
#display the video?
video_path = "video/cell-phone.mp4"

# Verify file exists
if os.path.exists(video_path):
    # Display video with controls and specified dimensions
    display(Video(video_path, 
                 embed=True, 
                 width=800,  # Adjust width as needed
                 height=450, 
                 html_attributes="controls autoplay loop"))
else:
    print(f"Error: Video file not found at {video_path}")

In [12]:
system_message = "You are an expert product merchandiser."
messages = [
    {
        "role": "user",
        "content": [
            {
                "video": {
                    "format": "mp4",
                    "source": {
                        "bytes": get_base64_encoded_value(
                            video_path
                        )
                    },
                }
            },
            {
                "text": "Brainstorm 3 ideas on how to place this creative asset for marketing"
            },
        ],
    }
]
model_response, content_text = call_nova(
    LITE_MODEL_ID, messages, system_message=system_message, max_tokens=300
)

print("\n[Response Content Text]")
print_output(content_text)


[Response Content Text]


1. **Social Media Campaigns**: Utilize platforms like Instagram, Facebook, and Twitter to share captivating images of the smartphone against a cosmic backdrop. Encourage users to share their own pictures or creative content using the hashtag associated with the campaign to increase engagement and visibility.

2. **Interactive Website Banner**: Design an interactive banner for the company's website that showcases the smartphone in a futuristic setting. Users can swipe or click to reveal different features and specifications of the device, providing an engaging and informative experience.

3. **Augmented Reality Experience**: Develop an augmented reality (AR) experience where users can point their smartphones at specific images or markers to see the device floating in a cosmic environment. This immersive experience can be shared on social media, driving interest and excitement about the product.

# Conclusion

In this notebook, we explored the foundational capabilities of Amazon Nova models:

## Key Learnings

1. **Understanding Amazon Nova Model Family**
   - **Text Understanding**: Amazon Nova Micro, Lite, Pro, and Premier offer varying levels of capabilities with different price-performance ratios
   - **Multimodal Processing**: Nova Lite and Pro can process text, images, and videos

2. **Core Capabilities**
   - **Text Processing**: Demonstrated synchronous and streaming API calls
   - **System Prompts**: Used system prompts to guide model behavior
   - **Multimodal Understanding**: Analyzed image and video data