# Amazon Nova 2 Omni - Setup and Configuration

This notebook helps you set up your environment for working with Amazon Nova 2 Omni model.

## What is Amazon Nova 2 Omni?

Amazon Nova 2 Omni is a multimodal foundation model that can understand and generate content across text, images, and audio. Key capabilities include:

- **Speech Understanding**: Transcribe, summarize, analyze, and answer questions about audio content
- **Image Generation**: Create high-quality images from text descriptions
- **Multimodal Reasoning**: Process and understand multiple input modalities simultaneously

**Supported Audio Formats:** mp3, opus, wav, aac, flac, mp4, ogg, mkv

---

## Prerequisites Check

Let's verify that your environment is properly configured.

In [None]:
import sys
print(f"Python version: {sys.version}")

# Check Python version
if sys.version_info >= (3, 12):
    print("‚úÖ Python 3.12+ is installed")
else:
    print("‚ùå Python 3.12+ is required. Please upgrade your Python version.")

In [None]:
# Install required boto3/botocore versions
!pip install boto3==1.42.4 botocore==1.42.4 --force-reinstall --no-cache-dir -q

In [None]:
# Verify installed versions
import boto3
import botocore

print(f"boto3 version: {boto3.__version__}")
print(f"botocore version: {botocore.__version__}")

if boto3.__version__ == '1.42.4' and botocore.__version__ == '1.42.4':
    print("‚úÖ Correct boto3/botocore versions installed")
else:
    print("‚ö†Ô∏è Version mismatch detected")

## AWS Configuration

Let's verify your AWS credentials and region configuration.

In [None]:
import boto3
from botocore.exceptions import NoCredentialsError, ClientError

try:
    # Check AWS credentials
    session = boto3.Session()
    credentials = session.get_credentials()
    
    if credentials:
        print("‚úÖ AWS credentials are configured")
        print(f"Region: {session.region_name or 'Not set (will use us-east-1)'}")
    else:
        print("‚ùå AWS credentials not found. Please configure your AWS CLI or set environment variables.")
        
except Exception as e:
    print(f"‚ùå Error checking AWS configuration: {e}")

## Amazon Bedrock Setup

Let's test the connection to Amazon Bedrock and verify model access.

In [None]:
from botocore.config import Config

MODEL_ID = "us.amazon.nova-2-omni-v1:0"
REGION_ID = "us-west-2"

def test_bedrock_connection():
    """Test connection to Amazon Bedrock"""
    try:
        config = Config(
            read_timeout=2 * 60,
        )
        bedrock = boto3.client(
            service_name="bedrock-runtime",
            region_name=REGION_ID,
            config=config,
        )
        
        # Test with a simple text-only request
        response = bedrock.converse(
            modelId=MODEL_ID,
            messages=[
                {
                    "role": "user",
                    "content": [{"text": "Hello, can you respond with just 'Hello back!'?"}],
                }
            ],
            inferenceConfig={"maxTokens": 50},
        )
        
        print("‚úÖ Successfully connected to Amazon Bedrock")
        print("‚úÖ Nova 2 Omni model is accessible")
        print(f"Test response: {response['output']['message']['content'][0]['text']}")
        return True
        
    except ClientError as e:
        error_code = e.response['Error']['Code']
        if error_code == 'AccessDeniedException':
            print("‚ùå Access denied. Please check your IAM permissions include 'bedrock:InvokeModel'")
        elif error_code == 'ValidationException':
            print("‚ùå Model not found. Please verify the model ID is correct.")
        else:
            print(f"‚ùå Bedrock error: {e}")
        return False
        
    except Exception as e:
        print(f"‚ùå Connection error: {e}")
        return False

# Test the connection
connection_success = test_bedrock_connection()

## Next Steps

If all checks passed successfully, you're ready to explore Nova 2 Omni capabilities!

### Available Notebooks:

1. **01_speech_understanding_examples.ipynb** - Audio processing:
   - Transcribe audio with speaker diarization
   - Summarize and analyze audio content
   - Call analytics with structured output

2. **02_image_generation_examples.ipynb** - Image generation:
   - Text-to-image with aspect ratio control
   - Image editing and style transfer
   - Text in images and creative control

3. **03_multimodal_understanding_examples.ipynb** - Multimodal analysis:
   - Image and video understanding
   - Video summarization and classification
   - Audio content analysis

4. **04_langchain_multimodal_reasoning.ipynb** - LangChain integration:
   - Tool use with structured outputs
   - Reasoning effort configuration
   - MMMU-style evaluation patterns

5. **05_langgraph_multimodal_reasoning.ipynb** - LangGraph workflows:
   - Stateful reasoning workflows
   - Multi-step reasoning chains
   - Conditional routing with tools

6. **06_strands_multimodal_reasoning.ipynb** - Multi-agent systems:
   - Specialized agents for different modalities
   - Agent orchestration and coordination
   - Collaborative reasoning patterns

7. **07_document_understanding_examples.ipynb** - Document processing:
   - OCR and text extraction
   - Key information extraction with JSON
   - Object detection and counting

### Tips for Success:

- Start with the speech understanding examples if you're interested in audio processing
- The model supports various audio formats: mp3, opus, wav, aac, flac, mp4, ogg, mkv
- For best results with transcription, use temperature=0.0
- For creative tasks, experiment with different temperature values (0.1-0.9)

Happy exploring! üöÄ