# From Insights to Intelligence: Multimodal RAG with Amazon Bedrock

This notebook demonstrates how to build a Multimodal Retrieval-Augmented Generation (RAG) application using Amazon Bedrock Data Automation (BDA) and Bedrock Knowledge Bases (KB). The application can analyze and generate insights from multiple data modalities, including documents, images, audio, and video.

## Setup and Configuration

Let's start by setting up the necessary dependencies and AWS clients.

In [1]:
%pip install "boto3>=1.37.4" s3fs tqdm retrying packaging --upgrade -qq

import boto3
import json
import uuid
import time
import os
import random
import sagemaker
import logging
import mimetypes
from botocore.exceptions import ClientError
import warnings
warnings.filterwarnings('ignore')

# Import utils and access the business context function
from utils.utils import BDARAGUtils

# Create utility instance to use its methods
rag_utils = BDARAGUtils()

# Display comprehensive business context for RAG
rag_utils.show_business_context("rag_complete")

# Configure logging
logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

sts_client = boto3.client('sts')
account_id = sts_client.get_caller_identity()["Account"]
region_name = boto3.session.Session().region_name

s3_client = boto3.client('s3')
bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime')

print(f"Setup complete!")
print(f"Using AWS region: {region_name}")

Note: you may need to restart the kernel to use updated packages.
sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


Setup complete!
Using AWS region: us-west-2


## 1. Load Knowledge Base Configuration

In this step, we'll load the knowledge base configuration created by the setup notebook (00-setup).

In [2]:
# Import our BDARAGUtils class
from utils.utils import BDARAGUtils

# Load knowledge base configuration from setup notebook
try:
    with open('kb_config.json', 'r') as f:
        kb_config = json.load(f)
    
    print("Knowledge Base configuration loaded successfully!")
    print(f"Knowledge Base ID: {kb_config['knowledge_base_id']}")
    print(f"Knowledge Base Name: {kb_config['knowledge_base_name']}")
    print(f"Bucket: {kb_config['bucket_name']}")
    print(f"S3 Prefix: {kb_config['s3_prefix']}")
    
    # Extract configuration variables
    kb_id = kb_config['knowledge_base_id']
    knowledge_base_name = kb_config['knowledge_base_name']
    bucket_name_kb = kb_config['bucket_name']
    s3_prefix = kb_config['s3_prefix']
    kb_suffix = kb_config['suffix']
    kb_ready = True
    
except FileNotFoundError:
    print("Knowledge Base configuration not found!")
    print("Please run the setup notebook (00-setup/enhanced_multimodal_rag.ipynb) first.")
    print("\nNote: An example configuration structure is available in 'kb_config_example.json'")
    kb_ready = False
except Exception as e:
    print(f"Error loading Knowledge Base configuration: {e}")
    kb_ready = False

✅ Knowledge Base configuration loaded successfully!
Knowledge Base ID: AT5MCPMWSZ
Knowledge Base Name: multimodal-rag-kb-1103523
Bucket: bda-workshop-us-west-2-033741858282
S3 Prefix: bda-workshop/


## 2. Initialize Knowledge Base Connection

Now we'll connect to the existing Knowledge Base created by the setup notebook.

In [10]:
# Only proceed if we have a valid knowledge base configuration
if not kb_ready:
    print("\nCannot proceed without Knowledge Base configuration from setup notebook.")
else:
    # Display business context for Knowledge Base connection
    rag_utils.show_business_context("knowledge_base")
    
    # Initialize connection to existing Knowledge Base
    print(f"🔗 Connecting to existing Knowledge Base: {knowledge_base_name}")
    
    try:
        # Create BDARAGUtils instance with the saved configuration
        kb_obj_config = kb_config['knowledge_base_object']
        knowledge_base = BDARAGUtils(
            kb_name=kb_obj_config['kb_name'],
            kb_description=kb_obj_config['kb_description'],
            data_sources=kb_obj_config['data_sources'],
            multi_modal=kb_obj_config['multi_modal'],
            parser=kb_obj_config['parser'],
            chunking_strategy=kb_obj_config['chunking_strategy'],
            embedding_model=kb_obj_config['embedding_model'],
            suffix=kb_suffix
        )
        
        # Set the knowledge base ID from the saved configuration
        knowledge_base.knowledge_base_id = kb_id
        
        # Get the actual knowledge base details from AWS
        try:
            kb_response = bedrock_agent_client.get_knowledge_base(knowledgeBaseId=kb_id)
            knowledge_base.knowledge_base = kb_response['knowledgeBase']
            print(f"✓ Retrieved Knowledge Base details from AWS")
            
            # Get the data sources for this knowledge base
            ds_response = bedrock_agent_client.list_data_sources(knowledgeBaseId=kb_id)
            knowledge_base.data_source = ds_response.get('dataSourceSummaries', [])
            print(f"✓ Found {len(knowledge_base.data_source)} data sources")
            
        except Exception as e:
            print(f"Warning: Could not retrieve Knowledge Base details: {e}")
            # Set minimal required attributes
            knowledge_base.knowledge_base = {
                'knowledgeBaseId': kb_id,
                'name': knowledge_base_name
            }
            knowledge_base.data_source = []
        
        print(f"\nSuccessfully connected to Knowledge Base!")
        print(f"Knowledge Base ID: {kb_id}")
        print(f"Ready for data ingestion and querying.")
        
    except Exception as e:
        print(f"\nError connecting to Knowledge Base: {e}")
        kb_ready = False

🔗 Connecting to existing Knowledge Base: multimodal-rag-kb-1103523
✓ Retrieved Knowledge Base details from AWS
✓ Found 1 data sources

✅ Successfully connected to Knowledge Base!
Knowledge Base ID: AT5MCPMWSZ
Ready for data ingestion and querying.


## 3. Start Data Ingestion

Now that we've connected to our Knowledge Base, we need to ingest the multimodal data. This process transforms our files into vector embeddings that can be efficiently searched.

In [4]:
# Only proceed if we have a ready knowledge base
if not kb_ready:
    print("\nCannot ingest data without a properly configured Knowledge Base.")
else:
    print("Starting data ingestion...")
    print("This process may take several minutes depending on the amount and size of data.")

    # Display business context for data ingestion process
    rag_utils.show_business_context("data_ingestion")

    try:
        # Check if we have data sources before starting ingestion
        if hasattr(knowledge_base, 'data_source') and knowledge_base.data_source:
            print(f"Found {len(knowledge_base.data_source)} data sources to ingest")
            # Start the ingestion job
            knowledge_base.start_ingestion_job()
            print("\n Data ingestion completed successfully!")
            print("Knowledge Base is now ready for querying.")
        else:
            print("\n No data sources found in the Knowledge Base.")
            print("The Knowledge Base may already have data ingested, or you may need to add data sources.")
            print("You can still proceed with querying if data was previously ingested.")
    except Exception as e:
        print(f"\nError during data ingestion: {e}")
        print("You may still be able to query the Knowledge Base if data was previously ingested.")

🚀 Starting data ingestion...
This process may take several minutes depending on the amount and size of data.


Found 1 data sources to ingest
Starting data ingestion jobs...
Waiting for Knowledge Base to be fully available...
Starting ingestion job for data source 1/1...
✓ Ingestion job started for data source 1 with ID: NGAKAEX7K1
Monitoring ingestion job status...
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: IN_PROGRES

## 4. Query the Knowledge Base

Now that our data is ingested, we can query the Knowledge Base using natural language. We'll use Amazon Bedrock's RetrieveAndGenerate API.

In [5]:
# Only proceed if we have a ready knowledge base
if not kb_ready:
    print("\n Cannot query without a properly configured Knowledge Base.")
else:
    # Display business context for semantic search and querying
    rag_utils.show_business_context("semantic_search")

    def query_kb(query, model_id="amazon.nova-pro-v1:0", num_results=5):
        """
        Query the knowledge base using real AWS API calls and display the results
        
        Args:
            query: The query to send to the knowledge base
            model_id: The foundation model to use for generating the response
            num_results: Number of results to retrieve from the knowledge base
        """
        print(f"🔍 Query: {query}")
        print(f"Processing...")
        
        try:
            # Use the real AWS API to query the knowledge base
            response = knowledge_base.query_knowledge_base(
                query=query,
                model_id=model_id,
                num_results=num_results
            )
                
            # Return the raw response
            return response
        
        except Exception as e:
            print(f"\n Error querying Knowledge Base: {e}")
            return None

### Query 1: Audio Content

Let's start by querying information from the audio content.

In [6]:
# Only run if we have a Knowledge Base set up
if kb_ready:
    # Query about the audio content
    audio_query = "What key topics were discussed in the AWS podcast?"
    
    audio_response = query_kb(audio_query)

🔍 Query: What key topics were discussed in the AWS podcast?
Processing...
Query processed in 2.95 seconds

Response:
The key topics discussed in the AWS podcast include:

1. **Introduction and Welcome**: The hosts, Nolan Chen and Malini Chatterjee, welcome their guest, Ben Schneider, to the AWS Rethink podcast.

2. **Recap of AWS Reinvent 2024**: The discussion includes a recap of the AWS Reinvent 2024 event, which took place earlier that month in Las Vegas.

3. **Guest Introduction**: Ben Schneider, the head of AI and modern data strategy business development at AWS, shares his experience and role at AWS.

These topics form the core of the discussion in the podcast.


### Query 2: Visual Content

Now let's query information from the image.

In [7]:
# Only run if we have a Knowledge Base set up
if kb_ready:
    # Query about visual content
    visual_query = "What were the products shown at the Airport?"
    
    visual_response = query_kb(visual_query)

🔍 Query: What were the products shown at the Airport?
Processing...
Query processed in 3.57 seconds

Response:
According to the obtained information, the products showcased at the airport were suitcases.

There is enough evidence in the results to conclude that suitcases were the products shown at the airport. Two suitcases were spotted in front of a large window with an airplane view, placed on the floor. The presence of suitcases is further corroborated by the text appearing in the results, where it reads, "Find more suitcases on Amazon.com". 

While there might be other products available at the airport, the specific product identified and emphasized in the results is the suitcase. Therefore, it is reasonable to infer that the primary product showcased at the airport was the suitcase.


### Query 3: Document Content

Let's explore information from document content.

In [8]:
# Only run if we have a Knowledge Base set up
if kb_ready:
    # Query about document content
    document_query = "What are the key callouts from the treasury statement?"
    
    document_response = query_kb(document_query)

🔍 Query: What are the key callouts from the treasury statement?
Processing...
Query processed in 2.67 seconds

Response:
The key callouts from the treasury statement include the fact that monthly statements of receipts and outlays of the U.S. Government (MTS) are published by the Bureau of the Fiscal Service, Department of the Treasury and the MTS is part of a triad of Treasury financial reports, which include the Daily Treasury Statement, the Monthly Statement of the Public Debt, and the Combined Statement of Receipts, Outlays, and Balances.


### Query 4: Video Content

Now let's ask a question from the Video.

In [9]:
# Only run if we have a Knowledge Base set up
if kb_ready:
    # Query requiring cross-modal integration
    cross_modal_query = "What happened in El Matador beach?"
    
    cross_modal_response = query_kb(
        query=cross_modal_query,
        num_results=8  # Increase results to capture information from multiple modalities
    )

🔍 Query: What happened in El Matador beach?
Processing...
Query processed in 2.14 seconds

Response:
Based on the retrieved results, there were mysterious disappearances along the stretch of road above El Matador Beach. A witness saw a man atop a rock formation, who then disappeared and was replaced by a woman in a white dress. 

The disappearances were investigated by Detective Sullivan, who dismissed theories that they were mob-related incidents or suicides. The only common thread among the victims was their divorced status.


## Summary

In this notebook, we demonstrated how to build a Multimodal RAG application using Amazon Bedrock Data Automation and Bedrock Knowledge Bases. We covered the key steps:

1. **Knowledge Base Configuration**: We loaded the Knowledge Base configuration created by the setup notebook
2. **Knowledge Base Connection**: We connected to the existing Knowledge Base using the saved configuration
3. **Data Ingestion**: We ingested multimodal data into the Knowledge Base
4. **Querying**: We queried the Knowledge Base across different modalities using natural language

This integrated workflow demonstrates how to properly separate setup and usage concerns, allowing the setup notebook to handle resource creation while this notebook focuses on data ingestion and querying. The Knowledge Base can now answer questions across all modalities in your data.