## Create VCF Genomic Analysis Agent

In this notebook we create the VCF Genomic Analysis Agent that will analyze genomic variants and provide clinical insights using the Strands framework. The agent will be deployed using Bedrock AgentCore.

#### Upgrade boto3 to the latest version with support for Bedrock AgentCore

In [None]:
%pip install --upgrade boto3

#### Ensure the latest version of boto3 is shown below

Ensure the boto3 version printed below is 1.39 or higher.

In [None]:
%pip show boto3

Install Strands agents and AgentCore dependencies

In [None]:
%pip install strands-agents strands-agents-tools bedrock-agentcore bedrock-agentcore-starter-toolkit --quiet

In [None]:
%run create_agent_role.py

#### Import required libraries

In [10]:
from utils.magic_helper import register_cell_magic

#### Agent Creation

In this section we create the genomic analysis agent

#### Agents as Tools with Strands Agents


"Agents as Tools" is an architectural pattern in AI systems where specialized AI agents are wrapped as callable functions (tools) that can be used by other agents. This creates a hierarchical structure where:

1. A primary "orchestrator" agent handles user interaction and determines which specialized agent to call
2. Specialized "tool agents" perform domain-specific tasks when called by the orchestrator
This approach mimics human team dynamics, where a manager coordinates specialists, each bringing unique expertise to solve complex problems. Rather than a single agent trying to handle everything, tasks are delegated to the most appropriate specialized agent.

This approach mimics human team dynamics, where a manager coordinates specialists, each bringing unique expertise to solve complex problems. Rather than a single agent trying to handle everything, tasks are delegated to the most appropriate specialized agent.

In [None]:
%%write_and_run vcf_agent_supervisor.py

import boto3
import json
import uuid
import requests
from typing import Dict, Any
from strands import Agent, tool
from strands.models import BedrockModel

from vcf_interpreters import *

# 1: vcf_interpreter_agent tool

@tool
def vcf_genomic_analyst(query: str) -> str:
    """
    Create vcf interpreter agent with lake formation tables using Strands framework

    Args:
        query: An information request from the vcf database

    Returns:
        A summary of the understanding of the user's query and the response.
    """
    try:
        vcf_sup_agent = Agent(
            model=model,
            tools=vcf_agent_tools,
            system_prompt=vcf_agent_instruction
        )
        vcf_agent_response = vcf_sup_agent(query)
        print("VCF Genomic Analysis Agent Response:")
        print(vcf_agent_response)
        return vcf_agent_response
    except Exception as e:
        print(f"Error creating agent: {e}")
        raise

# Define orchestrator agent configuration below

agent_name = "vcf-genomic-analyst"
agent_description = "VCF genomic analysis agent for clinical insights discovery"
agent_instruction = """You are a genomic research assistant AI specialized in VCF (Variant Call Format) data analysis and genomic variant interpretation from AWS HealthOmics.
Your primary task is to interpret user queries about genomic variants, analyze VCF data, and provide comprehensive genomic insights based on the data.

Use the vcf_genomic_analyst tool for all genomic variant analysis tasks including:
- Patient variant counting and profiling
- Pathogenic and benign variant identification
- Pharmacogenomic variant analysis
- Gene-specific variant analysis
- Clinical significance interpretation
- Chromosome-based variant distribution
- Variant frequency analysis
- Cross-patient variant comparison
- Patient data availability queries
- Total number of unique patients

When providing your response:
a. Start with a brief summary of your understanding of the genomic query.
b. Explain the analysis approach you're taking.
c. Present the results from the VCF analyst with appropriate genomic context.
d. Provide clinical interpretation and significance of findings.
e. Suggest follow-up analyses or clinical considerations when relevant.
f. Include appropriate disclaimers about clinical decision-making.

Make sure to explain genomic concepts, variant classifications, and clinical significance in a clear, accessible manner.
Always emphasize the importance of genetic counseling for clinical interpretation of genomic variants.

"""

# Define the model
bedrock_model = BedrockModel(
    model_id="us.anthropic.claude-3-7-sonnet-20250219-v1:0", 
    region_name=region,
    temperature=0.1,
    streaming=False
)

# Instantiate the orchestrator agent
try:
    orchestrator = Agent(
        model=bedrock_model,
        system_prompt=agent_instruction,
        callback_handler=None,
        # associate genomic analysis tools
        tools=[vcf_genomic_analyst]
    )
    print(f"Successfully created orchestrator agent: {agent_name}")
except Exception as e:
    print(f"Error creating agent: {e}")
    raise

#### Test the genomic analysis agent with different questions

In [None]:
# ---------------------------- Sample Question Bank --------------------------------------------

# VCF Agent Questions
vcf_agent_query_1 = "Howmany patients are there in present cohort?"
vcf_agent_query_2 = "How many pathogenic variants did you find in patient NA21135?"

# -----------------------------------------------------------------------------------------
test_query = vcf_agent_query_2 # Change value here to test different questions

print(f"Testing orchestrator agent with query: {test_query}")
print("=" * (39 + len(test_query)))

try:
    # Run the agent
    response = orchestrator(test_query)
    
except Exception as e:
    print(f"Error during agent execution: {e}")
    import traceback
    traceback.print_exc()

In [None]:
response = orchestrator("Analyze the patients cohort and provide a comprehensive clinical summary including: individual risk stratification, population-level insights, shared pathogenic variants, personalized medicine recommendations, and clinical prioritization for genetic counseling. And let me know how do you assess the risk and prioritization?")

### Agent Deployment

In this section we deploy the genomic analysis agent using Bedrock AgentCore.

#### Preparing your agent for deployment on AgentCore Runtime

In [None]:
%%writefile -a vcf_agent_marker.py

from strands import Agent, tool
import argparse
import json
from bedrock_agentcore.runtime import BedrockAgentCoreApp
from strands.models import BedrockModel
from bedrock_agentcore_starter_toolkit import Runtime
from boto3.session import Session

# Import everything from vcf_interpreters
from vcf_interpreters import *

boto_session = Session()

# Define the genomic analysis agent tool
@tool
def vcf_genomic_analyst(query: str) -> str:
    """
    Create vcf interpreter agent with lake formation tables using Strands framework

    Args:
        query: An information request from the vcf database

    Returns:
        A summary of the understanding of the user's query and the response.
    """
    try:
        vcf_sup_agent = Agent(
            model=model,
            tools=vcf_agent_tools,
            system_prompt=vcf_agent_instruction
        )
        vcf_agent_response = vcf_sup_agent(query)
        print("VCF Genomic Analysis Agent Response:")
        print(vcf_agent_response)
        return str(vcf_agent_response)
    except Exception as e:
        print(f"Error creating agent: {e}")
        return f"Error in VCF analysis: {str(e)}"

# Define orchestrator agent configuration
agent_name = "vcf-genomic-analyst"
agent_description = "VCF genomic analysis agent for clinical insights discovery"
agent_instruction = """You are a genomic research assistant AI specialized in VCF (Variant Call Format) data analysis and genomic variant interpretation from AWS HealthOmics.
Your primary task is to interpret user queries about genomic variants, analyze VCF data, and provide comprehensive genomic insights based on the data.

You have access to:
1. AWS HealthOmics Variant Store containing genomic variant data
2. Lake Formation tables with processed genomic information
3. VCF analysis tools and functions

When users ask questions about:
- Patient cohorts and variant statistics
- Specific genomic variants and their clinical significance
- Variant frequency and population data
- Gene-based variant analysis
- Clinical interpretation of variants

You should:
1. Use the vcf_genomic_analyst tool to query the genomic data
2. Provide clear, scientifically accurate interpretations
3. Include relevant clinical context when available
4. Explain technical genomic concepts in accessible terms when needed

Always ensure your responses are based on the actual data available in the system."""

# Create the orchestrator
try:
    orchestrator = Agent(
        model=model,
        system_prompt=agent_instruction,
        callback_handler=None,
        # associate genomic analysis tools
        tools=[vcf_genomic_analyst]
    )
    print("✅ Orchestrator created successfully")
except Exception as e:
    print(f"❌ Error creating orchestrator: {e}")
    orchestrator = None

app = BedrockAgentCoreApp()

@app.entrypoint
async def strands_agent_bedrock_streaming(payload):
    """
    Invoke the agent with streaming capabilities
    This function demonstrates how to implement streaming responses
    with AgentCore Runtime using async generators
    """
    user_input = payload.get("prompt")
    print("User input:", user_input)
    
    if orchestrator is None:
        error_response = {"error": "Orchestrator not initialized", "type": "initialization_error"}
        print(f"Initialization error: {error_response}")
        yield error_response
        return
    
    try:
        # Stream each chunk as it becomes available
        async for event in orchestrator.stream_async(user_input):
            if "data" in event:
                yield event["data"]
    except Exception as e:
        # Handle errors gracefully in streaming context
        error_response = {"error": str(e), "type": "stream_error"}
        print(f"Streaming error: {error_response}")
        yield error_response

if __name__ == "__main__":
    app.run()

### Deploying the agent to AgentCore Runtime

Define agent name and retrieve runtime role

In [None]:
agent_name="<YOUR_AGENT_NAME>"

iam = boto3.client('iam')
agentcore_iam_role = iam.get_role(RoleName='<YOUR_ROLE_NAME>')['Role']['Arn']
agentcore_iam_role

In [None]:
pip install git+https://github.com/aws/bedrock-agentcore-starter-toolkit.git

#### Configure AgentCore Runtime deployment

During the configure step, your docker file will be generated based on your application code.

In [None]:
import boto3
import os
from bedrock_agentcore_starter_toolkit import Runtime
from boto3.session import Session
from botocore.config import Config
boto_session = Session()
region = boto_session.region_name

client = boto3.client(service_name='bedrock-runtime', 
                      region_name='<YOUR_REGION>'
                      )

agentcore_runtime = Runtime()

response = agentcore_runtime.configure(
    entrypoint="vcf_agent_marker.py",
    execution_role=agentcore_iam_role,
    auto_create_ecr=True,
    requirements_file="runtime_requirements.txt",
    region=region,
    agent_name=agent_name,
    disable_otel=True
)

#### Launching agent to AgentCore Runtime
Now that we've got a docker file, let's launch the agent to the AgentCore Runtime. This will create the Amazon ECR repository and the AgentCore Runtime.

In [None]:
launch_result = agentcore_runtime.launch(
     auto_update_on_conflict=True
)
launch_result

### Now the VCF Genomic Analysis Agent is ready to assist you!

Invoking AgentCore Runtime

Finally, we can invoke our AgentCore Runtime with a payload.

In [None]:
invoke_response = agentcore_runtime.invoke({"prompt": "Howmany patients are there in present cohort?"})

In [None]:
invoke_response = agentcore_runtime.invoke({"prompt": "How many pathogenic variants did you find in patient NA21142?"})

In [None]:
invoke_response = agentcore_runtime.invoke({"prompt": "Analyze patient NA21141 and provide individual risk stratification"})