# Evaluate Biomarker Supervisor Agent
In this notebook we evaluate the Biomarker Supervisor Agent using the [Bedrock Agent Evaluation Framework](https://github.com/aws-samples/amazon-bedrock-agent-evaluation-framework/tree/main)

## Pre-requisites

1. Set up a LangFuse account using the cloud https://www.langfuse.com or the self-host option for AWS https://github.com/aws-samples/deploy-langfuse-on-ecs-with-fargate/tree/main/langfuse-v3

2. Create an organization in Langfuse

3. Create a project within your Langfuse organization

3. Save your Langfuse project keys (Secret Key, Public Key, and Host) to use later in this notebook

4. If you are using the self-hosted option and want to see model costs then you must create a model definition in Langfuse for the LLM used by the Biomarker Supervisor Agent, instructions can be found here https://langfuse.com/docs/model-usage-and-cost#custom-model-definitions

Go here for any help needed in the steps above https://langfuse.com/docs/get-started

### Step 1: Clone the [Bedrock Agent Evaluation Framework](https://github.com/aws-samples/amazon-bedrock-agent-evaluation-framework/tree/main)

In [None]:
!git clone https://github.com/aws-samples/amazon-bedrock-agent-evaluation-framework.git

### Step 2: Input the relevant information specific to the Biomarker Supervisor Agent and Langfuse setup
Note: All of these variables must be filled in for the evaluation to work properly!

In [None]:
user_input = """

AGENT_ID="FILL"
AGENT_ALIAS_ID="FILL"

DATA_FILE_PATH="../hcls_trajectories.json"

LANGFUSE_PUBLIC_KEY="FILL"
LANGFUSE_SECRET_KEY="FILL"
LANGFUSE_HOST="FILL"

"""

### Step 3: Create config.env that evaluation tool needs

In [None]:
import os
from string import Template

# Read the template file from the Bedrock Agent Evaluation Framework
template_file_path = os.path.join('amazon-bedrock-agent-evaluation-framework', 'config.env.tpl')
with open(template_file_path, 'r') as template_file:
    template_content = template_file.read()


# Convert template content and user input into dictionaries
def parse_env_content(content):
    env_dict = {}
    for line in content.split('\n'):
        line = line.strip()
        if line and not line.startswith('#'):
            if '=' in line:
                key, value = line.split('=', 1)
                env_dict[key.strip()] = value.strip()
    return env_dict

template_dict = parse_env_content(template_content)
user_dict = parse_env_content(user_input)

# Merge dictionaries, with user input taking precedence
final_dict = {**template_dict, **user_dict}

# Create the config.env content
config_content = ""
for key, value in final_dict.items():
    config_content += f"{key}={value}\n"

# Write to config.env file in the correct folder
config_file_path = os.path.join('amazon-bedrock-agent-evaluation-framework', 'config.env')
with open(config_file_path, 'w') as config_file:
    config_file.write(config_content)

print(f"config.env file has been created successfully in amazon-bedrock-agent-evaluation-framework!")

### Step 4: Run the evaluation tool to get results in Langfuse!

![Example Trace](img/example_traces.png)

![Example Trace](img/example_trace.png)

In [None]:
# Execute bash script to run evaluation
!chmod +x execute_eval.sh
!./execute_eval.sh

### Step 5: Navigate to your Langfuse host address, open the relevant Langfuse project, and view the traces populated there during evaluation run