# Amazon Bedrock Recipe: Langfuse Integration with Bedrock Agents

## Overview
This recipe implements an OpenTelemetry-based tracing and monitoring system for Amazon Bedrock Agents through Langfuse integration. It creates hierarchical trace structures to track agent performance metrics including token usage, latency measurements, and execution durations across preprocessing, orchestration, and postprocessing phases. It processes both streaming and non-streaming responses, generating spans with operation attributes such as timing data, error states, and response content. The error handling and logging functions enable systematic debugging, performance monitoring, and audit trail maintenance. 

### Context
Langfuse integration enables tracing, monitoring, and analyzing the performance and behavior of your Bedrock Agents. This helps in understanding agent interactions, debugging issues, and optimizing performance and can be used with single agents, multi-agent collaboration (MAC), or with inline agents. When using Langfuse you can utilize the cloud platform or a self hosted option on a container. 

#### Use Case
To demonstrate the integration between Langfuse and Amazon Bedrock Agents providing observability outside of AWS tooling. 

#### Implementation
In this notebook we will show how to integrate Amazon Bedrock Agents and Langfuse using both the Langfuse cloud platform and self-hosted option in a container running in AWS. We will configure agent observability, send traces to Langfuse, and validate the results using a single agent.


## Prerequisites
AWS account with appropriate IAM permissions for Amazon Bedrock Agents and Model Access as well as appropriate permission to deploy containers if using the Langfuse self-hosted option.

### Python Dependencies

To run this notebook, you'll need to install some libraries in your environment:


In [None]:

%pip install -r requirements.txt

### AWS Credentials
Before using Amazon Bedrock, ensure that your AWS credentials are configured correctly. You can set them up using the AWS CLI or by setting environment variables. For this notebook assumes that the credentials are already configured.


In [None]:
import boto3

# Create the client to invoke Agents in Amazon Bedrock:
br_agents_runtime = boto3.client("bedrock-agent-runtime")

### Amazon Bedrock Agent


We assume you've already created an [Amazon Bedrock Agent](https://docs.aws.amazon.com/bedrock/latest/userguide/agents.html). If you don't have one already you can follow the **[instructions here]()** to set up an example agent.

Configure your agent's **ID** and (optionally) alias ID in the cell below. You can find these by looking up your agent in the ["Agents" page on the AWS Console for Amazon Bedrock](https://console.aws.amazon.com/bedrock/home?#/agents) or CLI.

The Agent ID should be ten characters, uppercase, and alphanumeric. If you haven't created an Alias for your agent yet, you can use `TSTALIASID` to reference the latest saved development version.

In [None]:
agent_id = ""  # <- Configure your Bedrock Agent ID
agent_alias_id = "TSTALIASID"  # <- Optionally set a different Alias ID if you have one

Before moving on lets validate invoke agent is working correctly. The response is not important we are simply testing the API call. 

In [None]:
print(f"Trying to invoke alias {agent_alias_id} of agent {agent_id}...")
agent_resp = br_agents_runtime.invoke_agent(
    agentAliasId=agent_alias_id,
    agentId=agent_id,
    inputText="Hello!",
    sessionId="dummy-session",
)
if "completion" in agent_resp:
    print("✅ Got response")
else:
    raise ValueError(f"No 'completion' in agent response:\n{agent_resp}")

### Langfuse API keys

There are multiple ways you can use Langfuse - and we'll first need to configure where your Langfuse is hosted:

### Langfuse Cloud

If you're directly using [Langfuse Cloud](https://langfuse.com/pricing), your `langfuse_api_url` will be either
- `https://cloud.langfuse.com/`
- `https://us.cloud.langfuse.com/`
- ...or similar.

### Self-hosted

If you want to deploy the Open Source version of Langfuse in your own AWS Account, you can use the quick-start [CloudFormation](https://aws.amazon.com/cloudformation/resources/templates/) template provided below:

> ⚠️ **But first, note:**
> - This sample deployment is intended for initial experimentation, and doesn't fully implement scalability and security best-practices. It should not be used in mission-critical or production environments. For more details, see the [solution source code](https://github.com/aws-samples/amazon-bedrock-samples/tree/main/evaluation-observe/deploy-langfuse-on-ecs-fargate-with-typescript-cdk) and Langfuse's own [documentation on self-hosting](https://langfuse.com/self-hosting).
> - The solution uses resources outside of the AWS Free Tier, and the **estimated cost** to run is around $4-10 per full day (which may vary depending on your usage). When you delete the created stack(s), any data you stored in your Langfuse instance will also be deleted.
> - To deploy this stack, you'll need IAM permissions to manage AWS IAM roles and policies, AWS Lambda Functions, and AWS CodeBuild projects in your account. Find more information about the architecture and deployment [here](https://github.com/aws-samples/amazon-bedrock-samples/tree/main/evaluation-observe/deploy-langfuse-on-ecs-fargate-with-typescript-cdk).

[![Launch Stack](https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png)](https://console.aws.amazon.com/cloudformation/home?#/stacks/create/review?templateURL=https://console.aws.amazon.com/cloudformation/home?#/stacks/create/review?templateURL=https://aws-blogs-artifacts-public.s3.us-east-1.amazonaws.com/artifacts/ML-18524/langfuse-bootstrap.yml&stackName=LangfuseBootstrap "Launch Stack")

Once the stack deploys successfully, look up the `LangfuseUrl` output in the created `LangfuseDemo` stack.

You'll need to visit this URL to sign up a user account and create an Organization and Project in the Langfuse UI. Take note of the **project name** as you will need that later in the code.

For your `langfuse_api_url` below, use the same URL `https://123abcdefghijk.cloudfront.net/`.


### AWS Marketplace

For production-ready deployments of Langfuse on your own AWS Account, check out the [offerings from Langfuse on the AWS Marketplace](https://aws.amazon.com/marketplace/seller-profile?id=seller-nmyz7ju7oafxu).

---

However your Langfuse environment is deployed, set up the target URL below:

> ⚠️ **Note:** If you change this URL after using it, you'll need to **restart your notebook kernel**. Otherwise, you'll see a message like `opentelemetry.trace - WARNING - Overriding of current TracerProvider is not allowed` when you try to use the new one, and the client won't work correctly.

In [None]:
langfuse_api_url = "https://us.cloud.langfuse.com/"  # <- Replace as described above

Once your Langfuse environment is set up and you've signed in to the UI, you'll need to set up an **API key pair** for your particular Organization and Project (create a new project if you don't have one already).

For more information, see the [FAQ: Where are my Langfuse API keys](https://langfuse.com/faq/all/where-are-langfuse-api-keys) and Langfuse's [getting started documentation](https://langfuse.com/docs/get-started).

In [None]:
langfuse_public_key = "xxx"  # <- Configure your own key here
langfuse_secret_key = "xxx"  # <- Configure your own key here

### Setting up agent tracing

With all the pre-requisites in place, we're ready to recording traces from your Bedrock Agent into Langfuse.

First, let's load the libraries:

In [None]:
import time
import boto3
import uuid
import json
from core.timer_lib import timer
from core import instrument_agent_invocation, flush_telemetry

#### Now lets define a wrapper function
Here we create a wrapper function that is used to Invoke the Amazon Bedrock Agent with instrumentation for Langfuse on the Amazon Bedrock Agents runtime API.

1. Instrumentation for monitoring
2. Configurable streaming support
3. Trace enabling for debugging
4. Flexible parameter handling through kwargs
5. Proper logging of configuration states


In [None]:
@instrument_agent_invocation
def invoke_bedrock_agent(
    inputText: str, agentId: str, agentAliasId: str, sessionId: str, **kwargs
):
    """Invoke a Bedrock Agent with instrumentation for Langfuse."""
    # Create Bedrock client
    bedrock_rt_client = boto3.client("bedrock-agent-runtime")
    use_streaming = kwargs.get("streaming", False)
    invoke_params = {
        "inputText": inputText,
        "agentId": agentId,
        "agentAliasId": agentAliasId,
        "sessionId": sessionId,
        "enableTrace": True,  # Required for instrumentation
    }

    # Add streaming configurations if needed
    if use_streaming:
        invoke_params["streamingConfigurations"] = {
            "applyGuardrailInterval": 10,
            "streamFinalResponse": True,
        }
    response = bedrock_rt_client.invoke_agent(**invoke_params)
    return response

### Now lets create a wrapper function to handle the responses

1. Instrumentation for monitoring
2. Configurable streaming support
3. Trace enabling for debugging
4. Flexible parameter handling through kwargs
5. Proper logging of configuration states

It's particularly useful for:

1. Real-time processing of large responses
2. Interactive applications requiring immediate feedback
3. Debugging and monitoring streaming responses
4. Ensuring proper text encoding/decoding

In [None]:
def process_streaming_response(stream):
    """Process a streaming response from Bedrock Agent."""
    full_response = ""
    try:
        for event in stream:
            # Convert event to dictionary if it's a botocore Event object
            event_dict = (
                event.to_response_dict()
                if hasattr(event, "to_response_dict")
                else event
            )
            if "chunk" in event_dict:
                chunk_data = event_dict["chunk"]
                if "bytes" in chunk_data:
                    output_bytes = chunk_data["bytes"]
                    # Convert bytes to string if needed
                    if isinstance(output_bytes, bytes):
                        output_text = output_bytes.decode("utf-8")
                    else:
                        output_text = str(output_bytes)
                    full_response += output_text
    except Exception as e:
        print(f"\nError processing stream: {e}")
    return full_response

### Langfuse Configuration

In [None]:
import os
import base64
start = time.time()
with open('config.json', 'r') as config_file:
    config = json.load(config_file)
    
 # For Langfuse specifically but you can add any other observability provider:
os.environ["OTEL_SERVICE_NAME"] = 'Langfuse'
os.environ["DEPLOYMENT_ENVIRONMENT"] = config["langfuse"]["environment"]
project_name = config["langfuse"]["project_name"]
environment = config["langfuse"]["environment"]
langfuse_public_key = config["langfuse"]["langfuse_public_key"]
langfuse_secret_key = config["langfuse"]["langfuse_secret_key"]
langfuse_api_url = config["langfuse"]["langfuse_api_url"]

# Create auth header
auth_token = base64.b64encode(
    f"{langfuse_public_key}:{langfuse_secret_key}".encode()
).decode()

# Set OpenTelemetry environment variables for Langfuse
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = f"{langfuse_api_url}/api/public/otel/v1/traces"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {auth_token}"

### Update fields to pass to Agent and User 
The next code block will require some editing before running. Here we will set parameters used by Langfuse to track traces



In [None]:

# Langfuse configuration
project_name = "xxx" #Enter your Langfuse Project name that you created 
environment = "default"  #Enter the env name

# User information
user_id = "xxx" #This will be used in the Langfuse UI to filter traces

# Foundation Model used by the agent (used to estimate costs)
agent_model_id = "xxx"  #eg "claude-3-5-sonnet-20241022-v2:0"
    

In [None]:
# Agent configuration
agentId = config["agent"]["agentId"]
agentAliasId = config["agent"]["agentAliasId"]
sessionId = f"session-{int(time.time())}"

# User information
userId = config["user"]["userId"]  
agent_model_id = config["user"]["agent_model_id"]

# Tags for filtering in Langfuse
tags = ["bedrock-agent", "example", "development"]

# Generate a custom trace ID
trace_id = str(uuid.uuid4())


### Prompt

In [None]:
# Your prompt and streaming mode
question = "xxx" # your prompt to the agent
streaming = False


### Invoke Agent Function
There we pass all the parameters Invoking the agent along with the observability integration with Langfuse.

In [None]:
# Single invocation that works for both streaming and non-streaming
response = invoke_bedrock_agent(
    inputText=question,
    agentId=agentId,
    agentAliasId=agentAliasId,
    sessionId=sessionId,
    show_traces=True,
    SAVE_TRACE_LOGS=True,
    userId=userId,
    tags=tags,
    trace_id=trace_id,
    project_name=project_name,
    environment=environment,
    langfuse_public_key=langfuse_public_key,
    langfuse_secret_key=langfuse_secret_key,
    langfuse_api_url=langfuse_api_url,
    streaming=streaming,
    model_id=agent_model_id,
)

### Response Handling
Here we accept the different types of responses from the Agent or API and print the response.

In [None]:
# Handle the response appropriately based on streaming mode
if isinstance(response, dict) and "error" in response:
    print(f"\nError: {response['error']}")
elif streaming and isinstance(response, dict) and "completion" in response:
    print("\n🤖 Agent response (streaming):")
    if "extracted_completion" in response:
        print(response["extracted_completion"])
    else:
        process_streaming_response(response["completion"])
else:
    # Non-streaming response
    print("\n🤖 Agent response:")
    if isinstance(response, dict) and "extracted_completion" in response:
        print(response["extracted_completion"])
    elif (
        isinstance(response, dict) 
        and "completion" in response
        and hasattr(response["completion"], "__iter__")
    ):
        print("Processing completion:")
        full_response = process_streaming_response(response["completion"])
        print(f"\nFull response: {full_response}")
    else:
        print("Raw response:")
        print(f"{response}")

#### Clean up
Flush telemetry before exiting

In [None]:
flush_telemetry()
timer.reset_all()
