<!-- NOTEBOOK_METADATA source: "⚠️ Jupyter Notebook" title: "Integrate Anthropic with Langfuse" sidebarTitle: "Anthropic" logo: "/images/integrations/anthropic_icon.svg" description: "Learn how to monitor and trace Anthropic models with Langfuse to improve and debug your AI applications." category: "Integrations" -->

# Observability for Anthropic Models with Langfuse

Anthropic provides advanced language models like Claude, known for their safety, helpfulness, and strong reasoning capabilities. By combining Anthropic's models with **Langfuse**, you can trace, monitor, and analyze your AI workloads in development and production.

This notebook demonstrates **two** different ways to use Anthropic models with Langfuse:
1. **Native Anthropic SDK with Langfuse Decorators:** Use Langfuse decorators to wrap Anthropic SDK calls for automatic tracing.
2. **OpenAI SDK Drop-in Replacement:** Use Anthropic's OpenAI-compatible endpoints via Langfuse's OpenAI SDK wrapper.

> **What is Anthropic?**  
Anthropic is an AI safety company that develops Claude, a family of large language models designed to be helpful, harmless, and honest. Claude models excel at complex reasoning, analysis, and creative tasks.

> **What is Langfuse?**  
[Langfuse](https://langfuse.com) is an open source platform for LLM observability and monitoring. It helps you trace and monitor your AI applications by capturing metadata, prompt details, token usage, latency, and more.


## 1. Install Dependencies

Before you begin, install the necessary packages in your Python environment:

- **anthropic**: The official Anthropic Python SDK for using Claude models.
- **openai**: Needed to call Anthropic's OpenAI-compatible endpoints.
- **langfuse**: Required for sending trace data to the Langfuse platform.


In [None]:
%pip install anthropic openai langfuse

## 2. Set Up Environment Variables

Configure your **Langfuse** credentials and **Anthropic** API key as environment variables. Replace the dummy keys below with the real ones from your respective accounts.

 - `LANGFUSE_PUBLIC_KEY` / `LANGFUSE_SECRET_KEY`: From your Langfuse Project Settings.
 - `LANGFUSE_HOST`: `https://cloud.langfuse.com` (EU region) or `https://us.cloud.langfuse.com` (US region).
 - `ANTHROPIC_API_KEY`: Your Anthropic API key from the [Anthropic Console](https://console.anthropic.com/).


In [None]:
import os

# Example environment variables (replace with your actual keys)
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-..."  # your public key
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-..."  # your secret key
os.environ["LANGFUSE_HOST"] = "https://cloud.langfuse.com"  # or https://us.cloud.langfuse.com

os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."  # Your Anthropic API key

## Approach 1: Using Native Anthropic SDK with Langfuse Decorators

Langfuse decorators provide a simple way to trace function calls and automatically capture input/output data. This approach allows you to use the native Anthropic SDK while getting full observability through Langfuse.

### Steps
1. Import the Anthropic client and Langfuse decorators.
2. Use the `@observe()` decorator on functions that call Anthropic.
3. Use `@langfuse_context.update_current_observation()` to pass model details and usage information.
4. Make API calls to Anthropic as normal.
5. View the trace in your Langfuse dashboard.

**Note:** For more examples on using Langfuse decorators, see the [Langfuse Python SDK documentation](https://langfuse.com/docs/sdk/python/decorators).

In [None]:
from anthropic import Anthropic
from langfuse.decorators import observe, langfuse_context

# Initialize the Anthropic client
anthropic = Anthropic(
    api_key=os.environ.get("ANTHROPIC_API_KEY")
)

In [None]:
@observe()
def chat_with_claude(messages: list, model: str = "claude-3-5-sonnet-20241022", max_tokens: int = 1024):
    """Chat with Claude using the Anthropic SDK and trace with Langfuse."""
    
    # Make the API call to Anthropic
    response = anthropic.messages.create(
        model=model,
        max_tokens=max_tokens,
        messages=messages
    )
    
    # Update Langfuse observation with model details and usage
    langfuse_context.update_current_observation(
        model=model,
        input=messages,
        output=response.content[0].text,
        usage={
            "input": response.usage.input_tokens,
            "output": response.usage.output_tokens,
            "total": response.usage.input_tokens + response.usage.output_tokens
        },
        metadata={
            "model_id": response.model,
            "stop_reason": response.stop_reason
        }
    )
    
    return response

In [None]:
# Example usage with decorator
messages = [
    {"role": "user", "content": "What is Langfuse and how does it help with LLM observability?"}
]

response = chat_with_claude(messages)
print(response.content[0].text)

You can also use the decorator pattern for more complex workflows:

In [None]:
@observe()
def analyze_text(text: str):
    """Analyze text using Claude with multiple steps."""
    
    # Step 1: Summarize the text
    summary = chat_with_claude([
        {"role": "user", "content": f"Summarize this text in 2 sentences: {text}"}
    ], max_tokens=200)
    
    # Step 2: Extract key points
    key_points = chat_with_claude([
        {"role": "user", "content": f"List 3 key points from this text: {text}"}
    ], max_tokens=300)
    
    return {
        "summary": summary.content[0].text,
        "key_points": key_points.content[0].text
    }

# Example usage
sample_text = """Langfuse is an open-source LLM engineering platform that helps teams 
collaboratively debug, analyze, and iterate on their LLM applications. It provides 
detailed production traces, analytics, prompt management, and evaluation capabilities."""

result = analyze_text(sample_text)
print("Summary:", result["summary"])
print("\nKey Points:", result["key_points"])

Once the request completes, **log in to your Langfuse dashboard** and look for the new trace. You will see:
- The full conversation tree with nested observations
- Input/output for each Claude API call
- Token usage and costs
- Latency metrics
- Any metadata you've added


## Approach 2: Using OpenAI SDK Drop-in Replacement

Anthropic provides OpenAI-compatible endpoints that allow you to use the OpenAI SDK to interact with Claude models. This is particularly useful if you have existing code using the OpenAI SDK that you want to switch to Claude.

### Steps
1. Import the `OpenAI` client from `langfuse.openai`.
2. Create a client, setting `api_key` to your Anthropic API key and `base_url` to Anthropic's OpenAI-compatible endpoint.
3. Use the client's `chat.completions.create()` method with Claude model names.
4. View the trace in your Langfuse dashboard.

**Note:** This approach requires that Anthropic has enabled OpenAI-compatible endpoints for your account. Check Anthropic's documentation for availability.

In [None]:
# Langfuse OpenAI client
from langfuse.openai import OpenAI

# Create an OpenAI client pointing to Anthropic's compatible endpoint
# Note: This assumes Anthropic provides OpenAI-compatible endpoints
client = OpenAI(
    api_key=os.environ.get("ANTHROPIC_API_KEY"),
    base_url="https://api.anthropic.com/v1/messages/openai",  # Hypothetical endpoint
    default_headers={
        "anthropic-version": "2023-06-01"  # Required header for Anthropic API
    }
)

**Alternative Approach:** If Anthropic doesn't provide OpenAI-compatible endpoints, you can still use Langfuse's OpenAI wrapper with a proxy service that translates between OpenAI and Anthropic APIs:

In [None]:
# Example using the standard Anthropic API with Langfuse wrapping
from langfuse import Langfuse
from anthropic import Anthropic

# Initialize Langfuse client
langfuse = Langfuse()

# Create a trace manually
trace = langfuse.trace(name="anthropic-chat")

# Initialize Anthropic client
anthropic_client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

# Create a generation span
generation = trace.generation(
    name="claude-completion",
    model="claude-3-5-sonnet-20241022",
    model_parameters={"max_tokens": 256, "temperature": 0.7}
)

# Make the API call
messages = [
    {"role": "user", "content": "Explain the benefits of using Langfuse with Anthropic's Claude models."}
]

response = anthropic_client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=256,
    messages=messages
)

# Update the generation with response data
generation.update(
    input=messages,
    output=response.content[0].text,
    usage={
        "input": response.usage.input_tokens,
        "output": response.usage.output_tokens,
        "total": response.usage.input_tokens + response.usage.output_tokens
    },
    metadata={
        "stop_reason": response.stop_reason
    }
)

# End the generation
generation.end()

print(response.content[0].text)

## Advanced Features

### Streaming Responses

Both approaches support streaming responses from Claude:

In [None]:
@observe()
def stream_claude_response(prompt: str):
    """Stream a response from Claude with Langfuse tracing."""
    
    stream = anthropic.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1024,
        messages=[{"role": "user", "content": prompt}],
        stream=True
    )
    
    full_response = ""
    for chunk in stream:
        if chunk.type == "content_block_delta":
            text = chunk.delta.text
            full_response += text
            print(text, end="", flush=True)
    
    # Update observation after streaming completes
    langfuse_context.update_current_observation(
        output=full_response,
        model="claude-3-5-sonnet-20241022"
    )
    
    return full_response

# Example usage
response = stream_claude_response("Write a haiku about observability.")

### Adding Custom Metadata

You can enrich your traces with custom metadata for better filtering and analysis:

In [None]:
@observe()
def claude_with_metadata(prompt: str, user_id: str, session_id: str):
    """Call Claude with custom metadata for tracing."""
    
    # Add trace-level metadata
    langfuse_context.update_current_trace(
        user_id=user_id,
        session_id=session_id,
        tags=["claude", "production"],
        metadata={
            "environment": "production",
            "version": "1.0.0"
        }
    )
    
    response = anthropic.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=256,
        messages=[{"role": "user", "content": prompt}]
    )
    
    # Add observation-level metadata
    langfuse_context.update_current_observation(
        model="claude-3-5-sonnet-20241022",
        input=prompt,
        output=response.content[0].text,
        metadata={
            "prompt_type": "question",
            "response_format": "text"
        }
    )
    
    return response.content[0].text

# Example usage
result = claude_with_metadata(
    prompt="What are the key features of Langfuse?",
    user_id="user-123",
    session_id="session-456"
)

After running these examples, you'll see comprehensive traces in your Langfuse dashboard showing:
- Complete request/response data
- Token usage and estimated costs
- Latency metrics
- Custom metadata and tags
- Nested trace structure for complex workflows


<!-- MARKDOWN_COMPONENT name: "LearnMore" path: "@/components-mdx/integration-learn-more.mdx" -->

## Next Steps
- Explore [Anthropic's documentation](https://docs.anthropic.com/) to learn more about Claude's capabilities and best practices.
- Learn more about [Langfuse tracing features](https://langfuse.com/docs) to track your entire application flow.
- Try out Langfuse [Prompt Management](https://langfuse.com/docs/prompts/get-started) to version and manage your Claude prompts.
- Set up [LLM-as-a-Judge evaluations](https://langfuse.com/docs/scores/model-based-evals) to automatically evaluate Claude's outputs.
- Configure [Langfuse Experiments](https://langfuse.com/docs/experiments) to A/B test different Claude models and parameters.
