feat(litellm): [MLOB-2787] send client side workflow spans #13477

ncybul · 2025-05-21T18:39:46Z

Currently, all LLM interactions are sent to LLM Obs as LLM spans; however this does not gracefully handle the case where an LLM request is directed to a proxy server which internally makes the actual LLM call. Currently, for these cases, a customer may end up with nested LLM spans (one span sent from the client and one span sent from the server). This PR updates all LLM Obs integrations to conform to sending client-side requests to a proxy as workflow spans to LLM Obs.

Originally, we assumed that a non-default base URL was a good heuristic for identifying requests that were directed to a proxy; however, this assumption does not hold as customers can specify alternative model provider endpoints using the base URL (among potentially other use cases) which does not work with our previous assumption. In order to more accurately detect when a request is being directed to a proxy server, we are putting the onus on users to configure what URLs should be considered proxies. Users can configure this either by setting the DD_LLMOBS_INSTRUMENTED_PROXY_URLS environment variable (defined in ddtrace/settings/_config.py) or by enabling LLM Obs with the instrumented_proxy_urls field defined. We then check whether an LLM interaction is being sent to one of these proxy URLs. If so, we create a workflow span as it is expected that the underlying LLM span is captured in the proxy itself. Otherwise, we create an LLM span which is the current and default behavior.

Existing integrations were modified as follows:

Anthropic: An LLM Obs workflow span will be sent for proxy requests.
Bedrock: An LLM Obs workflow span will be sent for proxy requests.
CrewAI: Crew AI uses LiteLLM under the hood to make LLM calls; therefore, these cases should already be handled by the LiteLLM integration.
Gemini: no changes since this library does not allow users to specify a custom base URL
Langchain: An LLM Obs workflow span will be sent for proxy requests.
Langgraph: Langgraph is model agnostic, so there is nothing to change within this integration itself.
LiteLLM: LLM Obs spans will be sent from the LiteLLM integration as long as there is no downstream Open AI span detected. The span kind will be a workflow if the span is a LiteLLM router operation or proxy request. Otherwise, the span kind is an LLM.
Open AI: An LLM Obs workflow span will be sent for proxy requests.
Open AI Agents: The Open AI agents SDK also uses LiteLLM to allow users to call non-Open AI models; therefore, these cases should already be handled by the LiteLLM integration.
Vertex AI: no changes since this library does not allow users to specify a custom base URL

Every time a span is created by one of the LLM Obs integrations, the self._get_base_url method is called to retrieve the base URL for that interaction if it exists. Then, self._is_proxy_url(base_url) is called to determine whether to set an item in the context that indicates that the current span represents a proxy request. This will later be used in the integration code to determine the appropriate span kind. With this design, any new integrations simply need to implement the _get_base_url method and then use the PROXY_REQUEST context item to tag their LLM Obs spans accordingly.

Manual Testing

For each integration, I tested three cases:

No base URL is set (this should result in an LLM span)
The base URL is set to a proxy URL configured with DD_LLMOBS_INSTRUMENTED_PROXY_URLS (this should result in a top-level workflow span and perhaps other child spans which may include an LLM span depending on how the proxy server is instrumented)
The base URL Is set but not to a proxy URL (this should result in an LLM span)

Anthropic

Request with default base URL (trace).

from anthropic import Anthropic

client = Anthropic()

message = client.messages.create(
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "What color is the sky?",
        }
    ],
    model="claude-3-5-sonnet-20240620",
)

Request with base URL specified (when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set and when it is not).

from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:4000",
)

message = client.messages.create(
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "What color is the sky?",
        }
    ],
    model="claude-3.5",
)

Bedrock

I chose to not instrument the server in the case where the base URL is specified but is not set as the proxy URL to avoid sending spans from the server.

Request with default base URL (trace).

import boto3
import json

session = boto3.Session(profile_name='601427279990_account-admin', region_name="us-east-1")
brt = session.client(
    service_name='bedrock-runtime', 
)
modelId = 'amazon.titan-text-lite-v1'
accept = 'application/json'
contentType = 'application/json'
input_text = "Explain black holes to 8th graders."
body = {
    "inputText": input_text,
}
body = json.dumps(body)
response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response.get('body').read())

Request with base URL specified (when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set and when it is not)
Server Code (I created a proxy server of my own to test this out!)

from fastapi import FastAPI, Request
import uvicorn
import boto3
import json

app = FastAPI()

@app.post("/model/{model_id}/invoke")
async def invoke_model(model_id: str, request: Request):
    body = await request.json()

    session = boto3.Session(profile_name='601427279990_account-admin', region_name="us-east-1")
    brt = session.client(
        service_name='bedrock-runtime', 
    )
    
    body = json.dumps(body)
    response = brt.invoke_model(body=body, modelId=request.path_params.get("model_id"), accept=request.headers.get("accept"), contentType=request.headers.get("content-type"))
    response_body = json.loads(response.get('body').read())
    
    return response_body

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=4000)

Client code

import boto3
import json

session = boto3.Session(profile_name='601427279990_account-admin', region_name="us-east-1")
brt = session.client(
    service_name='bedrock-runtime', 
    endpoint_url="http://0.0.0.0:4000",
)
modelId = 'amazon.titan-text-lite-v1'
accept = 'application/json'
contentType = 'application/json'
input_text = "Explain black holes to 8th graders."
body = {
    "inputText": input_text,
}
body = json.dumps(body)
response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)
response_body = json.loads(response.get('body').read())

Crew AI

To test out these changes with Crew AI, I used the following simple Crew AI flow:

from crewai import Agent, Task, Crew, LLM

llm = LLM(
    model="gpt-3.5-turbo",
    base_url="http://0.0.0.0:4000", # optionally set for testing
)

calculator = Agent(
    role='Mathematical Calculator',
    goal='Perform accurate mathematical calculations',
    backstory='You are an expert mathematician who can solve complex calculations with precision.',
    llm=llm,
    verbose=True
)

calculation_task = Task(
    description='Calculate the sum of all numbers from 1 to 100',
    agent=calculator,
    expected_output='The sum of all numbers from 1 to 100'
)

crew = Crew(
    agents=[calculator],
    tasks=[calculation_task]
)

result = crew.kickoff()

When the base URL is not set, I get this trace with LLM spans.

When the base URL is set to the same URL as in DD_LLMOBS_INSTRUMENTED_PROXY_URLS, I get this trace with workflow spans from the client and underlying LLM spans nested within. And when the base URL is set but not to a proxy URL, I get this trace, again with just the LLM span as expected.

Langchain

For the request using a proxy URL, I instrumented both the client and the server, except for Open AI. This was to make things simpler as the only integrations emitting spans would be Langchain and LiteLLM (since I am using a LiteLLM proxy server). I also chose to not instrument the server in the case where the base URL is specified but is not set as the proxy URL to avoid sending spans from the server.

Request with default base URL (trace).

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage

chat = ChatOpenAI(
    model = "gpt-3.5-turbo",
    temperature=0.1,
)
messages = [HumanMessage(content="how are you?")]
response = chat(messages)
print(response)

Request with base URL specified (when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set and when it is not)

from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage

chat = ChatOpenAI(
    base_url="http://0.0.0.0:4000",
    model = "gpt-3.5-turbo",
    temperature=0.1,
)
messages = [HumanMessage(content="how are you?")]
response = chat(messages)
print(response)

Langgraph

For these tests, I used the following application code:

from langgraph.graph import StateGraph, START, END
from typing import TypedDict
from langchain_openai import ChatOpenAI

class GraphState(TypedDict):
    question: str
    conclusion: str

class Mathematician():
    def __init__(self):
        self.llm = ChatOpenAI(model="gpt-3.5-turbo")

    def __call__(self, state: GraphState):
        prompt = f"You are a mathematician that should only answer questions with a number. You are given a question: {state['question']}. Please answer the question."
        return {"conclusion": self.llm.invoke(prompt)}

graph_builder = StateGraph(GraphState)
graph_builder.add_node("mathematician", Mathematician())
graph_builder.add_edge(START, "mathematician")
graph_builder.add_edge("mathematician", END)
graph = graph_builder.compile()

conclusion = graph.invoke({
        "question": "sum the numbers 1 to 100",
})['conclusion']
print(conclusion)

I then made changes to the LLM model used to showcase the traces that result in the following cases:

Request with default base URL (trace)
Request with base URL specified (trace)
Request with base URL specified and DD_LLMOBS_INSTRUMENTED_PROXY_URLS set (trace)

LiteLLM

For these tests, I started a LiteLLM server and sent requests to it by specifying the base URL as "http://localhost:4000". To make the examples more relevant, I disabled the Open AI integration which means all spans were coming from the LiteLLM integration (this should not change the number of spans or the span kinds present in each trace). I also chose to not instrument the server in the case where the base URL is specified but is not set as the proxy URL to avoid sending spans from the server.

Request with default base URL (trace).

import os
import litellm
from litellm import completion

litellm.api_key = os.environ["OPENAI_API_KEY"]

messages = [{ "content": "What color is the sky?","role": "user"}]
response = completion(model="gpt-3.5-turbo", messages=messages)
print(response)

Request with base URL specified (when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set and when it is not).

import os
import litellm
from litellm import completion

litellm.api_key = os.environ["OPENAI_API_KEY"]

messages = [{ "content": "What color is the sky?","role": "user"}]
response = completion(model="gpt-3.5-turbo", messages=messages, api_base="http://localhost:4000")
print(response)

Open AI

I chose to not instrument the server in the case where the base URL is specified but is not set as the proxy URL to avoid sending spans from the server.

Request with default base URL (trace).

import os
from openai import OpenAI

oai_client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
)
completion = oai_client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "testing openai"},
    ],
)

Request with base URL specified (when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set and when it is not).

import os
from openai import OpenAI

oai_client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
    base_url="http://0.0.0.0:4000",
)
completion = oai_client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "user", "content": "testing openai"},
    ],
)

Open AI Agents

Request with default base URL (trace).

from agents import Agent, Runner
import asyncio

math_tutor_agent = Agent(
    name="Math Tutor",
    handoff_description="Specialist agent for math questions",
    instructions="You provide help with math problems. Explain your reasoning at each step and include examples",
    model="gpt-3.5-turbo",
)

triage_agent = Agent(
    name="Triage Agent",
    instructions="You determine which agent to use based on the user's homework question",
    handoffs=[math_tutor_agent],
    model="gpt-3.5-turbo",
)

async def main():
    result = await Runner.run(triage_agent, "what is the sum of the numbers between 1 and 100?", max_turns=3)
    print(result.final_output)

if __name__ == "__main__":
    asyncio.run(main())

Request with base URL specified (when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set and when it is not).

# only change was updating the model used in each agent
from agents.extensions.models.litellm_model import LitellmModel
import os

model = LitellmModel(
    model="gpt-3.5-turbo",
    api_key=os.getenv("OPENAI_API_KEY"),
    base_url="http://localhost:4000",
)

Checklist

PR author has checked that all the criteria below are met
The PR description includes an overview of the change
The PR description articulates the motivation for the change
The change includes tests OR the PR description describes a testing strategy
The PR description notes risks associated with the change, if any
Newly-added code is easy to change
The change follows the library release note guidelines
The change includes or references documentation updates if necessary
Backport labels are set (if applicable)

Reviewer Checklist

Reviewer has checked that all the criteria below are met
Title is accurate
All changes are related to the pull request's stated goal
Avoids breaking API changes
Testing strategy adequately addresses listed risks
Newly-added code is easy to change
Release note makes sense to a user of the library
If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
Backport labels are set in a manner that is consistent with the release branch maintenance policy

github-actions · 2025-05-21T18:40:22Z

CODEOWNERS have been resolved as:

releasenotes/notes/llmobs-configure-proxy-urls-1edb993ac7ccb895.yaml    @DataDog/apm-python
ddtrace/_trace/trace_handlers.py                                        @DataDog/apm-sdk-api-python
ddtrace/contrib/internal/anthropic/patch.py                             @DataDog/ml-observability
ddtrace/contrib/internal/botocore/services/bedrock.py                   @DataDog/ml-observability
ddtrace/contrib/internal/langchain/patch.py                             @DataDog/ml-observability
ddtrace/contrib/internal/langchain/utils.py                             @DataDog/ml-observability
ddtrace/contrib/internal/litellm/patch.py                               @DataDog/ml-observability
ddtrace/contrib/internal/openai/patch.py                                @DataDog/ml-observability
ddtrace/llmobs/_constants.py                                            @DataDog/ml-observability
ddtrace/llmobs/_integrations/anthropic.py                               @DataDog/ml-observability
ddtrace/llmobs/_integrations/base.py                                    @DataDog/ml-observability
ddtrace/llmobs/_integrations/bedrock.py                                 @DataDog/ml-observability
ddtrace/llmobs/_integrations/langchain.py                               @DataDog/ml-observability
ddtrace/llmobs/_integrations/litellm.py                                 @DataDog/ml-observability
ddtrace/llmobs/_integrations/openai.py                                  @DataDog/ml-observability
ddtrace/llmobs/_integrations/utils.py                                   @DataDog/ml-observability
ddtrace/llmobs/_llmobs.py                                               @DataDog/ml-observability
ddtrace/llmobs/_telemetry.py                                            @DataDog/ml-observability
ddtrace/settings/_config.py                                             @DataDog/apm-core-python
tests/contrib/anthropic/test_anthropic_llmobs.py                        @DataDog/ml-observability
tests/contrib/anthropic/utils.py                                        @DataDog/ml-observability
tests/contrib/botocore/bedrock_utils.py                                 @DataDog/ml-observability
tests/contrib/botocore/conftest.py                                      @DataDog/apm-core-python @DataDog/apm-idm-python
tests/contrib/botocore/test_bedrock_llmobs.py                           @DataDog/ml-observability
tests/contrib/langchain/conftest.py                                     @DataDog/ml-observability
tests/contrib/langchain/test_langchain_llmobs.py                        @DataDog/ml-observability
tests/contrib/langchain/utils.py                                        @DataDog/ml-observability
tests/contrib/litellm/test_litellm_llmobs.py                            @DataDog/ml-observability
tests/contrib/openai/test_openai_llmobs.py                              @DataDog/ml-observability
tests/contrib/openai/utils.py                                           @DataDog/ml-observability
tests/telemetry/test_writer.py                                          @DataDog/apm-python
tests/utils.py                                                          @DataDog/python-guild

github-actions · 2025-05-21T19:01:01Z

Bootstrap import analysis

Comparison of import times between this PR and base.

Summary

The average import time from this PR is: 274 ± 3 ms.

The average import time from base is: 276 ± 3 ms.

The import time difference between this PR and base is: -2.1 ± 0.1 ms.

Import time breakdown

The following import paths have shrunk:

ddtrace.auto 1.999 ms (0.73%)

ddtrace.bootstrap.sitecustomize 1.323 ms (0.48%)

ddtrace.bootstrap.preload 1.323 ms (0.48%)

ddtrace.internal.remoteconfig.client 0.648 ms (0.24%)

ddtrace 0.675 ms (0.25%)

ddtrace.internal._unpatched 0.030 ms (0.01%)

json 0.030 ms (0.01%)

json.decoder 0.030 ms (0.01%)

re 0.030 ms (0.01%)

enum 0.030 ms (0.01%)

types 0.030 ms (0.01%)

pr-commenter · 2025-05-21T19:31:48Z

Benchmarks

Benchmark execution time: 2025-06-18 17:59:51

Comparing candidate commit 80c0760 in PR branch nicole-cybul/send-client-side-workflow-spans with baseline commit 5592908 in branch main.

Found 1 performance improvements and 2 performance regressions! Performance is the same for 564 metrics, 5 unstable metrics.

scenario:iastaspects-replace_aspect

🟥 execution_time [+544.065ns; +620.462ns] or [+11.587%; +13.214%]

scenario:iastaspectssplit-splitlines_aspect

🟥 execution_time [+127.022ns; +154.226ns] or [+8.683%; +10.542%]

scenario:iastdjangostartup-appsec

🟩 execution_time [-1.485s; -1.313s] or [-66.522%; -58.827%]

brettlangdon

We should update CODEOWNERS for ddtrace/contrib/internal/litellm/ and tests/contrib/litellm/ (doesn't have to be in this PR)

Approval is for the files owned by the apm-python/core/guild. I did not review the integration/LLMObs specific changes

ncybul · 2025-06-12T13:29:11Z

We should update CODEOWNERS for ddtrace/contrib/internal/litellm/ and tests/contrib/litellm/ (doesn't have to be in this PR)

Approval is for the files owned by the apm-python/core/guild. I did not review the integration/LLMObs specific changes

Ah gotcha, I made a separate PR for this.

Kyle-Verhoog

Excellent PR description and test coverage (both manual and automated) @ncybul 👏 👏 👏

While reviewing I thought about a user having a proxy where server-side spans may only be generated for certain endpoints (see https://github.com/DataDog/llm-obs/pull/76 for example). To handle this case I was thinking we could add an argument to annotation_context in order to do this proxy logic on a span-by-span basis. Then we're covered for both generic auto-instrumentation as well as specific manual instrumentation cases. WDYT @ncybul?

Also for the setting naming I am thinking we should refer to it as "INSTRUMENTED_PROXY_URLS" since the use-case is only for proxies that will generate an llm span downstream. Leaving it as just proxy urls is ambiguous IMO.

ddtrace/llmobs/_integrations/base.py

ddtrace/llmobs/_integrations/utils.py

ddtrace/llmobs/_llmobs.py

ddtrace/llmobs/_integrations/bedrock.py

ddtrace/llmobs/_integrations/anthropic.py

ddtrace/llmobs/_integrations/openai.py

ddtrace/llmobs/_integrations/langchain.py

ddtrace/llmobs/_telemetry.py

Co-authored-by: kyle <kyle@verhoog.ca>

Currently, all LLM interactions are sent to LLM Obs as LLM spans; however this does not gracefully handle the case where an LLM request is directed to a proxy server which internally makes the actual LLM call. Currently, for these cases, a customer may end up with nested LLM spans (one span sent from the client and one span sent from the server). This PR updates all LLM Obs integrations to conform to sending client-side requests to a proxy as workflow spans to LLM Obs. Originally, we assumed that a non-default base URL was a good heuristic for identifying requests that were directed to a proxy; however, this assumption does not hold as customers can specify alternative model provider endpoints using the base URL (among potentially other use cases) which does not work with our previous assumption. In order to more accurately detect when a request is being directed to a proxy server, we are putting the onus on users to configure what URLs should be considered proxies. Users can configure this either by setting the **DD_LLMOBS_INSTRUMENTED_PROXY_URLS** environment variable (defined in `ddtrace/settings/_config.py`) or by enabling LLM Obs with the `instrumented_proxy_urls` field defined. We then check whether an LLM interaction is being sent to one of these proxy URLs. If so, we create a workflow span as it is expected that the underlying LLM span is captured in the proxy itself. Otherwise, we create an LLM span which is the current and default behavior. Existing integrations were modified as follows: **Anthropic**: An LLM Obs workflow span will be sent for proxy requests. **Bedrock**: An LLM Obs workflow span will be sent for proxy requests. **CrewAI**: Crew AI [uses LiteLLM under the hood](https://github.com/crewAIInc/crewAI/blob/main/src/crewai/llm.py#L768) to make LLM calls; therefore, these cases should already be handled by the LiteLLM integration. **Gemini**: no changes since this library does not allow users to specify a custom base URL **Langchain**: An LLM Obs workflow span will be sent for proxy requests. **Langgraph**: Langgraph is model agnostic, so there is nothing to change within this integration itself. **LiteLLM**: LLM Obs spans will be sent from the LiteLLM integration as long as there is no downstream Open AI span detected. The span kind will be a workflow if the span is a LiteLLM router operation or proxy request. Otherwise, the span kind is an LLM. **Open AI**: An LLM Obs workflow span will be sent for proxy requests. **Open AI Agents**: The Open AI agents SDK also [uses LiteLLM to allow users to call non-Open AI models](https://openai.github.io/openai-agents-python/models/litellm/); therefore, these cases should already be handled by the LiteLLM integration. **Vertex AI**: no changes since this library does not allow users to specify a custom base URL Every time a span is created by one of the LLM Obs integrations, the `self._get_base_url` method is called to retrieve the base URL for that interaction if it exists. Then, `self._is_proxy_url(base_url)` is called to determine whether to set an item in the context that indicates that the current span represents a proxy request. This will later be used in the integration code to determine the appropriate span kind. With this design, any new integrations simply need to implement the `_get_base_url` method and then use the `PROXY_REQUEST` context item to tag their LLM Obs spans accordingly. # Manual Testing For each integration, I tested three cases: 1. No base URL is set (this should result in an LLM span) 2. The base URL is set to a proxy URL configured with `DD_LLMOBS_INSTRUMENTED_PROXY_URLS` (this should result in a top-level workflow span and perhaps other child spans which may include an LLM span depending on how the proxy server is instrumented) 3. The base URL Is set but not to a proxy URL (this should result in an LLM span) ## Anthropic Request with default base URL ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWENSzdO5DDAAAABhBWmRXRU5TekFBRDBJZ1hMelVFc0FBQUEAAAAkZjE5NzU2MTAtZjU5Ny00NWU1LWI1M2UtMmE3OWQ3OWVmNjNlAAAADQ%22%7D%5D&spanId=4488892102153659416&start=1749494753131&end=1749495653131&paused=false)). ``` from anthropic import Anthropic client = Anthropic() message = client.messages.create( max_tokens=1024, messages=[ { "role": "user", "content": "What color is the sky?", } ], model="claude-3-5-sonnet-20240620", ) ``` Request with base URL specified ([when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWEi0I5prmNQAAABhBWmRXRWkwSUFBQldmbnNkU0FOZ0FBQUEAAAAkZjE5NzU2MTItNTM0OS00NjVkLThlM2QtZDEwYTcxZmMzNzBmAAAABg%22%7D%5D&spanId=15578566980693053138&start=1749494846544&end=1749495746544&paused=false) and [when it is not](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWEaqpbXi0tAAAABhBWmRXRWFxcEFBQnJBcVlPR0JoN0FBQUEAAAAkZjE5NzU2MTEtYWNlZi00MDgyLWJjN2QtZjVkM2MzMmIxNTQ4AAAABA%22%7D%5D&spanId=11578799453780556987&start=1749494826126&end=1749495726126&paused=false)). ``` from anthropic import Anthropic client = Anthropic( base_url="http://localhost:4000", ) message = client.messages.create( max_tokens=1024, messages=[ { "role": "user", "content": "What color is the sky?", } ], model="claude-3.5", ) ``` ## Bedrock I chose to not instrument the server in the case where the base URL is specified but is not set as the proxy URL to avoid sending spans from the server. Request with default base URL ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWGvk-sXTBZQAAABhBWmRXR3ZrLUFBQ25ncEVYTmlhQ0FBQUEAAAAkZjE5NzU2MWEtZjkzZS00OTlhLTg0NTktNzdmN2EyZWM2MzhjAAAAAA%22%7D%5D&spanId=15802925627459513713&start=1749495407424&end=1749496307424&paused=false)). ``` import boto3 import json session = boto3.Session(profile_name='601427279990_account-admin', region_name="us-east-1") brt = session.client( service_name='bedrock-runtime', ) modelId = 'amazon.titan-text-lite-v1' accept = 'application/json' contentType = 'application/json' input_text = "Explain black holes to 8th graders." body = { "inputText": input_text, } body = json.dumps(body) response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType) response_body = json.loads(response.get('body').read()) ``` Request with base URL specified ([when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWGhtvbh20tAAAABhBWmRXR2h0dkFBQmFrTVVqb09Vc0FBQUEAAAAkZjE5NzU2MWEtMjRkMS00ZDY3LTk2ODctMjdjYzk0ZGQ4ZTUwAAAABg%22%7D%5D&spanId=939675140636485153&start=1749495362196&end=1749496262196&paused=false) and [when it is not](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWGBvtsT3BZQAAABhBWmRXR0J2dEFBQURzcm1jV05ZYkFBQUEAAAAkZjE5NzU2MTgtMjdiZS00MzMzLWIwZWEtNmQ3YmIzN2Y2M2JmAAAAAQ%22%7D%5D&spanId=2663441806507313625&start=1749495240767&end=1749496140767&paused=false)) Server Code (I created a proxy server of my own to test this out!) ``` from fastapi import FastAPI, Request import uvicorn import boto3 import json app = FastAPI() @app.post("/model/{model_id}/invoke") async def invoke_model(model_id: str, request: Request): body = await request.json() session = boto3.Session(profile_name='601427279990_account-admin', region_name="us-east-1") brt = session.client( service_name='bedrock-runtime', ) body = json.dumps(body) response = brt.invoke_model(body=body, modelId=request.path_params.get("model_id"), accept=request.headers.get("accept"), contentType=request.headers.get("content-type")) response_body = json.loads(response.get('body').read()) return response_body if __name__ == "__main__": uvicorn.run(app, host="0.0.0.0", port=4000) ``` Client code ``` import boto3 import json session = boto3.Session(profile_name='601427279990_account-admin', region_name="us-east-1") brt = session.client( service_name='bedrock-runtime', endpoint_url="http://0.0.0.0:4000", ) modelId = 'amazon.titan-text-lite-v1' accept = 'application/json' contentType = 'application/json' input_text = "Explain black holes to 8th graders." body = { "inputText": input_text, } body = json.dumps(body) response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType) response_body = json.loads(response.get('body').read()) ``` ## Crew AI To test out these changes with Crew AI, I used the following simple Crew AI flow: ``` from crewai import Agent, Task, Crew, LLM llm = LLM( model="gpt-3.5-turbo", base_url="http://0.0.0.0:4000", # optionally set for testing ) calculator = Agent( role='Mathematical Calculator', goal='Perform accurate mathematical calculations', backstory='You are an expert mathematician who can solve complex calculations with precision.', llm=llm, verbose=True ) calculation_task = Task( description='Calculate the sum of all numbers from 1 to 100', agent=calculator, expected_output='The sum of all numbers from 1 to 100' ) crew = Crew( agents=[calculator], tasks=[calculation_task] ) result = crew.kickoff() ``` When the base URL is not set, I get this [trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWW0GEz1QXyQAAABhBWmRXVzBHRUFBQ0EzWnhKTzZVaUFBQUEAAAAkZjE5NzU2NWItNDNkOS00MWRhLWI4NDEtNzRkZWRhNDY2YjcxAAAABw%22%7D%5D&spanId=491126038724058424&start=1749496923362&end=1749500523362&paused=false) with LLM spans. When the base URL is set to the same URL as in `DD_LLMOBS_INSTRUMENTED_PROXY_URLS`, I get this [trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWXx01ozqQPwAAABhBWmRXWHgwMUFBQVBFcU8yell3eEFBQUEAAAAkZjE5NzU2NWYtMzhhYS00NzlmLWFkMzItNThjOTBmYTllMGZiAAADjw%22%7D%5D&spanId=8869414125491016509&start=1749499885536&end=1749500785536&paused=false) with workflow spans from the client and underlying LLM spans nested within. And when the base URL is set but not to a proxy URL, I get this [trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWXRmv8fYWxgAAABhBWmRXWFJtdkFBRDM5RTNkYXk4WEFBQUEAAAAkZjE5NzU2NWQtMzliNC00NjVhLWIzYjItMWRiNWU2MWQ0MjQ5AAAEug%22%7D%5D&spanId=7586973026220693213&start=1749499746513&end=1749500646513&paused=false), again with just the LLM span as expected. ## Langchain For the request using a proxy URL, I instrumented both the client and the server, except for Open AI. This was to make things simpler as the only integrations emitting spans would be Langchain and LiteLLM (since I am using a LiteLLM proxy server). I also chose to not instrument the server in the case where the base URL is specified but is not set as the proxy URL to avoid sending spans from the server. Request with default base URL ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWLxYtSWdrcwAAABhBWmRXTHhZdEFBRHFfMnhRT1lVU0FBQUEAAAAkZjE5NzU2MmYtMWM4Mi00ZTJiLWE1OWMtYTk1MjcwMDcwZDVjAAAAAg%22%7D%5D&spanId=9094709675449713583&start=1749496813623&end=1749497713623&paused=false)). ``` from langchain.chat_models import ChatOpenAI from langchain.schema import HumanMessage chat = ChatOpenAI( model = "gpt-3.5-turbo", temperature=0.1, ) messages = [HumanMessage(content="how are you?")] response = chat(messages) print(response) ``` Request with base URL specified ([when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWM0jM9LLMLQAAABhBWmRXTTBqTUFBQnRLbVpjdEJBcEFBQUEAAAAkZjE5NzU2MzMtNGE4NC00ODNkLWIwZjktMDBlN2MwY2E5Nzg5AAAABQ%22%7D%5D&spanId=16659272787581089973&start=1749497004926&end=1749497904926&paused=false) and [when it is not](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWNE1cSZNrcwAAABhBWmRXTkUxY0FBQkhWU3VFdmgxYUFBQUEAAAAkZjE5NzU2MzQtNGQ1ZC00NzQ5LTkwYWItMWE4MThmZmJkN2VjAAAAAg%22%7D%5D&spanId=4643233463687100041&start=1749497076231&end=1749497976231&paused=false)) ``` from langchain.chat_models import ChatOpenAI from langchain.schema import HumanMessage chat = ChatOpenAI( base_url="http://0.0.0.0:4000", model = "gpt-3.5-turbo", temperature=0.1, ) messages = [HumanMessage(content="how are you?")] response = chat(messages) print(response) ``` ## Langgraph For these tests, I used the following application code: ``` from langgraph.graph import StateGraph, START, END from typing import TypedDict from langchain_openai import ChatOpenAI class GraphState(TypedDict): question: str conclusion: str class Mathematician(): def __init__(self): self.llm = ChatOpenAI(model="gpt-3.5-turbo") def __call__(self, state: GraphState): prompt = f"You are a mathematician that should only answer questions with a number. You are given a question: {state['question']}. Please answer the question." return {"conclusion": self.llm.invoke(prompt)} graph_builder = StateGraph(GraphState) graph_builder.add_node("mathematician", Mathematician()) graph_builder.add_edge(START, "mathematician") graph_builder.add_edge("mathematician", END) graph = graph_builder.compile() conclusion = graph.invoke({ "question": "sum the numbers 1 to 100", })['conclusion'] print(conclusion) ``` I then made changes to the LLM model used to showcase the traces that result in the following cases: 1. Request with default base URL ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZda-xlM9SEw6wAAABhBWmRhLXhsTUFBQmdnOEVKYVplc0FBQUEAAAAkZjE5NzVhZmItMTk0ZC00YjYwLTgyYTQtNTY2YTNhOGMwZjllAAAABg%22%7D%5D&spanId=8325589879530822573&start=1749577652721&end=1749578552721&paused=false)) 2. Request with base URL specified ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdbBQy75H9rcwAAABhBWmRiQlF5N0FBQlBKcmNfNExUUUFBQUEAAAAkZjE5NzViMDUtMjY2OC00Nzg1LTgzMjctMmIyZjQxZGJjMjVhAAAABg%22%7D%5D&spanId=732132008673596287&start=1749577883240&end=1749578783240&paused=false)) 3. Request with base URL specified and `DD_LLMOBS_INSTRUMENTED_PROXY_URLS` set ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdbBI5bzhpfFAAAABhBWmRiQkk1YkFBRGVkaUVHdWZJY0FBQUEAAAAkZjE5NzViMDQtOGU4Yy00ZTE1LThiMjYtYTg4NDhhYjkyZjBiAAAAAg%22%7D%5D&spanId=2296149025644956944&start=1749577854444&end=1749578754444&paused=false)) ## LiteLLM For these tests, I started a LiteLLM server and sent requests to it by specifying the base URL as `"http://localhost:4000"`. To make the examples more relevant, I disabled the Open AI integration which means all spans were coming from the LiteLLM integration (this should not change the number of spans or the span kinds present in each trace). I also chose to not instrument the server in the case where the base URL is specified but is not set as the proxy URL to avoid sending spans from the server. Request with default base URL ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdV3XswcyZDDAAAABhBWmRWM1hzd0FBQ2NzOWtqSlpaTkFBQUEAAAAkZjE5NzU1ZGQtN2IzMC00NjBiLWE1NGUtYTA2NDM3Y2ZjMDNjAAAAAA%22%7D%5D&spanId=1159980771162140816&start=1749491382194&end=1749492282194&paused=false)). ``` import os import litellm from litellm import completion litellm.api_key = os.environ["OPENAI_API_KEY"] messages = [{ "content": "What color is the sky?","role": "user"}] response = completion(model="gpt-3.5-turbo", messages=messages) print(response) ``` Request with base URL specified ([when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdV6IDmv3BfGAAAABhBWmRWNklEbUFBQTh3N2ZhTzhjaUFBQUEAAAAkZjE5NzU1ZTgtODEzNy00MGQ4LWJkOGYtYzFiZWIxNDI1ZTcxAAAABA%22%7D%5D&spanId=832522128674090551&start=1749492100252&end=1749493000252&paused=false) and [when it is not](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdV6ljxyUkI0QAAABhBWmRWNmxqeEFBRDdkY1dNX2MwVkFBQUEAAAAkZjE5NzU1ZWEtN2JjNy00MGYwLThjY2ItZjhlODAxMWM4YmYyAAAAAw%22%7D%5D&spanId=3962954564061385568&start=1749492226691&end=1749493126691&paused=false)). ``` import os import litellm from litellm import completion litellm.api_key = os.environ["OPENAI_API_KEY"] messages = [{ "content": "What color is the sky?","role": "user"}] response = completion(model="gpt-3.5-turbo", messages=messages, api_base="http://localhost:4000") print(response) ``` ## Open AI I chose to not instrument the server in the case where the base URL is specified but is not set as the proxy URL to avoid sending spans from the server. Request with default base URL ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWIxjd6BrmNQAAABhBWmRXSXhqZEFBQkFhSlRiV05DeEFBQUEAAAAkZjE5NzU2MjMtNDU2Mi00MGJhLTlmYjQtNDZjMzczNjUxYmU5AAAABw%22%7D%5D&spanId=5010571712001212560&start=1749495949169&end=1749496849169&paused=false)). ``` import os from openai import OpenAI oai_client = OpenAI( api_key=os.environ.get("OPENAI_API_KEY"), ) completion = oai_client.chat.completions.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": "testing openai"}, ], ) ``` Request with base URL specified ([when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWJGwwdmBDDAAAABhBWmRXSkd3d0FBQ01GV0VRc29QUUFBQUEAAAAkZjE5NzU2MjQtN2NiNS00OTBmLWI3NmEtMTZlZmQ3NDYxNzE5AAAABw%22%7D%5D&spanId=9189504309494995843&start=1749496040239&end=1749496940239&paused=false) and [when it is not](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWJbwtwkVfGAAAABhBWmRXSmJ3dEFBQXdsTHhtNU1iMEFBQUEAAAAkZjE5NzU2MjUtZDEzZC00ZDVlLTgwMDItMzk0ODA0NjNlZGY0AAAABA%22%7D%5D&spanId=16803264818418539708&start=1749496121106&end=1749497021106&paused=false)). ``` import os from openai import OpenAI oai_client = OpenAI( api_key=os.environ.get("OPENAI_API_KEY"), base_url="http://0.0.0.0:4000", ) completion = oai_client.chat.completions.create( model="gpt-3.5-turbo", messages=[ {"role": "user", "content": "testing openai"}, ], ) ``` ## Open AI Agents Request with default base URL ([trace](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWfOtO9p0WxgAAABhBWmRXZk90T0FBQ2RUMHJnWk9MY0FBQUEAAAAkZjE5NzU2N2QtMWZlYi00MWMwLThkYmItZjc0ZGViOWQ2MTU5AAAAFw%22%7D%5D&spanId=12382352369827199770&start=1749501871180&end=1749502771180&paused=false)). ``` from agents import Agent, Runner import asyncio math_tutor_agent = Agent( name="Math Tutor", handoff_description="Specialist agent for math questions", instructions="You provide help with math problems. Explain your reasoning at each step and include examples", model="gpt-3.5-turbo", ) triage_agent = Agent( name="Triage Agent", instructions="You determine which agent to use based on the user's homework question", handoffs=[math_tutor_agent], model="gpt-3.5-turbo", ) async def main(): result = await Runner.run(triage_agent, "what is the sum of the numbers between 1 and 100?", max_turns=3) print(result.final_output) if __name__ == "__main__": asyncio.run(main()) ``` Request with base URL specified ([when DD_LLMOBS_INSTRUMENTED_PROXY_URLS is set](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=false&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWguw4pn2-8QAAABhBWmRXZ3V3NEFBQXhMYVlDQ0czVUFBQUEAAAAkZjE5NzU2ODItZWYxZC00NzExLWFkYmUtNGE4NmNkZDA3NGM3AAAAEQ%22%7D%5D&spanId=15430591163304707830&start=1749502304313&end=1749503204313&paused=false) and [when it is not](https://dd.datad0g.com/llm/traces?query=%40ml_app%3Anicole-test%20%40event_type%3Aspan%20%40parent_id%3Aundefined&agg_m=count&agg_m_source=base&agg_t=count&fromUser=true&llmPanels=%5B%7B%22t%22%3A%22sampleDetailPanel%22%2C%22rEID%22%3A%22AwAAAZdWhbr7gAG0tAAAABhBWmRXaGJyN0FBQ283U0ZGN3B3ckFBQUEAAAAkZjE5NzU2ODYtMDNhNi00ZjA4LTkwODgtOWY3ODcxMGNiODI4AAAAEQ%22%7D%5D&spanId=5364068118937893712&start=1749502418618&end=1749503318618&paused=false)). ``` # only change was updating the model used in each agent from agents.extensions.models.litellm_model import LitellmModel import os model = LitellmModel( model="gpt-3.5-turbo", api_key=os.getenv("OPENAI_API_KEY"), base_url="http://localhost:4000", ) ``` ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting) --------- Co-authored-by: kyle <kyle@verhoog.ca>

ncybul and others added 26 commits May 16, 2025 14:12

instrument router acompletion method

1033aac

adjust span kind assignment and llmobs submission logic

e881725

update litellm llmobs span names

ba9f784

instrument router.atext_completion

42b7a15

only add routing metadata to router spans

1c9a91e

instrument remaining router methods

c41b009

add tagging for tag based filtering

2bc1808

add release note and update documentation

386f87b

update workflow input and output value

5111c76

update litellm proxy test; add tags to APM router spans

612cdf3

add helper for extracting host tag

eb9593d

run black

e38c6ce

add example router completion test

7584356

add more router APM tests

38724a4

wip; add llmobs router tests

9635817

add llmobs tests for router methods

e033dea

update tests to use empty output value for streamed router spans

477b1d0

fix router spans not capturing streamed response

7c24c15

simplify traced methods for litellm

18a60ce

run black

8d65822

ruff fixes

6faf57f

type fixes

5844c87

update function docstring

ff58911

move flaky test to bottom of test file

e6fe22a

Merge branch 'main' into nicole-cybul/instrument-router-methods

34acca2

update openai integration to send client side proxy requests

a111f34

ncybul changed the title ~~send client side workflow spans~~ feat(litellm): [MLOB-2787] send client side workflow spans May 21, 2025

ncybul requested review from brettlangdon, vitor-de-araujo and wantsui June 11, 2025 13:56

brettlangdon approved these changes Jun 11, 2025

View reviewed changes

Kyle-Verhoog reviewed Jun 12, 2025

View reviewed changes

ddtrace/llmobs/_integrations/base.py Outdated Show resolved Hide resolved

ddtrace/llmobs/_integrations/utils.py Outdated Show resolved Hide resolved

ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved

ddtrace/llmobs/_llmobs.py Outdated Show resolved Hide resolved

add typing for _get_base_url helpers

9a2cf71

datadog-datadog-prod-us1 bot reviewed Jun 13, 2025

View reviewed changes

ncybul added 4 commits June 13, 2025 15:42

rename env variable and llmobs field to include instrumented adjective

469e84d

rename workflow helper

e8dcb2d

add instrumented proxy urls to telemetry and debug log

f461d3d

run black

6304d77

ncybul commented Jun 16, 2025

View reviewed changes

ddtrace/llmobs/_telemetry.py Outdated Show resolved Hide resolved

revert changes to type _get_base_url

eb42747

Kyle-Verhoog reviewed Jun 16, 2025

View reviewed changes

ddtrace/llmobs/_telemetry.py Outdated Show resolved Hide resolved

ncybul and others added 2 commits June 17, 2025 09:52

Update ddtrace/llmobs/_telemetry.py

f7aff1c

Co-authored-by: kyle <kyle@verhoog.ca>

Merge branch 'main' into nicole-cybul/send-client-side-workflow-spans

4b02707

Kyle-Verhoog approved these changes Jun 17, 2025

View reviewed changes

ncybul added 4 commits June 17, 2025 16:20

Merge branch 'main' into nicole-cybul/send-client-side-workflow-spans

2088537

style fixes

67c868b

attempt to fix telemetry test

d0f93b5

fix ordering of env variable

80c0760

Kyle-Verhoog approved these changes Jun 18, 2025

View reviewed changes

ZStriker19 approved these changes Jun 20, 2025

View reviewed changes

ncybul merged commit 027f277 into main Jun 20, 2025
756 checks passed

ncybul deleted the nicole-cybul/send-client-side-workflow-spans branch June 20, 2025 17:39

ncybul mentioned this pull request Jun 23, 2025

[MLOB-2113] add docs for setting instrumented proxy urls DataDog/documentation#30069

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(litellm): [MLOB-2787] send client side workflow spans #13477

feat(litellm): [MLOB-2787] send client side workflow spans #13477

Uh oh!

ncybul commented May 21, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 21, 2025 •

edited

Loading

Uh oh!

github-actions bot commented May 21, 2025 •

edited

Loading

Uh oh!

pr-commenter bot commented May 21, 2025 •

edited

Loading

Uh oh!

brettlangdon left a comment •

edited

Loading

Uh oh!

ncybul commented Jun 12, 2025

Uh oh!

Kyle-Verhoog left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

feat(litellm): [MLOB-2787] send client side workflow spans #13477

feat(litellm): [MLOB-2787] send client side workflow spans #13477

Uh oh!

Conversation

ncybul commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Manual Testing

Anthropic

Bedrock

Crew AI

Langchain

Langgraph

LiteLLM

Open AI

Open AI Agents

Checklist

Reviewer Checklist

Uh oh!

github-actions bot commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bootstrap import analysis

Summary

Import time breakdown

Uh oh!

pr-commenter bot commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

scenario:iastaspects-replace_aspect

scenario:iastaspectssplit-splitlines_aspect

scenario:iastdjangostartup-appsec

Uh oh!

brettlangdon left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ncybul commented Jun 12, 2025

Uh oh!

Kyle-Verhoog left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ncybul commented May 21, 2025 •

edited

Loading

github-actions bot commented May 21, 2025 •

edited

Loading

github-actions bot commented May 21, 2025 •

edited

Loading

pr-commenter bot commented May 21, 2025 •

edited

Loading

brettlangdon left a comment •

edited

Loading