-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
When trying to train a langgraph agent with GRPO, I observe the following warnings when the agent uses the with_structured_output function of langchain
(TaskRunner pid=2743926) Warning: Reward is None for rollout ro-f38e23575c3a, will be auto-set to 0.0.
(TaskRunner pid=2743926) Warning: Length of triplets is 0 for rollout ro-f38e23575c3a.
This behavior does not appear when I am using the base llm without the with_structured_output method.
When inspecting the spans, I observed that when I used with_structured_outputs there is no span with name "openai.chat.completion" (while it's the case without with_structured_output) and therefore no response_token_ids are saved. Instead I see traces with name chat_model.llm like that
Span(
rollout_id="ro-f38e23575c3a",
attempt_id="at-23e12663",
sequence_id=2,
trace_id="b513a90f765ffab2e70515045376b7ab",
span_id="57767100b5bb63dd",
parent_id="6f47a29b226edb52",
name="chat_model.llm",
status=TraceStatus(status_code="UNSET", description=None),
attributes={
"gen_ai.request.model": "Qwen/Qwen3-8B",
"langchain.llm.model": "Qwen/Qwen3-8B",
"gen_ai.prompt": "[]",
"langchain.llm.name": ["langchain", "chat_models", "openai", "ChatOpenAI"],
"langchain.chat_message.roles": "[]",
"langchain.chat_model.type": "chat",
"gen_ai.request.temperature": 1.0,
"agentops.span.kind": "llm",
"agentops.operation.name": "chat_model",
"gen_ai.completion": '["{\\n\\n\\n \\"field_1\\": \\"value_1\\",\\n \\"field_2\\": \\"value2\\",\\n \\"field_3\\": \\"value_3\\\n\\n \\n}"]',
"gen_ai.usage.completion_tokens": 276,
"gen_ai.usage.prompt_tokens": 389,
"gen_ai.usage.total_tokens": 665,
},
events=[],
...
}
Here is the kind of span I observe when I don't use with_structured_outputs
Span(
rollout_id="ro-f2da02b42afd",
attempt_id="at-35588969",
sequence_id=2,
trace_id="e82fe2df35e304516c82855f718f0813",
span_id="385252948874f324",
parent_id="5bf0191efebccdbb",
name="openai.chat.completion",
status=TraceStatus(status_code="OK", description=None),
attributes={
"gen_ai.request.type": "chat",
"gen_ai.system": "OpenAI",
"gen_ai.request.model": "Qwen/Qwen3-8B",
"gen_ai.request.temperature": 1.0,
"gen_ai.request.streaming": False,
"gen_ai.request.headers": "{'X-Stainless-Raw-Response': 'true'}",
"gen_ai.prompt.0.role": "user",
"gen_ai.prompt.0.content": 'LLM prompt',
"gen_ai.response.id": "chatcmpl-edf09afd36744c66babdc9c1c244639c",
"gen_ai.response.model": "hosted_vllm/Qwen/Qwen3-8B",
"gen_ai.usage.total_tokens": 2317,
"gen_ai.usage.prompt_tokens": 392,
"gen_ai.usage.completion_tokens": 1925,
"gen_ai.completion.0.finish_reason": "stop",
"gen_ai.completion.0.role": "assistant",
"gen_ai.completion.0.content": "<think>\nOkay, output of the LLM",
"prompt_token_ids": [
151644,
872,
271,
198,
],
"response_token_ids": [
151667,
198,
32313,
11,
358,
],
},
events=[],
...
}
I precise that using the langchain_callback_handler does not change anything.
Is this an issue with angent-lightning or rather with agentops? Do you have a workaround?