Skip to content

fix(langfuse): enrich agentflow trace metadata, tags, and cost tracking#1021

Merged
maxtechera merged 2 commits into
stagingfrom
fix/langfuse-agentflow-tracing
Mar 16, 2026
Merged

fix(langfuse): enrich agentflow trace metadata, tags, and cost tracking#1021
maxtechera merged 2 commits into
stagingfrom
fix/langfuse-agentflow-tracing

Conversation

@maxtechera
Copy link
Copy Markdown
Collaborator

Summary

  • Enriches AnalyticHandler.getInstance options in buildAgentflow.ts with chatflow/user/billing context so agentflow Langfuse traces get the same rich metadata as chatflows
  • Replaces hardcoded metadata: { tags: ['openai-assistant'] } in onChainStart with metadata built from this.options
  • Adds chatflow_id, chat_id, chatmessage_id tags to all Langfuse traces (agentflows via onChainStart, chatflows/legacy via additionalCallbacks)

Problem

Agentflow traces exist in Langfuse but are bare — no chatflowid, userId, organizationId, billing metadata, and empty tags. This makes them unfindable and unfilterable compared to chatflow traces which have rich metadata.

Root Cause

  1. AnalyticHandler.getInstance in buildAgentflow.ts received minimal options (only chatId, analytic config) — missing chatflowid, user, messageId, sessionId, billing
  2. AnalyticHandler.onChainStart in handler.ts created traces with hardcoded metadata: { tags: ['openai-assistant'] } ignoring enriched options

Changes

buildAgentflow.ts (15 lines)

  • Compute billingStripeCustomerId before analytics init block (copies existing pattern from follow-up prompts section)
  • Add 8 fields to AnalyticHandler.getInstance options: chatflowid, chatflowId, chatflowName, user, sessionId, messageId, billingStripeCustomerId, trackingMetadata
  • Add stack trace logging to analytics error catch block

handler.ts (48 lines net)

  • Replace metadata: { tags: ['openai-assistant'] } in onChainStart (2 trace creation paths) with rich metadata object
  • Add chatflow_id:, chat_id:, chatmessage_id: tags to onChainStart traces
  • Add same 3 tags to additionalCallbacks handlerConfig and parentLangfuseTrace.update

After Deploy

Agentflow traces will have:

name:     "Kumello Agentic Search Prod"  (was "Agentflow")
metadata: { chatflowid, chatflowName, chatId, userId, organizationId, messageId, sessionId, stripeCustomerId }
tags:     [ "Name:...", "chatflow_id:...", "chat_id:...", "chatmessage_id:..." ]

Chatflow/legacy traces also get the 3 new ID tags via the additionalCallbacks change.

…/user/billing context

- Compute billingStripeCustomerId before analytics init block
- Pass chatflowid, chatflowName, user, sessionId, messageId, billingStripeCustomerId,
  trackingMetadata into AnalyticHandler.getInstance options
- Improve analytics catch block to log full error stack trace
…a, add ID tags

- Replace hardcoded metadata: { tags: ['openai-assistant'] } in onChainStart
  with rich metadata from this.options (chatflowid, chatflowName, chatId,
  userId, organizationId, messageId, sessionId, stripeCustomerId)
- Add chatflow_id, chat_id, chatmessage_id tags to onChainStart traces
- Add chatflow_id, chat_id, chatmessage_id tags to additionalCallbacks
  handlerConfig and parentLangfuseTrace.update
- Applies to both new-trace and fallback trace paths in onChainStart
@vercel
Copy link
Copy Markdown

vercel Bot commented Mar 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
answerai-docs Building Building Preview Mar 16, 2026 5:05pm
the-answerai Building Building Preview Mar 16, 2026 5:05pm

Request Review

@claude
Copy link
Copy Markdown

claude Bot commented Mar 16, 2026

PR Review: fix(langfuse): enrich agentflow trace metadata, tags, and cost tracking

Summary: This PR fixes a real gap — agentflow traces in Langfuse were getting minimal metadata compared to chatflow traces. The approach is correct and follows existing patterns. Two issues need attention before merging: a redundant database query on every agentflow execution, and a tag inconsistency between the two code paths.


Critical Issues

1. Redundant database query on every agentflow execution

Location: packages/server/src/utils/buildAgentflow.ts (new block at line ~1984)

The new code adds an unconditional User repository lookup:

const billedUserId = user?.id || chatflow.userId
const billedUser = await appDataSource.getRepository(User).findOne({ where: { id: billedUserId } })
const billingStripeCustomerId = OVERRIDE_CUSTOMER_ID ? DEFAULT_CUSTOMER_ID : billedUser?.stripeCustomerId

This runs on every agentflow execution, even when analytics are disabled (isAnalyticsEnabled check happens in the try block below it). The identical pattern already exists later in the function for followUpPrompts (around line 2342), but that one is correctly guarded inside if (chatflow.followUpPrompts). This new block has no guard at all.

Impact: One extra DB round-trip per agentflow call, regardless of whether analytics or OVERRIDE_CUSTOMER_ID are in play.

Suggestion: Move the three billing lines inside the if (isAnalyticsEnabled(chatflow.analytic)) block, since that is the only place billingStripeCustomerId is consumed in this PR:

if (isAnalyticsEnabled(chatflow.analytic)) {
    const billedUserId = user?.id || chatflow.userId
    const billedUser = await appDataSource.getRepository(User).findOne({ where: { id: billedUserId } })
    const billingStripeCustomerId = OVERRIDE_CUSTOMER_ID ? DEFAULT_CUSTOMER_ID : billedUser?.stripeCustomerId
    // ... rest of analytics init
}

Major Concerns

2. Tag inconsistency: chatflow_id tag may be silently empty in additionalCallbacks

Location: packages/components/src/handler.ts lines ~718-723 and ~739-743

The new tags added to additionalCallbacks use options.chatflowid (lowercase id) without a null guard:

tags: [
    `Name:${chatflow.name}`,
    `chatflow_id:${options.chatflowid}`,   // no guard — prints "chatflow_id:undefined" if missing
    `chat_id:${options.chatId}`,            // same issue
    ...(options.messageId ? [`chatmessage_id:${options.messageId}`] : [])
],

In onChainStart (lines ~1221 and ~1246), the same field is correctly guarded:

...(this.options.chatflowid ? [`chatflow_id:${this.options.chatflowid}`] : []),
...(this.options.chatId ? [`chat_id:${this.options.chatId}`] : []),

For callers that do not supply chatflowid or chatId in options (e.g. legacy chatflow paths), Langfuse will receive literal "chatflow_id:undefined" and "chat_id:undefined" tags, polluting the tag index.

Suggestion: Apply the same conditional spread pattern used in onChainStart:

tags: [
    `Name:${chatflow.name}`,
    ...(options.chatflowid ? [`chatflow_id:${options.chatflowid}`] : []),
    ...(options.chatId ? [`chat_id:${options.chatId}`] : []),
    ...(options.messageId ? [`chatmessage_id:${options.messageId}`] : [])
],

This applies to both the handlerConfig block and the parentLangfuseTrace.update block in additionalCallbacks.

3. Duplicate chatflowId / chatflowid fields passed to getInstance

Location: packages/server/src/utils/buildAgentflow.ts lines 2006-2007

chatflowid: chatflow.id,
chatflowId: chatflow.id,  // duplicate — same value, different casing

This is intentional based on the PR description ("copies existing pattern"), but it is worth documenting why both are needed (i.e., which consumers use which casing). A comment here would prevent future confusion. The onChainStart fallback in handler.ts uses this.options.chatflowid || this.options.chatflowId to handle both, which confirms the dual-key approach is load-bearing — a brief comment explaining that would be helpful.


Minor Issues and Suggestions

4. console.warn vs logger in onChainStart

Location: packages/components/src/handler.ts line ~1204

The existing console.warn (not new in this PR, but the new trace-creation code was added immediately after it) uses console.warn while all other logging in buildAgentflow.ts uses the logger utility. This is a pre-existing inconsistency but worth noting since the new code sits directly below it.

5. filter(Boolean) after all-string conditional spreads

Location: handler.ts lines ~1224 and ~1249

].filter(Boolean),

After using conditional spreads (...(x ? ['tag'] : [])), the array can never contain falsy values, so filter(Boolean) is a no-op. It is harmless but adds minor cognitive overhead. Consider removing it for clarity, or add a comment explaining it is defensive.

6. logger.error with an Error object as second argument

Location: packages/server/src/utils/buildAgentflow.ts line 2025

logger.error(`[server]: Analytics stack trace:`, error)

The standard logger.error signature in this codebase takes a string message. Passing a raw Error object as a second argument may or may not serialize correctly depending on the logger configuration. Consider using getErrorMessage(error) or error instanceof Error ? error.stack : String(error) to guarantee the stack trace is captured as a string.


Positive Observations

  • The PR description is clear and detailed, making the intent and root cause easy to understand.
  • Both new trace creation paths in onChainStart are kept in sync — the same metadata object and tag array are applied to both the "parent fetch failed" fallback and the "no parent trace" independent trace paths. This avoids a class of subtle tracing bugs.
  • The approach of enriching AnalyticHandler.getInstance options at the call site (rather than reaching into buildAgentflow internals from handler.ts) is the right architectural choice.
  • The existing billedUserId = user?.id || chatflow.userId billing pattern is correctly followed.
  • Targeting staging branch correctly per project conventions.

Checklist

  • No new routes added — enforceAbility N/A
  • No new database queries without organizationId scoping — the new .findOne({ where: { id: billedUserId } }) is a user lookup by PK, not a cross-tenant query
  • Redundant unconditional DB query should be guarded (issue Answers Integration Beta v1 #1)
  • chatflow_id / chat_id tags missing null guards in additionalCallbacks (issue Tools Sandbox #2)
  • No tests added for the new metadata enrichment paths

Status: Request Changes — Two issues should be addressed before merge: the unconditional DB query (correctness/performance) and the missing null guards on tags in additionalCallbacks (data quality). Both fixes are small and mechanical.

@maxtechera maxtechera merged commit 8a83b5d into staging Mar 16, 2026
8 checks passed
@maxtechera maxtechera deleted the fix/langfuse-agentflow-tracing branch March 16, 2026 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant