fix: ui improvements for logs page and ai insights#1658
Conversation
🦋 Changeset detectedLatest commit: c1b4997 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
d3a17a0 to
769faaa
Compare
🚀 Preview Environment (PR #1658)Preview URL: https://pr-1658.dev.getgram.ai
Gram Preview Bot |
| } | ||
|
|
||
| // Exclude chat completion logs (urn:uuid:...) which are not tool calls | ||
| sb = sb.Having("position(gram_urn, 'urn:uuid:') != 1") |
There was a problem hiding this comment.
🔴 Non-deterministic any(gram_urn) causes urn:uuid: HAVING filter to randomly exclude valid tool call traces
The unconditional HAVING filter position(gram_urn, 'urn:uuid:') != 1 operates on any(gram_urn) which is a non-deterministic aggregate. When tool calls and chat completion events share the same trace_id (as the updated seed data now does — see .mise-tasks/seed.mts where traceId is reused for both tool call and chat completion events), the any() function may arbitrarily pick either the tool URN (e.g., tools:http:gram:foo) or the urn:uuid:... URN for the chat completion.
Root cause and impact
The trace_summaries materialized view (server/clickhouse/schema.sql:125-140) already stores any(gram_urn) per trace:
any(gram_urn) AS gram_urnThe ListTraces query then further aggregates with any(gram_urn) as gram_urn at queries.sql.go:240. The new HAVING filter at line 262 checks if this non-deterministic value starts with urn:uuid:.
When a trace contains both a tool call log (gram_urn = 'tools:http:gram:some_tool') and a chat completion log (gram_urn = 'urn:uuid:...'), the any() function may pick the urn:uuid: value, causing the HAVING filter to exclude the entire trace — even though it contains valid tool call data.
This results in random data loss on the logs page: legitimate tool call traces silently disappear depending on which URN any() happens to pick. The issue is exacerbated by the seed data change that explicitly assigns the same traceId to both tool call and chat completion events.
Fix: Instead of filtering on the non-deterministic any(gram_urn), use an aggregate-safe filter such as HAVING countIf(position(gram_urn, 'urn:uuid:') = 1) = 0 or filter at the source MV level.
Prompt for agents
In server/internal/telemetry/repo/queries.sql.go at line 262, replace the HAVING filter `position(gram_urn, 'urn:uuid:') != 1` which operates on the non-deterministic `any(gram_urn)` alias. The `any()` function may pick a `urn:uuid:` URN even when the trace also contains valid tool call URNs, causing those traces to be randomly excluded.
Option A: Use a conditional aggregate that checks ALL URNs in the trace group rather than a single non-deterministic pick. For example, if the trace_summaries table is the source, you could add a new column to the MV that tracks whether the trace has tool call URNs (e.g., `countIf(startsWith(gram_urn, 'tools:'))`).
Option B: Change the SELECT to prefer tool URNs over uuid URNs by using `anyIf(gram_urn, NOT startsWith(gram_urn, 'urn:uuid:'))` instead of `any(gram_urn)` at line 240, and keep the HAVING filter. However, this requires updating the trace_summaries MV schema as well.
Option C (simplest short-term fix): Change line 262 to use `HAVING NOT startsWith(any(gram_urn), 'urn:uuid:') OR countIf(...)` or alternatively, filter for traces that DO have tool URNs rather than filtering out traces that happen to have a uuid picked by any().
Was this helpful? React with 👍 or 👎 to provide feedback.
bcde857 to
305ada8
Compare
|
|
||||||||||||||||
|
|
||||||||||||||||
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
- Make search bar same height (42px) as other filters - Update server filter border to match time range picker - Sort tool bar lists by displayed value (highest first) - For failure rate, sort by failureCount to match displayed % Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Prevent timestamp wrapping with whitespace-nowrap - Remove comma from timestamp format - Hide child span timestamps when same as parent - Align tree lines with parent chevron - Fix layout shift in loading/error/empty states - Exclude urn:uuid: entries (chat completions) from tool calls list Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change InsightsSidebar content wrapper to overflow-hidden - Let Page.Body handle scrolling internally - Add pb-24 bottom padding to Page.Body for scroll visibility Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove Beta label from AI Insights sidebar header - Remove Beta label from Insights page title - Add backdrop overlay to close sidebar when clicking outside Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…inistic filtering The `any(gram_urn)` aggregate in trace_summaries could non-deterministically pick a urn:uuid: value when a trace contains both tool call and chat completion logs, causing the HAVING filter to randomly exclude valid traces. Fix by filtering urn:uuid: logs at the MV insert level so they never enter the table. The query-level HAVING filter is kept as a safety net for historical data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…s conflict The trace_summaries_mv materialized view WHERE clause was referencing `gram_urn` which ClickHouse interpreted as the SELECT alias `any(gram_urn)` rather than the source column. This caused ClickHouse error 184. Fixed by qualifying the column as `telemetry_logs.gram_urn` in the WHERE clause to explicitly reference the source table column. Also regenerated openapi3.json to fix dirty file CI check failures. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ization The trace_summaries_mv in schema.sql (used by testcontainers) also needs the qualified column reference `telemetry_logs.gram_urn` to avoid ClickHouse confusing it with the SELECT alias `any(gram_urn) AS gram_urn`. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
… chat data When a user selects an MCP server filter on the Tools tab and switches to the Chats tab, the filter was persisting and causing all chat metrics to show 0 (since chat URNs don't match tool URN prefixes). Now the MCP filter is only applied when activeTab === "tools". Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The ToolBarList component now shows each tool's share of total failures rather than the failure rate per call. Updated the title from "Tools by Failure Rate" to "Failure Distribution by Tool" to accurately describe the displayed metric. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
67d53ba to
c1b4997
Compare
Summary
urn:uuid:) from tool calls listage-1371
Test plan
urn:uuid:entries no longer appear in logs🤖 Generated with Claude Code