Skip to content

feat(bqaa): ADK 2.0 minimum producer cut (#293 v5)#6

Draft
caohy1988 wants to merge 5 commits into
mainfrom
feat/bqaa-adk2-minimum-producer-on-fork
Draft

feat(bqaa): ADK 2.0 minimum producer cut (#293 v5)#6
caohy1988 wants to merge 5 commits into
mainfrom
feat/bqaa-adk2-minimum-producer-on-fork

Conversation

@caohy1988
Copy link
Copy Markdown
Owner

Draft — for iteration in this fork before raising upstream.

Originally raised at google#6015 against google/adk-python; moved here to iterate independently first.

This PR implements the customer-driven minimum producer subset of the ADK 2.0 observability work tracked in GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK#293 (parent: google#190). The customer needs these specific fields visible in BigQuery before their ADK 2.0 production cutover; the full v15 contract lands incrementally.

Producer-only — no consumer-side SDK / typed-view work is needed for the customer to query the new data; they read base-table JSON directly during the mid-June window.

What lands

Group Item Sub-issue
A1/A2 every row carries attributes.adk.{schema_version, app_name} (this PR)
A3 on_event_callback rows carry attributes.adk.source_event_id — never fabricated on callback-only rows google#194
C1 attributes.adk.node = {path, run_id, parent_path} — default-empty path preserved verbatim, no synthesis google#196
C2 attributes.adk.branch (absent stays JSON null) google#197
C3 attributes.adk.scope = null | {id, kind} per google#198 / google#293 v5 derivation google#198
C4 emit AGENT_TRANSFER (from_agent = event.author, to_agent = actions.transfer_to_agent) google#200
C5 emit EVENT_COMPACTION — fractional float-epoch seconds preserved google#201
C6 emit AGENT_STATE_CHECKPOINT (both {agent_state, end_of_agent} shapes), inline payload only google#202
C7 emit TOOL_PAUSED with pause_kind (HITL-aware via _HITL_PAUSE_KIND_MAP) + function_call_id; user-message TOOL_COMPLETED with pause_kind='tool' for non-HITL; HITL function_responses stay on HITL_*_COMPLETED, never emit TOOL_COMPLETED google#199 (pair-key subset)
C8 flat-with-prefix attributes.adk.{route, render_ui_widgets, rewind_before_invocation_id} mirror google#203
D1 delete deprecated on_state_change_callback stub (this PR)

Explicitly deferred (post-mid-June, per google#293 v5)

Deferred Reason Tracked at
WORKFLOW_NODE_STARTING/COMPLETED event types Design-blocked (OTel-span vs event-observation). Workflow boundaries are partially observable via attributes.adk.node today. #207
Pause registry pause_orphan semantics / read-after-write visibility Blocks the orphan-flag contract, not the row-pair join keys. Customer can compute long-running tool durations from direct TOOL_PAUSED ↔ TOOL_COMPLETED SQL joins. google#206
OTel attributes.adk.otel_span_id Best-effort; consumer can join via A3 source_event_id ↔ ADK's span-side associated_event_ids. google#205
Oversized-state GCS offload for AGENT_STATE_CHECKPOINT Inline covers all but very large state. google#190
Full B0 (google#194) plumbing for non-on_event_callback paths Added EventData.source_event: Optional[Event] = None as a minimal step; full per-callback coverage matrix lands with google#194. google#194

Compatibility

  • AGENT_RESPONSE retains the legacy flat extras (source_event_id, source_event_author, source_event_branch) for backward compat alongside the new canonical attributes.adk.* envelope. Existing consumers continue to work.
  • EventData signature gains one optional field (source_event). No breaking change.
  • Mock-based test fixtures (the HITL suite) continue to pass — the envelope helper is defensive against missing attributes on Mock objects.

Tests

252 plugin tests pass (233 existing on fork's main + 19 new). Note: this fork's main doesn't yet have upstream's TestDropStats class (introduced in google/adk-python after this fork's last sync); that class is not in this PR.

  • Envelope shape on Event-originating and non-Event-originating rows
  • node.parent_path derivation with both empty and nested paths
  • _derive_scope for None, bare node, path/node, FC IDs, empty string
  • C4 / C5 / C6 emit paths
  • C5 fractional float-epoch precision round-trip
  • C6 both-shape coverage + id-stabilization regression guard (Event.model_post_init auto-assigns id even when the constructor omits it — _create_agent_state_event is covered)
  • C7 TOOL_PAUSED pause_kind derivation for non-HITL and HITL
  • C7 HITL non-routing: HITL function_responseHITL_*_COMPLETED only, never TOOL_COMPLETED
  • C7 user-message TOOL_COMPLETED with pause_kind='tool'
  • C8 flat-with-prefix route / rewind_before_invocation_id
  • D1: on_state_change_callback removed from the public surface
$ python3 -m pytest tests/unittests/plugins/test_bigquery_agent_analytics_plugin.py -q
252 passed, 9 warnings in 21.96s

References

Test plan

  • pytest tests/unittests/plugins/test_bigquery_agent_analytics_plugin.py — 252/252 pass
  • pyink --config pyproject.toml src/ tests/ + isort src/ tests/ — clean
  • Manual smoke against a real BigQuery dataset with an ADK 2.0 invocation that includes transfer + compaction + checkpoint + long-running pause
  • Customer pre-cutover dry run

🤖 Generated with Claude Code

caohy1988 added 2 commits June 8, 2026 00:59
Implements the customer-driven mid-June producer-only subset of the
ADK 2.0 observability work tracked in google#293 (parent google#190). The customer
needs these specific fields visible in BigQuery before their ADK 2.0
production cutover; full v15 contract (google#190) lands incrementally.

This change is producer-only. No consumer-side SDK / typed-view work
is included — the customer reads base-table JSON directly during the
mid-June window.

What lands
----------

A1/A2 — every ADK-enriched row carries the attributes.adk envelope:
  attributes.adk.schema_version (_ADK_ENVELOPE_SCHEMA_VERSION = "1")
  attributes.adk.app_name       (from InvocationContext.session.app_name)

A3 — rows from on_event_callback additionally carry:
  attributes.adk.source_event_id  (Event.id, reliable join key)
                                  Note: never fabricated — callback rows
                                  without an originating Event leave it
                                  JSON-null.

C1 — attributes.adk.node = {path, run_id, parent_path}. parent_path is
     derived from path; default-empty path (NodeInfo.path = "") is
     preserved verbatim with parent_path = null, no synthesis.

C2 — attributes.adk.branch (absent stays JSON null).

C3 — attributes.adk.scope = null | {id, kind} per google#198 / google#293 v5
     derivation order: (1) None → null, (2) name@run_id / path/name@run_id
     → node_run, (3) any other non-empty string → function_call
     (model-provided FC IDs match here), (4) empty/non-string → unknown
     with warning.

C4 — emit AGENT_TRANSFER from event.actions.transfer_to_agent. Payload
     pinned: from_agent = event.author, to_agent = the target. Verified
     against EventActions.transfer_to_agent which stores the target only.

C5 — emit EVENT_COMPACTION from event.actions.compaction. Float
     start_timestamp / end_timestamp preserved with fractional precision
     (consumer view conversion deferred).

C6 — emit AGENT_STATE_CHECKPOINT when actions.agent_state is not None
     OR actions.end_of_agent is True. Allows {agent_state: null,
     end_of_agent: true} payloads. Inline payload only; GCS offload
     for oversized state deferred.

C7 — emit TOOL_PAUSED for each event.long_running_tool_ids id, with
     attributes.pause_kind (derived from function_call NAME via
     _HITL_PAUSE_KIND_MAP — hitl_* for adk_request_*, tool otherwise)
     and attributes.function_call_id. HITL routing is unchanged:
     HITL function_responses stay on HITL_*_COMPLETED, NEVER emit
     TOOL_COMPLETED. Non-HITL function_responses arriving via
     on_user_message_callback emit TOOL_COMPLETED with pause_kind='tool'
     so the customer can pair (TOOL_PAUSED ↔ TOOL_COMPLETED) on
     (app_name, user_id, session_id, function_call_id) directly in SQL.
     Pause registry / pause_orphan semantics deferred to google#206.

C8 — attributes.adk.{route, render_ui_widgets, rewind_before_invocation_id}
     mirror EventActions, flat-with-prefix per google#203 (matches the rest
     of the attributes.adk.* envelope convention).

D1 — delete the deprecated on_state_change_callback stub (never called
     by ADK 2.0; verified no callers).

Compatibility
-------------

* AGENT_RESPONSE retains the legacy flat extras (source_event_id,
  source_event_author, source_event_branch) for backward compat. The
  canonical keys are now under attributes.adk.*.
* The HITL test fixtures use Mock events without long_running_tool_ids
  or .id; the envelope helper is defensive against missing attrs.
* No EventData / _log_event signature change. Added one optional field
  EventData.source_event: Optional[Event] = None — a minimal B0 (google#194)
  step. Callbacks that have access to the source Event pass it through;
  others leave it None (and the envelope correctly leaves A3/C1/C2/C3
  null on those rows).

Tests
-----

257 plugin tests pass (238 existing + 19 new):
* envelope shape on event-originating and non-event-originating rows
* node parent_path derivation with both empty and nested paths
* _derive_scope for None, bare node, path/node, FC IDs, empty string
* C4/C5/C6 emit paths
* C5 fractional float-epoch precision round-trip
* C6 both-shape coverage + id-stabilization regression guard
  (Event.model_post_init auto-assigns id even when constructor omits it)
* C7 TOOL_PAUSED pause_kind derivation for non-HITL and HITL
* C7 HITL non-routing: HITL function_response → HITL_*_COMPLETED only,
  NEVER TOOL_COMPLETED
* C7 user-message TOOL_COMPLETED with pause_kind='tool'
* C8 flat-with-prefix route / rewind_before_invocation_id
* D1: on_state_change_callback removed from the public surface

Refs: google#293 (v5), google#190, google#194, google#196, google#197, google#198, google#199, google#200, google#201, google#202,
Caught in review of #6: the C7 pair keys
(pause_kind, function_call_id) were being passed via
EventData.extra_attributes, which _enrich_attributes() copies at the
top of attrs *before* attrs["adk"] = _build_adk_envelope(...). That
landed them at attributes.pause_kind / attributes.function_call_id,
not attributes.adk.pause_kind / attributes.adk.function_call_id.

The customer SQL pinned in google#293 v5 acceptance #3 is:

  JSON_VALUE(attributes, '$.adk.function_call_id') = JSON_VALUE(...)

so the pair join would have returned null on every row. This commit
makes the contract match the SQL.

Changes:
* EventData gains adk_extras: dict[str, Any], a sibling of
  extra_attributes that lives INSIDE attributes.adk.
* _enrich_attributes merges adk_extras into the envelope after
  _build_adk_envelope (envelope wins on conflict — producer-derived
  identity fields like source_event_id are the source of truth).
* The two emit sites (TOOL_PAUSED in on_event_callback,
  TOOL_COMPLETED in on_user_message_callback) pass the pair keys via
  adk_extras= instead of extra_attributes=.
* The three C7 tests are updated to assert
  json.loads(row["attributes"])["adk"]["pause_kind"] etc., locking
  in the right shape this time.

Full plugin suite: 252 passed.
@caohy1988
Copy link
Copy Markdown
Owner Author

Good catch — verified and fixed.

The reviewer is right that EventData.extra_attributes lands at the top of attributes, not inside attributes.adk. _enrich_attributes copies extra_attributes first, then assigns attrs[\"adk\"] separately, so my C7 pair keys ended up at attributes.pause_kind / attributes.function_call_id — the SQL in google#293 v5 acceptance #3 (JSON_VALUE(attributes, '$.adk.function_call_id')) would return null on every row.

Fix (commit 236b790)

  • EventData.adk_extras: dict[str, Any] — a sibling of extra_attributes that lives INSIDE attributes.adk. Merged after _build_adk_envelope (envelope wins on key conflict; producer-derived identity fields like source_event_id are the source of truth).
  • Both emit sites (TOOL_PAUSED in on_event_callback, TOOL_COMPLETED in on_user_message_callback) now pass the pair keys via adk_extras= instead of extra_attributes=.
  • The three C7 tests now assert json.loads(row[\"attributes\"])[\"adk\"][\"pause_kind\"] / [\"function_call_id\"], locking in the right shape this time so a future regression can't silently break the SQL.
$ python3 -m pytest tests/unittests/plugins/test_bigquery_agent_analytics_plugin.py -q
252 passed, 9 warnings in 24.34s

@caohy1988
Copy link
Copy Markdown
Owner Author

End-to-end validation against real BigQuery + real Gemini 3.5 Flash

Ran the PR's plugin code against a fresh BigQuery dataset, calling gemini-3.5-flash on the Vertex AI global endpoint with a real LongRunningFunctionTool. Synthesized additional Events for the action triples Gemini won't trigger on a single turn (compaction, route, widgets, rewind, checkpoint shapes, HITL completions).

Result: 16/16 validations PASS (10 from the main run + 6 from a supplemental run that closed envelope-field gaps the first run didn't query).

Coverage table

Customer ask (from #190 customer comment) e2e check(s) Result
event.long_running_tool_ids C7 TOOL_PAUSED with attributes.adk.pause_kind='tool' + function_call_id='adk-58381d…' (real Gemini call); C7 non-HITL TOOL_COMPLETED with matching pair keys (fcid='fc-long-running-1'); C7 HITL → HITL_*_COMPLETED only, never TOOL_COMPLETED PASS (3)
event.node_info C1 attributes.adk.node.path = 'wf/A@1/B@2'parent_path = 'wf/A@1' PASS
actions.compaction C5 fractional float-epoch round-trip: 1717000000.125 / 1717000003.875 survive serialization PASS
actions.transfer_to_agent C4 AGENT_TRANSFER with {from_agent: 'root_supervisor', to_agent: 'specialist_subagent'} + envelope PASS
actions.end_of_agent / agent_state C6 both shapes — {state: {...}, end:false} and {state: null, end:true} PASS
actions.route / UI / rewind C8 attributes.adk.route='branch_alpha', rewind_before_invocation_id='prior-inv-abc', render_ui_widgets[0].{id='wgt-1', provider='mcp'} (flat-with-prefix per google#203) PASS (all 3)
Workflow boundaries Partial via attributes.adk.node.path (validated above). Dedicated WORKFLOW_NODE_STARTING/COMPLETED deferred to #207 (design-blocked) Partial-by-design
Envelope foundation A1 schema_version on every row (0/22 missing); A2 app_name on every row (0/22 missing); A3 source_event_id on 7/22 Event-originating rows; C2 branch passthrough across 3 distinct branches; C3 scope derivation for node_run / function_call / null PASS (8)

How the validation was done

# Phase 1 — real Gemini 3.5 Flash, global endpoint:
agent = LlmAgent(name="approval_agent", model="gemini-3.5-flash",
                 tools=[LongRunningFunctionTool(func=submit_for_human_approval)])
# Gemini called the tool → emitted long_running_tool_ids:
#   {'adk-58381d5a-ea48-4d75-972c-34f3f3c172f2'}

# Phase 2 — synthetic Events for paths Gemini won't trigger:
await plugin.on_event_callback(invocation_context=ic, event=ev_transfer)
await plugin.on_event_callback(invocation_context=ic, event=ev_compact)
await plugin.on_event_callback(invocation_context=ic, event=ev_state)
await plugin.on_event_callback(invocation_context=ic, event=ev_end)
await plugin.on_event_callback(invocation_context=ic, event=ev_actions)  # route/widgets/rewind
await plugin.on_user_message_callback(invocation_context=ic, user_message=hitl_msg)
await plugin.on_user_message_callback(invocation_context=ic, user_message=nonhitl_msg)

# Phase 3 — plugin.shutdown(), wait for Storage Write API visibility, query BQ.

Concrete proof of the C7 regression fix

The previously-fixed bug (pair keys at the wrong JSON path) was specifically validated by querying the exact SQL the customer will use:

SELECT JSON_VALUE(attributes, '$.adk.pause_kind')       AS pk,
       JSON_VALUE(attributes, '$.adk.function_call_id') AS fcid
  FROM `…agent_events`
  WHERE session_id = '' AND event_type = 'TOOL_PAUSED';
-- → [{'pk': 'tool', 'fcid': 'adk-58381d5a-ea48-4d75-972c-34f3f3c172f2'}]

Both keys resolve to non-null values — i.e. the customer's TOOL_PAUSED ↔ TOOL_COMPLETED pair-join SQL on JSON_VALUE(attributes, '$.adk.function_call_id') works end-to-end on real BigQuery rows.

Explicitly not validated (deferred per google#293 v5)

Deferred Reason Tracked at
WORKFLOW_NODE_STARTING/COMPLETED event types Design-blocked #207
Pause registry pause_orphan semantics Design-blocked google#206
Oversized-state GCS offload for AGENT_STATE_CHECKPOINT Inline-only validated google#190
OTel attributes.adk.otel_span_id Best-effort, customer joins via source_event_id google#205
Consumer SDK typed views Customer reads base-table JSON during mid-June window google#190 / google#211

Test infrastructure

  • Project: test-project-0728-467323, bqaa_e2e_validate + bqaa_e2e_gaps datasets (both torn down after the runs).
  • Model: gemini-3.5-flash on Vertex AI global endpoint (https://aiplatform.googleapis.com/v1/projects/…/locations/global/publishers/google/models/gemini-3.5-flash). The regional endpoint returns 404 for this model as of today.
  • Storage Write API: 45-second settling delay before each validation query (real eventual-visibility behavior — the customer's production queries should similarly tolerate the streaming-insert window, which is the same pause-registry visibility issue tracked in Multimodal response generation support google/adk-python#206).

@caohy1988
Copy link
Copy Markdown
Owner Author

Backward-compatibility analysis

Classification: backward-compatible additive telemetry, not a breaking change.

What is NOT changing

Surface Verification
Top-level BigQuery schema columns The row = {…} dict in _log_event is untouched. attributes was already a STRING column — we changed the JSON content inside it, not the column shape.
Existing event types Nothing removed. Only additions: AGENT_TRANSFER, EVENT_COMPACTION, AGENT_STATE_CHECKPOINT, TOOL_PAUSED. Reuses existing TOOL_COMPLETED.
AGENT_RESPONSE flat extras Deliberately kept (source_event_id / source_event_author / source_event_branch) alongside the new envelope, so existing consumer SQL keeps working.
EventData signature Only gains optional fields with safe defaults (source_event: Optional[Event] = None, adk_extras: dict = field(default_factory=dict)). Existing call sites unchanged.
ADK agent execution behavior All callbacks return None / pass-through; no event-flow mutation.
Deleted on_state_change_callback stub The plugin's own docstring previously stated "never invoked by the framework (not in BasePlugin, PluginManager, or Runner)". No ADK callers exist; removing it is documentation cleanup.

What IS observable (intentionally)

  • New rows for existing ADK runsAGENT_TRANSFER, EVENT_COMPACTION, AGENT_STATE_CHECKPOINT, TOOL_PAUSED, and long-running user-message TOOL_COMPLETED rows will land for ADK 2.0 invocations that exercise those EventActions. This is the point of the PR.
  • Every row's attributes JSON now includes attributes.adk.schema_version and attributes.adk.app_name.
  • Event-originating rows additionally include attributes.adk.source_event_id, attributes.adk.node, attributes.adk.branch, attributes.adk.scope.
  • AGENT_RESPONSE rows are double-stamped during the transition window (legacy flat extras + new envelope). When downstream has migrated to attributes.adk.*, the legacy keys can be removed in a follow-up — that would be a breaking change and should be telegraphed before it lands.

Operational nuances for the customer / analytics team

  1. Row volume — agents using long-running tools heavily will see additional TOOL_PAUSED + resume-side TOOL_COMPLETED rows. For agents that don't use long-running tools: zero change.
  2. Per-row storage cost ticks up by the size of the adk envelope (~150-300 bytes typical, more if route / render_ui_widgets / node.path are set). On 1M rows/day this is single-digit MB/day — non-issue at customer scale but worth noting for very high-volume internal deployments.
  3. attributes.adk.app_name exposure — surfaces whatever the producer set on InvocationContext.session.app_name. Same exposure surface that already existed elsewhere, but now on every row's attributes column. If any deployment has stuck sensitive identifiers in their app name, this is the moment to fix that.
  4. C7 SQL contract is JSON_VALUE(attributes, '$.adk.pause_kind') and $.adk.function_call_id — pair keys live under attributes.adk, not at top level. Anyone who wrote SQL against an earlier buggy intermediate shape (top-level) during PR iteration would find their SQL silently returning null. The final shipped contract matches automatic function calling fails on lambda generated tools google/adk-python#293 v5; no production SQL should ever have pointed at the wrong path.

Caveats — who might need to update something

  • Brittle event-type allowlists / strict row-count tests — downstream consumers (or alarm rules) that hardcode the older set of BQAA event types may need to opt in or update filters. This is the consumer-side audit tracked in upstream issue support for context caching google/adk-python#211.
  • Dashboards that aggregate by event_type — new types will show up; legend/filter UIs may need a refresh.
  • Strict JSON-schema validators on attributes — if any downstream system rejects unknown top-level keys, the new adk envelope will trip it. Almost no real-world consumer does this, but flagging.

Summary

Backward-compatible additive telemetry. The only group with action items pre-merge is anyone running brittle assertions on exact event-type sets or attribute-JSON shape — and that group is tracked separately under google#211.

@caohy1988
Copy link
Copy Markdown
Owner Author

Downstream SQL recipes for the new ADK 2.0 events

Every query below targets the BQAA agent_events table. Replace the placeholders:

DECLARE _PROJECT  STRING DEFAULT 'your-project';
DECLARE _DATASET  STRING DEFAULT 'your_dataset';
DECLARE _TABLE    STRING DEFAULT 'agent_events';
DECLARE _SINCE    TIMESTAMP DEFAULT TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY);

Or inline a fully-qualified table reference like `proj.ds.agent_events` and a hard-coded WHERE timestamp >= ... clause.


1. Envelope health checks (A1 / A2 / A3)

1.1 — Confirm attributes.adk.schema_version + app_name stamp every row

SELECT
  COUNT(*)                                                                AS total,
  COUNTIF(JSON_VALUE(attributes, '$.adk.schema_version') IS NULL)         AS missing_schema_version,
  COUNTIF(JSON_VALUE(attributes, '$.adk.app_name')       IS NULL)         AS missing_app_name,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.schema_version'))          AS distinct_schema_versions
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY);

missing_schema_version and missing_app_name should both be 0 on rows produced by an ADK 2.0-aware plugin. If distinct_schema_versions > 1, you have a mixed-producer fleet — useful for tracking a roll-out.

1.2 — Event-originating vs callback-only row split (A3 source_event_id presence)

SELECT
  event_type,
  COUNTIF(JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL) AS event_originating,
  COUNTIF(JSON_VALUE(attributes, '$.adk.source_event_id') IS NULL)     AS callback_only,
  COUNT(*) AS total
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY event_type
ORDER BY total DESC;

Helps surface which event types are produced from non-on_event_callback paths (those will show callback_only > 0).


2. Workflow DAG via attributes.adk.node (C1)

2.1 — Node hierarchy join via parent_path

SELECT
  invocation_id,
  JSON_VALUE(attributes, '$.adk.node.path')        AS node_path,
  JSON_VALUE(attributes, '$.adk.node.parent_path') AS parent_path,
  JSON_VALUE(attributes, '$.adk.node.run_id')      AS run_id,
  COUNT(*)                                         AS events_at_node
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.node.path') IS NOT NULL
  AND JSON_VALUE(attributes, '$.adk.node.path') != ''
GROUP BY 1, 2, 3, 4
ORDER BY invocation_id, node_path;

parent_path IS the workflow DAG join key — every node row joins back to its parent via JSON_VALUE(attributes, '$.adk.node.path') = JSON_VALUE(parent.attributes, '$.adk.node.parent_path') within the same invocation_id.

2.2 — Workflow node fan-out per invocation

SELECT
  invocation_id,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.node.path')) AS distinct_node_paths,
  COUNT(*)                                                  AS total_events
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.node.path') IS NOT NULL
  AND JSON_VALUE(attributes, '$.adk.node.path') != ''
GROUP BY invocation_id
HAVING distinct_node_paths > 1
ORDER BY distinct_node_paths DESC
LIMIT 50;

3. Multi-agent transfer chains (C4)

3.1 — Linear transfer chain per invocation

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  ARRAY_AGG(
    STRUCT(
      JSON_VALUE(content, '$.from_agent') AS from_agent,
      JSON_VALUE(content, '$.to_agent')   AS to_agent,
      timestamp
    )
    ORDER BY timestamp
  ) AS transfer_chain
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_TRANSFER'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1, 2, 3, 4
ORDER BY ARRAY_LENGTH(transfer_chain) DESC
LIMIT 50;

3.2 — Top transfer pairs (which agents hand off to which)

SELECT
  JSON_VALUE(content, '$.from_agent') AS from_agent,
  JSON_VALUE(content, '$.to_agent')   AS to_agent,
  COUNT(*) AS transfers
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_TRANSFER'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1, 2
ORDER BY transfers DESC
LIMIT 25;

4. Event compaction windows (C5)

4.1 — Active compactions with fractional-precision timestamps

SELECT
  invocation_id,
  -- The producer preserves float-epoch fractional precision.
  -- Use TIMESTAMP_MICROS(*1e6) to widen safely without losing precision.
  TIMESTAMP_MICROS(CAST(CAST(JSON_VALUE(content, '$.start_timestamp') AS FLOAT64) * 1000000 AS INT64)) AS window_start,
  TIMESTAMP_MICROS(CAST(CAST(JSON_VALUE(content, '$.end_timestamp')   AS FLOAT64) * 1000000 AS INT64)) AS window_end,
  CAST(JSON_VALUE(content, '$.end_timestamp') AS FLOAT64) - CAST(JSON_VALUE(content, '$.start_timestamp') AS FLOAT64) AS window_seconds
FROM `proj.ds.agent_events`
WHERE event_type = 'EVENT_COMPACTION'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY window_seconds DESC
LIMIT 50;

Use TIMESTAMP_MICROS(... * 1000000) instead of TIMESTAMP_SECONDS — the latter truncates fractional windows.


5. Agent-state checkpoints (C6)

5.1 — Checkpoint stream (both shapes side-by-side)

SELECT
  JSON_VALUE(attributes, '$.adk.app_name')          AS app_name,
  session_id,
  invocation_id,
  agent,
  timestamp,
  JSON_QUERY(content, '$.agent_state')              AS agent_state,
  CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) AS end_of_agent,
  CASE
    WHEN JSON_VALUE(content, '$.agent_state') IS NULL
         AND CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) = TRUE
      THEN 'end_only'
    WHEN JSON_VALUE(content, '$.agent_state') IS NOT NULL
         AND CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) = FALSE
      THEN 'state_only'
    ELSE 'both'
  END AS shape
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_STATE_CHECKPOINT'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY timestamp DESC
LIMIT 100;

5.2 — Checkpoint frequency per agent

SELECT
  agent,
  COUNTIF(JSON_VALUE(content, '$.agent_state') IS NOT NULL) AS state_checkpoints,
  COUNTIF(CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) = TRUE) AS end_of_agent_signals,
  COUNT(*) AS total
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_STATE_CHECKPOINT'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY agent
ORDER BY total DESC;

6. Long-running tool durations (C7)

6.1 — TOOL_PAUSED ↔ TOOL_COMPLETED pair-join with duration

This is the customer's headline query — the one that powers "how long are my long-running tools actually pausing for?".

WITH paused AS (
  SELECT
    JSON_VALUE(attributes, '$.adk.app_name')        AS app_name,
    user_id, session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    JSON_VALUE(content,    '$.tool')                AS tool,
    timestamp                                       AS pause_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_PAUSED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
),
completed AS (
  SELECT
    JSON_VALUE(attributes, '$.adk.app_name')        AS app_name,
    user_id, session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    JSON_VALUE(content,    '$.tool')                AS tool,
    timestamp                                       AS complete_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_COMPLETED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
)
SELECT
  p.app_name, p.user_id, p.session_id, p.function_call_id, p.tool,
  p.pause_ts, c.complete_ts,
  TIMESTAMP_DIFF(c.complete_ts, p.pause_ts, SECOND) AS pause_seconds
FROM paused p
JOIN completed c
  USING (app_name, user_id, session_id, function_call_id)
ORDER BY pause_seconds DESC
LIMIT 100;

Two-event-type CTEs avoid the dedupe gotcha called out in google#190 v15 (a single-partition ROW_NUMBER across both event types can drop the matching completion under a duplicate paused row).

6.2 — Orphan paused rows (no resume observed)

WITH paused AS (
  SELECT
    JSON_VALUE(attributes, '$.adk.app_name')        AS app_name,
    user_id, session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    timestamp                                       AS pause_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_PAUSED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    AND timestamp BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
                      AND TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
),
completed_ids AS (
  SELECT DISTINCT
    JSON_VALUE(attributes, '$.adk.app_name')        AS app_name,
    user_id, session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_COMPLETED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
)
SELECT *
FROM paused
WHERE NOT EXISTS (
  SELECT 1 FROM completed_ids c
  WHERE c.app_name        = paused.app_name
    AND c.user_id         = paused.user_id
    AND c.session_id      = paused.session_id
    AND c.function_call_id = paused.function_call_id
)
ORDER BY pause_ts DESC
LIMIT 50;

The BETWEEN ... AND TIMESTAMP_SUB(... INTERVAL 1 HOUR) window is the manual "settling delay" for the eventual-visibility of the Storage Write API — adjust to your tolerance. Per google#293 v5 + google#297, the proper pause_orphan flag lives on the producer side and is tracked by google#206.


7. HITL stream — separate from long-running tools (C7 routing)

7.1 — HITL request/completion pairs (and confirm they do NOT show up in TOOL_COMPLETED)

SELECT
  user_id, session_id, invocation_id,
  COUNTIF(event_type = 'HITL_CONFIRMATION_REQUEST')           AS conf_request,
  COUNTIF(event_type = 'HITL_CONFIRMATION_REQUEST_COMPLETED') AS conf_completed,
  COUNTIF(event_type = 'HITL_CREDENTIAL_REQUEST')             AS cred_request,
  COUNTIF(event_type = 'HITL_CREDENTIAL_REQUEST_COMPLETED')   AS cred_completed,
  COUNTIF(event_type = 'HITL_INPUT_REQUEST')                  AS inp_request,
  COUNTIF(event_type = 'HITL_INPUT_REQUEST_COMPLETED')        AS inp_completed
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND event_type LIKE 'HITL_%'
GROUP BY 1, 2, 3
ORDER BY (conf_request + cred_request + inp_request) DESC
LIMIT 50;

7.2 — Sanity check: HITL completions never land in TOOL_COMPLETED

-- Expected result: 0 rows. If non-zero, the HITL non-routing contract is broken.
SELECT
  JSON_VALUE(content, '$.tool') AS tool, COUNT(*) AS n
FROM `proj.ds.agent_events`
WHERE event_type = 'TOOL_COMPLETED'
  AND JSON_VALUE(content, '$.tool') IN ('adk_request_confirmation', 'adk_request_credential', 'adk_request_input')
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1;

8. Action attributes — route / widgets / rewind (C8)

8.1 — Routing-decision histogram

SELECT
  JSON_VALUE(attributes, '$.adk.route') AS route_value,
  COUNT(*) AS occurrences,
  COUNT(DISTINCT invocation_id) AS invocations
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.route') IS NOT NULL
GROUP BY 1
ORDER BY occurrences DESC;

8.2 — Rewind requests

SELECT
  invocation_id                                                          AS rewinding_invocation_id,
  JSON_VALUE(attributes, '$.adk.rewind_before_invocation_id')            AS rewinding_to,
  agent,
  timestamp
FROM `proj.ds.agent_events`
WHERE JSON_VALUE(attributes, '$.adk.rewind_before_invocation_id') IS NOT NULL
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY timestamp DESC
LIMIT 50;

8.3 — UI-widget render frequency

SELECT
  invocation_id, agent,
  ARRAY_LENGTH(JSON_QUERY_ARRAY(attributes, '$.adk.render_ui_widgets')) AS widget_count,
  JSON_VALUE(attributes, '$.adk.render_ui_widgets[0].provider')         AS first_widget_provider
FROM `proj.ds.agent_events`
WHERE JSON_QUERY(attributes, '$.adk.render_ui_widgets') IS NOT NULL
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY widget_count DESC
LIMIT 50;

9. Scope breakdown (C3) — node-loop vs function-call fan-out

SELECT
  JSON_VALUE(attributes, '$.adk.scope.kind')          AS scope_kind,
  COUNT(*)                                            AS rows,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.scope.id')) AS distinct_scope_ids,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.source_event_id')) AS distinct_source_events
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL
GROUP BY scope_kind
ORDER BY rows DESC;

scope_kind values: node_run (loop iteration), function_call (model-/ADK-issued tool id), unknown (anomaly with warning log in the producer). NULL = unscoped (isolation_scope is None).


10. Branch fan-out (C2)

SELECT
  invocation_id,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.branch')) AS distinct_branches,
  ARRAY_AGG(DISTINCT JSON_VALUE(attributes, '$.adk.branch') IGNORE NULLS LIMIT 10) AS branches
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL
GROUP BY invocation_id
HAVING distinct_branches > 1
ORDER BY distinct_branches DESC
LIMIT 50;

ADK 1.x impact

Short answer: zero impact on users staying on ADK 1.x; additive observability for users upgrading from 1.x to 2.x.

If you stay on ADK 1.x

No impact at all. This PR targets main (ADK 2.x), not v1. The v1 branch's bigquery_agent_analytics_plugin.py is not modified by this change. ADK 1.x users keep their existing 16 event types and existing attribute shape.

If you upgrade from ADK 1.x to ADK 2.x (where this PR lands)

The upgrade is to ADK 2.x, which gets you this PR's plugin behavior automatically. Verified against the v1 branch source:

Field used by this PR Present on v1 EventActions? Effect after upgrade
actions.transfer_to_agent Yes (it's the original 1.x transfer mechanism) Existing 1.x-shaped agents that transfer now emit a new AGENT_TRANSFER row per transfer
actions.compaction Yes (v1 EventActions.compaction: Optional[EventCompaction]) 1.x agents that triggered compaction now emit EVENT_COMPACTION rows
actions.agent_state / actions.end_of_agent Yes (both fields exist on v1 EventActions) 1.x agents that set either now emit AGENT_STATE_CHECKPOINT rows
event.long_running_tool_ids Yes (v1 event.py: long_running_tool_ids: Optional[set[str]] = None) 1.x agents with long-running tools now emit TOOL_PAUSED rows, and user-message-side resumes emit non-HITL TOOL_COMPLETED rows with the pair keys
actions.rewind_before_invocation_id Yes 1.x agents that requested a rewind now stamp attributes.adk.rewind_before_invocation_id
actions.render_ui_widgets Yes Stamped under attributes.adk.render_ui_widgets when present
actions.route Noroute is ADK-2.0-only on EventActions 1.x agents never produce a non-null attributes.adk.route; null is the correct/expected value

Key observation: even though most of these EventActions fields existed in 1.x, the 1.x BQAA plugin never emitted the new event types (AGENT_TRANSFER, EVENT_COMPACTION, AGENT_STATE_CHECKPOINT, TOOL_PAUSED) — they were silently ignored. The data was already in the agent execution; it just wasn't logged.

So for a 1.x → 2.x upgrade:

  • All existing event types continue to be emitted with the same content shape. Old SQL keeps working.
  • The new event types and the attributes.adk.* envelope are additive. New SQL gets new observability into actions that were already happening.
  • No agent-behavior change. The plugin is observation-only; it doesn't modify the agent's event flow.
  • Brittle assertions need a refresh. If your downstream has a hardcoded allowlist of event types, or a strict row-count test, or strict JSON-schema validation on the attributes column, you'll need to update those to permit the new types and the adk envelope. This is the same caveat already captured in the BC analysis above.

Quick "am I about to break?" check for a 1.x → 2.x upgrade

Run before the upgrade against the old 1.x data:

SELECT event_type, COUNT(*) AS n
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
GROUP BY event_type
ORDER BY n DESC;

Run after the upgrade. If AGENT_TRANSFER, EVENT_COMPACTION, AGENT_STATE_CHECKPOINT, or TOOL_PAUSED show up where they didn't before — that's expected (your agent was always taking those actions; you can now see them).

@caohy1988
Copy link
Copy Markdown
Owner Author

Fresh review of the SQL-library / ADK 1.x impact comment. The structure is useful and the 1.x compatibility framing checks out against origin/v1, but I found two SQL correctness issues and one robustness gap worth fixing before people copy these recipes.

Findings

  1. Checkpoint queries misclassify object-valued agent_state because they use JSON_VALUE.

    In section 5, both the shape discriminator and frequency query use:

    JSON_VALUE(content, '$.agent_state')

    But agent_state is a JSON object for normal checkpoint rows. BigQuery JSON_VALUE returns SQL NULL for non-scalar values (objects/arrays), per the official JSON function contract: https://cloud.google.com/bigquery/docs/reference/standard-sql/json_functions#json_value

    Consequence:

    • state-only rows like {"agent_state": {"step": 3}, "end_of_agent": false} are not detected as state_only;
    • state_checkpoints in 5.2 undercounts object-valued checkpoints, often to zero.

    Fix: use JSON_QUERY(content, '$.agent_state') for presence checks and output. Example:

    CASE
      WHEN JSON_QUERY(content, '$.agent_state') IS NULL
           AND SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) = TRUE
        THEN 'end_only'
      WHEN JSON_QUERY(content, '$.agent_state') IS NOT NULL
           AND COALESCE(SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL), FALSE) = FALSE
        THEN 'state_only'
      ELSE 'both'
    END AS shape

    And in 5.2:

    COUNTIF(JSON_QUERY(content, '$.agent_state') IS NOT NULL) AS state_checkpoints
  2. Invocation/DAG queries should carry the full telemetry identity, not only invocation_id.

    Several recipes group or order at invocation grain using only invocation_id:

    • 2.1 / 2.2 workflow DAG;
    • 4.1 compaction list;
    • 7.1 HITL stream grouping has user_id/session_id/invocation_id but omits app_name;
    • 10 branch fan-out.

    In a shared agent_events table, invocation_id is not the whole telemetry identity. The post-automatic function calling fails on lambda generated tools google/adk-python#293 contract deliberately stamps attributes.adk.app_name on every row, while user_id and session_id are top-level columns. Use (app_name, user_id, session_id, invocation_id) anywhere the query describes an invocation, DAG, branch fan-out, HITL pair stream, or cross-event grouping.

    Example for section 10:

    SELECT
      JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
      user_id,
      session_id,
      invocation_id,
      COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.branch')) AS distinct_branches,
      ARRAY_AGG(DISTINCT JSON_VALUE(attributes, '$.adk.branch') IGNORE NULLS LIMIT 10) AS branches
    FROM `proj.ds.agent_events`
    WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
      AND JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL
    GROUP BY 1, 2, 3, 4
    HAVING distinct_branches > 1
    ORDER BY distinct_branches DESC
    LIMIT 50;

    The long-running duration join in section 6 is intentionally different: omitting invocation_id there is correct because resume can happen in a later invocation. Keep its full key as (app_name, user_id, session_id, function_call_id).

  3. Section 6 says the two-CTE shape dodges the dedupe gotcha, but it does not dedupe duplicates.

    Splitting TOOL_PAUSED and TOOL_COMPLETED into separate CTEs avoids the old "one ROW_NUMBER() partition across both event types drops the completion" bug. It does not protect the ad-hoc query from duplicate paused rows or duplicate completed rows; the join can still multiply durations.

    If this library is meant to be copy/paste operational SQL, add per-stream dedupe in each CTE:

    paused AS (
      SELECT * EXCEPT(rn)
      FROM (
        SELECT ..., ROW_NUMBER() OVER (
          PARTITION BY app_name, user_id, session_id, function_call_id, timestamp
          ORDER BY timestamp
        ) AS rn
        FROM ...
      )
      WHERE rn = 1
    )

    Same pattern for completed. If the goal is a minimal readable recipe, weaken the prose to "two event-type CTEs avoid the single-partition dedupe bug; production views should still dedupe per stream."

Verified Good

  • C7 pair keys are now actually under attributes.adk in the PR branch:
    • TOOL_COMPLETED: src/google/adk/plugins/bigquery_agent_analytics_plugin.py:3152-3157
    • TOOL_PAUSED: src/google/adk/plugins/bigquery_agent_analytics_plugin.py:3319-3325
  • The 1.x impact table is directionally correct against origin/v1:
    • transfer_to_agent, compaction, end_of_agent, agent_state, rewind_before_invocation_id, render_ui_widgets exist on v1 EventActions.
    • route is absent from v1 EventActions.
    • long_running_tool_ids exists on v1 Event.

After the three SQL edits above, the library is in good shape as the customer-facing query starter set.

@caohy1988
Copy link
Copy Markdown
Owner Author

SQL library v2 — three corrections

All three findings verified before posting:

# Verified Fix
1 Live bq query against BQ: JSON_VALUE(j, "$.obj") returns NULL; JSON_QUERY(j, "$.obj") returns the object Section 5 rewritten with JSON_QUERY
2 post-google#293 contract stamps attributes.adk.app_name on every row; in a shared agent_events table, invocation_id alone is not unique Sections 2, 4, 7, 10 re-grouped on (app_name, user_id, session_id, invocation_id)
3 Two-CTE shape avoids the cross-event-type partition bug but does not dedupe duplicates within either stream Section 6 gains per-stream dedupe instead of weakening the prose

The long-running duration join in Section 6 intentionally omits invocation_id from the join key — resume can happen in a later invocation — and the key stays (app_name, user_id, session_id, function_call_id). That part was correct in v1.

Below: the three corrected sections inline. Everything else in the v1 library stands.


Section 5 (corrected) — Agent-state checkpoints

5.1 — Checkpoint stream with correct shape discriminator

SELECT
  JSON_VALUE(attributes, '$.adk.app_name')                 AS app_name,
  user_id,
  session_id,
  invocation_id,
  agent,
  timestamp,
  JSON_QUERY(content, '$.agent_state')                    AS agent_state,
  SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) AS end_of_agent,
  CASE
    WHEN JSON_QUERY(content, '$.agent_state') IS NULL
         AND SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) = TRUE
      THEN 'end_only'
    WHEN JSON_QUERY(content, '$.agent_state') IS NOT NULL
         AND COALESCE(SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL), FALSE) = FALSE
      THEN 'state_only'
    ELSE 'both'
  END AS shape
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_STATE_CHECKPOINT'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY timestamp DESC
LIMIT 100;

5.2 — Checkpoint frequency per agent (uses JSON_QUERY for presence)

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  agent,
  COUNTIF(JSON_QUERY(content, '$.agent_state') IS NOT NULL)               AS state_checkpoints,
  COUNTIF(SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) = TRUE) AS end_of_agent_signals,
  COUNT(*)                                                                AS total
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_STATE_CHECKPOINT'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1, 2
ORDER BY total DESC;

Sections 2 / 4 / 7 / 10 (corrected) — full telemetry identity in grouping

2.1 — Workflow DAG join

SELECT
  JSON_VALUE(attributes, '$.adk.app_name')         AS app_name,
  user_id,
  session_id,
  invocation_id,
  JSON_VALUE(attributes, '$.adk.node.path')        AS node_path,
  JSON_VALUE(attributes, '$.adk.node.parent_path') AS parent_path,
  JSON_VALUE(attributes, '$.adk.node.run_id')      AS run_id,
  COUNT(*)                                         AS events_at_node
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.node.path') IS NOT NULL
  AND JSON_VALUE(attributes, '$.adk.node.path') != ''
GROUP BY 1, 2, 3, 4, 5, 6, 7
ORDER BY app_name, user_id, session_id, invocation_id, node_path;

2.2 — Workflow node fan-out per invocation

SELECT
  JSON_VALUE(attributes, '$.adk.app_name')                  AS app_name,
  user_id,
  session_id,
  invocation_id,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.node.path')) AS distinct_node_paths,
  COUNT(*)                                                  AS total_events
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.node.path') IS NOT NULL
  AND JSON_VALUE(attributes, '$.adk.node.path') != ''
GROUP BY 1, 2, 3, 4
HAVING distinct_node_paths > 1
ORDER BY distinct_node_paths DESC
LIMIT 50;

4.1 — Compaction list with full identity

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  TIMESTAMP_MICROS(CAST(CAST(JSON_VALUE(content, '$.start_timestamp') AS FLOAT64) * 1000000 AS INT64)) AS window_start,
  TIMESTAMP_MICROS(CAST(CAST(JSON_VALUE(content, '$.end_timestamp')   AS FLOAT64) * 1000000 AS INT64)) AS window_end,
  CAST(JSON_VALUE(content, '$.end_timestamp') AS FLOAT64)
    - CAST(JSON_VALUE(content, '$.start_timestamp') AS FLOAT64) AS window_seconds
FROM `proj.ds.agent_events`
WHERE event_type = 'EVENT_COMPACTION'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY window_seconds DESC
LIMIT 50;

7.1 — HITL pair stream with app_name in the grouping

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  COUNTIF(event_type = 'HITL_CONFIRMATION_REQUEST')           AS conf_request,
  COUNTIF(event_type = 'HITL_CONFIRMATION_REQUEST_COMPLETED') AS conf_completed,
  COUNTIF(event_type = 'HITL_CREDENTIAL_REQUEST')             AS cred_request,
  COUNTIF(event_type = 'HITL_CREDENTIAL_REQUEST_COMPLETED')   AS cred_completed,
  COUNTIF(event_type = 'HITL_INPUT_REQUEST')                  AS inp_request,
  COUNTIF(event_type = 'HITL_INPUT_REQUEST_COMPLETED')        AS inp_completed
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND event_type LIKE 'HITL_%'
GROUP BY 1, 2, 3, 4
ORDER BY (conf_request + cred_request + inp_request) DESC
LIMIT 50;

10 — Branch fan-out with full identity

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.branch'))         AS distinct_branches,
  ARRAY_AGG(DISTINCT JSON_VALUE(attributes, '$.adk.branch') IGNORE NULLS LIMIT 10) AS branches
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL
GROUP BY 1, 2, 3, 4
HAVING distinct_branches > 1
ORDER BY distinct_branches DESC
LIMIT 50;

Section 6.1 (corrected) — per-stream dedupe added

The two-CTE shape avoids the single-partition cross-event-type bug. The added ROW_NUMBER() per-stream dedupes legitimate duplicate rows (which can arise from network retries on the Storage Write API path or from upstream replay). Without it, the join can multiply durations.

WITH paused_dedup AS (
  SELECT * EXCEPT(rn)
  FROM (
    SELECT
      JSON_VALUE(attributes, '$.adk.app_name')        AS app_name,
      user_id, session_id,
      JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
      JSON_VALUE(content,    '$.tool')                AS tool,
      timestamp                                       AS pause_ts,
      ROW_NUMBER() OVER (
        PARTITION BY
          JSON_VALUE(attributes, '$.adk.app_name'),
          user_id, session_id,
          JSON_VALUE(attributes, '$.adk.function_call_id')
        ORDER BY timestamp
      ) AS rn
    FROM `proj.ds.agent_events`
    WHERE event_type = 'TOOL_PAUSED'
      AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
      AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
  )
  WHERE rn = 1
),
completed_dedup AS (
  SELECT * EXCEPT(rn)
  FROM (
    SELECT
      JSON_VALUE(attributes, '$.adk.app_name')        AS app_name,
      user_id, session_id,
      JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
      JSON_VALUE(content,    '$.tool')                AS tool,
      timestamp                                       AS complete_ts,
      ROW_NUMBER() OVER (
        PARTITION BY
          JSON_VALUE(attributes, '$.adk.app_name'),
          user_id, session_id,
          JSON_VALUE(attributes, '$.adk.function_call_id')
        ORDER BY timestamp
      ) AS rn
    FROM `proj.ds.agent_events`
    WHERE event_type = 'TOOL_COMPLETED'
      AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
      AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
  )
  WHERE rn = 1
)
SELECT
  p.app_name, p.user_id, p.session_id, p.function_call_id, p.tool,
  p.pause_ts, c.complete_ts,
  TIMESTAMP_DIFF(c.complete_ts, p.pause_ts, SECOND) AS pause_seconds
FROM paused_dedup p
JOIN completed_dedup c
  USING (app_name, user_id, session_id, function_call_id)
ORDER BY pause_seconds DESC
LIMIT 100;

Why this layout vs a single combined partition: putting event_type into a single ROW_NUMBER partition spanning both rows is the original bug (a duplicate paused row can drop the matching completion's rn=1). Splitting into two CTEs and deduping within each is what's actually correct. Sections 6.2 / the rest stand as-is.


v1 sections that need no change

1 (envelope health) / 3 (transfer chains) / 8 (action attributes) / 9 (scope breakdown) — already correct as posted. The 1.x impact analysis stands.

@caohy1988
Copy link
Copy Markdown
Owner Author

Fresh review of SQL library v2. The three corrections are materially right: Section 5 now uses the right JSON function for object state, Sections 2/4/7/10 carry full invocation identity, and Section 6.1 now dedupes per stream. I found two remaining follow-ups worth tightening before this becomes customer copy/paste material.

Findings

  1. Section 6.1 should be forward-compatible with Multimodal response generation support google/adk-python#206 by excluding future pause_orphan = true completions from the healthy pair branch.

    The current PR does not emit attributes.adk.pause_orphan, so the query works today. But Database UniqueViolation on events table during streaming with multi-part LLM responses (text + function call) google/adk-python#297/Multimodal response generation support google/adk-python#206 explicitly makes pause_orphan the next producer-side long-running-tool field. Once that lands, an orphan-tagged TOOL_COMPLETED could still join a visible TOOL_PAUSED row later and appear as a healthy duration unless the completed CTE excludes it.

    Add this predicate to completed_dedup now; it is null-safe for current automatic function calling fails on lambda generated tools google/adk-python#293 rows:

    AND COALESCE(SAFE_CAST(JSON_VALUE(attributes, '$.adk.pause_orphan') AS BOOL), FALSE) = FALSE

    Optional guard in the final join:

    AND c.complete_ts >= p.pause_ts

    That keeps this recipe aligned with the planned Multimodal response generation support google/adk-python#206 semantics without changing current behavior.

  2. The shared-table identity rule should also be explicit for Section 3.2 and the Section 8 rollups, or those should be labeled fleet-wide.

    v2 correctly fixed invocation-grain queries to include (app_name, user_id, session_id, invocation_id). But the same shared-table concern can still affect global histograms:

    • Section 3.2 groups top transfer pairs only by from_agent, to_agent.
    • Section 8.1 groups routes only by route_value.
    • Section 8.2 / 8.3 list action rows without app_name, user_id, session_id.

    If the intent is fleet-wide aggregation, say so in the heading. If the intent is operator debugging in a shared agent_events table, include at least app_name, and include user_id/session_id/invocation_id in list-style rows.

    Suggested Section 3.2 shape:

    SELECT
      JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
      JSON_VALUE(content, '$.from_agent') AS from_agent,
      JSON_VALUE(content, '$.to_agent')   AS to_agent,
      COUNT(*) AS transfers
    FROM `proj.ds.agent_events`
    WHERE event_type = 'AGENT_TRANSFER'
      AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
    GROUP BY 1, 2, 3
    ORDER BY transfers DESC
    LIMIT 25;

Verified Good

  • Section 5's JSON_QUERY(content, '$.agent_state') matches the producer payload: AGENT_STATE_CHECKPOINT writes object-or-null content.agent_state at src/google/adk/plugins/bigquery_agent_analytics_plugin.py:3242-3267.
  • C7 pair-key paths match the PR branch:
    • TOOL_COMPLETED writes attributes.adk.{pause_kind,function_call_id} at src/google/adk/plugins/bigquery_agent_analytics_plugin.py:3152-3157.
    • TOOL_PAUSED writes the same keys at src/google/adk/plugins/bigquery_agent_analytics_plugin.py:3319-3325.
  • The 1.x impact section still checks out against origin/v1: the listed EventActions fields exist there except route, and long_running_tool_ids exists on v1 Event.

No blocker to the SQL library direction. I would make the two edits above so the recipes do not drift the moment google#206 lands and so shared-table users do not accidentally merge unrelated apps in their rollups.

@caohy1988
Copy link
Copy Markdown
Owner Author

SQL library v3 — two final follow-ups

Both findings are real:

# Reason Fix
1 google#293 doesn't emit pause_orphan yet, but google#297/google#206 makes it the next producer field on long-running TOOL_COMPLETED. Once it lands, an orphan-tagged completion could still join a later-visible TOOL_PAUSED and look like a healthy duration unless explicitly excluded. The null-safe predicate is no-op against current rows. Section 6.1 completed_dedup CTE gains the pause_orphan = false filter (null-safe) and the join gains c.complete_ts >= p.pause_ts.
2 Sections 3.2 / 8.1 / 8.2 / 8.3 omitted app_name from groupings. For shared agent_events tables this silently merges unrelated apps. Include attributes.adk.app_name in group keys (3.2 / 8.1) and in row lists (8.2 / 8.3). Each section now operates per-app by default.

Section 6.1's join-time guard c.complete_ts >= p.pause_ts is small but blocks a class of bad pairs that can arise from clock skew or replay (pause_seconds going negative).


Section 6.1 (v3) — pause_orphan-safe + timestamp-ordered join

WITH paused_dedup AS (
  SELECT * EXCEPT(rn)
  FROM (
    SELECT
      JSON_VALUE(attributes, '$.adk.app_name')        AS app_name,
      user_id, session_id,
      JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
      JSON_VALUE(content,    '$.tool')                AS tool,
      timestamp                                       AS pause_ts,
      ROW_NUMBER() OVER (
        PARTITION BY
          JSON_VALUE(attributes, '$.adk.app_name'),
          user_id, session_id,
          JSON_VALUE(attributes, '$.adk.function_call_id')
        ORDER BY timestamp
      ) AS rn
    FROM `proj.ds.agent_events`
    WHERE event_type = 'TOOL_PAUSED'
      AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
      AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
  )
  WHERE rn = 1
),
completed_dedup AS (
  SELECT * EXCEPT(rn)
  FROM (
    SELECT
      JSON_VALUE(attributes, '$.adk.app_name')        AS app_name,
      user_id, session_id,
      JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
      JSON_VALUE(content,    '$.tool')                AS tool,
      timestamp                                       AS complete_ts,
      ROW_NUMBER() OVER (
        PARTITION BY
          JSON_VALUE(attributes, '$.adk.app_name'),
          user_id, session_id,
          JSON_VALUE(attributes, '$.adk.function_call_id')
        ORDER BY timestamp
      ) AS rn
    FROM `proj.ds.agent_events`
    WHERE event_type = 'TOOL_COMPLETED'
      AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
      AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
      -- Forward-compat with #206: exclude orphan-tagged completions
      -- from the healthy-pair branch. Null-safe vs current #293 rows.
      AND COALESCE(SAFE_CAST(JSON_VALUE(attributes, '$.adk.pause_orphan') AS BOOL), FALSE) = FALSE
  )
  WHERE rn = 1
)
SELECT
  p.app_name, p.user_id, p.session_id, p.function_call_id, p.tool,
  p.pause_ts, c.complete_ts,
  TIMESTAMP_DIFF(c.complete_ts, p.pause_ts, SECOND) AS pause_seconds
FROM paused_dedup p
JOIN completed_dedup c
  USING (app_name, user_id, session_id, function_call_id)
WHERE c.complete_ts >= p.pause_ts  -- guard against clock skew / replay
ORDER BY pause_seconds DESC
LIMIT 100;

When google#206 lands, an orphan branch can be added as a sibling query (or UNION ALL block) that selects pause_orphan = true completions with status = 'orphan_completion', null pause_ts, and the completion timestamp as complete_ts.


Section 3.2 (v3) — top transfer pairs per app

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  JSON_VALUE(content, '$.from_agent')      AS from_agent,
  JSON_VALUE(content, '$.to_agent')        AS to_agent,
  COUNT(*) AS transfers
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_TRANSFER'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1, 2, 3
ORDER BY transfers DESC
LIMIT 25;

For a true fleet-wide rollup across all apps, drop the app_name column from SELECT and GROUP BY — and put -- fleet-wide rollup across apps in the heading.


Section 8.1 (v3) — routing histogram per app

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  JSON_VALUE(attributes, '$.adk.route')    AS route_value,
  COUNT(*)                                 AS occurrences,
  COUNT(DISTINCT invocation_id)            AS invocations
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.route') IS NOT NULL
GROUP BY 1, 2
ORDER BY occurrences DESC;

Section 8.2 (v3) — rewind requests with identity

SELECT
  JSON_VALUE(attributes, '$.adk.app_name')                    AS app_name,
  user_id,
  session_id,
  invocation_id                                               AS rewinding_invocation_id,
  JSON_VALUE(attributes, '$.adk.rewind_before_invocation_id') AS rewinding_to,
  agent,
  timestamp
FROM `proj.ds.agent_events`
WHERE JSON_VALUE(attributes, '$.adk.rewind_before_invocation_id') IS NOT NULL
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY timestamp DESC
LIMIT 50;

Section 8.3 (v3) — widget render rows with identity

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  agent,
  ARRAY_LENGTH(JSON_QUERY_ARRAY(attributes, '$.adk.render_ui_widgets')) AS widget_count,
  JSON_VALUE(attributes, '$.adk.render_ui_widgets[0].provider')         AS first_widget_provider
FROM `proj.ds.agent_events`
WHERE JSON_QUERY(attributes, '$.adk.render_ui_widgets') IS NOT NULL
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY widget_count DESC
LIMIT 50;

Net: the library now stays correct when google#206 ships and respects per-app boundaries in shared agent_events deployments by default. v1 sections 1 / 9 still need no change.

@caohy1988
Copy link
Copy Markdown
Owner Author

Fresh review of SQL library v3. The two fixes are directionally correct and the emitted JSON paths match the PR branch. I found one real SQL robustness issue in Section 6.1 and one small consistency nit in Section 8.1.

Findings

  1. Section 6.1 dedupes completions before applying c.complete_ts >= p.pause_ts, which can drop a valid later completion.

    The new timestamp guard is good, but its position matters. Today completed_dedup keeps the earliest TOOL_COMPLETED per (app_name, user_id, session_id, function_call_id) before the join. If that earliest completion is a clock-skewed/replayed row with complete_ts < pause_ts, the final WHERE c.complete_ts >= p.pause_ts filters it out and the query never considers a later valid completion for the same key.

    Safer shape:

    • dedupe exact-ish stream duplicates first, keeping timestamp in the dedupe grain;
    • join paused to all non-orphan completion candidates with c.complete_ts >= p.pause_ts;
    • choose the first valid completion after the pause with QUALIFY.

    Sketch:

    WITH paused AS (
      SELECT DISTINCT
        JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
        user_id,
        session_id,
        JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
        JSON_VALUE(content, '$.tool') AS tool,
        timestamp AS pause_ts
      FROM `proj.ds.agent_events`
      WHERE event_type = 'TOOL_PAUSED'
        AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
        AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
    ),
    completed AS (
      SELECT DISTINCT
        JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
        user_id,
        session_id,
        JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
        JSON_VALUE(content, '$.tool') AS tool,
        timestamp AS complete_ts
      FROM `proj.ds.agent_events`
      WHERE event_type = 'TOOL_COMPLETED'
        AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
        AND COALESCE(SAFE_CAST(JSON_VALUE(attributes, '$.adk.pause_orphan') AS BOOL), FALSE) = FALSE
        AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
    )
    SELECT
      p.app_name, p.user_id, p.session_id, p.function_call_id, p.tool,
      p.pause_ts, c.complete_ts,
      TIMESTAMP_DIFF(c.complete_ts, p.pause_ts, SECOND) AS pause_seconds
    FROM paused p
    JOIN completed c
      USING (app_name, user_id, session_id, function_call_id)
    WHERE c.complete_ts >= p.pause_ts
    QUALIFY ROW_NUMBER() OVER (
      PARTITION BY p.app_name, p.user_id, p.session_id, p.function_call_id, p.pause_ts
      ORDER BY c.complete_ts
    ) = 1
    ORDER BY pause_seconds DESC
    LIMIT 100;

    This keeps the Multimodal response generation support google/adk-python#206 pause_orphan forward-compat predicate, prevents negative durations, and still avoids the original cross-event-type ROW_NUMBER() bug.

  2. Section 8.1's COUNT(DISTINCT invocation_id) is still a partial identity count.

    Since the section is now explicitly per-app, this is lower risk, but the post-automatic function calling fails on lambda generated tools google/adk-python#293 contract treats invocation identity as (app_name, user_id, session_id, invocation_id). For consistency with the rest of v3, count distinct user/session/invocation tuples:

    COUNT(DISTINCT TO_JSON_STRING(STRUCT(user_id, session_id, invocation_id))) AS invocations

    Or rename the metric to distinct_invocation_ids if the partial count is intentional.

Verified Good

No producer-code concern here. This is just making the customer SQL robust before it gets copied into docs or notebooks.

@caohy1988
Copy link
Copy Markdown
Owner Author

SQL library v4 — order-of-operations fix in Section 6.1 + identity-count fix in Section 8.1

# Bug Fix
1 Section 6.1: completed_dedup picks the earliest TOOL_COMPLETED per key via ROW_NUMBER() ORDER BY timestamp. If that earliest row is a clock-skewed / replayed completion with complete_ts < pause_ts, the final WHERE c.complete_ts >= p.pause_ts drops it and no later valid completion is ever considered. Bug masquerades as "missing completion" in the result set. Drop the early dedup; use DISTINCT for stream-duplicate collapse; join paused to all valid completion candidates with the timestamp guard; pick the first valid completion after the pause via QUALIFY ROW_NUMBER() partitioned by the paused key + pause_ts.
2 Section 8.1: COUNT(DISTINCT invocation_id) is a partial identity count under the post-google#293 contract — invocation identity is (app_name, user_id, session_id, invocation_id). Count distinct identity tuples via COUNT(DISTINCT TO_JSON_STRING(STRUCT(user_id, session_id, invocation_id))). (Already per-app from v3, so app_name is implicit in the GROUP BY.)

Section 6.1 (v4) — dedupe + join + QUALIFY

WITH paused AS (
  SELECT DISTINCT
    JSON_VALUE(attributes, '$.adk.app_name')         AS app_name,
    user_id,
    session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    JSON_VALUE(content,    '$.tool')                 AS tool,
    timestamp                                        AS pause_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_PAUSED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
),
completed AS (
  SELECT DISTINCT
    JSON_VALUE(attributes, '$.adk.app_name')         AS app_name,
    user_id,
    session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    JSON_VALUE(content,    '$.tool')                 AS tool,
    timestamp                                        AS complete_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_COMPLETED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    -- Forward-compat with #206: orphan-tagged completions are excluded
    -- from the healthy-pair branch. Null-safe vs current #293 rows.
    AND COALESCE(SAFE_CAST(JSON_VALUE(attributes, '$.adk.pause_orphan') AS BOOL), FALSE) = FALSE
    AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
)
SELECT
  p.app_name, p.user_id, p.session_id, p.function_call_id, p.tool,
  p.pause_ts, c.complete_ts,
  TIMESTAMP_DIFF(c.complete_ts, p.pause_ts, SECOND) AS pause_seconds
FROM paused p
JOIN completed c
  USING (app_name, user_id, session_id, function_call_id)
-- The guard happens BEFORE choosing the winning completion, so a skewed
-- early completion can't shadow a later valid one.
WHERE c.complete_ts >= p.pause_ts
-- For each (pause), pick the earliest valid completion after it.
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY p.app_name, p.user_id, p.session_id, p.function_call_id, p.pause_ts
  ORDER BY c.complete_ts
) = 1
ORDER BY pause_seconds DESC
LIMIT 100;

This shape:

  • Preserves stream-dedup via DISTINCT (collapses exact-key replays at the same timestamp).
  • Applies the c.complete_ts >= p.pause_ts guard before picking a winner, so clock-skew / replay rows can't crowd out valid completions.
  • Selects the first valid completion after the pause via QUALIFY partitioned by the paused key plus pause_ts, which correctly handles a paused tool that completed, paused again, and completed again on the same function_call_id (each pair-resolves independently).
  • Keeps the Multimodal response generation support google/adk-python#206 forward-compat exclusion of pause_orphan = true.

Section 8.1 (v4) — full-identity invocation count

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  JSON_VALUE(attributes, '$.adk.route')    AS route_value,
  COUNT(*)                                 AS occurrences,
  COUNT(DISTINCT TO_JSON_STRING(STRUCT(user_id, session_id, invocation_id))) AS invocations
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.route') IS NOT NULL
GROUP BY 1, 2
ORDER BY occurrences DESC;

Same pattern works anywhere else a downstream user wants a strict-identity distinct count. Two other v3 queries (Section 2.2 fan-out, Section 10 branch fan-out) already group by (app_name, user_id, session_id, invocation_id) directly so their distinct counts are already correct.


No producer-code concern; the PR branch's emitted JSON paths still match. Other v3 sections stand as-posted.

@caohy1988
Copy link
Copy Markdown
Owner Author

Fresh final review of SQL library v4.

Findings

No remaining blockers found.

The two v4 changes close the last issues I had on the customer SQL:

  • Section 6.1 now applies the timestamp guard before winner-picking. That fixes the replay/skew case where an early invalid completion could hide a later valid completion, and the QUALIFY partition by the paused key plus pause_ts is the right BigQuery shape for "first valid completion after this pause."
  • Section 8.1 now counts the full invocation identity tuple within each app, so it matches the post-automatic function calling fails on lambda generated tools google/adk-python#293 identity contract instead of counting bare invocation_id.

Verified Good

  • Producer paths still match the SQL:
    • TOOL_COMPLETED writes attributes.adk.pause_kind = 'tool' and attributes.adk.function_call_id through EventData.adk_extras at src/google/adk/plugins/bigquery_agent_analytics_plugin.py:3147-3157.
    • TOOL_PAUSED writes the same pair keys under attributes.adk at src/google/adk/plugins/bigquery_agent_analytics_plugin.py:3311-3325.
    • AGENT_STATE_CHECKPOINT writes object-or-null content.agent_state plus boolean content.end_of_agent at src/google/adk/plugins/bigquery_agent_analytics_plugin.py:3258-3267; the v2 Section 5 correction uses JSON_QUERY for object presence and SAFE_CAST for the bool in both the CASE and COUNT predicates.
  • Section 6.1 remains forward-compatible with Multimodal response generation support google/adk-python#206 by excluding future pause_orphan = true completions from the healthy-pair branch while staying null-safe for current automatic function calling fails on lambda generated tools google/adk-python#293 rows.
  • The long-running join intentionally omits invocation_id, which is correct because resume can happen in a later invocation. The join key stays (app_name, user_id, session_id, function_call_id).
  • The shared-table fixes are now consistent: invocation-grain queries group on (app_name, user_id, session_id, invocation_id), per-app rollups include app_name, and list-style rows carry enough identity for debugging.

Residual note: Section 6.1 is a healthy-pair recipe only. Once google#206 lands, add the sibling orphan branch (pause_orphan = true, status = 'orphan_completion', null pause_ts) rather than folding orphans into this query.

… JSON-null

The docstring claimed callback-only rows leave A3/C1/C2/C3/C8 keys
'JSON null'. The implementation actually returns early when source_event
is None (line 2837-2838) so those keys are absent from the envelope, not
written as null.

Behavior is correct (and what the google#293 v5 contract intends). Updating
the docstring to match — and noting that because the surrounding column
is BigQuery JSON, an omitted key resolves to SQL NULL via
JSON_VALUE(attributes, '$.adk.<field>'), so consumer SQL using
'IS NOT NULL' to gate Event-originating rows works without the producer
writing explicit JSON nulls.

Caught by the RFC google#97 review against the haiyuan-eng-google SDK repo;
no code change required, docstring-only fix.
@caohy1988
Copy link
Copy Markdown
Owner Author

Pushed a docstring-only follow-up (8d8eb05): `_build_adk_envelope` previously claimed callback-only rows leave A3/C1/C2/C3/C8 keys "JSON null". The actual behavior is correct — those keys are omitted (the helper returns early at line 2837 when source_event is None) — but the docstring was inaccurate.

Caught by the RFC google#97 review at the SDK repo. No code change required; the omitted-vs-null distinction matters because:

  • An omitted key under a BigQuery JSON column still resolves to SQL NULL via JSON_VALUE(attributes, '$.adk.<field>'), so consumer SQL gating on Event-originating rows via WHERE JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL works correctly.
  • Consumers using JSON_QUERY(...) for key-presence checks would observe "key absent" rather than "key present with null value" — same behavior for IS NULL checks but worth being precise about.

Updated docstring now reads:

... callback-only rows omit those keys from the envelope rather than synthesizing fake identity. Since the surrounding column is BigQuery JSON, an omitted key resolves to SQL NULL via JSON_VALUE(attributes, '$.adk.<field>'); consumers using JSON_VALUE(...) IS NOT NULL to gate on Event-originating rows therefore work correctly without the producer writing explicit JSON nulls.

caohy1988 added 2 commits June 8, 2026 02:26
… contract

Last stale 'writes those attributes as null' reference in the producer
code. Behavior is unchanged; the helper omits the keys (return early at
:2837 when source_event is None) and JSON_VALUE on the BigQuery JSON
column resolves an omitted key to SQL NULL, so consumer gating with
'IS NOT NULL' works without explicit JSON nulls.

Caught by the RFC google#97 final review pass; matches the corrected
_build_adk_envelope docstring in 8d8eb05.
Following review feedback that docstrings shouldn't reference GitHub
issue numbers or PR review-thread revisions. The technical substance
(contract names like 'A1/A2 envelope', 'C7 pair-key emit', 'flat-with-
prefix', 'HITL non-routing') stays where it aids navigation; only the
'#NNN' and 'v5' annotations come out.

20 sites swept across the plugin module and test file. Behavior and
test names unchanged; suite still 252/252.

The existing 'google#4645' reference in workflow plumbing is left alone -- it
was not introduced by this change.
@caohy1988
Copy link
Copy Markdown
Owner Author

caohy1988 commented Jun 8, 2026

Consolidated SQL recipe library (final — v5)

This comment is the single canonical version after the v1 → v5 review cycle. All earlier SQL comments are superseded by this one. Every section here uses the latest contract decisions:

  • JSON_TYPE(JSON_QUERY(...)) for JSON presence checks (Section 5 fix); JSON_QUERY for object / array retrieval; JSON_VALUE only for scalars.
  • Full telemetry identity (app_name, user_id, session_id, invocation_id) in all invocation-grain groupings (Sections 2 / 4 / 7 / 10 fix).
  • Per-app rollups by default — drop app_name for fleet-wide views; see notes inline (Sections 3 / 8 fix).
  • pause_orphan forward-compat for when Multimodal response generation support google/adk-python#206 ships (Section 6 fix).
  • Per-stream DISTINCT + QUALIFY ROW_NUMBER() after the timestamp guard, so clock-skew/replay rows can't shadow valid completions (Section 6 fix).
  • COUNT(DISTINCT TO_JSON_STRING(STRUCT(...))) for distinct identity tuples (Section 8.1 fix).

Replace these placeholders in the FROM clauses below:

-- proj.ds.agent_events  →  your fully-qualified table reference

The WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY) windows can be tightened/widened in place.


1. Envelope health checks (A1 / A2 / A3)

1.1 — Confirm attributes.adk.schema_version + app_name stamp every row

SELECT
  COUNT(*)                                                                AS total,
  COUNTIF(JSON_VALUE(attributes, '$.adk.schema_version') IS NULL)         AS missing_schema_version,
  COUNTIF(JSON_VALUE(attributes, '$.adk.app_name')       IS NULL)         AS missing_app_name,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.schema_version'))          AS distinct_schema_versions
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY);

missing_schema_version and missing_app_name should both be 0 on rows produced by an ADK-2.0-aware plugin. distinct_schema_versions > 1 flags a mixed-producer fleet — useful for tracking a roll-out.

1.2 — Event-originating vs callback-only row split (A3 source_event_id presence)

SELECT
  event_type,
  COUNTIF(JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL) AS event_originating,
  COUNTIF(JSON_VALUE(attributes, '$.adk.source_event_id') IS NULL)     AS callback_only,
  COUNT(*) AS total
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY event_type
ORDER BY total DESC;

Surfaces which event types are produced from non-on_event_callback paths (callback_only > 0).


2. Workflow DAG via attributes.adk.node (C1)

2.1 — Node hierarchy with parent_path join

SELECT
  JSON_VALUE(attributes, '$.adk.app_name')         AS app_name,
  user_id,
  session_id,
  invocation_id,
  JSON_VALUE(attributes, '$.adk.node.path')        AS node_path,
  JSON_VALUE(attributes, '$.adk.node.parent_path') AS parent_path,
  JSON_VALUE(attributes, '$.adk.node.run_id')      AS run_id,
  COUNT(*)                                         AS events_at_node
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.node.path') IS NOT NULL
  AND JSON_VALUE(attributes, '$.adk.node.path') != ''
GROUP BY 1, 2, 3, 4, 5, 6, 7
ORDER BY app_name, user_id, session_id, invocation_id, node_path;

parent_path is the DAG join key — every node row joins to its parent via JSON_VALUE(attributes, '$.adk.node.path') = JSON_VALUE(parent.attributes, '$.adk.node.parent_path') within the same (app_name, user_id, session_id, invocation_id) identity.

2.2 — Workflow node fan-out per invocation

SELECT
  JSON_VALUE(attributes, '$.adk.app_name')                  AS app_name,
  user_id,
  session_id,
  invocation_id,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.node.path')) AS distinct_node_paths,
  COUNT(*)                                                  AS total_events
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.node.path') IS NOT NULL
  AND JSON_VALUE(attributes, '$.adk.node.path') != ''
GROUP BY 1, 2, 3, 4
HAVING distinct_node_paths > 1
ORDER BY distinct_node_paths DESC
LIMIT 50;

3. Multi-agent transfer chains (C4)

3.1 — Linear transfer chain per invocation

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  ARRAY_AGG(
    STRUCT(
      JSON_VALUE(content, '$.from_agent') AS from_agent,
      JSON_VALUE(content, '$.to_agent')   AS to_agent,
      timestamp
    )
    ORDER BY timestamp
  ) AS transfer_chain
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_TRANSFER'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1, 2, 3, 4
ORDER BY ARRAY_LENGTH(transfer_chain) DESC
LIMIT 50;

3.2 — Top transfer pairs per app

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  JSON_VALUE(content, '$.from_agent')      AS from_agent,
  JSON_VALUE(content, '$.to_agent')        AS to_agent,
  COUNT(*) AS transfers
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_TRANSFER'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1, 2, 3
ORDER BY transfers DESC
LIMIT 25;

For a fleet-wide rollup across apps, drop the app_name column from SELECT and GROUP BY.


4. Event compaction windows (C5)

4.1 — Active compactions with fractional float-epoch timestamps

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  -- Producer preserves fractional precision on start_/end_timestamp.
  -- Use TIMESTAMP_MICROS(*1e6) to widen safely without truncating
  -- sub-second windows (TIMESTAMP_SECONDS would lose them).
  TIMESTAMP_MICROS(CAST(CAST(JSON_VALUE(content, '$.start_timestamp') AS FLOAT64) * 1000000 AS INT64)) AS window_start,
  TIMESTAMP_MICROS(CAST(CAST(JSON_VALUE(content, '$.end_timestamp')   AS FLOAT64) * 1000000 AS INT64)) AS window_end,
  CAST(JSON_VALUE(content, '$.end_timestamp') AS FLOAT64)
    - CAST(JSON_VALUE(content, '$.start_timestamp') AS FLOAT64) AS window_seconds
FROM `proj.ds.agent_events`
WHERE event_type = 'EVENT_COMPACTION'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY window_seconds DESC
LIMIT 50;

5. Agent-state checkpoints (C6)

content.agent_state is either a JSON object or an explicit JSON null (for {agent_state: null, end_of_agent: true} rows). Two BigQuery JSON gotchas:

  • JSON_VALUE(content, '$.agent_state') returns SQL NULL for object values — so it under-counts real checkpoints.
  • JSON_QUERY(content, '$.agent_state') on an explicit JSON null returns JSON null (not SQL NULL), so a bare JSON_QUERY(...) IS NULL misses the end-of-agent-only shape.

Use JSON_TYPE(JSON_QUERY(...)) for presence checks (NULL = key absent, 'null' = explicit JSON null, anything else = a real value), and JSON_QUERY(...) to select the payload.

5.1 — Checkpoint stream with correct shape discriminator

WITH checkpoints AS (
  SELECT
    JSON_VALUE(attributes, '$.adk.app_name')                 AS app_name,
    user_id,
    session_id,
    invocation_id,
    agent,
    timestamp,
    JSON_QUERY(content, '$.agent_state')                     AS agent_state,
    JSON_TYPE(JSON_QUERY(content, '$.agent_state'))          AS agent_state_type,
    SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) AS end_of_agent
  FROM `proj.ds.agent_events`
  WHERE event_type = 'AGENT_STATE_CHECKPOINT'
    AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
)
SELECT
  app_name, user_id, session_id, invocation_id, agent, timestamp,
  agent_state,
  end_of_agent,
  CASE
    -- "end_only": agent_state is absent (NULL) or explicit JSON null,
    -- and end_of_agent is TRUE.
    WHEN (agent_state_type IS NULL OR agent_state_type = 'null')
         AND end_of_agent = TRUE
      THEN 'end_only'
    -- "state_only": agent_state is a real value (non-null JSON), and
    -- end_of_agent is FALSE (treat absent as FALSE).
    WHEN agent_state_type IS NOT NULL
         AND agent_state_type != 'null'
         AND COALESCE(end_of_agent, FALSE) = FALSE
      THEN 'state_only'
    ELSE 'both'
  END AS shape
FROM checkpoints
ORDER BY timestamp DESC
LIMIT 100;

5.2 — Checkpoint frequency per agent

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  agent,
  COUNTIF(JSON_TYPE(JSON_QUERY(content, '$.agent_state')) IS NOT NULL
          AND JSON_TYPE(JSON_QUERY(content, '$.agent_state')) != 'null') AS state_checkpoints,
  COUNTIF(SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) = TRUE) AS end_of_agent_signals,
  COUNT(*)                                                                AS total
FROM `proj.ds.agent_events`
WHERE event_type = 'AGENT_STATE_CHECKPOINT'
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1, 2
ORDER BY total DESC;

6. Long-running tool durations (C7)

The customer's headline query. The shape below intentionally:

  1. Dedupes paused / completed streams with DISTINCT (collapses exact-key replays).
  2. Joins paused to all valid completion candidates with c.complete_ts >= p.pause_ts before picking a winner — so a clock-skewed early completion can't shadow a later valid one.
  3. Picks the first valid completion after each pause via QUALIFY ROW_NUMBER(), which also correctly handles a tool that pauses → completes → pauses again → completes again on the same function_call_id (each pair resolves independently).
  4. Excludes pause_orphan = true completions from the healthy branch (null-safe vs current automatic function calling fails on lambda generated tools google/adk-python#293 rows; correct when Multimodal response generation support google/adk-python#206 ships).

6.1 — TOOL_PAUSED ↔ TOOL_COMPLETED pair-join with duration

WITH paused AS (
  SELECT DISTINCT
    JSON_VALUE(attributes, '$.adk.app_name')         AS app_name,
    user_id,
    session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    JSON_VALUE(content,    '$.tool')                 AS tool,
    timestamp                                        AS pause_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_PAUSED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
),
completed AS (
  SELECT DISTINCT
    JSON_VALUE(attributes, '$.adk.app_name')         AS app_name,
    user_id,
    session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    JSON_VALUE(content,    '$.tool')                 AS tool,
    timestamp                                        AS complete_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_COMPLETED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    -- Forward-compat: orphan-tagged completions are excluded from the
    -- healthy-pair branch. Null-safe vs current rows.
    AND COALESCE(SAFE_CAST(JSON_VALUE(attributes, '$.adk.pause_orphan') AS BOOL), FALSE) = FALSE
    AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
)
SELECT
  p.app_name, p.user_id, p.session_id, p.function_call_id, p.tool,
  p.pause_ts, c.complete_ts,
  TIMESTAMP_DIFF(c.complete_ts, p.pause_ts, SECOND) AS pause_seconds
FROM paused p
JOIN completed c
  USING (app_name, user_id, session_id, function_call_id)
-- The guard happens BEFORE picking a winner, so skew/replay rows can't
-- crowd out valid completions.
WHERE c.complete_ts >= p.pause_ts
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY p.app_name, p.user_id, p.session_id, p.function_call_id, p.pause_ts
  ORDER BY c.complete_ts
) = 1
ORDER BY pause_seconds DESC
LIMIT 100;

Note the join key is intentionally (app_name, user_id, session_id, function_call_id)no invocation_id — because resume legitimately happens in a later invocation.

6.2 — Suspected orphan paused rows (manual settling window)

WITH paused AS (
  SELECT
    JSON_VALUE(attributes, '$.adk.app_name')         AS app_name,
    user_id,
    session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    timestamp                                        AS pause_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_PAUSED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    AND timestamp BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
                      AND TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
),
completed AS (
  -- Mirror 6.1's healthy-completion filters so the existence check is
  -- consistent: same pause_orphan exclusion, same time window.
  SELECT DISTINCT
    JSON_VALUE(attributes, '$.adk.app_name')         AS app_name,
    user_id,
    session_id,
    JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
    timestamp                                        AS complete_ts
  FROM `proj.ds.agent_events`
  WHERE event_type = 'TOOL_COMPLETED'
    AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
    AND COALESCE(SAFE_CAST(JSON_VALUE(attributes, '$.adk.pause_orphan') AS BOOL), FALSE) = FALSE
    AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
)
SELECT p.*
FROM paused p
WHERE NOT EXISTS (
  -- Timestamp-aware: a completion only suppresses paused rows whose
  -- pause_ts is at or before it. Stops a replay completion for the
  -- same key from masking a *later* pause that genuinely has no
  -- matching completion.
  SELECT 1 FROM completed c
  WHERE c.app_name        = p.app_name
    AND c.user_id         = p.user_id
    AND c.session_id      = p.session_id
    AND c.function_call_id = p.function_call_id
    AND c.complete_ts    >= p.pause_ts
)
ORDER BY pause_ts DESC
LIMIT 50;

The BETWEEN … TIMESTAMP_SUB(…, INTERVAL 1 HOUR) is the manual settling delay for Storage Write API visibility — adjust to tolerance. Replace with the producer-side pause_orphan = true flag once google#206 lands.


7. HITL stream — separate from long-running tools (C7 routing)

7.1 — HITL request/completion pairs (per invocation)

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  COUNTIF(event_type = 'HITL_CONFIRMATION_REQUEST')           AS conf_request,
  COUNTIF(event_type = 'HITL_CONFIRMATION_REQUEST_COMPLETED') AS conf_completed,
  COUNTIF(event_type = 'HITL_CREDENTIAL_REQUEST')             AS cred_request,
  COUNTIF(event_type = 'HITL_CREDENTIAL_REQUEST_COMPLETED')   AS cred_completed,
  COUNTIF(event_type = 'HITL_INPUT_REQUEST')                  AS inp_request,
  COUNTIF(event_type = 'HITL_INPUT_REQUEST_COMPLETED')        AS inp_completed
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND event_type LIKE 'HITL_%'
GROUP BY 1, 2, 3, 4
ORDER BY (conf_request + cred_request + inp_request) DESC
LIMIT 50;

7.2 — Routing contract: HITL completions never land in TOOL_COMPLETED

-- Expected: 0 rows. If non-zero, the HITL non-routing contract is broken.
SELECT
  JSON_VALUE(content, '$.tool') AS tool, COUNT(*) AS n
FROM `proj.ds.agent_events`
WHERE event_type = 'TOOL_COMPLETED'
  AND JSON_VALUE(content, '$.tool') IN ('adk_request_confirmation', 'adk_request_credential', 'adk_request_input')
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
GROUP BY 1;

8. Action attributes — route / widgets / rewind (C8)

8.1 — Routing-decision histogram per app

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  JSON_VALUE(attributes, '$.adk.route')    AS route_value,
  COUNT(*)                                 AS occurrences,
  COUNT(DISTINCT TO_JSON_STRING(STRUCT(user_id, session_id, invocation_id))) AS invocations
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.route') IS NOT NULL
GROUP BY 1, 2
ORDER BY occurrences DESC;

COUNT(DISTINCT TO_JSON_STRING(STRUCT(...))) is BigQuery's idiom for counting distinct multi-column tuples.

8.2 — Rewind requests with identity

SELECT
  JSON_VALUE(attributes, '$.adk.app_name')                    AS app_name,
  user_id,
  session_id,
  invocation_id                                               AS rewinding_invocation_id,
  JSON_VALUE(attributes, '$.adk.rewind_before_invocation_id') AS rewinding_to,
  agent,
  timestamp
FROM `proj.ds.agent_events`
WHERE JSON_VALUE(attributes, '$.adk.rewind_before_invocation_id') IS NOT NULL
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY timestamp DESC
LIMIT 50;

8.3 — Widget render rows with identity

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  agent,
  ARRAY_LENGTH(JSON_QUERY_ARRAY(attributes, '$.adk.render_ui_widgets')) AS widget_count,
  JSON_VALUE(attributes, '$.adk.render_ui_widgets[0].provider')         AS first_widget_provider
FROM `proj.ds.agent_events`
WHERE JSON_QUERY(attributes, '$.adk.render_ui_widgets') IS NOT NULL
  AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY widget_count DESC
LIMIT 50;

9. Scope breakdown (C3) — node-loop vs function-call fan-out

SELECT
  JSON_VALUE(attributes, '$.adk.scope.kind')                AS scope_kind,
  COUNT(*)                                                  AS rows,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.scope.id'))  AS distinct_scope_ids,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.source_event_id')) AS distinct_source_events
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL
GROUP BY scope_kind
ORDER BY rows DESC;

scope_kind values: node_run (loop iteration), function_call (model- or ADK-issued tool id), unknown (anomaly — producer logs a warning). NULL = unscoped (isolation_scope is None).


10. Branch fan-out per invocation (C2)

SELECT
  JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
  user_id,
  session_id,
  invocation_id,
  COUNT(DISTINCT JSON_VALUE(attributes, '$.adk.branch'))         AS distinct_branches,
  ARRAY_AGG(DISTINCT JSON_VALUE(attributes, '$.adk.branch') IGNORE NULLS LIMIT 10) AS branches
FROM `proj.ds.agent_events`
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND JSON_VALUE(attributes, '$.adk.source_event_id') IS NOT NULL
GROUP BY 1, 2, 3, 4
HAVING distinct_branches > 1
ORDER BY distinct_branches DESC
LIMIT 50;

This is the working recipe set as of the v5 review pass — Section 5 now uses JSON_TYPE(JSON_QUERY(...)) for explicit-JSON-null presence checks, and Section 6.2 now requires complete_ts >= pause_ts to align with 6.1's healthy-pair semantics. Earlier SQL comments on this PR (v1, v2, v3, v4 corrections) are superseded.

@caohy1988
Copy link
Copy Markdown
Owner Author

Fresh review of the consolidated SQL recipe library. The consolidation is the right move and most v1→v4 fixes carried over cleanly. I found two remaining SQL issues worth fixing before this becomes the canonical copy/paste comment.

Findings

  1. Section 5 still misclassifies explicit JSON null agent_state values.

    The comment correctly moved from JSON_VALUE to JSON_QUERY for object-valued agent_state, but BigQuery has a sharp edge here: for a JSON column, JSON_QUERY(..., '$.agent_state') on an explicit JSON null returns JSON null, not SQL NULL.

    I verified this live:

    SELECT
      JSON_QUERY(JSON '{"agent_state": null}', '$.agent_state') IS NULL AS jq_null_is_sql_null,
      JSON_VALUE(JSON '{"agent_state": null}', '$.agent_state') IS NULL AS jv_null_is_sql_null,
      JSON_TYPE(JSON_QUERY(JSON '{"agent_state": null}', '$.agent_state')) AS type_null,
      JSON_TYPE(JSON_QUERY(JSON '{"agent_state": {"x": 1}}', '$.agent_state')) AS type_obj;

    Result:

    jq_null_is_sql_null = false
    jv_null_is_sql_null = true
    type_null = null
    type_obj = object
    

    Consequence: the current Section 5.1 CASE does not enter the end_only branch for the producer's valid {agent_state: null, end_of_agent: true} shape. It sees JSON_QUERY(...) IS NOT NULL and falls to both. Section 5.2 also counts JSON-null checkpoint rows as state_checkpoints.

    Safer shape: compute agent_state_type and use it for presence, while still selecting JSON_QUERY(...) for the object payload.

    WITH checkpoints AS (
      SELECT
        JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
        user_id,
        session_id,
        invocation_id,
        agent,
        timestamp,
        JSON_QUERY(content, '$.agent_state') AS agent_state,
        JSON_TYPE(JSON_QUERY(content, '$.agent_state')) AS agent_state_type,
        SAFE_CAST(JSON_VALUE(content, '$.end_of_agent') AS BOOL) AS end_of_agent
      FROM `proj.ds.agent_events`
      WHERE event_type = 'AGENT_STATE_CHECKPOINT'
        AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
    )
    SELECT
      app_name, user_id, session_id, invocation_id, agent, timestamp,
      agent_state,
      end_of_agent,
      CASE
        WHEN (agent_state_type IS NULL OR agent_state_type = 'null')
             AND end_of_agent = TRUE
          THEN 'end_only'
        WHEN agent_state_type IS NOT NULL
             AND agent_state_type != 'null'
             AND COALESCE(end_of_agent, FALSE) = FALSE
          THEN 'state_only'
        ELSE 'both'
      END AS shape
    FROM checkpoints
    ORDER BY timestamp DESC
    LIMIT 100;

    And Section 5.2 should count real state objects with:

    COUNTIF(JSON_TYPE(JSON_QUERY(content, '$.agent_state')) IS NOT NULL
            AND JSON_TYPE(JSON_QUERY(content, '$.agent_state')) != 'null') AS state_checkpoints
  2. Section 6.2's orphan-suspect query can be suppressed by an unrelated completion for the same key.

    completed_ids currently keeps only (app_name, user_id, session_id, function_call_id) and the final NOT EXISTS ignores completion time. That is looser than the healthy-pair query in 6.1. If the same function_call_id appears in a replay or repeated pause/complete cycle, any completion with that key suppresses every paused row with that key, even if the completion happened before the pause being inspected.

    Keep the timestamp and apply the same healthy-completion filters:

    completed AS (
      SELECT DISTINCT
        JSON_VALUE(attributes, '$.adk.app_name') AS app_name,
        user_id,
        session_id,
        JSON_VALUE(attributes, '$.adk.function_call_id') AS function_call_id,
        timestamp AS complete_ts
      FROM `proj.ds.agent_events`
      WHERE event_type = 'TOOL_COMPLETED'
        AND JSON_VALUE(attributes, '$.adk.pause_kind') = 'tool'
        AND COALESCE(SAFE_CAST(JSON_VALUE(attributes, '$.adk.pause_orphan') AS BOOL), FALSE) = FALSE
        AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
    )
    SELECT p.*
    FROM paused p
    WHERE NOT EXISTS (
      SELECT 1
      FROM completed c
      WHERE c.app_name = p.app_name
        AND c.user_id = p.user_id
        AND c.session_id = p.session_id
        AND c.function_call_id = p.function_call_id
        AND c.complete_ts >= p.pause_ts
    )

    This keeps 6.2 aligned with 6.1's "completion must be after this pause" rule and stays forward-compatible with Multimodal response generation support google/adk-python#206 orphan-tagged completions.

Verified Good

  • Section 6.1 is now the correct healthy-pair shape: DISTINCT streams, timestamp guard before QUALIFY, pause_orphan exclusion, and no invocation_id in the cross-invocation join key.
  • Sections 2 / 4 / 7 / 10 carry full invocation identity.
  • Sections 3 / 8 default to per-app rollups.
  • Section 8.1 uses the full identity tuple count.
  • The comment clearly supersedes earlier v1/v2/v3 SQL comments, which should prevent readers from copying stale snippets.

Net: fix Section 5's JSON-null presence checks and Section 6.2's timestamp-aware completion existence check. After that, this can be the canonical SQL library.

@caohy1988
Copy link
Copy Markdown
Owner Author

Fresh review of the edited canonical SQL comment.

Findings

No SQL correctness blockers found.

The two v5 fixes are now folded in correctly:

  • Section 5 uses JSON_TYPE(JSON_QUERY(...)) for presence checks and keeps JSON_QUERY(...) for object payload retrieval. I verified the discriminator live in BigQuery against all three checkpoint shapes:

    explicit JSON null + end_of_agent=true  -> end_only
    object state + end_of_agent=false       -> state_only
    object state + end_of_agent=true        -> both
    
  • Section 6.2 now mirrors 6.1's healthy-completion filters and requires complete_ts >= pause_ts, so an older replay/completion for the same (app_name, user_id, session_id, function_call_id) no longer masks a later paused row.

Small Polish

  1. The comment header still says final — v4 and "v1 → v4 review cycle."

    The closing line correctly says "v5 review pass" and supersedes v1-v4. Update the header/preamble to match:

    ## Consolidated SQL recipe library (final — v5)
    
    This comment is the single canonical version after the v1 → v5 review cycle.
  2. Section 2's prose says "parent_path is the DAG join key ... within the same invocation."

    Since the query itself correctly carries full identity, make the prose equally precise for shared tables:

    ...within the same (app_name, user_id, session_id, invocation_id) identity.

Verified Good

  • Section 6.1 remains the right healthy-pair shape: stream DISTINCT, timestamp guard before QUALIFY, pause_orphan exclusion, and no invocation_id in the cross-invocation join.
  • Sections 2 / 4 / 7 / 10 keep full invocation identity.
  • Sections 3 / 8 default to per-app rollups.
  • Section 8.1 uses COUNT(DISTINCT TO_JSON_STRING(STRUCT(...))) for identity tuples.
  • Section 9 correctly gates to Event-originating rows with source_event_id IS NOT NULL.

Net: the SQL library is canonical after the two wording updates above.

@caohy1988
Copy link
Copy Markdown
Owner Author

Final SQL recipe review.

No remaining findings.

Verified:

  • Header/preamble now says final v5 and v1→v5.
  • Section 2.1 parent-path prose now scopes the join to the full (app_name, user_id, session_id, invocation_id) identity.
  • Section 5 keeps the correct JSON_TYPE(JSON_QUERY(...)) discriminator for explicit JSON null vs object state.
  • Section 6.2 keeps timestamp-aware completion existence with complete_ts >= pause_ts and mirrors 6.1's healthy-completion filters.
  • The closing line clearly supersedes v1-v4 comments.

The canonical SQL library is ready to use as the copy/paste reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant