Summary
The /api/tracing/spans/query and /api/tracing/spans/ingest endpoints are not round-trippable. A customer hit this while trying to migrate traces between two Agenta instances: query output, resubmit it via ingest, every span is silently dropped.
Which side mutates the data
The ingest side is the one that writes a non-canonical shape. It is not "query returning raw data that ingest rejects" — it is "ingest writes a shape that ingest itself rejects on a second pass."
At api/oss/src/core/tracing/utils/parsing.py:285-290, during ingestion we compute duration from start_time/end_time and overwrite the field:
if raw_span.start_time and raw_span.end_time:
duration_s = (raw_span.end_time - raw_span.start_time).total_seconds()
duration_ms = round(duration_s * 1_000, 3)
if duration_ms is not None:
ag["metrics"]["duration"] = {"cumulative": duration_ms} # scalar
This writes duration.cumulative as a scalar (1922.653).
The Pydantic model at sdk/agenta/sdk/models/tracing.py:44, however, declares:
class AgMetricEntryAttributes(BaseModel):
cumulative: Optional[Metrics] = None # Metrics = Dict[str, NumericJson]
incremental: Optional[Metrics] = None
cumulative is supposed to be a dict (that is how costs and tokens are stored: {"total": ..., "prompt": ..., "completion": ...}). So on the next ingest, validation rejects the scalar.
Repro
Against eu.cloud.agenta.ai with a valid API key:
POST /api/tracing/spans/query with {"focus": "trace", "limit": 1} — response contains "duration": {"cumulative": 1922.653} on every span with a non-zero duration.
POST /api/tracing/spans/ingest with {"traces": <response.traces>} (or after rewriting IDs to avoid dedup) — response 202 Accepted, body {"count": 0, "links": []}. Nothing is persisted.
Narrowing test at /tmp/agenta-test/test_metrics.py confirms: submitting a span with metrics.duration.cumulative as a scalar returns count: 0; submitting the same span with metrics.duration.cumulative = {"total": 1922.653} returns count: 1.
Proposed fix
Make ingest write the canonical dict shape at parsing.py:290:
ag["metrics"]["duration"] = {"cumulative": {"total": duration_ms}}
This aligns with the AgMetricEntryAttributes / Metrics = Dict[str, NumericJson] contract and is consistent with how costs and tokens are already stored ({"total": ..., "prompt": ..., "completion": ...}).
Alternative considered: widen the model to accept Union[Metrics, float] for duration only. Rejected — it makes duration special-cased versus the other metric entries and keeps the inconsistency visible in the API.
Migration for existing data
Existing rows in production have duration.cumulative = <scalar> on disk. Options:
- One-off backfill migration rewriting
cumulative: scalar → {"total": scalar}.
- Accept both shapes on read and normalize on the way out of the query endpoint, then do the backfill lazily.
Option 2 is safer for prod. The query response normalization would live next to _parse_span_into_response in parsing.py.
Related
See companion issue on silent validation failures in parse_spans_from_request — that is what hides this bug from clients today.
Summary
The
/api/tracing/spans/queryand/api/tracing/spans/ingestendpoints are not round-trippable. A customer hit this while trying to migrate traces between two Agenta instances: query output, resubmit it via ingest, every span is silently dropped.Which side mutates the data
The ingest side is the one that writes a non-canonical shape. It is not "query returning raw data that ingest rejects" — it is "ingest writes a shape that ingest itself rejects on a second pass."
At
api/oss/src/core/tracing/utils/parsing.py:285-290, during ingestion we compute duration fromstart_time/end_timeand overwrite the field:This writes
duration.cumulativeas a scalar (1922.653).The Pydantic model at
sdk/agenta/sdk/models/tracing.py:44, however, declares:cumulativeis supposed to be a dict (that is howcostsandtokensare stored:{"total": ..., "prompt": ..., "completion": ...}). So on the next ingest, validation rejects the scalar.Repro
Against
eu.cloud.agenta.aiwith a valid API key:POST /api/tracing/spans/querywith{"focus": "trace", "limit": 1}— response contains"duration": {"cumulative": 1922.653}on every span with a non-zero duration.POST /api/tracing/spans/ingestwith{"traces": <response.traces>}(or after rewriting IDs to avoid dedup) — response202 Accepted, body{"count": 0, "links": []}. Nothing is persisted.Narrowing test at
/tmp/agenta-test/test_metrics.pyconfirms: submitting a span withmetrics.duration.cumulativeas a scalar returnscount: 0; submitting the same span withmetrics.duration.cumulative = {"total": 1922.653}returnscount: 1.Proposed fix
Make ingest write the canonical dict shape at
parsing.py:290:This aligns with the
AgMetricEntryAttributes/Metrics = Dict[str, NumericJson]contract and is consistent with howcostsandtokensare already stored ({"total": ..., "prompt": ..., "completion": ...}).Alternative considered: widen the model to accept
Union[Metrics, float]for duration only. Rejected — it makes duration special-cased versus the other metric entries and keeps the inconsistency visible in the API.Migration for existing data
Existing rows in production have
duration.cumulative = <scalar>on disk. Options:cumulative: scalar→{"total": scalar}.Option 2 is safer for prod. The query response normalization would live next to
_parse_span_into_responseinparsing.py.Related
See companion issue on silent validation failures in
parse_spans_from_request— that is what hides this bug from clients today.