Tracing: query↔ingest schema asymmetry on ag.metrics.duration.cumulative

## Summary

The `/api/tracing/spans/query` and `/api/tracing/spans/ingest` endpoints are not round-trippable. A customer hit this while trying to migrate traces between two Agenta instances: query output, resubmit it via ingest, every span is silently dropped.

## Which side mutates the data

The **ingest** side is the one that writes a non-canonical shape. It is not "query returning raw data that ingest rejects" — it is "ingest writes a shape that ingest itself rejects on a second pass."

At `api/oss/src/core/tracing/utils/parsing.py:285-290`, during ingestion we compute duration from `start_time`/`end_time` and overwrite the field:

```python
if raw_span.start_time and raw_span.end_time:
    duration_s = (raw_span.end_time - raw_span.start_time).total_seconds()
    duration_ms = round(duration_s * 1_000, 3)
    if duration_ms is not None:
        ag["metrics"]["duration"] = {"cumulative": duration_ms}  # scalar
```

This writes `duration.cumulative` as a **scalar** (`1922.653`).

The Pydantic model at `sdk/agenta/sdk/models/tracing.py:44`, however, declares:

```python
class AgMetricEntryAttributes(BaseModel):
    cumulative: Optional[Metrics] = None  # Metrics = Dict[str, NumericJson]
    incremental: Optional[Metrics] = None
```

`cumulative` is supposed to be a dict (that is how `costs` and `tokens` are stored: `{"total": ..., "prompt": ..., "completion": ...}`). So on the next ingest, validation rejects the scalar.

## Repro

Against `eu.cloud.agenta.ai` with a valid API key:

1. `POST /api/tracing/spans/query` with `{"focus": "trace", "limit": 1}` — response contains `"duration": {"cumulative": 1922.653}` on every span with a non-zero duration.
2. `POST /api/tracing/spans/ingest` with `{"traces": <response.traces>}` (or after rewriting IDs to avoid dedup) — response `202 Accepted`, body `{"count": 0, "links": []}`. Nothing is persisted.

Narrowing test at `/tmp/agenta-test/test_metrics.py` confirms: submitting a span with `metrics.duration.cumulative` as a scalar returns `count: 0`; submitting the same span with `metrics.duration.cumulative = {"total": 1922.653}` returns `count: 1`.

## Proposed fix

Make ingest write the canonical dict shape at `parsing.py:290`:

```python
ag["metrics"]["duration"] = {"cumulative": {"total": duration_ms}}
```

This aligns with the `AgMetricEntryAttributes` / `Metrics = Dict[str, NumericJson]` contract and is consistent with how `costs` and `tokens` are already stored (`{"total": ..., "prompt": ..., "completion": ...}`).

Alternative considered: widen the model to accept `Union[Metrics, float]` for duration only. Rejected — it makes duration special-cased versus the other metric entries and keeps the inconsistency visible in the API.

## Migration for existing data

Existing rows in production have `duration.cumulative = <scalar>` on disk. Options:

1. One-off backfill migration rewriting `cumulative: scalar` → `{"total": scalar}`.
2. Accept both shapes on read and normalize on the way out of the query endpoint, then do the backfill lazily.

Option 2 is safer for prod. The query response normalization would live next to `_parse_span_into_response` in `parsing.py`.

## Related

See companion issue on silent validation failures in `parse_spans_from_request` — that is what hides this bug from clients today.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracing: query↔ingest schema asymmetry on ag.metrics.duration.cumulative #4172

Summary

Which side mutates the data

Repro

Proposed fix

Migration for existing data

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tracing: query↔ingest schema asymmetry on ag.metrics.duration.cumulative #4172

Description

Summary

Which side mutates the data

Repro

Proposed fix

Migration for existing data

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions