Batch time-series endpoint and embedded run scores#508
Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughAdds optional includeScores to the monitor runs list to embed per-run evaluator summaries and introduces a batch time-series endpoint accepting multiple evaluators (comma-separated) that returns per-evaluator series in a single response. Frontend clients, spec, models, services, repos, tests, and wiring updated accordingly. Changes
Sequence DiagramsequenceDiagram
participant Client as Frontend Client
participant Controller as Monitor Scores Controller
participant Service as Monitor Scores Service
participant Repo as Score Repository
participant DB as Database
Client->>Controller: GET /monitors/{m}/scores/timeseries?evaluators=eval1,eval2&startTime&endTime
Controller->>Service: GetEvaluatorsTimeSeries(monitorID, [eval1,eval2], startTime, endTime)
Service->>Repo: Probe sparsity for evaluatorNames
Repo->>DB: Query traces/spans for evaluatorNames
DB-->>Repo: probe results (sparse/dense)
Repo-->>Service: probe outcome
alt Sparse path (trace-level)
Service->>Repo: GetEvaluatorsTraceAggregated(monitorID, evaluatorNames, startTime, endTime, limit)
Repo->>DB: Fetch trace-level aggregates per evaluator
DB-->>Repo: trace aggregates
else Dense path (bucketed)
Service->>Repo: GetEvaluatorsTimeSeriesAggregated(monitorID, evaluatorNames, startTime, endTime, granularity)
Repo->>DB: Fetch time-bucket aggregates per evaluator
DB-->>Repo: bucket aggregates
end
Repo-->>Service: Batch aggregation results
Service-->>Controller: BatchTimeSeriesResponse (preserve input evaluator order)
Controller-->>Client: JSON { monitorName, granularity, evaluators: [{evaluatorName, points: [...]}] }
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
agent-manager-service/docs/api_v1_openapi.yaml (1)
5576-5628:⚠️ Potential issue | 🟠 MajorKeep the legacy
evaluatorquery param in the contract.The linked objective says this route should add
evaluatorswhile keepingevaluatorfor backward compatibility. Right now the spec removesevaluator, andagent-manager-service/controllers/monitor_scores_controller.go:197-230hard-fails whenevaluatorsis absent. Existing clients that still sendevaluator=...will now get a 400 instead of continuing to work.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent-manager-service/docs/api_v1_openapi.yaml` around lines 5576 - 5628, The OpenAPI contract removed the legacy single "evaluator" query param causing older clients to break; update the GET /orgs/{orgName}/projects/{projName}/agents/{agentName}/monitors/{monitorName}/scores/timeseries (operationId getMonitorScoresTimeSeries) to include a non-required query parameter named "evaluator" (type: string, in: query) alongside the new "evaluators" param and note in its description that it is legacy and equivalent to a single-item evaluators list; this will align the spec with agent-manager-service/controllers/monitor_scores_controller.go (which currently accepts/depends on either) and preserve backward compatibility.console/workspaces/pages/eval/src/subComponents/PerformanceByEvaluatorCard.tsx (1)
112-119:⚠️ Potential issue | 🟡 MinorFormat the x-axis from the returned granularity.
The batch endpoint can now return
minute/hour/day/week, but the chart still always formats buckets withHH:mm. Fordayandweekseries that renders misleading00:00labels and makes the trend harder to read.Suggested fix
- const label = date.toLocaleString(undefined, { - month: "short", - day: "numeric", - hour: "2-digit", - minute: "2-digit", - hour12: false, - }); + const label = formatBucketLabel(date, timeSeriesByEvaluator.granularity);const formatBucketLabel = (date: Date, granularity?: string) => { switch (granularity) { case "day": case "week": return date.toLocaleDateString(undefined, { month: "short", day: "numeric", }); default: return date.toLocaleString(undefined, { month: "short", day: "numeric", hour: "2-digit", minute: "2-digit", hour12: false, }); } };Also applies to: 128-136
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@console/workspaces/pages/eval/src/subComponents/PerformanceByEvaluatorCard.tsx` around lines 112 - 119, The x-axis is always formatted as "HH:mm"; introduce a formatter that uses the returned granularity and apply it when building series and rendering ticks: add a formatBucketLabel(date, granularity?) function (use the suggested switch for "day"/"week" vs default) and use timeSeriesByEvaluator.granularity (or the returned granularity field) when formatting each point in evaluators.forEach (where seriesMap[evaluatorName] is created) and when supplying the tick label formatter for the chart (the code block around lines 128-136 that renders X-axis ticks). Ensure you replace the hardcoded HH:mm formatting with calls to formatBucketLabel so day/week show "Mon 1" style and minute/hour use date+time.agent-manager-service/spec/api_default.go (1)
4769-4786:⚠️ Potential issue | 🟠 MajorKeep the legacy single-evaluator path in the generated client.
Line 4848 now makes
evaluatorsmandatory, and the oldEvaluator(...)builder is gone. That breaks the compatibility goal from#471for existing Go callers that still send the singleevaluatorquery param. Please model both params in the OpenAPI spec, keep them optional, and regenerate so old callers keep compiling while new callers can batch.Also applies to: 4848-4854
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent-manager-service/spec/api_default.go` around lines 4769 - 4786, The generated client removed the legacy single-evaluator builder and made evaluators mandatory; update the OpenAPI spec to define both query params "evaluator" (single) and "evaluators" (comma-separated) as optional, then regenerate the client so ApiGetMonitorScoresTimeSeriesRequest contains both builder methods (Evaluator(evaluator string) and Evaluators(evaluators string)) and neither field is required; ensure the generated getters/setters for Evaluator and Evaluators exist and accept pointer/optional values so existing callers using Evaluator(...) keep compiling while new callers can use Evaluators(...).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@agent-manager-service/controllers/monitor_scores_controller.go`:
- Around line 209-218: The handler currently rejects requests that use the
legacy "evaluator" query param; update the parsing logic so if
r.URL.Query().Get("evaluators") is empty it falls back to
r.URL.Query().Get("evaluator") and passes that value into parseEvaluatorsList
(or constructs a single-item slice) so evaluatorNames contains at least one
name; specifically modify the block that reads evaluatorsParam and builds
evaluatorNames (referencing parseEvaluatorsList and the evaluatorNames variable)
to accept either "evaluators" (CSV) or the single legacy "evaluator" parameter
before returning the 400.
In `@agent-manager-service/docs/api_v1_openapi.yaml`:
- Around line 5279-5292: Update the OpenAPI parameter docs for pagination to
match the server's enforcement: in the getAgentTraceScores operation (and other
scores/traces endpoints), change the limit query parameter to include "minimum:
1" and "maximum: 100" and set its default to 100 (instead of 20), and add
"minimum: 0" to the offset parameter while keeping default 0; this aligns the
YAML parameter definitions with the checks implemented in monitor_controller.go
(the limit>0, cap to 100, and non-negative offset logic).
In `@agent-manager-service/utils/makeresults.go`:
- Around line 739-752: The code currently only sets response.Scores when
len(run.Scores) > 0, which drops non-nil empty slices; change the conditional to
check for a non-nil slice (if run.Scores != nil) so requested-but-empty score
lists are preserved, then create scores := make([]spec.EvaluatorScoreSummary,
len(run.Scores)) and populate it (loop will be skipped for len==0) and assign
response.Scores = scores; update references to run.Scores and response.Scores in
makeresults.go accordingly.
---
Outside diff comments:
In `@agent-manager-service/docs/api_v1_openapi.yaml`:
- Around line 5576-5628: The OpenAPI contract removed the legacy single
"evaluator" query param causing older clients to break; update the GET
/orgs/{orgName}/projects/{projName}/agents/{agentName}/monitors/{monitorName}/scores/timeseries
(operationId getMonitorScoresTimeSeries) to include a non-required query
parameter named "evaluator" (type: string, in: query) alongside the new
"evaluators" param and note in its description that it is legacy and equivalent
to a single-item evaluators list; this will align the spec with
agent-manager-service/controllers/monitor_scores_controller.go (which currently
accepts/depends on either) and preserve backward compatibility.
In `@agent-manager-service/spec/api_default.go`:
- Around line 4769-4786: The generated client removed the legacy
single-evaluator builder and made evaluators mandatory; update the OpenAPI spec
to define both query params "evaluator" (single) and "evaluators"
(comma-separated) as optional, then regenerate the client so
ApiGetMonitorScoresTimeSeriesRequest contains both builder methods
(Evaluator(evaluator string) and Evaluators(evaluators string)) and neither
field is required; ensure the generated getters/setters for Evaluator and
Evaluators exist and accept pointer/optional values so existing callers using
Evaluator(...) keep compiling while new callers can use Evaluators(...).
In
`@console/workspaces/pages/eval/src/subComponents/PerformanceByEvaluatorCard.tsx`:
- Around line 112-119: The x-axis is always formatted as "HH:mm"; introduce a
formatter that uses the returned granularity and apply it when building series
and rendering ticks: add a formatBucketLabel(date, granularity?) function (use
the suggested switch for "day"/"week" vs default) and use
timeSeriesByEvaluator.granularity (or the returned granularity field) when
formatting each point in evaluators.forEach (where seriesMap[evaluatorName] is
created) and when supplying the tick label formatter for the chart (the code
block around lines 128-136 that renders X-axis ticks). Ensure you replace the
hardcoded HH:mm formatting with calls to formatBucketLabel so day/week show "Mon
1" style and minute/hour use date+time.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5cb4551e-eecb-4df5-825b-8e222d040a5f
⛔ Files ignored due to path filters (1)
console/common/config/rush/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (20)
agent-manager-service/controllers/monitor_controller.goagent-manager-service/controllers/monitor_scores_controller.goagent-manager-service/docs/api_v1_openapi.yamlagent-manager-service/models/monitor.goagent-manager-service/models/score.goagent-manager-service/repositories/evaluation_score_repository.goagent-manager-service/services/monitor_manager.goagent-manager-service/services/monitor_scores_service.goagent-manager-service/spec/api_default.goagent-manager-service/spec/model_batch_time_series_evaluator_series.goagent-manager-service/spec/model_batch_time_series_response.goagent-manager-service/spec/model_monitor_run_response.goagent-manager-service/tests/monitor_scores_test.goagent-manager-service/utils/makeresults.goagent-manager-service/wiring/wire_gen.goconsole/workspaces/libs/api-client/src/apis/monitors.tsconsole/workspaces/libs/api-client/src/hooks/monitors.tsconsole/workspaces/libs/types/src/api/monitors.tsconsole/workspaces/pages/eval/src/subComponents/PerformanceByEvaluatorCard.tsxconsole/workspaces/pages/eval/src/subComponents/RunSummaryCard.tsx
There was a problem hiding this comment.
🧹 Nitpick comments (1)
agent-manager-service/tests/monitor_scores_test.go (1)
304-395: Add one real multi-evaluator contract test.The updated suite covers basic validation, but it still doesn't lock down the new CSV parsing rules or a successful multi-evaluator response. A couple of cases for
evaluators=a, b ,a,?evaluator=latency, and >50 unique names — plus one happy-path assertion on ordered evaluator series — would make regressions here much harder to miss.Based on learnings In wso2/agent-manager, update all code paths handling GET /monitors/{monitorName}/scores/timeseries to accept the new evaluators query parameter (plural, comma-separated) and remove any usage or parsing of the old singular evaluator parameter.
Also applies to: 403-445
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@agent-manager-service/tests/monitor_scores_test.go` around lines 304 - 395, Add a real multi-evaluator contract test and remove legacy singular parsing: update the test suite around TestGetScoresTimeSeries_Validation and TestGetScoresTimeSeries_ValidRanges (using newScoresHandler and the same base path) to include cases for evaluators=a,b,a (duplicates), evaluators with >50 unique names (expect 400), and a request using evaluator=latency (singular) to confirm it is rejected; also add one happy-path integration-style test that sends evaluators=latency,throughput (comma-separated) and asserts the response contains ordered evaluator time series for those names. Then change the server code path that handles GET /monitors/{monitorName}/scores/timeseries (the handler invoked by handler.ServeHTTP/newScoresHandler) to parse the plural "evaluators" query param as a comma-separated list, enforce max-unique limit (50), dedupe when appropriate, and remove any parsing/acceptance of the old "evaluator" singular parameter so only the plural CSV form is supported.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@agent-manager-service/tests/monitor_scores_test.go`:
- Around line 304-395: Add a real multi-evaluator contract test and remove
legacy singular parsing: update the test suite around
TestGetScoresTimeSeries_Validation and TestGetScoresTimeSeries_ValidRanges
(using newScoresHandler and the same base path) to include cases for
evaluators=a,b,a (duplicates), evaluators with >50 unique names (expect 400),
and a request using evaluator=latency (singular) to confirm it is rejected; also
add one happy-path integration-style test that sends
evaluators=latency,throughput (comma-separated) and asserts the response
contains ordered evaluator time series for those names. Then change the server
code path that handles GET /monitors/{monitorName}/scores/timeseries (the
handler invoked by handler.ServeHTTP/newScoresHandler) to parse the plural
"evaluators" query param as a comma-separated list, enforce max-unique limit
(50), dedupe when appropriate, and remove any parsing/acceptance of the old
"evaluator" singular parameter so only the plural CSV form is supported.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 47b35a72-d201-4f66-8b98-3b238237b676
📒 Files selected for processing (5)
agent-manager-service/controllers/monitor_scores_controller.goagent-manager-service/models/score.goagent-manager-service/services/monitor_scores_service.goagent-manager-service/tests/monitor_scores_test.goconsole/workspaces/pages/eval/src/subComponents/RunSummaryCard.tsx
Summary
evaluatorsquery param) that returns grouped data for all requested evaluatorsincludeScores=trueparam toListMonitorRunsto embed evaluator score summaries in each run, eliminating the N+1 request waterfallTest plan
make dev-test)make lint)Summary by CodeRabbit
New Features
Documentation