feat(metrics): add vikingbot feedback observability by myysy · Pull Request #2037 · volcengine/OpenViking

myysy · 2026-05-14T05:52:02Z

Description

Add VikingBot feedback observability to OpenViking metrics by exporting scrape-time feedback and outcome aggregates from persisted bot sessions. This also documents the new metric families and provides Grafana/Prometheus examples plus validation guides.

Related Issue

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (no functional changes)
Performance improvement
Test update

Changes Made

Add FeedbackCollector and bot-side feedback aggregation logic to export openviking_feedback_* and openviking_feedback_channel_* gauges from persisted VikingBot session data.
Extend bot observability, outcome metadata, console surfaces, and metrics bootstrap/tests to support feedback coverage, thumbs up/down, outcome totals, and per-channel breakdowns.
Add metrics documentation, validation guides, and Grafana/Prometheus example configs and dashboard for VikingBot feedback observability.

Testing

I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have tested this on the following platforms:
- Linux
- macOS
- Windows

Checklist

My code follows the project's coding style
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Screenshots (if applicable)

N/A

Additional Notes

This PR exports scrape-time snapshot gauges rather than online counters. The collector recomputes aggregates from bot/sessions/*.jsonl on each scrape.
valid="1" indicates a fresh successful snapshot, while valid="0" indicates fallback to the last successful snapshot after collector refresh failure.
Local untracked environment files were intentionally excluded from the commit: compare_json.py, log, ov_conf/, and scripts/.

github-actions · 2026-05-14T05:53:28Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🏅 Score: 85
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review Missing License Header New Python file does not include the required license header. While R6 explicitly mentions openviking/ and openviking_cli/ directories, checking consistency with other project files is recommended. """Offline feedback observability aggregation over persisted sessions."""

github-actions · 2026-05-14T05:54:07Z

PR Code Suggestions ✨

No code suggestions found for the PR.

# Conflicts: # bot/vikingbot/agent/memory.py

MaojiaSheng · 2026-05-14T12:21:26Z

发群里请相关同学review吧

yeshion23333

I found two blocking issues and two follow-up suggestions in the new feedback observability path. The most serious problem is that the default server metrics bootstrap now has a hard import-time dependency on the vikingbot package and its transitive runtime requirements.

yeshion23333 · 2026-05-15T08:34:37Z

+from typing import Any, ClassVar
+
+from openviking.metrics.core.base import MetricCollector
+


[Bug] (blocking) This collector now imports vikingbot.config.loader at module import time, and openviking.metrics.bootstrap imports FeedbackCollector unconditionally. That means enabling the default server metrics path now requires the full vikingbot dependency chain to be importable, even in deployments that only use the server. I was able to reproduce this locally by importing create_default_collector_manager(), which failed in the bot dependency chain with ModuleNotFoundError. Before this PR, the default metrics bootstrap did not depend on bot-only runtime packages. Please decouple the collector from vikingbot.config.loader and inject the bot data path from server-side config (or at least defer the dependency so unsupported environments degrade gracefully instead of failing during import).

yeshion23333 · 2026-05-15T08:34:37Z

+        return None
+
+    feedback_events = metadata.get("feedback_events", [])
+    response_outcomes = metadata.get("response_outcomes", {})


[Design] (blocking) The denominator for the exported rates is currently len(_collect_response_ids(...)), i.e. every persisted assistant message with a response_id. The numerators, however, only come from the new feedback_events / response_outcomes contract in metadata. In mixed historical data, this will systematically understate coverage and resolution because old responses that were never part of the new observability contract are still counted in responses_total. For example, if an older session has 100 assistant responses but only the newest 10 are tracked in response_outcomes, a dashboard reader will see those rates as if all 100 responses were eligible. Please either change the denominator to tracked responses or export a separate tracked-response total and document the current semantics explicitly.

yeshion23333 · 2026-05-15T08:34:37Z

@@ -80,6 +81,7 @@ def create_default_collector_manager(*, app=None, service=None) -> CollectorMana
    manager = CollectorManager()
    manager.register(QueueCollector(data_source=QueuePipelineStateDataSource()))
    manager.register(TaskTrackerCollector(data_source=TaskStateDataSource()))


[Suggestion] (non-blocking) Registering FeedbackCollector() with no explicit constructor inputs makes the bootstrap path rely on hidden defaults inside the collector. That is what forces the current test to monkeypatch load_config, and it also makes deployment configuration harder to reason about. Consider passing the bot data path (or a dedicated feedback metrics config object) from the bootstrap layer so the dependency stays explicit and testable.

yeshion23333 · 2026-05-15T08:34:37Z

        finally:
            shutdown_metrics(app=app)

+    async def test_metrics_endpoint_exports_feedback_metrics(self, monkeypatch, tmp_path):


[Suggestion] (non-blocking) This endpoint test only writes the metadata record and does not include any persisted assistant message rows. That still proves the endpoint can expose feedback metrics, but it does not exercise the real denominator path used by responses_total, feedback_coverage, and the derived rates, because those currently come from scanning assistant messages in the JSONL body. Adding at least one assistant record here would make this test much better at catching regressions in the session-file contract and the rate semantics.

…from volcengine#2037

…from #2037 (#2082)

feat(metrics): add vikingbot feedback observability

2a13162

github-project-automation Bot added this to OpenViking project May 14, 2026

github-project-automation Bot moved this to Backlog in OpenViking project May 14, 2026

Merge branch 'main' into feat/vikingbot-feedback-observability

8d0d27d

# Conflicts: # bot/vikingbot/agent/memory.py

myysy added 2 commits May 15, 2026 15:45

fix(bot): align feedback observability contracts

e991148

Merge branch 'main' into feat/vikingbot-feedback-observability

7ab4876

yeshion23333 requested changes May 15, 2026

View reviewed changes

fix(metrics): decouple feedback collector bootstrap

e502450

yeshion23333 approved these changes May 15, 2026

View reviewed changes

yeshion23333 merged commit 9ebfd59 into main May 15, 2026
6 of 7 checks passed

yeshion23333 deleted the feat/vikingbot-feedback-observability branch May 15, 2026 09:30

github-project-automation Bot moved this from Backlog to Done in OpenViking project May 15, 2026

r266-tech added a commit to r266-tech/OpenViking that referenced this pull request May 16, 2026

docs(en/metrics): add per-channel one-turn resolution PromQL example …

65ce2b7

…from volcengine#2037

r266-tech mentioned this pull request May 16, 2026

docs(en/metrics): add per-channel one-turn resolution PromQL example from #2037 #2082

Merged

qin-ctx pushed a commit that referenced this pull request May 18, 2026

docs(en/metrics): add per-channel one-turn resolution PromQL example …

b1c3936

…from #2037 (#2082)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(metrics): add vikingbot feedback observability#2037

feat(metrics): add vikingbot feedback observability#2037
yeshion23333 merged 5 commits into
mainfrom
feat/vikingbot-feedback-observability

myysy commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

MaojiaSheng commented May 14, 2026

Uh oh!

yeshion23333 left a comment

Uh oh!

yeshion23333 May 15, 2026

Uh oh!

yeshion23333 May 15, 2026

Uh oh!

yeshion23333 May 15, 2026

Uh oh!

yeshion23333 May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		from typing import Any, ClassVar

		from openviking.metrics.core.base import MetricCollector

Conversation

myysy commented May 14, 2026

Description

Related Issue

Type of Change

Changes Made

Testing

Checklist

Screenshots (if applicable)

Additional Notes

Uh oh!

github-actions Bot commented May 14, 2026

PR Reviewer Guide 🔍

Uh oh!

github-actions Bot commented May 14, 2026

PR Code Suggestions ✨

Uh oh!

MaojiaSheng commented May 14, 2026

Uh oh!

yeshion23333 left a comment

Choose a reason for hiding this comment

Uh oh!

yeshion23333 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

yeshion23333 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

yeshion23333 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

yeshion23333 May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants