Skip to content

feat(agent): surface per-CR failures as Degraded condition with cause on NodeStatus.Message#109

Merged
gma1k merged 2 commits into
mainfrom
feat/surface-agent-failures
May 11, 2026
Merged

feat(agent): surface per-CR failures as Degraded condition with cause on NodeStatus.Message#109
gma1k merged 2 commits into
mainfrom
feat/surface-agent-failures

Conversation

@gma1k
Copy link
Copy Markdown
Owner

@gma1k gma1k commented May 11, 2026

  • Agent reconciler now publishes tombstone CRRules (rule.Err != nil) for per-CR failures — selector match, cgroup resolve, bundle load, and exporter build — instead of silently continueing.
  • StatusWriter writes the wrapped cause to NodeStatus[*].Message and forces Ready=false for tombstoned rules; the agent-global Ready callback only governs healthy rules.
  • Operator's PodTraceReconciler rolls any tombstoned NodeStatus row up into ConditionDegraded=True with Reason=AgentNodeStatus, lexicographically naming the first failing node so the message stays stable across reconciles.
  • User-visible benefit: ExporterConfigs with type: jaeger|zipkin|splunk|datadog (not yet implemented in the agent) now show build exporter: ... not yet implemented in agent mode on kubectl describe podtrace, instead of failing silently.

@gma1k gma1k merged commit bd0322b into main May 11, 2026
9 checks passed
@gma1k gma1k deleted the feat/surface-agent-failures branch May 11, 2026 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant