Skip to content

feat(ddgr): surface Datadog Monitor state in CR status#3047

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 1 commit into
mainfrom
CONTP-1686/wassim.dhif/ddgr-monitor-status
Jun 1, 2026
Merged

feat(ddgr): surface Datadog Monitor state in CR status#3047
gh-worker-dd-mergequeue-cf854d[bot] merged 1 commit into
mainfrom
CONTP-1686/wassim.dhif/ddgr-monitor-status

Conversation

@wdhif
Copy link
Copy Markdown
Member

@wdhif wdhif commented May 27, 2026

What does this PR do?

Surfaces the live Datadog-side state of Monitor resources managed by DatadogGenericResource (DDGR), matching the UX DatadogMonitor already provides. Three new fields appear on the CR status — state, stateLastUpdateTime, stateLastTransitionTime — populated on a ~60s cadence from the Datadog API. A new StateSynced condition signals refresh success/failure (reason=GetError when the API call fails; last-known state is preserved).

The design is intentionally type-agnostic at every layer (schema, condition, dispatch, reconcile branch). Only MonitorHandler.refreshStatusFunc does Monitor-specific work; the other five handlers are one-line no-ops. When SLO support lands in DDGR, its handler implements the same refreshStatusFunc, populates the same three generic fields, and uses the same StateSynced condition — zero schema changes required.

Motivation

This PR closes that gap for Monitor and leaves the door open for SLO without committing to its schema.

Additional Notes

  • Deliberate feature gap vs DatadogMonitor: triggeredState (per-group breakdown — which hosts/services are firing inside a grouped monitor) is intentionally omitted from DDGR. It's Monitor-specific and can't be expressed type-agnostically without polluting the generic schema. Users who need per-group visibility should use the dedicated DatadogMonitor CRD. If a real need emerges later, a Monitor-only triggeredState field can be added at the top level without breaking anything.
  • State-conversion helper is duplicated, not extracted. setMonitorStateOnStatus in monitors.go mirrors convertStateToStatus in internal/controller/datadogmonitor/controller.go by design — a one-time minor DRY violation in exchange for a smaller blast radius (no shared-package refactor that would touch DatadogMonitor). Both copies carry a cross-reference comment so future divergence is easy to spot.

Minimum Agent Versions

No agent-side change. No minimum version bump.

  • Agent: no change
  • Cluster Agent: no change

Describe your test plan

1. Install Operator and DDGR

helm repo update datadog
helm install datadog-operator datadog/datadog-operator --devel \
  --set apiKeyExistingSecret=datadog-secret \
  --set appKeyExistingSecret=datadog-secret \
  --set datadogGenericResource.enabled=true \
  --set datadogCRDs.crds.datadogGenericResources=true \
  --set datadogMonitor.enabled=true \
  --set datadogCRDs.crds.datadogMonitors=true

2. Apply a Monitor DatadogGenericResource

# ddgr-monitor-state-test.yaml
apiVersion: datadoghq.com/v1alpha1
kind: DatadogGenericResource
metadata:
  name: ddgr-monitor-state-test
spec:
  type: monitor
  jsonSpec: |-
    {
      "name": "CONTP-1686 state test",
      "type": "metric alert",
      "query": "avg(last_5m):avg:system.cpu.user{*} by {host} > 100",
      "message": "state-refresh test — Notify nobody",
      "tags": ["test:contp-1686"],
      "options": {"thresholds": {"critical": 100, "warning": 80}}
    }
kubectl apply -f ddgr-monitor-state-test.yaml

3. Verify

kubectl get ddgr -o wide

Expected — new STATE and LAST STATE SYNC columns populated:

NAME                      ID          SYNC STATUS   STATE     LAST STATE SYNC        AGE
ddgr-monitor-state-test   287954487   OK            No Data   2026-05-27T09:33:50Z   76s
kubectl describe ddgr ddgr-monitor-state-test

Expected — StateSynced condition + new State* status fields:

Status:
  Conditions:
    Message:                   DatadogGenericResource Created
    Reason:                    CreatingGenericResource
    Status:                    True
    Type:                      Created
    Message:                   State refreshed from Datadog
    Reason:                    Synced
    Status:                    True
    Type:                      StateSynced
  State:                       No Data
  State Last Transition Time:  2026-05-27T09:33:50Z
  State Last Update Time:      2026-05-27T09:33:50Z
  Sync Status:                 OK

Checklist

  • PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies — please add enhancement (primary) and optionally documentation
  • PR has a milestone or the qa/skip-qa label
  • All commits are signed (see: signing commits)

@datadog-prod-us1-3
Copy link
Copy Markdown

datadog-prod-us1-3 Bot commented May 27, 2026

Code Coverage  Pipelines

Fix all issues with BitsAI

🛑 Gate Violations

🎯 1 Code Coverage issue detected

A Patch coverage percentage gate may be blocking this PR.

Patch coverage: 61.54% (threshold: 80.00%)

⚠️ Warnings

🚦 8 Pipeline jobs failed

DataDog/datadog-operator | e2e: [1.19]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. Job failed: execution took longer than 1h0m0s seconds due to timeout.

DataDog/datadog-operator | e2e: [1.22]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. Job execution exceeded the timeout limit of 1h0m0s.

DataDog/datadog-operator | e2e: [1.24]   View in Datadog   GitLab

🔄 Retry job. This looks flaky and may succeed on retry. Job failed due to timeout: execution took longer than 1h0m0s seconds.

View all 8 failed jobs.

ℹ️ Info

🎯 Code Coverage (details)
Patch Coverage: 61.54%
Overall Coverage: 43.33% (+0.02%)

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: be5f03b | Docs | Datadog PR Page | Give us feedback!

@wdhif wdhif added the enhancement New feature or request label May 27, 2026
@wdhif wdhif added this to the v1.29.0 milestone May 27, 2026
@wdhif wdhif force-pushed the CONTP-1686/wassim.dhif/ddgr-monitor-status branch from 1c4acd5 to 2c606b1 Compare May 27, 2026 08:46
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 27, 2026

Codecov Report

❌ Patch coverage is 51.51515% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.00%. Comparing base (82a75d4) to head (be5f03b).

Files with missing lines Patch % Lines
...rnal/controller/datadoggenericresource/monitors.go 0.00% 6 Missing ⚠️
...al/controller/datadoggenericresource/synthetics.go 0.00% 4 Missing ⚠️
...al/controller/datadoggenericresource/dashboards.go 0.00% 2 Missing ⚠️
...nal/controller/datadoggenericresource/downtimes.go 0.00% 2 Missing ⚠️
...nal/controller/datadoggenericresource/notebooks.go 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #3047   +/-   ##
=======================================
  Coverage   42.99%   43.00%           
=======================================
  Files         339      339           
  Lines       29193    29226   +33     
=======================================
+ Hits        12551    12568   +17     
- Misses      15820    15836   +16     
  Partials      822      822           
Flag Coverage Δ
unittests 43.00% <51.51%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...al/controller/datadoggenericresource/controller.go 78.80% <100.00%> (+2.68%) ⬆️
...kg/controller/utils/condition/condition_generic.go 0.00% <ø> (ø)
...al/controller/datadoggenericresource/dashboards.go 14.58% <0.00%> (-0.64%) ⬇️
...nal/controller/datadoggenericresource/downtimes.go 9.85% <0.00%> (-0.29%) ⬇️
...nal/controller/datadoggenericresource/notebooks.go 14.03% <0.00%> (-0.52%) ⬇️
...al/controller/datadoggenericresource/synthetics.go 27.17% <0.00%> (-1.24%) ⬇️
...rnal/controller/datadoggenericresource/monitors.go 13.11% <0.00%> (-1.44%) ⬇️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 82a75d4...be5f03b. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wdhif wdhif force-pushed the CONTP-1686/wassim.dhif/ddgr-monitor-status branch from 2c606b1 to 8d0e93a Compare May 27, 2026 09:00
@wdhif wdhif marked this pull request as ready for review May 27, 2026 10:20
@wdhif wdhif requested review from a team and Copilot May 27, 2026 10:20
@wdhif wdhif requested review from a team as code owners May 27, 2026 10:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds live Datadog-side “state” visibility to DatadogGenericResource (DDGR) status for resource types that can expose it (currently Monitors), similar to the existing UX for DatadogMonitor. It introduces three new status fields (state, stateLastUpdateTime, stateLastTransitionTime) and a StateSynced condition that reports refresh success/failure while preserving last-known state on refresh errors.

Changes:

  • Extend the DDGR controller/handler interface with refreshState() and implement Monitor state refresh (other handlers are no-ops).
  • Add new DDGR status fields + printer columns + generated CRD/OpenAPI/deepcopy updates.
  • Add unit tests for state application logic and reconcile-path state refresh behavior; update docs to describe the new status surface.

Reviewed changes

Copilot reviewed 16 out of 18 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/controller/utils/condition/condition_generic.go Adds StateSynced condition type constant used by DDGR refresh logic.
internal/controller/datadoggenericresource/resource_handler.go Extends ResourceHandler with refreshState() and defines ResourceState.
internal/controller/datadoggenericresource/controller.go Adds idle-tick branch to refresh Datadog-side state and merges it into status fields + condition updates.
internal/controller/datadoggenericresource/monitors.go Implements Monitor-specific refreshState() via Datadog API overall state.
internal/controller/datadoggenericresource/dashboards.go Adds no-op refreshState() implementation.
internal/controller/datadoggenericresource/notebooks.go Adds no-op refreshState() implementation.
internal/controller/datadoggenericresource/downtimes.go Adds no-op refreshState() implementation.
internal/controller/datadoggenericresource/synthetics.go Adds no-op refreshState() implementations for synthetics handlers.
internal/controller/datadoggenericresource/monitors_test.go New unit tests for applyResourceState() timestamp/transition semantics.
internal/controller/datadoggenericresource/mock_handler_test.go Extends mock handler to support refreshState() and tracks calls/results.
internal/controller/datadoggenericresource/controller_test.go Adds reconcile tests covering state refresh success and failure behavior.
api/datadoghq/v1alpha1/datadoggenericresource_types.go Adds new DDGR status fields + kubebuilder printer columns.
api/datadoghq/v1alpha1/zz_generated.deepcopy.go Adds deepcopy for the new status time pointers.
api/datadoghq/v1alpha1/zz_generated.openapi.go Updates OpenAPI schema for the new status fields.
config/crd/bases/v1/datadoghq.com_datadoggenericresources.yaml Adds printer columns + schema for the new status fields.
config/crd/bases/v1/datadoghq.com_datadoggenericresources_v1alpha1.json Adds schema for the new status fields in JSON CRD base.
docs/datadog_generic_resource.md Documents the new Datadog-side status fields and StateSynced condition.
docs/datadog_generic_resource_dev.md Updates contributor guide for the new handler method and post-dev steps.
Files not reviewed (2)
  • api/datadoghq/v1alpha1/zz_generated.deepcopy.go: Language not supported
  • api/datadoghq/v1alpha1/zz_generated.openapi.go: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/datadog_generic_resource_dev.md Outdated
Comment thread docs/datadog_generic_resource_dev.md Outdated
Copy link
Copy Markdown
Contributor

@OliviaShoup OliviaShoup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR! 🚀 left one suggestion to chop up a long sentence into two

Comment thread docs/datadog_generic_resource.md
Copy link
Copy Markdown
Member

@tbavelier tbavelier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 , except ResourceState struct IMHO, see comment

Comment thread internal/controller/datadoggenericresource/resource_handler.go Outdated
@wdhif wdhif force-pushed the CONTP-1686/wassim.dhif/ddgr-monitor-status branch from a36238e to 740f7aa Compare May 29, 2026 10:22
@wdhif wdhif requested a review from tbavelier May 29, 2026 10:23
Copy link
Copy Markdown
Member

@tbavelier tbavelier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, unblocking, but the dev documentation should be updated accordingly

Comment thread docs/datadog_generic_resource_dev.md
@wdhif wdhif force-pushed the CONTP-1686/wassim.dhif/ddgr-monitor-status branch from 7a40632 to bde5952 Compare June 1, 2026 09:52
Signed-off-by: Wassim DHIF <wassim.dhif@datadoghq.com>
@wdhif wdhif force-pushed the CONTP-1686/wassim.dhif/ddgr-monitor-status branch from bde5952 to be5f03b Compare June 1, 2026 09:59
@wdhif wdhif requested a review from Copilot June 1, 2026 09:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 18 changed files in this pull request and generated no new comments.

Files not reviewed (2)
  • api/datadoghq/v1alpha1/zz_generated.deepcopy.go: Language not supported
  • api/datadoghq/v1alpha1/zz_generated.openapi.go: Language not supported

@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 8b39974 into main Jun 1, 2026
45 of 54 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the CONTP-1686/wassim.dhif/ddgr-monitor-status branch June 1, 2026 17:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants