Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix labels for Service Monitors #2878

Merged
merged 24 commits into from
May 20, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
1625097
Create a separate Service Monitor when the Prometheus exporter is pre…
iblancasa Apr 22, 2024
78f1315
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa Apr 25, 2024
9ccd87f
Improve changelog
iblancasa Apr 25, 2024
5f55a01
Fix prometheus-cr E2E test
iblancasa Apr 25, 2024
e3174b6
Remove unused target
iblancasa Apr 25, 2024
edfb463
Merge branch 'main' into bug/2877
iblancasa Apr 26, 2024
82fefa2
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa Apr 29, 2024
28baa37
Add docstring
iblancasa Apr 29, 2024
7dcaadb
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 3, 2024
2e143ee
Fix typo
iblancasa May 3, 2024
cc96402
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 8, 2024
978dfa0
Merge branch 'main' into bug/2877
iblancasa May 8, 2024
f5b29d8
Change the label name
iblancasa May 9, 2024
fff9870
Merge branch 'bug/2877' of github.com:iblancasa/opentelemetry-operato…
iblancasa May 9, 2024
6432371
Merge branch 'main' into bug/2877
iblancasa May 13, 2024
2469162
Merge branch 'bug/2877' of github.com:iblancasa/opentelemetry-operato…
iblancasa May 13, 2024
0b2a439
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 13, 2024
3712c43
Change changelog description
iblancasa May 13, 2024
743247b
Merge branch 'main' into bug/2877
iblancasa May 14, 2024
552cb3e
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 17, 2024
720c6b1
Recover removed labels
iblancasa May 17, 2024
d1889d0
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 17, 2024
cc84a10
Add missing labels
iblancasa May 17, 2024
e5b251c
Remove wrong labels
iblancasa May 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 18 additions & 0 deletions .chloggen/bug_2877.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: bug_fix

# The name of the component, or a single word describing the area of concern, (e.g. collector, target allocator, auto-instrumentation, opamp, github action)
component: collector

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: |
Create a Service Monitor for the collector Service when Prometheus exporter is used. A different Service Monitor is created for the monitoring service.
swiatekm-sumo marked this conversation as resolved.
Show resolved Hide resolved
This helps excluding the headless service (duplicating the metrics collectiong) and splits responsibilities between the two Service Monitors.
iblancasa marked this conversation as resolved.
Show resolved Hide resolved

# One or more tracking issues related to the change
issues: [2877]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:
4 changes: 0 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -312,10 +312,6 @@ endif
install-metrics-server:
./hack/install-metrics-server.sh

.PHONY: install-prometheus-operator
install-prometheus-operator:
./hack/install-prometheus-operator.sh

# This only installs the CRDs Target Allocator supports
.PHONY: install-targetallocator-prometheus-crds
install-targetallocator-prometheus-crds:
Expand Down
39 changes: 21 additions & 18 deletions controllers/builder_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -259,12 +259,13 @@ service:
Name: "test-collector",
Namespace: "test",
Labels: map[string]string{
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service": "Exists",
},
Annotations: nil,
},
Expand Down Expand Up @@ -506,12 +507,13 @@ service:
Name: "test-collector",
Namespace: "test",
Labels: map[string]string{
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service": "Exists",
},
Annotations: nil,
},
Expand Down Expand Up @@ -774,12 +776,13 @@ service:
Name: "test-collector",
Namespace: "test",
Labels: map[string]string{
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service": "Exists",
},
Annotations: nil,
},
Expand Down
2 changes: 1 addition & 1 deletion internal/manifests/collector/collector.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ func Build(params manifests.Params) ([]client.Object, error) {
if params.OtelCol.Spec.Mode == v1beta1.ModeSidecar {
manifestFactories = append(manifestFactories, manifests.Factory(PodMonitor))
} else {
manifestFactories = append(manifestFactories, manifests.Factory(ServiceMonitor))
manifestFactories = append(manifestFactories, manifests.Factory(ServiceMonitor), manifests.Factory(ServiceMonitorMonitoring))
}
}

Expand Down
37 changes: 21 additions & 16 deletions internal/manifests/collector/podmonitor.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,28 +31,14 @@ import (

// PodMonitor returns the pod monitor for the given instance.
func PodMonitor(params manifests.Params) (*monitoringv1.PodMonitor, error) {
if !params.OtelCol.Spec.Observability.Metrics.EnableMetrics {
params.Log.V(2).Info("Metrics disabled for this OTEL Collector",
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)
return nil, nil
} else if params.Config.PrometheusCRAvailability() == prometheus.NotAvailable {
params.Log.V(1).Info("Cannot enable PodMonitor when prometheus CRDs are unavailable",
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)
if !shouldCreatePodMonitor(params) {
return nil, nil
}
var pm monitoringv1.PodMonitor

if params.OtelCol.Spec.Mode != v1beta1.ModeSidecar {
return nil, nil
}
name := naming.PodMonitor(params.OtelCol.Name)
labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, nil)
selectorLabels := manifestutils.SelectorLabels(params.OtelCol.ObjectMeta, ComponentOpenTelemetryCollector)
pm = monitoringv1.PodMonitor{
pm := monitoringv1.PodMonitor{
ObjectMeta: metav1.ObjectMeta{
Namespace: params.OtelCol.Namespace,
Name: name,
Expand Down Expand Up @@ -107,3 +93,22 @@ func metricsEndpointsFromConfig(logger logr.Logger, otelcol v1beta1.OpenTelemetr
}
return metricsEndpoints
}

func shouldCreatePodMonitor(params manifests.Params) bool {
l := params.Log.WithValues(
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)

if !params.OtelCol.Spec.Observability.Metrics.EnableMetrics {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal opinion but this statement can be written in a more readable way

if params.OtelCol.Spec.Observability.Metrics.EnableMetrics &&  params.Config.PrometheusCRAvailability() == prometheus.NotAvailable && params.OtelCol.Spec.Mode == v1beta1.ModeSidecar{
  return true
}

l..V(2).Info("PodMonitor will not be created", "Metrics.enabledMetrics", "params.OtelCol.Spec.Observability.Metrics.EnableMetrics ", ....)
return false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I would move the logger creation close to returning false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a double condition + no log message is easier to read than the current approach?

The current is easy to follow and understand why the method returns false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single log message with all parameters makes it clear to understand the overall state.

The function has 2 states (true and false), aligning the condition to it makes it easier to understand

l.V(2).Info("Metrics disabled for this OTEL Collector. PodMonitor will not ve created")
return false
} else if params.Config.PrometheusCRAvailability() == prometheus.NotAvailable {
l.V(2).Info("Cannot enable PodMonitor when prometheus CRDs are unavailable")
return false
} else if params.OtelCol.Spec.Mode != v1beta1.ModeSidecar {
l.V(2).Info("Not using sidecar mode. PodMonitor will not be created")
return false
}
return true
}
3 changes: 3 additions & 0 deletions internal/manifests/collector/service.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ import (
// headless and monitoring labels are to differentiate the headless/monitoring services from the clusterIP service.
const (
headlessLabel = "operator.opentelemetry.io/collector-headless-service"
iblancasa marked this conversation as resolved.
Show resolved Hide resolved
headfulLabel = "operator.opentelemetry.io/collector-service"
monitoringLabel = "operator.opentelemetry.io/collector-monitoring-service"
iblancasa marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iblancasa i think we should keep these around for now, mark them as deprecated and remove them in two+ versions rather than deleting them outright which would constitute a breaking change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

valueExists = "Exists"
)
Expand All @@ -44,6 +45,7 @@ func HeadlessService(params manifests.Params) (*corev1.Service, error) {

h.Name = naming.HeadlessService(params.OtelCol.Name)
h.Labels[headlessLabel] = valueExists
delete(h.Labels, headfulLabel)

// copy to avoid modifying params.OtelCol.Annotations
annotations := map[string]string{
Expand Down Expand Up @@ -90,6 +92,7 @@ func MonitoringService(params manifests.Params) (*corev1.Service, error) {
func Service(params manifests.Params) (*corev1.Service, error) {
name := naming.Service(params.OtelCol.Name)
labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, []string{})
labels[headfulLabel] = valueExists

out, err := params.OtelCol.Spec.Config.Yaml()
if err != nil {
Expand Down
1 change: 1 addition & 0 deletions internal/manifests/collector/service_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,7 @@ func service(name string, ports []v1beta1.PortsSpec) v1.Service {
func serviceWithInternalTrafficPolicy(name string, ports []v1beta1.PortsSpec, internalTrafficPolicy v1.ServiceInternalTrafficPolicyType) v1.Service {
params := deploymentParams()
labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, []string{})
labels[headfulLabel] = valueExists

svcPorts := []v1.ServicePort{}
for _, p := range ports {
Expand Down
68 changes: 46 additions & 22 deletions internal/manifests/collector/servicemonitor.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
package collector

import (
"fmt"
"strings"

"github.com/go-logr/logr"
Expand All @@ -29,30 +30,38 @@ import (
"github.com/open-telemetry/opentelemetry-operator/internal/naming"
)

// ServiceMonitor returns the service monitor for the given instance.
// ServiceMonitor returns the service monitor for the collector.
func ServiceMonitor(params manifests.Params) (*monitoringv1.ServiceMonitor, error) {
if !params.OtelCol.Spec.Observability.Metrics.EnableMetrics {
params.Log.V(2).Info("Metrics disabled for this OTEL Collector",
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)
return nil, nil
} else if params.Config.PrometheusCRAvailability() == prometheus.NotAvailable {
params.Log.V(1).Info("Cannot enable ServiceMonitor when prometheus CRDs are unavailable",
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)
return nil, nil
name := naming.ServiceMonitor(params.OtelCol.Name)
endpoints := endpointsFromConfig(params.Log, params.OtelCol)
if len(endpoints) > 0 {
return createServiceMonitor(name, params, headfulLabel, endpoints)
}
var sm monitoringv1.ServiceMonitor
return nil, nil
}

// ServiceMonitor returns the service monitor for the monitoring service of the collector.
func ServiceMonitorMonitoring(params manifests.Params) (*monitoringv1.ServiceMonitor, error) {
name := naming.ServiceMonitor(fmt.Sprintf("%s-monitoring", params.OtelCol.Name))
endpoints := []monitoringv1.Endpoint{
{
Port: "monitoring",
},
}
return createServiceMonitor(name, params, monitoringLabel, endpoints)
}

if params.OtelCol.Spec.Mode == v1beta1.ModeSidecar {
func createServiceMonitor(name string, params manifests.Params, label string, endpoints []monitoringv1.Endpoint) (*monitoringv1.ServiceMonitor, error) {
iblancasa marked this conversation as resolved.
Show resolved Hide resolved
if !shouldCreateServiceMonitor(params) {
return nil, nil
}
name := naming.ServiceMonitor(params.OtelCol.Name)

var sm monitoringv1.ServiceMonitor

labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, []string{})
selectorLabels := manifestutils.SelectorLabels(params.OtelCol.ObjectMeta, ComponentOpenTelemetryCollector)
selectorLabels[monitoringLabel] = valueExists
// This label is the one which differentiates the services
selectorLabels[label] = valueExists

sm = monitoringv1.ServiceMonitor{
ObjectMeta: metav1.ObjectMeta{
Expand All @@ -61,11 +70,7 @@ func ServiceMonitor(params manifests.Params) (*monitoringv1.ServiceMonitor, erro
Labels: labels,
},
Spec: monitoringv1.ServiceMonitorSpec{
Endpoints: append([]monitoringv1.Endpoint{
{
Port: "monitoring",
},
}, endpointsFromConfig(params.Log, params.OtelCol)...),
Endpoints: endpoints,
NamespaceSelector: monitoringv1.NamespaceSelector{
MatchNames: []string{params.OtelCol.Namespace},
},
Expand All @@ -78,6 +83,25 @@ func ServiceMonitor(params manifests.Params) (*monitoringv1.ServiceMonitor, erro
return &sm, nil
}

func shouldCreateServiceMonitor(params manifests.Params) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal opinion but this statement can be written in a more readable way

See the previous comment

l := params.Log.WithValues(
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)

if !params.OtelCol.Spec.Observability.Metrics.EnableMetrics {
l.V(2).Info("Metrics disabled for this OTEL Collector. ServiceMonitor will not ve created")
return false
} else if params.Config.PrometheusCRAvailability() == prometheus.NotAvailable {
l.V(2).Info("Cannot enable ServiceMonitor when prometheus CRDs are unavailable")
return false
} else if params.OtelCol.Spec.Mode == v1beta1.ModeSidecar {
l.V(2).Info("Using sidecar mode. ServiceMonitor will not be created")
return false
}
return true
}

func endpointsFromConfig(logger logr.Logger, otelcol v1beta1.OpenTelemetryCollector) []monitoringv1.Endpoint {
// TODO: https://github.com/open-telemetry/opentelemetry-operator/issues/2603
cfgStr, err := otelcol.Spec.Config.Yaml()
Expand Down
31 changes: 20 additions & 11 deletions internal/manifests/collector/servicemonitor_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,18 +34,26 @@ func TestDesiredServiceMonitors(t *testing.T) {
params.OtelCol.Spec.Observability.Metrics.EnableMetrics = true
actual, err = ServiceMonitor(params)
assert.NoError(t, err)
assert.Nil(t, actual)

// Check the monitoring SM
actual, err = ServiceMonitorMonitoring(params)
assert.NoError(t, err)
assert.NotNil(t, actual)
assert.Equal(t, fmt.Sprintf("%s-collector", params.OtelCol.Name), actual.Name)
assert.Equal(t, fmt.Sprintf("%s-monitoring-collector", params.OtelCol.Name), actual.Name)
assert.Equal(t, params.OtelCol.Namespace, actual.Namespace)
assert.Equal(t, "monitoring", actual.Spec.Endpoints[0].Port)
expectedSelectorLabels := map[string]string{
expectedSelectorLabelsMonitor := map[string]string{
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "default.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/part-of": "opentelemetry",
"operator.opentelemetry.io/collector-monitoring-service": "Exists",
}
assert.Equal(t, expectedSelectorLabels, actual.Spec.Selector.MatchLabels)
assert.Equal(t, expectedSelectorLabelsMonitor, actual.Spec.Selector.MatchLabels)
assert.NotContains(t, "operator.opentelemetry.io/collector-headless-service", actual.Spec.Selector.MatchLabels)
assert.NotContains(t, "operator.opentelemetry.io/collector-service", actual.Spec.Selector.MatchLabels)

}

func TestDesiredServiceMonitorsWithPrometheus(t *testing.T) {
Expand All @@ -57,17 +65,18 @@ func TestDesiredServiceMonitorsWithPrometheus(t *testing.T) {
assert.NotNil(t, actual)
assert.Equal(t, fmt.Sprintf("%s-collector", params.OtelCol.Name), actual.Name)
assert.Equal(t, params.OtelCol.Namespace, actual.Namespace)
assert.Equal(t, "monitoring", actual.Spec.Endpoints[0].Port)
assert.Equal(t, "prometheus-dev", actual.Spec.Endpoints[1].Port)
assert.Equal(t, "prometheus-prod", actual.Spec.Endpoints[2].Port)
assert.Equal(t, "prometheus-dev", actual.Spec.Endpoints[0].Port)
assert.Equal(t, "prometheus-prod", actual.Spec.Endpoints[1].Port)
expectedSelectorLabels := map[string]string{
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "default.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/part-of": "opentelemetry",
"operator.opentelemetry.io/collector-monitoring-service": "Exists",
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "default.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/part-of": "opentelemetry",
"operator.opentelemetry.io/collector-service": "Exists",
}
assert.Equal(t, expectedSelectorLabels, actual.Spec.Selector.MatchLabels)
assert.NotContains(t, "operator.opentelemetry.io/collector-headless-service", actual.Spec.Selector.MatchLabels)
assert.NotContains(t, "operator.opentelemetry.io/collector-monitoring-service", actual.Spec.Selector.MatchLabels)
}

func TestDesiredServiceMonitorsPrometheusNotAvailable(t *testing.T) {
Expand Down
8 changes: 3 additions & 5 deletions tests/e2e-openshift/monitoring/01-assert.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,22 +15,21 @@ status:
availableReplicas: 1
readyReplicas: 1
replicas: 1

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/managed-by: opentelemetry-operator
app.kubernetes.io/name: cluster-collector-collector
name: cluster-collector-collector
app.kubernetes.io/name: cluster-collector-monitoring-collector
name: cluster-collector-monitoring-collector
spec:
endpoints:
- port: monitoring
selector:
matchLabels:
app.kubernetes.io/managed-by: opentelemetry-operator

operator.opentelemetry.io/collector-monitoring-service: Exists
---
apiVersion: v1
kind: Service
Expand Down Expand Up @@ -87,7 +86,6 @@ spec:
app.kubernetes.io/managed-by: opentelemetry-operator
app.kubernetes.io/part-of: opentelemetry
type: ClusterIP

---
apiVersion: v1
kind: Service
Expand Down
14 changes: 14 additions & 0 deletions tests/e2e-openshift/monitoring/04-assert.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/managed-by: opentelemetry-operator
app.kubernetes.io/name: cluster-collector2-collector
name: cluster-collector2-collector
spec:
endpoints:
- port: prometheus
selector:
matchLabels:
app.kubernetes.io/managed-by: opentelemetry-operator
operator.opentelemetry.io/collector-service: Exists