Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix labels for Service Monitors #2878

Merged
merged 24 commits into from
May 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
1625097
Create a separate Service Monitor when the Prometheus exporter is pre…
iblancasa Apr 22, 2024
78f1315
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa Apr 25, 2024
9ccd87f
Improve changelog
iblancasa Apr 25, 2024
5f55a01
Fix prometheus-cr E2E test
iblancasa Apr 25, 2024
e3174b6
Remove unused target
iblancasa Apr 25, 2024
edfb463
Merge branch 'main' into bug/2877
iblancasa Apr 26, 2024
82fefa2
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa Apr 29, 2024
28baa37
Add docstring
iblancasa Apr 29, 2024
7dcaadb
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 3, 2024
2e143ee
Fix typo
iblancasa May 3, 2024
cc96402
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 8, 2024
978dfa0
Merge branch 'main' into bug/2877
iblancasa May 8, 2024
f5b29d8
Change the label name
iblancasa May 9, 2024
fff9870
Merge branch 'bug/2877' of github.com:iblancasa/opentelemetry-operato…
iblancasa May 9, 2024
6432371
Merge branch 'main' into bug/2877
iblancasa May 13, 2024
2469162
Merge branch 'bug/2877' of github.com:iblancasa/opentelemetry-operato…
iblancasa May 13, 2024
0b2a439
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 13, 2024
3712c43
Change changelog description
iblancasa May 13, 2024
743247b
Merge branch 'main' into bug/2877
iblancasa May 14, 2024
552cb3e
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 17, 2024
720c6b1
Recover removed labels
iblancasa May 17, 2024
d1889d0
Merge branch 'main' of github.com:open-telemetry/opentelemetry-operat…
iblancasa May 17, 2024
cc84a10
Add missing labels
iblancasa May 17, 2024
e5b251c
Remove wrong labels
iblancasa May 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .chloggen/bug_2877.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: bug_fix

# The name of the component, or a single word describing the area of concern, (e.g. collector, target allocator, auto-instrumentation, opamp, github action)
component: collector

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Create a Service Monitor for the monitoring service and another one for the collector service when the Prometheus exporter is used.

# One or more tracking issues related to the change
issues: [2877]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
Create a Service Monitor for the collector Service when Prometheus exporter is used. A different Service Monitor is created for the monitoring service.
This helps excluding the headless service (duplicating the metrics collection) and splits responsibilities between the two Service Monitors.
Now, the operator.opentelemetry.io/collector-service-type label is used to differentiate the services.
operator.opentelemetry.io/collector-monitoring-service and operator.opentelemetry.io/collector-headless-service are deprecated now.
47 changes: 29 additions & 18 deletions controllers/builder_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -259,12 +259,13 @@ service:
Name: "test-collector",
Namespace: "test",
Labels: map[string]string{
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "base",
},
Annotations: nil,
},
Expand All @@ -291,6 +292,7 @@ service:
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-headless-service": "Exists",
"operator.opentelemetry.io/collector-service-type": "headless",
},
Annotations: map[string]string{
"service.beta.openshift.io/serving-cert-secret-name": "test-collector-headless-tls",
Expand Down Expand Up @@ -319,6 +321,7 @@ service:
"app.kubernetes.io/name": "test-collector-monitoring",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "monitoring",
"operator.opentelemetry.io/collector-monitoring-service": "Exists",
},
Annotations: nil,
Expand Down Expand Up @@ -506,12 +509,13 @@ service:
Name: "test-collector",
Namespace: "test",
Labels: map[string]string{
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "base",
},
Annotations: nil,
},
Expand All @@ -537,6 +541,7 @@ service:
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "headless",
"operator.opentelemetry.io/collector-headless-service": "Exists",
},
Annotations: map[string]string{
Expand Down Expand Up @@ -566,6 +571,7 @@ service:
"app.kubernetes.io/name": "test-collector-monitoring",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "monitoring",
"operator.opentelemetry.io/collector-monitoring-service": "Exists",
},
Annotations: nil,
Expand Down Expand Up @@ -774,12 +780,13 @@ service:
Name: "test-collector",
Namespace: "test",
Labels: map[string]string{
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"app.kubernetes.io/component": "opentelemetry-collector",
"app.kubernetes.io/instance": "test.test",
"app.kubernetes.io/managed-by": "opentelemetry-operator",
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "base",
},
Annotations: nil,
},
Expand All @@ -805,6 +812,7 @@ service:
"app.kubernetes.io/name": "test-collector",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "headless",
"operator.opentelemetry.io/collector-headless-service": "Exists",
},
Annotations: map[string]string{
Expand Down Expand Up @@ -834,6 +842,7 @@ service:
"app.kubernetes.io/name": "test-collector-monitoring",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "monitoring",
"operator.opentelemetry.io/collector-monitoring-service": "Exists",
},
Annotations: nil,
Expand Down Expand Up @@ -1317,6 +1326,7 @@ service:
"app.kubernetes.io/name": "test-collector-monitoring",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "monitoring",
"operator.opentelemetry.io/collector-monitoring-service": "Exists",
},
Annotations: nil,
Expand Down Expand Up @@ -1711,6 +1721,7 @@ prometheus_cr:
"app.kubernetes.io/name": "test-collector-monitoring",
"app.kubernetes.io/part-of": "opentelemetry",
"app.kubernetes.io/version": "latest",
"operator.opentelemetry.io/collector-service-type": "monitoring",
"operator.opentelemetry.io/collector-monitoring-service": "Exists",
},
Annotations: nil,
Expand Down
2 changes: 1 addition & 1 deletion internal/manifests/collector/collector.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ func Build(params manifests.Params) ([]client.Object, error) {
if params.OtelCol.Spec.Mode == v1beta1.ModeSidecar {
manifestFactories = append(manifestFactories, manifests.Factory(PodMonitor))
} else {
manifestFactories = append(manifestFactories, manifests.Factory(ServiceMonitor))
manifestFactories = append(manifestFactories, manifests.Factory(ServiceMonitor), manifests.Factory(ServiceMonitorMonitoring))
}
}

Expand Down
37 changes: 21 additions & 16 deletions internal/manifests/collector/podmonitor.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,28 +31,14 @@ import (

// PodMonitor returns the pod monitor for the given instance.
func PodMonitor(params manifests.Params) (*monitoringv1.PodMonitor, error) {
if !params.OtelCol.Spec.Observability.Metrics.EnableMetrics {
params.Log.V(2).Info("Metrics disabled for this OTEL Collector",
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)
return nil, nil
} else if params.Config.PrometheusCRAvailability() == prometheus.NotAvailable {
params.Log.V(1).Info("Cannot enable PodMonitor when prometheus CRDs are unavailable",
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)
if !shouldCreatePodMonitor(params) {
return nil, nil
}
var pm monitoringv1.PodMonitor

if params.OtelCol.Spec.Mode != v1beta1.ModeSidecar {
return nil, nil
}
name := naming.PodMonitor(params.OtelCol.Name)
labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, nil)
selectorLabels := manifestutils.SelectorLabels(params.OtelCol.ObjectMeta, ComponentOpenTelemetryCollector)
pm = monitoringv1.PodMonitor{
pm := monitoringv1.PodMonitor{
ObjectMeta: metav1.ObjectMeta{
Namespace: params.OtelCol.Namespace,
Name: name,
Expand Down Expand Up @@ -107,3 +93,22 @@ func metricsEndpointsFromConfig(logger logr.Logger, otelcol v1beta1.OpenTelemetr
}
return metricsEndpoints
}

func shouldCreatePodMonitor(params manifests.Params) bool {
l := params.Log.WithValues(
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)

if !params.OtelCol.Spec.Observability.Metrics.EnableMetrics {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal opinion but this statement can be written in a more readable way

if params.OtelCol.Spec.Observability.Metrics.EnableMetrics &&  params.Config.PrometheusCRAvailability() == prometheus.NotAvailable && params.OtelCol.Spec.Mode == v1beta1.ModeSidecar{
  return true
}

l..V(2).Info("PodMonitor will not be created", "Metrics.enabledMetrics", "params.OtelCol.Spec.Observability.Metrics.EnableMetrics ", ....)
return false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I would move the logger creation close to returning false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a double condition + no log message is easier to read than the current approach?

The current is easy to follow and understand why the method returns false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single log message with all parameters makes it clear to understand the overall state.

The function has 2 states (true and false), aligning the condition to it makes it easier to understand

l.V(2).Info("Metrics disabled for this OTEL Collector. PodMonitor will not ve created")
return false
} else if params.Config.PrometheusCRAvailability() == prometheus.NotAvailable {
l.V(2).Info("Cannot enable PodMonitor when prometheus CRDs are unavailable")
return false
} else if params.OtelCol.Spec.Mode != v1beta1.ModeSidecar {
l.V(2).Info("Not using sidecar mode. PodMonitor will not be created")
return false
}
return true
}
24 changes: 20 additions & 4 deletions internal/manifests/collector/service.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,26 @@ import (
"github.com/open-telemetry/opentelemetry-operator/internal/naming"
)

// headless and monitoring labels are to differentiate the headless/monitoring services from the clusterIP service.
// headless and monitoring labels are to differentiate the base/headless/monitoring services from the clusterIP service.
const (
headlessLabel = "operator.opentelemetry.io/collector-headless-service"
iblancasa marked this conversation as resolved.
Show resolved Hide resolved
monitoringLabel = "operator.opentelemetry.io/collector-monitoring-service"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iblancasa i think we should keep these around for now, mark them as deprecated and remove them in two+ versions rather than deleting them outright which would constitute a breaking change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

valueExists = "Exists"
headlessLabel = "operator.opentelemetry.io/collector-headless-service"
monitoringLabel = "operator.opentelemetry.io/collector-monitoring-service"
serviceTypeLabel = "operator.opentelemetry.io/collector-service-type"
valueExists = "Exists"
)

type ServiceType int

const (
BaseServiceType ServiceType = iota
HeadlessServiceType
MonitoringServiceType
)

func (s ServiceType) String() string {
return [...]string{"base", "headless", "monitoring"}[s]
}

func HeadlessService(params manifests.Params) (*corev1.Service, error) {
h, err := Service(params)
if h == nil || err != nil {
Expand All @@ -44,6 +57,7 @@ func HeadlessService(params manifests.Params) (*corev1.Service, error) {

h.Name = naming.HeadlessService(params.OtelCol.Name)
h.Labels[headlessLabel] = valueExists
h.Labels[serviceTypeLabel] = HeadlessServiceType.String()

// copy to avoid modifying params.OtelCol.Annotations
annotations := map[string]string{
Expand All @@ -63,6 +77,7 @@ func MonitoringService(params manifests.Params) (*corev1.Service, error) {
name := naming.MonitoringService(params.OtelCol.Name)
labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, []string{})
labels[monitoringLabel] = valueExists
labels[serviceTypeLabel] = MonitoringServiceType.String()

metricsPort, err := params.OtelCol.Spec.Config.Service.MetricsPort()
if err != nil {
Expand Down Expand Up @@ -90,6 +105,7 @@ func MonitoringService(params manifests.Params) (*corev1.Service, error) {
func Service(params manifests.Params) (*corev1.Service, error) {
name := naming.Service(params.OtelCol.Name)
labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, []string{})
labels[serviceTypeLabel] = BaseServiceType.String()

out, err := params.OtelCol.Spec.Config.Yaml()
if err != nil {
Expand Down
1 change: 1 addition & 0 deletions internal/manifests/collector/service_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,7 @@ func service(name string, ports []v1beta1.PortsSpec) v1.Service {
func serviceWithInternalTrafficPolicy(name string, ports []v1beta1.PortsSpec, internalTrafficPolicy v1.ServiceInternalTrafficPolicyType) v1.Service {
params := deploymentParams()
labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, []string{})
labels[serviceTypeLabel] = BaseServiceType.String()

svcPorts := []v1.ServicePort{}
for _, p := range ports {
Expand Down
70 changes: 48 additions & 22 deletions internal/manifests/collector/servicemonitor.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
package collector

import (
"fmt"
"strings"

"github.com/go-logr/logr"
Expand All @@ -29,30 +30,40 @@ import (
"github.com/open-telemetry/opentelemetry-operator/internal/naming"
)

// ServiceMonitor returns the service monitor for the given instance.
// ServiceMonitor returns the service monitor for the collector.
func ServiceMonitor(params manifests.Params) (*monitoringv1.ServiceMonitor, error) {
if !params.OtelCol.Spec.Observability.Metrics.EnableMetrics {
params.Log.V(2).Info("Metrics disabled for this OTEL Collector",
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)
return nil, nil
} else if params.Config.PrometheusCRAvailability() == prometheus.NotAvailable {
params.Log.V(1).Info("Cannot enable ServiceMonitor when prometheus CRDs are unavailable",
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)
return nil, nil
name := naming.ServiceMonitor(params.OtelCol.Name)
endpoints := endpointsFromConfig(params.Log, params.OtelCol)
if len(endpoints) > 0 {
return createServiceMonitor(name, params, BaseServiceType, endpoints)
}
var sm monitoringv1.ServiceMonitor
return nil, nil
}

// ServiceMonitor returns the service monitor for the monitoring service of the collector.
func ServiceMonitorMonitoring(params manifests.Params) (*monitoringv1.ServiceMonitor, error) {
name := naming.ServiceMonitor(fmt.Sprintf("%s-monitoring", params.OtelCol.Name))
endpoints := []monitoringv1.Endpoint{
{
Port: "monitoring",
},
}
return createServiceMonitor(name, params, MonitoringServiceType, endpoints)
}

if params.OtelCol.Spec.Mode == v1beta1.ModeSidecar {
// createServiceMonitor creates a Service Monitor using the provided name, the params from the instance, a label to identify the service
// to target (like the monitoring or the collector services) and the endpoints to scrape.
func createServiceMonitor(name string, params manifests.Params, serviceType ServiceType, endpoints []monitoringv1.Endpoint) (*monitoringv1.ServiceMonitor, error) {
if !shouldCreateServiceMonitor(params) {
return nil, nil
}
name := naming.ServiceMonitor(params.OtelCol.Name)

var sm monitoringv1.ServiceMonitor

labels := manifestutils.Labels(params.OtelCol.ObjectMeta, name, params.OtelCol.Spec.Image, ComponentOpenTelemetryCollector, []string{})
selectorLabels := manifestutils.SelectorLabels(params.OtelCol.ObjectMeta, ComponentOpenTelemetryCollector)
selectorLabels[monitoringLabel] = valueExists
// This label is the one which differentiates the services
selectorLabels[serviceTypeLabel] = serviceType.String()

sm = monitoringv1.ServiceMonitor{
ObjectMeta: metav1.ObjectMeta{
Expand All @@ -61,11 +72,7 @@ func ServiceMonitor(params manifests.Params) (*monitoringv1.ServiceMonitor, erro
Labels: labels,
},
Spec: monitoringv1.ServiceMonitorSpec{
Endpoints: append([]monitoringv1.Endpoint{
{
Port: "monitoring",
},
}, endpointsFromConfig(params.Log, params.OtelCol)...),
Endpoints: endpoints,
NamespaceSelector: monitoringv1.NamespaceSelector{
MatchNames: []string{params.OtelCol.Namespace},
},
Expand All @@ -78,6 +85,25 @@ func ServiceMonitor(params manifests.Params) (*monitoringv1.ServiceMonitor, erro
return &sm, nil
}

func shouldCreateServiceMonitor(params manifests.Params) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal opinion but this statement can be written in a more readable way

See the previous comment

l := params.Log.WithValues(
"params.OtelCol.name", params.OtelCol.Name,
"params.OtelCol.namespace", params.OtelCol.Namespace,
)

if !params.OtelCol.Spec.Observability.Metrics.EnableMetrics {
l.V(2).Info("Metrics disabled for this OTEL Collector. ServiceMonitor will not ve created")
return false
} else if params.Config.PrometheusCRAvailability() == prometheus.NotAvailable {
l.V(2).Info("Cannot enable ServiceMonitor when prometheus CRDs are unavailable")
return false
} else if params.OtelCol.Spec.Mode == v1beta1.ModeSidecar {
l.V(2).Info("Using sidecar mode. ServiceMonitor will not be created")
return false
}
return true
}

func endpointsFromConfig(logger logr.Logger, otelcol v1beta1.OpenTelemetryCollector) []monitoringv1.Endpoint {
// TODO: https://github.com/open-telemetry/opentelemetry-operator/issues/2603
cfgStr, err := otelcol.Spec.Config.Yaml()
Expand Down
Loading
Loading