From 94325fec09873321c83360c1f0e2339864f75561 Mon Sep 17 00:00:00 2001 From: Aldo Lacuku Date: Tue, 2 Jul 2024 14:55:01 +0200 Subject: [PATCH] feat(falco): add support for Falco metrics Signed-off-by: Aldo Lacuku --- charts/falco/CHANGELOG.md | 4 + charts/falco/Chart.yaml | 2 +- charts/falco/README.md | 31 ++- charts/falco/templates/_helpers.tpl | 22 +- charts/falco/templates/configmap.yaml | 1 + charts/falco/templates/service.yaml | 19 ++ charts/falco/templates/serviceMonitor.yaml | 48 +++++ charts/falco/tests/unit/metricsConfig_test.go | 204 ++++++++++++++++++ .../tests/unit/serviceMonitorTemplate_test.go | 93 ++++++++ charts/falco/values.yaml | 135 ++++++++++++ 10 files changed, 556 insertions(+), 3 deletions(-) create mode 100644 charts/falco/templates/service.yaml create mode 100644 charts/falco/templates/serviceMonitor.yaml create mode 100644 charts/falco/tests/unit/metricsConfig_test.go create mode 100644 charts/falco/tests/unit/serviceMonitorTemplate_test.go diff --git a/charts/falco/CHANGELOG.md b/charts/falco/CHANGELOG.md index 79aea26e3..bd722eb25 100644 --- a/charts/falco/CHANGELOG.md +++ b/charts/falco/CHANGELOG.md @@ -3,6 +3,10 @@ This file documents all notable changes to Falco Helm Chart. The release numbering uses [semantic versioning](http://semver.org). +## v4.6.0 + +* feat(falco): add support for Falco metrics + ## v4.5.2 * bump falcosidekick dependency version to v0.8.0, for falcosidekick 2.29.0 diff --git a/charts/falco/Chart.yaml b/charts/falco/Chart.yaml index 1891b8f13..9e3d71a37 100644 --- a/charts/falco/Chart.yaml +++ b/charts/falco/Chart.yaml @@ -1,6 +1,6 @@ apiVersion: v2 name: falco -version: 4.5.3 +version: 4.6.0 appVersion: "0.38.1" description: Falco keywords: diff --git a/charts/falco/README.md b/charts/falco/README.md index 3c582135d..899f4ab07 100644 --- a/charts/falco/README.md +++ b/charts/falco/README.md @@ -581,7 +581,7 @@ If you use a Proxy in your cluster, the requests between `Falco` and `Falcosidek ## Configuration -The following table lists the main configurable parameters of the falco chart v4.5.3 and their default values. See [values.yaml](./values.yaml) for full list. +The following table lists the main configurable parameters of the falco chart v4.6.0 and their default values. See [values.yaml](./values.yaml) for full list. ## Values @@ -740,6 +740,23 @@ The following table lists the main configurable parameters of the falco chart v4 | image.repository | string | `"falcosecurity/falco-no-driver"` | The image repository to pull from | | image.tag | string | `""` | The image tag to pull. Overrides the image tag whose default is the chart appVersion. | | imagePullSecrets | list | `[]` | Secrets containing credentials when pulling from private/secure registries. | +| metrics | object | `{"convertMemoryToMB":true,"enabled":false,"includeEmptyValues":false,"interval":"1h","kernelEventCountersEnabled":true,"libbpfStatsEnabled":true,"outputRule":false,"resourceUtilizationEnabled":true,"rulesCountersEnabled":true,"service":{"create":true,"ports":{"metrics":{"port":8765,"protocol":"TCP","targetPort":8765}},"type":"ClusterIP"},"stateCountersEnabled":true}` | metrics configures Falco to enable and expose the metrics. | +| metrics.convertMemoryToMB | bool | `true` | convertMemoryToMB specifies whether the memory should be converted to mb. | +| metrics.enabled | bool | `false` | enabled specifies whether the metrics should be enabled. | +| metrics.includeEmptyValues | bool | `false` | includeEmptyValues specifies whether the empty values should be included in the metrics. | +| metrics.interval | string | `"1h"` | interval is stats interval in Falco follows the time duration definitions used by Prometheus. https://prometheus.io/docs/prometheus/latest/querying/basics/#time-durations Time durations are specified as a number, followed immediately by one of the following units: ms - millisecond s - second m - minute h - hour d - day - assuming a day has always 24h w - week - assuming a week has always 7d y - year - assuming a year has always 365d Example of a valid time duration: 1h30m20s10ms A minimum interval of 100ms is enforced for metric collection. However, for production environments, we recommend selecting one of the following intervals for optimal monitoring: 15m 30m 1h 4h 6h | +| metrics.libbpfStatsEnabled | bool | `true` | libbpfStatsEnabled exposes statistics similar to `bpftool prog show`, providing information such as the number of invocations of each BPF program attached by Falco and the time spent in each program measured in nanoseconds. To enable this feature, the kernel must be >= 5.1, and the kernel configuration `/proc/sys/kernel/bpf_stats_enabled` must be set. This option, or an equivalent statistics feature, is not available for non `*bpf*` drivers. Additionally, please be aware that the current implementation of `libbpf` does not support granularity of statistics at the bpf tail call level. | +| metrics.outputRule | bool | `false` | outputRule enables seamless metrics and performance monitoring, we recommend emitting metrics as the rule "Falco internal: metrics snapshot". This option is particularly useful when Falco logs are preserved in a data lake. Please note that to use this option, the Falco rules config `priority` must be set to `info` at a minimum. | +| metrics.resourceUtilizationEnabled | bool | `true` | resourceUtilizationEnabled`: Emit CPU and memory usage metrics. CPU usage is reported as a percentage of one CPU and can be normalized to the total number of CPUs to determine overall usage. Memory metrics are provided in raw units (`kb` for `RSS`, `PSS` and `VSZ` or `bytes` for `container_memory_used`) and can be uniformly converted to megabytes (MB) using the `convert_memory_to_mb` functionality. In environments such as Kubernetes when deployed as daemonset, it is crucial to track Falco's container memory usage. To customize the path of the memory metric file, you can create an environment variable named `FALCO_CGROUP_MEM_PATH` and set it to the desired file path. By default, Falco uses the file `/sys/fs/cgroup/memory/memory.usage_in_bytes` to monitor container memory usage, which aligns with Kubernetes' `container_memory_working_set_bytes` metric. Finally, we emit the overall host CPU and memory usages, along with the total number of processes and open file descriptors (fds) on the host, obtained from the proc file system unrelated to Falco's monitoring. These metrics help assess Falco's usage in relation to the server's workload intensity. | +| metrics.rulesCountersEnabled | bool | `true` | rulesCountersEnabled specifies whether the counts for each rule should be emitted. | +| metrics.service | object | `{"create":true,"ports":{"metrics":{"port":8765,"protocol":"TCP","targetPort":8765}},"type":"ClusterIP"}` | service exposes the metrics service to be accessed from within the cluster. ref: https://kubernetes.io/docs/concepts/services-networking/service/ | +| metrics.service.create | bool | `true` | create specifies whether a service should be created. | +| metrics.service.ports | object | `{"metrics":{"port":8765,"protocol":"TCP","targetPort":8765}}` | ports denotes all the ports on which the Service will listen. | +| metrics.service.ports.metrics | object | `{"port":8765,"protocol":"TCP","targetPort":8765}` | metrics denotes a listening service named "metrics". | +| metrics.service.ports.metrics.port | int | `8765` | port is the port on which the Service will listen. | +| metrics.service.ports.metrics.protocol | string | `"TCP"` | protocol specifies the network protocol that the Service should use for the associated port. | +| metrics.service.ports.metrics.targetPort | int | `8765` | targetPort is the port on which the Pod is listening. | +| metrics.service.type | string | `"ClusterIP"` | type denotes the service type. Setting it to "ClusterIP" we ensure that are accessible from within the cluster. | | mounts.enforceProcMount | bool | `false` | By default, `/proc` from the host is only mounted into the Falco pod when `driver.enabled` is set to `true`. This flag allows it to override this behaviour for edge cases where `/proc` is needed but syscall data source is not enabled at the same time (e.g. for specific plugins). | | mounts.volumeMounts | list | `[]` | A list of volumes you want to add to the Falco pods. | | mounts.volumes | list | `[]` | A list of volumes you want to add to the Falco pods. | @@ -757,6 +774,18 @@ The following table lists the main configurable parameters of the falco chart v4 | serviceAccount.annotations | object | `{}` | Annotations to add to the service account. | | serviceAccount.create | bool | `true` | Specifies whether a service account should be created. | | serviceAccount.name | string | `""` | The name of the service account to use. If not set and create is true, a name is generated using the fullname template | +| serviceMonitor | object | `{"create":false,"endpointPort":"metrics","interval":"15s","labels":{},"path":"/metrics","relabelings":[],"scheme":"http","scrapeTimeout":"10s","selector":{},"targetLabels":[],"tlsConfig":{}}` | serviceMonitor holds the configuration for the ServiceMonitor CRD. A ServiceMonitor is a custom resource definition (CRD) used to configure how Prometheus should discover and scrape metrics from the Falco service. | +| serviceMonitor.create | bool | `false` | create specifies whether a ServiceMonitor CRD should be created for a prometheus operator. https://github.com/coreos/prometheus-operator Enable it only if the ServiceMonitor CRD is installed in your cluster. | +| serviceMonitor.endpointPort | string | `"metrics"` | endpointPort is the port in the Falco service that exposes the metrics service. Change the value if you deploy a custom service for Falco's metrics. | +| serviceMonitor.interval | string | `"15s"` | interval specifies the time interval at which Prometheus should scrape metrics from the service. | +| serviceMonitor.labels | object | `{}` | labels set of labels to be applied to the ServiceMonitor resource. If your Prometheus deployment is configured to use serviceMonitorSelector, then add the right label here in order for the ServiceMonitor to be selected for target discovery. | +| serviceMonitor.path | string | `"/metrics"` | path at which the metrics are exposed by Falco. | +| serviceMonitor.relabelings | list | `[]` | relabelings configures the relabeling rules to apply the target’s metadata labels. | +| serviceMonitor.scheme | string | `"http"` | scheme specifies network protocol used by the metrics endpoint. In this case HTTP. | +| serviceMonitor.scrapeTimeout | string | `"10s"` | scrapeTimeout determines the maximum time Prometheus should wait for a target to respond to a scrape request. If the target does not respond within the specified timeout, Prometheus considers the scrape as failed for that target. | +| serviceMonitor.selector | object | `{}` | selector set of labels that should match the labels on the Service targeted by the current serviceMonitor. | +| serviceMonitor.targetLabels | list | `[]` | targetLabels defines the labels which are transferred from the associated Kubernetes service object onto the ingested metrics. | +| serviceMonitor.tlsConfig | object | `{}` | tlsConfig specifies TLS (Transport Layer Security) configuration for secure communication when scraping metrics from a service. It allows you to define the details of the TLS connection, such as CA certificate, client certificate, and client key. Currently, the k8s-metacollector does not support TLS configuration for the metrics endpoint. | | services | string | `nil` | Network services configuration (scenario requirement) Add here your services to be deployed together with Falco. | | tolerations | list | `[{"effect":"NoSchedule","key":"node-role.kubernetes.io/master"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/control-plane"}]` | Tolerations to allow Falco to run on Kubernetes masters. | | tty | bool | `false` | Attach the Falco process to a tty inside the container. Needed to flush Falco logs as soon as they are emitted. Set it to "true" when you need the Falco logs to be immediately displayed. | diff --git a/charts/falco/templates/_helpers.tpl b/charts/falco/templates/_helpers.tpl index dbc43195c..f611a5397 100644 --- a/charts/falco/templates/_helpers.tpl +++ b/charts/falco/templates/_helpers.tpl @@ -411,4 +411,24 @@ false {{- else -}} true {{- end -}} -{{- end -}} \ No newline at end of file +{{- end -}} + +{{/* +Based on the use input it populates the metrics configuration in the falco config map. +*/}} +{{- define "falco.metricsConfiguration" -}} +{{- if .Values.metrics.enabled -}} +{{- $_ := set .Values.falco.webserver "prometheus_metrics_enabled" true -}} +{{- $_ = set .Values.falco.webserver "enabled" true -}} +{{- $_ = set .Values.falco.metrics "enabled" .Values.metrics.enabled -}} +{{- $_ = set .Values.falco.metrics "interval" .Values.metrics.interval -}} +{{- $_ = set .Values.falco.metrics "output_rule" .Values.metrics.outputRule -}} +{{- $_ = set .Values.falco.metrics "rules_counters_enabled" .Values.metrics.rulesCountersEnabled -}} +{{- $_ = set .Values.falco.metrics "resource_utilization_enabled" .Values.metrics.resourceUtilizationEnabled -}} +{{- $_ = set .Values.falco.metrics "state_counters_enabled" .Values.metrics.stateCountersEnabled -}} +{{- $_ = set .Values.falco.metrics "kernel_event_counters_enabled" .Values.metrics.kernelEventCountersEnabled -}} +{{- $_ = set .Values.falco.metrics "libbpf_stats_enabled" .Values.metrics.libbpfStatsEnabled -}} +{{- $_ = set .Values.falco.metrics "convert_memory_to_mb" .Values.metrics.convertMemoryToMB -}} +{{- $_ = set .Values.falco.metrics "include_empty_values" .Values.metrics.includeEmptyValues -}} +{{- end -}} +{{- end -}} diff --git a/charts/falco/templates/configmap.yaml b/charts/falco/templates/configmap.yaml index 118c7f86b..f48fc88e7 100644 --- a/charts/falco/templates/configmap.yaml +++ b/charts/falco/templates/configmap.yaml @@ -10,4 +10,5 @@ data: {{- include "falco.falcosidekickConfig" . }} {{- include "k8smeta.configuration" . -}} {{- include "falco.engineConfiguration" . -}} + {{- include "falco.metricsConfiguration" . -}} {{- toYaml .Values.falco | nindent 4 }} diff --git a/charts/falco/templates/service.yaml b/charts/falco/templates/service.yaml new file mode 100644 index 000000000..d2093ec22 --- /dev/null +++ b/charts/falco/templates/service.yaml @@ -0,0 +1,19 @@ +{{- if and .Values.metrics.enabled .Values.metrics.service.create }} +apiVersion: v1 +kind: Service +metadata: + name: {{ include "falco.fullname" . }}-metrics + namespace: {{ include "falco.namespace" . }} + labels: + {{- include "falco.labels" . | nindent 4 }} + type: "falco-metrics" +spec: + type: {{ .Values.metrics.service.type }} + ports: + - port: {{ .Values.metrics.service.ports.metrics.port }} + targetPort: {{ .Values.metrics.service.ports.metrics.targetPort }} + protocol: {{ .Values.metrics.service.ports.metrics.protocol }} + name: "metrics" + selector: + {{- include "falco.selectorLabels" . | nindent 4 }} +{{- end }} diff --git a/charts/falco/templates/serviceMonitor.yaml b/charts/falco/templates/serviceMonitor.yaml new file mode 100644 index 000000000..0dea6dd6e --- /dev/null +++ b/charts/falco/templates/serviceMonitor.yaml @@ -0,0 +1,48 @@ +{{- if .Values.serviceMonitor.create }} +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: {{ include "falco.fullname" . }} + {{- if .Values.serviceMonitor.namespace }} + namespace: {{ tpl .Values.serviceMonitor.namespace . }} + {{- else }} + namespace: {{ include "falco.namespace" . }} + {{- end }} + labels: + {{- include "falco.labels" . | nindent 4 }} + {{- with .Values.serviceMonitor.labels }} + {{- toYaml . | nindent 4 }} + {{- end }} +spec: + endpoints: + - port: "{{ .Values.serviceMonitor.endpointPort }}" + {{- with .Values.serviceMonitor.interval }} + interval: {{ . }} + {{- end }} + {{- with .Values.serviceMonitor.scrapeTimeout }} + scrapeTimeout: {{ . }} + {{- end }} + honorLabels: true + path: {{ .Values.serviceMonitor.path }} + scheme: {{ .Values.serviceMonitor.scheme }} + {{- with .Values.serviceMonitor.tlsConfig }} + tlsConfig: + {{- toYaml . | nindent 8 }} + {{- end }} + {{- with .Values.serviceMonitor.relabelings }} + relabelings: + {{- toYaml . | nindent 8 }} + {{- end }} + jobLabel: "{{ .Release.Name }}" + selector: + matchLabels: + {{- include "falco.selectorLabels" . | nindent 6 }} + type: "falco-metrics" + namespaceSelector: + matchNames: + - {{ include "falco.namespace" . }} + {{- with .Values.serviceMonitor.targetLabels }} + targetLabels: + {{- toYaml . | nindent 4 }} + {{- end }} +{{- end }} diff --git a/charts/falco/tests/unit/metricsConfig_test.go b/charts/falco/tests/unit/metricsConfig_test.go new file mode 100644 index 000000000..2d0cc33da --- /dev/null +++ b/charts/falco/tests/unit/metricsConfig_test.go @@ -0,0 +1,204 @@ +// SPDX-License-Identifier: Apache-2.0 +// Copyright 2024 The Falco Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package unit + +import ( + "path/filepath" + "testing" + + "github.com/gruntwork-io/terratest/modules/helm" + "github.com/stretchr/testify/require" + "gopkg.in/yaml.v3" + corev1 "k8s.io/api/core/v1" +) + +type metricsConfig struct { + Enabled bool `yaml:"enabled"` + ConvertMemoryToMB bool `yaml:"convert_memory_to_mb"` + IncludeEmptyValues bool `yaml:"include_empty_values"` + KernelEventCountersEnabled bool `yaml:"kernel_event_counters_enabled"` + ResourceUtilizationEnabled bool `yaml:"resource_utilization_enabled"` + RulesCountersEnabled bool `yaml:"rules_counters_enabled"` + LibbpfStatsEnabled bool `yaml:"libbpf_stats_enabled"` + OutputRule bool `yaml:"output_rule"` + StateCountersEnabled bool `yaml:"state_counters_enabled"` + Interval string `yaml:"interval"` +} + +type webServerConfig struct { + Enabled bool `yaml:"enabled"` + K8sHealthzEndpoint string `yaml:"k8s_healthz_endpoint"` + ListenPort string `yaml:"listen_port"` + PrometheusMetricsEnabled bool `yaml:"prometheus_metrics_enabled"` + SSLCertificate string `yaml:"ssl_certificate"` + SSLEnabled bool `yaml:"ssl_enabled"` + Threadiness int `yaml:"threadiness"` +} + +func TestMetricsConfigInFalcoConfig(t *testing.T) { + t.Parallel() + + helmChartPath, err := filepath.Abs(chartPath) + require.NoError(t, err) + + testCases := []struct { + name string + values map[string]string + expected func(t *testing.T, metricsConfig, webServerConfig any) + }{ + { + "defaultValues", + nil, + func(t *testing.T, metricsConfig, webServerConfig any) { + require.Len(t, metricsConfig, 10, "should have ten items") + + metrics, err := getMetricsConfig(metricsConfig) + require.NoError(t, err) + require.NotNil(t, metrics) + require.True(t, metrics.ConvertMemoryToMB) + require.False(t, metrics.Enabled) + require.False(t, metrics.IncludeEmptyValues) + require.True(t, metrics.KernelEventCountersEnabled) + require.True(t, metrics.ResourceUtilizationEnabled) + require.True(t, metrics.RulesCountersEnabled) + require.Equal(t, "1h", metrics.Interval) + require.True(t, metrics.LibbpfStatsEnabled) + require.True(t, metrics.OutputRule) + require.True(t, metrics.StateCountersEnabled) + + webServer, err := getWebServerConfig(webServerConfig) + require.NoError(t, err) + require.NotNil(t, webServer) + require.True(t, webServer.Enabled) + require.False(t, webServer.PrometheusMetricsEnabled) + }, + }, + { + "metricsEnabled", + map[string]string{ + "metrics.enabled": "true", + }, + func(t *testing.T, metricsConfig, webServerConfig any) { + require.Len(t, metricsConfig, 10, "should have ten items") + + metrics, err := getMetricsConfig(metricsConfig) + require.NoError(t, err) + require.NotNil(t, metrics) + require.True(t, metrics.ConvertMemoryToMB) + require.True(t, metrics.Enabled) + require.False(t, metrics.IncludeEmptyValues) + require.True(t, metrics.KernelEventCountersEnabled) + require.True(t, metrics.ResourceUtilizationEnabled) + require.True(t, metrics.RulesCountersEnabled) + require.Equal(t, "1h", metrics.Interval) + require.True(t, metrics.LibbpfStatsEnabled) + require.False(t, metrics.OutputRule) + require.True(t, metrics.StateCountersEnabled) + + webServer, err := getWebServerConfig(webServerConfig) + require.NoError(t, err) + require.NotNil(t, webServer) + require.True(t, webServer.Enabled) + require.True(t, webServer.PrometheusMetricsEnabled) + }, + }, + { + "Flip/Change Values", + map[string]string{ + "metrics.enabled": "true", + "metrics.convertMemoryToMB": "false", + "metrics.includeEmptyValues": "true", + "metrics.kernelEventCountersEnabled": "false", + "metrics.resourceUtilizationEnabled": "false", + "metrics.rulesCountersEnabled": "false", + "metrics.libbpfStatsEnabled": "false", + "metrics.outputRule": "false", + "metrics.stateCountersEnabled": "false", + "metrics.interval": "1s", + }, + func(t *testing.T, metricsConfig, webServerConfig any) { + require.Len(t, metricsConfig, 10, "should have ten items") + + metrics, err := getMetricsConfig(metricsConfig) + require.NoError(t, err) + require.NotNil(t, metrics) + require.False(t, metrics.ConvertMemoryToMB) + require.True(t, metrics.Enabled) + require.True(t, metrics.IncludeEmptyValues) + require.False(t, metrics.KernelEventCountersEnabled) + require.False(t, metrics.ResourceUtilizationEnabled) + require.False(t, metrics.RulesCountersEnabled) + require.Equal(t, "1s", metrics.Interval) + require.False(t, metrics.LibbpfStatsEnabled) + require.False(t, metrics.OutputRule) + require.False(t, metrics.StateCountersEnabled) + + webServer, err := getWebServerConfig(webServerConfig) + require.NoError(t, err) + require.NotNil(t, webServer) + require.True(t, webServer.Enabled) + require.True(t, webServer.PrometheusMetricsEnabled) + }, + }, + } + + for _, testCase := range testCases { + testCase := testCase + + t.Run(testCase.name, func(t *testing.T) { + t.Parallel() + + options := &helm.Options{SetValues: testCase.values} + output := helm.RenderTemplate(t, options, helmChartPath, releaseName, []string{"templates/configmap.yaml"}) + + var cm corev1.ConfigMap + helm.UnmarshalK8SYaml(t, output, &cm) + var config map[string]interface{} + + helm.UnmarshalK8SYaml(t, cm.Data["falco.yaml"], &config) + metrics := config["metrics"] + webServer := config["webserver"] + testCase.expected(t, metrics, webServer) + }) + } +} + +func getMetricsConfig(config any) (*metricsConfig, error) { + var metrics metricsConfig + + metricsByte, err := yaml.Marshal(config) + if err != nil { + return nil, err + } + + if err := yaml.Unmarshal(metricsByte, &metrics); err != nil { + return nil, err + } + + return &metrics, nil +} + +func getWebServerConfig(config any) (*webServerConfig, error) { + var webServer webServerConfig + webServerByte, err := yaml.Marshal(config) + if err != nil { + return nil, err + } + if err := yaml.Unmarshal(webServerByte, &webServer); err != nil { + return nil, err + } + return &webServer, nil +} diff --git a/charts/falco/tests/unit/serviceMonitorTemplate_test.go b/charts/falco/tests/unit/serviceMonitorTemplate_test.go new file mode 100644 index 000000000..b2fcb3745 --- /dev/null +++ b/charts/falco/tests/unit/serviceMonitorTemplate_test.go @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: Apache-2.0 +// Copyright 2024 The Falco Authors +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package unit + +import ( + "encoding/json" + "path/filepath" + "reflect" + "testing" + + "github.com/gruntwork-io/terratest/modules/helm" + monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1" + "github.com/stretchr/testify/require" + "github.com/stretchr/testify/suite" +) + +type serviceMonitorTemplateTest struct { + suite.Suite + chartPath string + releaseName string + namespace string + templates []string +} + +func TestServiceMonitorTemplate(t *testing.T) { + t.Parallel() + + chartFullPath, err := filepath.Abs(chartPath) + require.NoError(t, err) + + suite.Run(t, &serviceMonitorTemplateTest{ + Suite: suite.Suite{}, + chartPath: chartFullPath, + releaseName: "falco-test", + namespace: "falco-namespace-test", + templates: []string{"templates/serviceMonitor.yaml"}, + }) +} + +func (s *serviceMonitorTemplateTest) TestCreationDefaultValues() { + // Render the servicemonitor and check that it has not been rendered. + _, err := helm.RenderTemplateE(s.T(), &helm.Options{}, s.chartPath, s.releaseName, s.templates) + s.Error(err, "should error") + s.Equal("error while running command: exit status 1; Error: could not find template templates/serviceMonitor.yaml in chart", err.Error()) +} + +func (s *serviceMonitorTemplateTest) TestEndpoint() { + defaultEndpointsJSON := `[ + { + "port": "metrics", + "interval": "15s", + "scrapeTimeout": "10s", + "honorLabels": true, + "path": "/metrics", + "scheme": "http" + } +]` + var defaultEndpoints []monitoringv1.Endpoint + err := json.Unmarshal([]byte(defaultEndpointsJSON), &defaultEndpoints) + s.NoError(err) + + options := &helm.Options{SetValues: map[string]string{"serviceMonitor.create": "true"}} + output := helm.RenderTemplate(s.T(), options, s.chartPath, s.releaseName, s.templates) + + var svcMonitor monitoringv1.ServiceMonitor + helm.UnmarshalK8SYaml(s.T(), output, &svcMonitor) + + s.Len(svcMonitor.Spec.Endpoints, 1, "should have only one endpoint") + s.True(reflect.DeepEqual(svcMonitor.Spec.Endpoints[0], defaultEndpoints[0])) +} + +func (s *serviceMonitorTemplateTest) TestNamespaceSelector() { + options := &helm.Options{SetValues: map[string]string{"serviceMonitor.create": "true"}} + output := helm.RenderTemplate(s.T(), options, s.chartPath, s.releaseName, s.templates) + + var svcMonitor monitoringv1.ServiceMonitor + helm.UnmarshalK8SYaml(s.T(), output, &svcMonitor) + s.Len(svcMonitor.Spec.NamespaceSelector.MatchNames, 1) + s.Equal("default", svcMonitor.Spec.NamespaceSelector.MatchNames[0]) +} diff --git a/charts/falco/values.yaml b/charts/falco/values.yaml index 66a732a8a..15e7dd1d9 100644 --- a/charts/falco/values.yaml +++ b/charts/falco/values.yaml @@ -166,6 +166,99 @@ services: # nodePort: 30007 # protocol: TCP +# -- metrics configures Falco to enable and expose the metrics. +metrics: + # -- enabled specifies whether the metrics should be enabled. + enabled: false + # -- interval is stats interval in Falco follows the time duration definitions + # used by Prometheus. + # https://prometheus.io/docs/prometheus/latest/querying/basics/#time-durations + # Time durations are specified as a number, followed immediately by one of the + # following units: + # ms - millisecond + # s - second + # m - minute + # h - hour + # d - day - assuming a day has always 24h + # w - week - assuming a week has always 7d + # y - year - assuming a year has always 365d + # Example of a valid time duration: 1h30m20s10ms + # A minimum interval of 100ms is enforced for metric collection. However, for + # production environments, we recommend selecting one of the following intervals + # for optimal monitoring: + # 15m + # 30m + # 1h + # 4h + # 6h + interval: 1h + # -- outputRule enables seamless metrics and performance monitoring, we + # recommend emitting metrics as the rule "Falco internal: metrics snapshot". + # This option is particularly useful when Falco logs are preserved in a data + # lake. Please note that to use this option, the Falco rules config `priority` + # must be set to `info` at a minimum. + outputRule: false + # -- rulesCountersEnabled specifies whether the counts for each rule should be emitted. + rulesCountersEnabled: true + # -- resourceUtilizationEnabled`: Emit CPU and memory usage metrics. CPU usage + # is reported as a percentage of one CPU and can be normalized to the total + # number of CPUs to determine overall usage. Memory metrics are provided in raw + # units (`kb` for `RSS`, `PSS` and `VSZ` or `bytes` for `container_memory_used`) + # and can be uniformly converted to megabytes (MB) using the + # `convert_memory_to_mb` functionality. In environments such as Kubernetes when + # deployed as daemonset, it is crucial to track Falco's container memory usage. + # To customize the path of the memory metric file, you can create an environment + # variable named `FALCO_CGROUP_MEM_PATH` and set it to the desired file path. By + # default, Falco uses the file `/sys/fs/cgroup/memory/memory.usage_in_bytes` to + # monitor container memory usage, which aligns with Kubernetes' + # `container_memory_working_set_bytes` metric. Finally, we emit the overall host + # CPU and memory usages, along with the total number of processes and open file + # descriptors (fds) on the host, obtained from the proc file system unrelated to + # Falco's monitoring. These metrics help assess Falco's usage in relation to the + # server's workload intensity. + resourceUtilizationEnabled: true + # stateCountersEnabled emits counters related to Falco's state engine, including + # added, removed threads or file descriptors (fds), and failed lookup, store, or + # retrieve actions in relation to Falco's underlying process cache table (threadtable). + # We also log the number of currently cached containers if applicable. + stateCountersEnabled: true + # kernelEventCountersEnabled emits kernel side event and drop counters, as + # an alternative to `syscall_event_drops`, but with some differences. These + # counters reflect monotonic values since Falco's start and are exported at a + # constant stats interval. + kernelEventCountersEnabled: true + # -- libbpfStatsEnabled exposes statistics similar to `bpftool prog show`, + # providing information such as the number of invocations of each BPF program + # attached by Falco and the time spent in each program measured in nanoseconds. + # To enable this feature, the kernel must be >= 5.1, and the kernel + # configuration `/proc/sys/kernel/bpf_stats_enabled` must be set. This option, + # or an equivalent statistics feature, is not available for non `*bpf*` drivers. + # Additionally, please be aware that the current implementation of `libbpf` does + # not support granularity of statistics at the bpf tail call level. + libbpfStatsEnabled: true + # -- convertMemoryToMB specifies whether the memory should be converted to mb. + convertMemoryToMB: true + # -- includeEmptyValues specifies whether the empty values should be included in the metrics. + includeEmptyValues: false + # -- service exposes the metrics service to be accessed from within the cluster. + # ref: https://kubernetes.io/docs/concepts/services-networking/service/ + service: + # -- create specifies whether a service should be created. + create: true + # -- type denotes the service type. Setting it to "ClusterIP" we ensure that are accessible + # from within the cluster. + type: ClusterIP + # -- ports denotes all the ports on which the Service will listen. + ports: + # -- metrics denotes a listening service named "metrics". + metrics: + # -- port is the port on which the Service will listen. + port: 8765 + # -- targetPort is the port on which the Pod is listening. + targetPort: 8765 + # -- protocol specifies the network protocol that the Service should use for the associated port. + protocol: "TCP" + # File access configuration (scenario requirement) mounts: # -- A list of volumes you want to add to the Falco pods. @@ -450,6 +543,48 @@ falcoctl: # -- See the fields of the artifact.install section. pluginsDir: /plugins +# -- serviceMonitor holds the configuration for the ServiceMonitor CRD. +# A ServiceMonitor is a custom resource definition (CRD) used to configure how Prometheus should +# discover and scrape metrics from the Falco service. +serviceMonitor: + # -- create specifies whether a ServiceMonitor CRD should be created for a prometheus operator. + # https://github.com/coreos/prometheus-operator + # Enable it only if the ServiceMonitor CRD is installed in your cluster. + create: false + # -- path at which the metrics are exposed by Falco. + path: /metrics + # -- labels set of labels to be applied to the ServiceMonitor resource. + # If your Prometheus deployment is configured to use serviceMonitorSelector, then add the right + # label here in order for the ServiceMonitor to be selected for target discovery. + labels: {} + # -- selector set of labels that should match the labels on the Service targeted by the current serviceMonitor. + selector: {} + # -- interval specifies the time interval at which Prometheus should scrape metrics from the service. + interval: 15s + # -- scheme specifies network protocol used by the metrics endpoint. In this case HTTP. + scheme: http + # -- tlsConfig specifies TLS (Transport Layer Security) configuration for secure communication when + # scraping metrics from a service. It allows you to define the details of the TLS connection, such as + # CA certificate, client certificate, and client key. Currently, the k8s-metacollector does not support + # TLS configuration for the metrics endpoint. + tlsConfig: {} + # insecureSkipVerify: false + # caFile: /path/to/ca.crt + # certFile: /path/to/client.crt + # keyFile: /path/to/client.key + # -- scrapeTimeout determines the maximum time Prometheus should wait for a target to respond to a scrape request. + # If the target does not respond within the specified timeout, Prometheus considers the scrape as failed for + # that target. + scrapeTimeout: 10s + # -- relabelings configures the relabeling rules to apply the target’s metadata labels. + relabelings: [] + # -- targetLabels defines the labels which are transferred from the associated Kubernetes service object onto the ingested metrics. + targetLabels: [] + # -- endpointPort is the port in the Falco service that exposes the metrics service. Change the value if you deploy a custom service + # for Falco's metrics. + endpointPort: "metrics" + + ###################### # falco.yaml config # ######################