Skip to content

Latest commit

 

History

History
146 lines (93 loc) · 7.86 KB

TESTING.md

File metadata and controls

146 lines (93 loc) · 7.86 KB

When contributing code to Prometheus-Operator, you'll notice that every Pull Request will run against an extensive test suite. Among an extensive list of benefits that tests brings to the Project's overall health and reliability, it can be the reviewer's and contributors's best friend during development:

  • Test cases serve as documentation, providing insights into the expected behavior of the software.
  • Testing can prevent regressions by verifying that new changes don't break existing functionality.
  • Running tests locally accelerate the feedback loop, removing the dependency that contributors might have on CI when working on a Pull Request.

This document will focus on teaching you about the different test suites that we currently have and how to run different scenarios to help your development experience!

Test categories

Unit tests

Unit tests are used to test particular code snippets in isolation. They are your best ally when looking for quick feedback loops in a particular function.

Imagine you're working on a PR that adds a new field to the ScrapeConfig CRD and you want to test if your change is reflected to the configmap. Instead of creating a full Kubernetes cluster, installing all the CRDs, running the Prometheus-Operator, deploying a Prometheus resource with ScrapeConfigSelectors and finally check if your change made it to the live object, you could simply write or extend a unit test for the configmap generation.

Here is an example test that checks if the string generated from ScrapeConfigs are equal to an expected file.

// When adding new test cases the developer should specify a name, a ScrapeConfig Spec
// (scSpec) and an expected config in golden file (.golden file in testdata folder). (Optional) It's also possible to specify a
// function (patchProm) that modifies the default Prometheus CR used if necessary for the test
// case.
func TestScrapeConfigSpecConfig(t *testing.T) {
refreshInterval := monitoringv1.Duration("5m")
for _, tc := range []struct {
name string
patchProm func(*monitoringv1.Prometheus)
scSpec monitoringv1alpha1.ScrapeConfigSpec
golden string
}{

Unit tests can be run with:

make test-unit

They can also be run for particular packages:

go test ./pkg/prometheus/server

Or even particular functions:

go test -run ^TestPodLabelsAnnotations$ ./pkg/prometheus/server

Testing multi line string comparison - Golden files

Golden files are plain-text documents designed to facilitate the validation of lengthy strings. They come in handy when, for instance, you need to test a Prometheus configuration that's generated using Go structures. You can marshal this configuration into YAML and then compare it against a static reference to ensure a match. Golden files offer an elegant solution to this challenge, sparing you the need to hard-code the static configuration directly into your test code.

In the example below, we're generating the Prometheus configuration (which can easily have 100+ lines for each individual test) and comparing it against a golden file:

func TestGlobalSettings(t *testing.T) {
var (
expectedBodySizeLimit monitoringv1.ByteSize = "1000MB"
expectedSampleLimit uint64 = 10000
expectedTargetLimit uint64 = 1000
expectedLabelLimit uint64 = 50
expectedLabelNameLengthLimit uint64 = 40
expectedLabelValueLengthLimit uint64 = 30
expectedkeepDroppedTargets uint64 = 50
)
for _, tc := range []struct {
Scenario string
EvaluationInterval monitoringv1.Duration
ScrapeInterval monitoringv1.Duration
ScrapeTimeout monitoringv1.Duration
ExternalLabels map[string]string
PrometheusExternalLabelName *string
ReplicaExternalLabelName *string
QueryLogFile string
Version string
BodySizeLimit *monitoringv1.ByteSize
SampleLimit *uint64
TargetLimit *uint64
LabelLimit *uint64
LabelNameLengthLimit *uint64
LabelValueLengthLimit *uint64
KeepDroppedTargets *uint64
ExpectError bool
Golden string
}{
{
Scenario: "valid config",
Version: "v2.15.2",
ScrapeInterval: "15s",
EvaluationInterval: "30s",
Golden: "global_settings_valid_config_v2.15.2.golden",
},
{
Scenario: "invalid scrape timeout specified when scrape interval specified",
Version: "v2.30.0",
ScrapeInterval: "30s",
ScrapeTimeout: "60s",
Golden: "invalid_scrape_timeout_specified_when_scrape_interval_specified.golden",
ExpectError: true,
},
{
Scenario: "valid scrape timeout along with valid scrape interval specified",
Version: "v2.15.2",
ScrapeInterval: "60s",
ScrapeTimeout: "10s",
EvaluationInterval: "30s",
Golden: "valid_scrape_timeout_along_with_valid_scrape_interval_specified.golden",
},
{
Scenario: "external label specified",
Version: "v2.15.2",
ScrapeInterval: "30s",
EvaluationInterval: "30s",
ExternalLabels: map[string]string{
"key1": "value1",
"key2": "value2",
},
Golden: "external_label_specified.golden",
},
{
Scenario: "external label specified along with reserved labels",
Version: "v2.45.0",
ScrapeInterval: "30s",
EvaluationInterval: "30s",
ExternalLabels: map[string]string{
"prometheus_replica": "1",
"prometheus": "prometheus-k8s-1",
"some-other-key": "some-value",
},
PrometheusExternalLabelName: ptr.To("prometheus"),
ReplicaExternalLabelName: ptr.To("prometheus_replica"),
Golden: "external_label_specified_along_with_reserved_labels.golden",
},
{
Scenario: "query log file",
Version: "v2.16.0",
ScrapeInterval: "30s",
EvaluationInterval: "30s",
QueryLogFile: "test.log",
Golden: "query_log_file.golden",
},
{
Scenario: "valid global limits",
Version: "v2.45.0",
ScrapeInterval: "30s",
EvaluationInterval: "30s",
BodySizeLimit: &expectedBodySizeLimit,
SampleLimit: &expectedSampleLimit,
TargetLimit: &expectedTargetLimit,
Golden: "valid_global_limits.golden",
},
{
Scenario: "valid global config with label limits",
Version: "v2.45.0",
ScrapeInterval: "30s",
EvaluationInterval: "30s",
BodySizeLimit: &expectedBodySizeLimit,
SampleLimit: &expectedSampleLimit,
TargetLimit: &expectedTargetLimit,
LabelLimit: &expectedLabelLimit,
LabelNameLengthLimit: &expectedLabelNameLengthLimit,
LabelValueLengthLimit: &expectedLabelValueLengthLimit,
Golden: "valid_global_config_with_label_limits.golden",
},
{
Scenario: "valid global config with keep dropped targets",
Version: "v2.47.0",
ScrapeInterval: "30s",
EvaluationInterval: "30s",
KeepDroppedTargets: &expectedkeepDroppedTargets,
Golden: "valid_global_config_with_keep_dropped_targets.golden",
},
} {
p := &monitoringv1.Prometheus{
ObjectMeta: metav1.ObjectMeta{},
Spec: monitoringv1.PrometheusSpec{
CommonPrometheusFields: monitoringv1.CommonPrometheusFields{
ScrapeInterval: tc.ScrapeInterval,
ScrapeTimeout: tc.ScrapeTimeout,
ExternalLabels: tc.ExternalLabels,
PrometheusExternalLabelName: tc.PrometheusExternalLabelName,
ReplicaExternalLabelName: tc.ReplicaExternalLabelName,
Version: tc.Version,
TracingConfig: nil,
BodySizeLimit: tc.BodySizeLimit,
SampleLimit: tc.SampleLimit,
TargetLimit: tc.TargetLimit,
LabelLimit: tc.LabelLimit,
LabelNameLengthLimit: tc.LabelNameLengthLimit,
LabelValueLengthLimit: tc.LabelValueLengthLimit,
KeepDroppedTargets: tc.KeepDroppedTargets,
},
EvaluationInterval: tc.EvaluationInterval,
QueryLogFile: tc.QueryLogFile,
},
}
cg := mustNewConfigGenerator(t, p)
t.Run(fmt.Sprintf("case %s", tc.Scenario), func(t *testing.T) {
cfg, err := cg.GenerateServerConfiguration(
context.Background(),
p.Spec.EvaluationInterval,
p.Spec.QueryLogFile,
p.Spec.RuleSelector,
p.Spec.Exemplars,
p.Spec.TSDB,
p.Spec.Alerting,
p.Spec.RemoteRead,
map[string]*monitoringv1.ServiceMonitor{},
nil,
nil,
nil,
&assets.Store{},
nil,
nil,
nil,
nil,
)
if tc.ExpectError {
require.Error(t, err)
} else {
require.NoError(t, err)
}
golden.Assert(t, string(cfg), tc.Golden)
})
}
}

If not for golden files, the test above, instead of ~150 lines, would easily require around ~1000 lines. The usage of golden files help us maintain test suites with several multi line strings comparison without sacrificing test readability.

Updating Golden Files

There are contributions, e.g. adding a new required field to an existing configuration, that require to update several golden files at once. This can easily be done with the command below:

make test-unit-update-golden

End-to-end tests

Sometimes, running tests in isolation is not enough and we really want test the behavior of Prometheus-Operator when running in a working Kubernetes cluster. For those occasions, end-to-end tests are our choice.

To run e2e-tests locally, first start a Kubernetes cluster. We recommend KinD because it is lightweight (it can run on small notebooks) and this is what the project's CI uses. MiniKube is also another option.

For manual testing, you can use the utility script scripts/run-external.sh, it will check all the requirements and run your local version of the Prometheus Operator on your Kind cluster:

./scripts/run-external.sh -c

Building images and loading them into your cluster

Using docker with Kind

Before running automated end-to-end tests, you need run the following command to make images and load it in your local cluster:

make image

for n in "prometheus-operator" "prometheus-config-reloader" "admission-webhook"; do kind load docker-image "quay.io/prometheus-operator/$n:$(git rev-parse --short HEAD)"; done;

Using podman with Kind

When running kind on MacOS using podman, it is recommended to create podman machine with 4 CPUs and 8 GiB memory. Less resources might cause end to end tests to fail because of lack of resources in the cluster.

podman machine init --cpus=4 --memory=8192 --rootful --now

Before running automated end-to-end tests, you need run the following command to make images and load it in your local cluster:

CONTAINER_CLI=podman make image

for n in "prometheus-operator" "prometheus-config-reloader" "admission-webhook"; do podman save --quiet -o tmp/$n.tar "quay.io/prometheus-operator/$n:$(git rev-parse --short HEAD)"; kind load image-archive tmp/$n.tar; done

Running the automated E2E Tests

To run the automated end-to-end tests, run the following command:

make test-e2e

make test-e2e will run the complete end-to-end test suite. Those are the same tests we run in Pull Requests pipelines and it will make sure all features requirements amongst all controllers are working.

When working on a contribution though, it's rare that you'll need to make a change that impacts all controllers at once. Running the complete test suite takes a long time, so you might want to run only the tests that are relevant to your change while developing it.

Skipping test suites

func skipPrometheusTests(t *testing.T) {
if os.Getenv("EXCLUDE_PROMETHEUS_TESTS") != "" {
t.Skip("Skipping Prometheus tests")
}
}

As shown above, particular test suites can be skipped with Environment Variables. You can also look at our CI pipeline as example. Although we always run all tests in CI, skipping irrelevant tests are great during development as they shorten the feedback loop.

The following Makefile targets can run specific end-to-end tests:

  • make test-e2e-alertmanager - Will run Alertmanager tests.
  • make test-e2e-thanos-ruler - Will run Thanos-Ruler tests.
  • make test-e2e-prometheus - Will run Prometheus tests with limited namespace permissions.
  • make test-e2e-prometheus-all-namespaces - Will run regular Prometheus tests.
  • make test-e2e-operator-upgrade - Will validate that a monitoring stack managed by the previous version of Prometheus-Operator will continue to work after an upgrade to the current version.
  • make test-e2e-prometheus-upgrade - Will validate that a series of Prometheus versions can be sequentially upgraded.

Running just a particular end-to-end test

A few test suites can easily take more than an hour even when running in powerful notebooks. If you're debugging a particular test, it might be advantageous for you to comment code just to accelerate your tests.

// TestDenylist tests the Prometheus Operator configured not to watch specific namespaces.
func TestDenylist(t *testing.T) {
	skipPrometheusTests(t)
	testFuncs := map[string]func(t *testing.T){
+		// "Prometheus":     testDenyPrometheus,
+		// "ServiceMonitor": testDenyServiceMonitor,
-		"Prometheus":     testDenyPrometheus,
-		"ServiceMonitor": testDenyServiceMonitor,
		"ThanosRuler":    testDenyThanosRuler,
	}

	for name, f := range testFuncs {
		t.Run(name, f)
	}
}

In the example above we're commenting 2 tests, in combination with Environment Variables to skip other test suites, to make sure we focus on what really matters to us at the moment. Just don't forget to remove the comments once you're done!!