Pod termination fault #359

pablochacin · 2023-10-24T13:28:27Z

Description

Introduces the Terminate pod fault.

Documentation: grafana/k6-docs#1381

Checklist:

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works.
I have run linter locally (make lint) and all checks pass.
I have run tests locally (make test) and all tests pass.
I have run relevant integration test locally (make integration-xxx for affected packages)
I have run relevant e2e test locally (make e2e-xxx for disruptors, or cluster related changes)
Any dependent changes have been merged and published in downstream modules

roobre

Looking good to me, left minor comments about error messages and style

roobre · 2023-10-25T12:07:40Z

e2e/disruptors/pod_e2e_test.go

@@ -55,22 +57,24 @@ func Test_PodDisruptor(t *testing.T) {
 		return
 	}

-	t.Run("Test fault injection", func(t *testing.T) {
+	t.Run("TProtocol fault injection", func(t *testing.T) {


roobre · 2023-10-25T12:35:52Z

pkg/api/api_test.go

+			d.terminatePods(fault)
+			`,
+			expectError: true,
+		},


I'd suggest to add a test for an empty fault object as well

Suggested change

},

},

{

description: "Terminate Pods (empty fault)",

script: `

d.terminatePods({})

`,

expectError: true,

},

roobre · 2023-10-25T12:36:09Z

pkg/api/api_test.go

+		{
+			description: "Terminate Pods (missing argument)",
+			script: `
+


Stray newline

Suggested change

roobre · 2023-10-25T13:02:21Z

pkg/disruptors/pod.go

+func (d *podDisruptor) Targets(_ context.Context) ([]string, error) {
+	return utils.PodNames(d.targets), nil


Seems like the two implementations of Targets no longer use a context neither return an error, should we update the interface to reflect this?

I'm pretty sure we will need the context to retrieve the list of targets instead of using a pre-stored list, and this process may return an error. I prefer to leave it as it is instead of changing the implementation back and forth.

roobre · 2023-10-25T13:12:14Z

pkg/disruptors/terminate.go

+type PodTerminationFault struct {
+	// Count indicates how many pods to terminate. Can be a number or a percentage or targets
+	Count intstr.IntOrString
+	// Timeout specifies the maximum time to wait for a pod to terminate


Does this mean that Terminate will return an error if the pod is still in Terminating state after this amount of time passes? If so, it might be good to include it in the comment above just for clarity.

I'm putting the comment in the method TerminatePods as it where the error is returned

roobre · 2023-10-25T13:17:25Z

pkg/kubernetes/helpers/pods_test.go

+			expectError: false,
+		},
+		{
+			title:     "pod does not exists",


Small typo

Suggested change

title: "pod does not exists",

title: "pod does not exist",

roobre · 2023-10-25T13:18:58Z

pkg/testutils/e2e/deploy/deploy.go

@@ -14,7 +15,7 @@ import (
 	"k8s.io/apimachinery/pkg/util/intstr"
 )

-// RunPod creates a pod and waits it for be running
+// RunPod creates apreplicas pod and waits it for be running


Seems like a leftover from a previous iteration where RunPod received a replicas arg.

roobre · 2023-10-25T13:20:42Z

pkg/testutils/e2e/deploy/deploy.go

 // The ingress routes request that specify the service's name as host to this service.
 func ExposeApp(
 	k8s kubernetes.Kubernetes,
 	namespace string,
 	pod corev1.Pod,
+	replicas int,


Should we return an error if replicas == 0? Seems like it could be the source of one of annoying to debug situations where nothing happens but no error is returned.

roobre · 2023-10-25T13:24:10Z

pkg/utils/kubernetes.go

+	names := []string{}
+	for _, pod := range pods {
+		names = append(names, pod.Name)
+	}


We can probably preallocate the names slice here. It's not like it's the hot path, but maybe just as a "best practices" approach:

Suggested change

names := []string{}

for _, pod := range pods {

names = append(names, pod.Name)

}

names := make([]string, 0, len(pods))

for _, pod := range pods {

names = append(names, pod.Name)

}

roobre · 2023-10-25T13:27:18Z

pkg/utils/kubernetes.go

+	}
+
+	if sampleSize > len(pods) {
+		return nil, fmt.Errorf("not enough elements to sample")


I'd suggest to add the numbers in play to the error message so it is easier to trace what's going on from logs:

Suggested change

return nil, fmt.Errorf("not enough elements to sample")

return nil, fmt.Errorf("cannot sample %d pods out of a total of %d", sampleSize, len(pods))

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

pablochacin force-pushed the pod-termination-fault branch 2 times, most recently from 83e47f7 to d369de2 Compare October 25, 2023 08:12

pablochacin marked this pull request as ready for review October 25, 2023 08:16

pablochacin requested a review from roobre October 25, 2023 08:16

roobre reviewed Oct 25, 2023

View reviewed changes

pablochacin added 8 commits October 25, 2023 17:53

Make PodController stateless

90b798e

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

Add helpers for deleting pods

76882eb

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

Implement TerminatePod fault

4336626

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

Implement e2e test for TerminatePod

93fce2e

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

Add helper for handing percentages

738e229

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

Use int or percentage in pod termination count

e42eb59

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

Rename Pod termination fault

e87dd78

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

Expose pod termination fault in JS API

e38a6db

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>

pablochacin force-pushed the pod-termination-fault branch from 3a0d652 to e38a6db Compare October 26, 2023 08:31

pablochacin merged commit 65f90aa into main Oct 26, 2023
8 checks passed

pablochacin deleted the pod-termination-fault branch October 26, 2023 08:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pod termination fault #359

Pod termination fault #359

pablochacin commented Oct 24, 2023 •

edited

roobre left a comment

roobre Oct 25, 2023

roobre Oct 25, 2023

roobre Oct 25, 2023

roobre Oct 25, 2023 •

edited

pablochacin Oct 25, 2023

roobre Oct 25, 2023

pablochacin Oct 25, 2023

roobre Oct 25, 2023

roobre Oct 25, 2023

roobre Oct 25, 2023

roobre Oct 25, 2023

roobre Oct 25, 2023

		func (d *podDisruptor) Targets(_ context.Context) ([]string, error) {
		return utils.PodNames(d.targets), nil

	return nil, fmt.Errorf("not enough elements to sample")
	return nil, fmt.Errorf("cannot sample %d pods out of a total of %d", sampleSize, len(pods))

Pod termination fault #359

Pod termination fault #359

Conversation

pablochacin commented Oct 24, 2023 • edited

Description

Checklist:

roobre left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roobre Oct 25, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pablochacin commented Oct 24, 2023 •

edited

roobre Oct 25, 2023 •

edited