Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod termination fault #359

Merged
merged 8 commits into from
Oct 26, 2023
Merged

Pod termination fault #359

merged 8 commits into from
Oct 26, 2023

Conversation

pablochacin
Copy link
Collaborator

@pablochacin pablochacin commented Oct 24, 2023

Description

Introduces the Terminate pod fault.

Documentation: grafana/k6-docs#1381

Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works.
  • I have run linter locally (make lint) and all checks pass.
  • I have run tests locally (make test) and all tests pass.
  • I have run relevant integration test locally (make integration-xxx for affected packages)
  • I have run relevant e2e test locally (make e2e-xxx for disruptors, or cluster related changes)
  • Any dependent changes have been merged and published in downstream modules

@pablochacin pablochacin force-pushed the pod-termination-fault branch 2 times, most recently from 83e47f7 to d369de2 Compare October 25, 2023 08:12
@pablochacin pablochacin marked this pull request as ready for review October 25, 2023 08:16
Copy link
Collaborator

@roobre roobre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me, left minor comments about error messages and style

@@ -55,22 +57,24 @@ func Test_PodDisruptor(t *testing.T) {
return
}

t.Run("Test fault injection", func(t *testing.T) {
t.Run("TProtocol fault injection", func(t *testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo

d.terminatePods(fault)
`,
expectError: true,
},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to add a test for an empty fault object as well

Suggested change
},
},
{
description: "Terminate Pods (empty fault)",
script: `
d.terminatePods({})
`,
expectError: true,
},

{
description: "Terminate Pods (missing argument)",
script: `

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray newline

Suggested change

Comment on lines +149 to +150
func (d *podDisruptor) Targets(_ context.Context) ([]string, error) {
return utils.PodNames(d.targets), nil
Copy link
Collaborator

@roobre roobre Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the two implementations of Targets no longer use a context neither return an error, should we update the interface to reflect this?

image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure we will need the context to retrieve the list of targets instead of using a pre-stored list, and this process may return an error. I prefer to leave it as it is instead of changing the implementation back and forth.

type PodTerminationFault struct {
// Count indicates how many pods to terminate. Can be a number or a percentage or targets
Count intstr.IntOrString
// Timeout specifies the maximum time to wait for a pod to terminate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that Terminate will return an error if the pod is still in Terminating state after this amount of time passes? If so, it might be good to include it in the comment above just for clarity.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm putting the comment in the method TerminatePods as it where the error is returned

expectError: false,
},
{
title: "pod does not exists",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo

Suggested change
title: "pod does not exists",
title: "pod does not exist",

@@ -14,7 +15,7 @@ import (
"k8s.io/apimachinery/pkg/util/intstr"
)

// RunPod creates a pod and waits it for be running
// RunPod creates apreplicas pod and waits it for be running
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a leftover from a previous iteration where RunPod received a replicas arg.

// The ingress routes request that specify the service's name as host to this service.
func ExposeApp(
k8s kubernetes.Kubernetes,
namespace string,
pod corev1.Pod,
replicas int,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return an error if replicas == 0? Seems like it could be the source of one of annoying to debug situations where nothing happens but no error is returned.

Comment on lines 73 to 76
names := []string{}
for _, pod := range pods {
names = append(names, pod.Name)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably preallocate the names slice here. It's not like it's the hot path, but maybe just as a "best practices" approach:

Suggested change
names := []string{}
for _, pod := range pods {
names = append(names, pod.Name)
}
names := make([]string, 0, len(pods))
for _, pod := range pods {
names = append(names, pod.Name)
}

}

if sampleSize > len(pods) {
return nil, fmt.Errorf("not enough elements to sample")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to add the numbers in play to the error message so it is easier to trace what's going on from logs:

Suggested change
return nil, fmt.Errorf("not enough elements to sample")
return nil, fmt.Errorf("cannot sample %d pods out of a total of %d", sampleSize, len(pods))

Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
Signed-off-by: Pablo Chacin <pablochacin@gmail.com>
@pablochacin pablochacin merged commit 65f90aa into main Oct 26, 2023
8 checks passed
@pablochacin pablochacin deleted the pod-termination-fault branch October 26, 2023 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants