Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-2388 rptest: remove node from started list after kill #17913

Merged
merged 1 commit into from
Apr 17, 2024

Conversation

nvartolomei
Copy link
Contributor

@nvartolomei nvartolomei commented Apr 17, 2024

We have a wrapper RedpandaService.wait_until which is used to wait for
conditions as long as the cluster is "healthy". Healthy is defined as
all "started nodes" being up.

When we call signal_redpanda to kill a particular node on purpose. In
this case, we also need to remove the node from the started nodes list
otherwise the above check will trip thinking that a node is crashed or
something similar.

I have tried to add this logic inside the signal_redpanda method in
#17889 but discovered that there are tests which rely on the
existing behavior already.

@bharathv noticed that stop_node can be used instead so this PR does
that.

Fixes #17886

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented Apr 17, 2024

@nvartolomei
Copy link
Contributor Author

We have a wrapper RedpandaService.wait_until which is used to wait for
conditions as long as the cluster is "healthy". Healthy is defined as
all "started nodes" being up.

When we call signal_redpanda to kill a particular node on purpose. In
this case, we also need to remove the node from the started nodes list
otherwise the above check will trip thinking that a node is crashed or
something similar.

I have tried to add this logic inside the signal_redpanda method in
[redpanda-data#17889][1] but discovered that there are tests which rely on the
existing behavior already.

@bharathv noticed that stop_node can be used instead so this PR does
that.

Fixes redpanda-data#17886

[1]: redpanda-data#17889
@bharathv bharathv merged commit f74837d into redpanda-data:dev Apr 17, 2024
17 checks passed
@dotnwat dotnwat changed the title rptest: remove node from started list after kill CORE-2388 rptest: remove node from started list after kill Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants