Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increased a timeout for cluster replicas #16976

Merged
merged 2 commits into from
Mar 9, 2024
Merged

increased a timeout for cluster replicas #16976

merged 2 commits into from
Mar 9, 2024

Conversation

rpdevmp
Copy link
Contributor

@rpdevmp rpdevmp commented Mar 8, 2024

Buildkite HTT failure: 12163

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

  • none

Bug Fixes

This change should make sure that HTT tests pass when checking for cluster replicas status. Based on logs, 10 minutes seems to be not enough.

Improvements

This change should make sure that HTT tests pass when checking for cluster replicas status. Based on logs, 10 minutes seems to be not enough.

[INFO - 2024-03-07 06:11:53,073 - high_throughput_test - _stage_add_and_decommission - lineno:1259]: scaling out cluster rp-cnkk9q2vmreg2bvgr9jg from 3 to 4 [INFO - 2024-03-07 06:11:53,645 - high_throughput_test - _stage_add_and_decommission - lineno:1264]: waiting for cluster rp-cnkk9q2vmreg2bvgr9jg to have ready replicas 4 ... ... [INFO - 2024-03-07 06:21:58,549 - kubectl - _cmd - lineno:161]: Command failed (rc=1). --------- stdout ----------- --------- stderr ----------- error: timed out waiting for the condition on clusters/rp-cnkk9q2vmreg2bvgr9jg ERROR: Process exited with status 1

Update:
Message from Simon,

It timed out waiting for 4th pod to appear. i see the timeout is 600 seconds (10 minutes) i think to add a node will take approx (7mins to reconcile, and 3mins to launch, reboot, then configure xfs raid before pod starts) - so 10 minutes is too tight. Best to increase that to 20mins IMO.

So increasing it to 20 minutes based on discussion

@rpdevmp rpdevmp merged commit bbbc05c into dev Mar 9, 2024
17 checks passed
@rpdevmp rpdevmp deleted the rpdevmp/bkite-12163 branch March 9, 2024 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants