Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PESDLC-985 Raise crash errors on pod restarts #17178

Merged
merged 1 commit into from
Mar 26, 2024
Merged

Conversation

savex
Copy link
Contributor

@savex savex commented Mar 18, 2024

Detect pod restarts and treat them as crashes

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@savex savex marked this pull request as ready for review March 18, 2024 23:48
@savex
Copy link
Contributor Author

savex commented Mar 18, 2024

Copy link
Member

@travisdowns travisdowns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment.

@savex savex marked this pull request as draft March 20, 2024 17:00
@savex savex force-pushed the PESDLC-985-raise-on-pod-crash branch from 0808934 to 370d189 Compare March 20, 2024 21:34
@savex
Copy link
Contributor Author

savex commented Mar 20, 2024

Tested on EC2:

status:     PASS
run time:   38 minutes 55.220 seconds
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.redpanda_cloud_tests.rolling_restart_test.RollingRestartTest.test_rolling_restart
status:     PASS
run time:   4 minutes 37.480 seconds
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_id:    rptest.redpanda_cloud_tests.cloud_self_test.SelfRedpandaCloudTest.test_healthy
status:     PASS
run time:   43.962 seconds
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
======================================================================================================================================================================================================================================================================================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.8.18
session_id:       2024-03-20--001
run time:         44 minutes 20.556 seconds
tests run:        3
passed:           3
flaky:            0
failed:           0
ignored:          0
opassed:          0
ofailed:          0
======================================================================================================================================================================================================================================================================================================================================================
ubuntu@ip-172-31-61-4:~/tests$ ```

@savex savex requested a review from travisdowns March 20, 2024 22:11
@savex savex marked this pull request as ready for review March 20, 2024 22:11
@savex savex force-pushed the PESDLC-985-raise-on-pod-crash branch 2 times, most recently from 5eaa2c6 to 18d8d94 Compare March 20, 2024 22:26
@savex
Copy link
Contributor Author

savex commented Mar 20, 2024

/ci-repeat 1

@vbotbuildovich
Copy link
Collaborator

new failures in https://buildkite.com/redpanda/redpanda/builds/46527#018e5e37-dfd2-4728-a384-ed453420e390:

"rptest.tests.raft_availability_test.RaftAvailabilityTest.test_two_nodes_down"

    Function mimics raise_on_crash call in RedpandaService to be called
    from metadataadder
@savex savex force-pushed the PESDLC-985-raise-on-pod-crash branch from 18d8d94 to eef9c54 Compare March 21, 2024 21:51
@savex
Copy link
Contributor Author

savex commented Mar 21, 2024

Conducted additional checks and updated calling to raise with proper instance check

@savex
Copy link
Contributor Author

savex commented Mar 22, 2024

Tested manually on EC2 deployment with tier-1-aws-v2-arm profile

Copy link
Contributor

@rpdevmp rpdevmp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@savex savex merged commit d8b79cd into dev Mar 26, 2024
16 checks passed
@savex savex deleted the PESDLC-985-raise-on-pod-crash branch March 26, 2024 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants