Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DocDB] Return before deadline from AreNodesSafeToTakeDown #21855

Closed
1 task done
SrivastavaAnubhav opened this issue Apr 5, 2024 · 0 comments
Closed
1 task done

[DocDB] Return before deadline from AreNodesSafeToTakeDown #21855

SrivastavaAnubhav opened this issue Apr 5, 2024 · 0 comments
Assignees
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue

Comments

@SrivastavaAnubhav
Copy link
Contributor

SrivastavaAnubhav commented Apr 5, 2024

Jira Link: DB-10754

Description

AreNodesSafeToTakeDown currently uses the RPC's deadline as its deadline for how long to wait for responses from tservers and masters. This means that the client does not get a readable error when we hit the timeout, since it usually also stops waiting on the RPC. Instead, AreNodesSafeToTakeDown should return early and provide a readable error message.

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

  • I confirm this issue does not contain any sensitive information.
@SrivastavaAnubhav SrivastavaAnubhav added area/docdb YugabyteDB core features priority/medium Medium priority issue labels Apr 5, 2024
@SrivastavaAnubhav SrivastavaAnubhav self-assigned this Apr 5, 2024
@yugabyte-ci yugabyte-ci added the kind/enhancement This is an enhancement of an existing feature label Apr 5, 2024
SrivastavaAnubhav pushed a commit that referenced this issue Apr 17, 2024
…wn to provide a readable error on timeout

Summary:
Original commit: 1fda770 / D33898
`AreNodesSafeToTakeDown` currently uses the RPC's deadline as its deadline for how long to wait for health check responses from tservers and masters. This means that the client does not get a readable error when we hit the timeout, since it usually also stops waiting on the RPC. Instead, `AreNodesSafeToTakeDown` should return early and provide a readable error message.

This diff adds a flag controlling how much earlier than the passed deadline `AreNodesSafeToTakeDown` attempts to return: `are_nodes_safe_to_take_down_timeout_buffer_ms` (2 seconds by default).

Fixes #21855.
Jira: DB-10754

Test Plan:
`./yb_build.sh --cxx-test tablet_health_manager-itest --gtest_filter AreNodesSafeToTakeDownItest.MasterUnresponsive`
`./yb_build.sh --cxx-test tablet_health_manager-itest --gtest_filter AreNodesSafeToTakeDownItest.TserverUnresponsive`

Reviewers: zdrudi

Reviewed By: zdrudi

Subscribers: bogdan, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34032
SrivastavaAnubhav pushed a commit that referenced this issue Apr 17, 2024
… to provide a readable error on timeout

Summary:
Original commit: 1fda770 / D33898
`AreNodesSafeToTakeDown` currently uses the RPC's deadline as its deadline for how long to wait for health check responses from tservers and masters. This means that the client does not get a readable error when we hit the timeout, since it usually also stops waiting on the RPC. Instead, `AreNodesSafeToTakeDown` should return early and provide a readable error message.

This diff adds a flag controlling how much earlier than the passed deadline `AreNodesSafeToTakeDown` attempts to return: `are_nodes_safe_to_take_down_timeout_buffer_ms` (2 seconds by default).

Fixes #21855.
Jira: DB-10754

Test Plan:
`./yb_build.sh --cxx-test tablet_health_manager-itest --gtest_filter AreNodesSafeToTakeDownItest.MasterUnresponsive`
`./yb_build.sh --cxx-test tablet_health_manager-itest --gtest_filter AreNodesSafeToTakeDownItest.TserverUnresponsive`

Reviewers: zdrudi

Reviewed By: zdrudi

Subscribers: bogdan, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D34035
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue
Projects
None yet
Development

No branches or pull requests

2 participants