release-22.2: cli: compute --drain-wait based on cluster setting values #98577

rafiss · 2023-03-14T14:20:52Z

Backport 3/3 commits from #98390.

/cc @cockroachdb/release

Release justification: low risk improvement to an important area

This includes a few commits related to draining

cli: compute --drain-wait based on cluster setting values

Release note (cli change): The --drain-wait argument for the drain command will be automatically increased if the command detects that it is smaller than the sum of server.shutdown.drain_wait, server.shutdown.connection_wait, server.shutdown.query_wait times two, and server.shutdown.lease_transfer_wait.

This recommendation was already documented, but now the advice will be applied automatically.

sql: fix check for closing connExecutor during draining

This fixes a minor bug in which the connection would not get closed at
the earliest possible time during server shutdown.

The connection is supposed to be closed as soon as we handle a Sync
message when the conn_executor is in the draining state and not in a
transaction. Since the transaction state was checked before state
transitions occurred, this would cause the connection to remain open for
an extra bit of time. This was particularly a problem because the Sync
message is also the command that auto-commits an implicit transaction.
So before this commit, it was actually impossible for the check to work
as it was supposed to.

Now we check the txn state after state transitions occur.

roachtest: enhance drain test

The test now does much more:

Checks that --drain-wait is automatically increased if it is set lower
than the cluster settings require.
Check that the /health?ready=1 endpoint fails during the drain_wait
period.
Check for the proper error message during the connection_wait phase.
Check for the proper error message when trying to begin a new
query/transaction during the query_wait phase.
Check that an open transaction is allowed to continue during the
query_wait phase.
Check for the proper error message when a query is canceled during
shutdown.

Release note (cli change): The --drain-wait argument for the `drain` command will be automatically increased if the command detects that it is smaller than the sum of server.shutdown.drain_wait, server.shutdown.connection_wait, server.shutdown.query_wait times two, and server.shutdown.lease_transfer_wait. If the --drain-wait argument is 0, then no timeout is used. This recommendation was already documented, but now the advice will be applied automatically.

This fixes a minor bug in which the connection would not get closed at the earliest possible time during server shutdown. The connection is supposed to be closed as soon as we handle a Sync message when the conn_executor is in the draining state and not in a transaction. Since the transaction state was checked before state transitions occurred, this would cause the connection to remain open for an extra bit of time. This was particularly a problem because the Sync message is also the command that auto-commits an implicit transaction. So before this commit, it was actually impossible for the check to work as it was supposed to. Now we check the txn state after state transitions occur. Release note: None

cockroach-teamcity · 2023-03-14T14:21:06Z

This change is

The test now does much more: - Checks that --drain-wait is automatically increased if it is set lower than the cluster settings require. - Check that the /health?ready=1 endpoint fails during the drain_wait period. - Check for the proper error message during the connection_wait phase. - Check for the proper error message when trying to begin a new query/transaction during the query_wait phase. - Check that an open transaction is allowed to continue during the query_wait phase. - Check for the proper error message when a query is canceled during shutdown. Release note: None

rafiss added 2 commits March 14, 2023 10:18

rafiss requested a review from knz March 14, 2023 14:20

rafiss requested a review from a team as a code owner March 14, 2023 14:20

rafiss requested a review from a team March 14, 2023 14:20

rafiss requested a review from a team as a code owner March 14, 2023 14:20

knz changed the title ~~cli: compute --drain-wait based on cluster setting values~~ release-22.2: cli: compute --drain-wait based on cluster setting values Mar 14, 2023

knz approved these changes Mar 14, 2023

View reviewed changes

rafiss force-pushed the backport22.2-98390 branch from 16c6175 to 4cb23c1 Compare March 14, 2023 19:29

rafiss merged commit 97fd65b into cockroachdb:release-22.2 Mar 14, 2023

rafiss deleted the backport22.2-98390 branch March 14, 2023 20:26

cockroach-teamcity mentioned this pull request Mar 15, 2023

PR #98577 - cli: compute --drain-wait based on cluster setting values cockroachdb/docs#16491

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-22.2: cli: compute --drain-wait based on cluster setting values #98577

release-22.2: cli: compute --drain-wait based on cluster setting values #98577

rafiss commented Mar 14, 2023

cockroach-teamcity commented Mar 14, 2023

release-22.2: cli: compute --drain-wait based on cluster setting values #98577

release-22.2: cli: compute --drain-wait based on cluster setting values #98577

Conversation

rafiss commented Mar 14, 2023

cli: compute --drain-wait based on cluster setting values

sql: fix check for closing connExecutor during draining

roachtest: enhance drain test

cockroach-teamcity commented Mar 14, 2023