Skip to content

ExecuteSyncPoint and Fetch now throw Timeout if all sub-errors are Ti…#3379

Merged
dcapwell merged 1 commit intoapache:cep-15-accordfrom
dcapwell:CASSANDRA-19718
Jun 18, 2024
Merged

ExecuteSyncPoint and Fetch now throw Timeout if all sub-errors are Ti…#3379
dcapwell merged 1 commit intoapache:cep-15-accordfrom
dcapwell:CASSANDRA-19718

Conversation

@dcapwell
Copy link
Contributor

No description provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spoke with @bdeggleston about this in slack, this timeout is too large and using an unrelated config (bootstrap and repair are not the same thing). This isn't 100% related to this patch as its from my topology fix patch, but trying to pull out bug fixes to make reviews easier

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly refactored to make the tests easier (as they count calls to sleep) and make it clear when we backoff; which is only when the type of error is expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is possible on the JVM, but super tacky to hit this case... you either do this outside of javac or trick javac... In the common case (most likely 100%) we will return from the previous line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I punted on trying to fix the test... we actually don't stream anything....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I punted on trying to fix the test... we actually don't stream anything....

@dcapwell dcapwell requested a review from aweisberg June 18, 2024 21:46
Copy link
Contributor

@aweisberg aweisberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I literally just encountered the Preempted == timeout issue in my testing today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if its jvm-dtest there is a function to retry read/writes (assuming they are idempotent, else we fail)... both are timeouts from a users point of view...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only hit exhausted for reads and specifically barriers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw other code paths that use it as well, but they are not user facing; CoordinateGloballyDurable (background task), and FetchCoordinator (bootstrap / streaming of ranges)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to touch those as the coverage is mostly via BurnTest so subtle changes can take a long time to detect issues.

…Timeout and doesn’t get retried

patch by David Capwell; reviewed by Ariel Weisberg for CASSANDRA-19718
@dcapwell dcapwell merged commit 0ef232f into apache:cep-15-accord Jun 18, 2024
@dcapwell dcapwell deleted the CASSANDRA-19718 branch June 18, 2024 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants