Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: add timeout to ExportRequest sent by a backup #63383

Merged
merged 1 commit into from
May 6, 2021

Conversation

adityamaru
Copy link
Contributor

Previously, there was no timeout around the ExportRequest that
was sent as part of a backup. If the request returned certain
kinds of unambiguous errors, dist sender would keep retrying the request
thereby causing the backup to hang.

This change adds an upper bound on the time it should take the sender
to receive a response, thereby erroring out instead of mysteriously
hanging. The upper bound is configurable via a cluster settings and is
set to 5 mins by default, which is very conservative as these requests
should be very fast under normal circumstances.

Release note: None

@adityamaru adityamaru requested review from dt and pbardea April 9, 2021 15:29
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru
Copy link
Contributor Author

Just wanted to get an idea if this is the approach we want to go with for adding timeouts. While exploring this space I realized there is some work that can be done to retry in the face of non-intent errors (but would we do anything more than what dist sender already does?), and to populate the job progress with the intents we hit when retrying for better visibility, but both those seemed orthogonal to this work.

Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to follow up that this approach looks reasonable to me

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @dt)


pkg/ccl/backupccl/backup_processor.go, line 63 at r1 (raw file):

	timeoutPerAttmpt = settings.RegisterDurationSetting(
		"bulkio.backup.read_timeout",
		"amount of time after which a read attempt is considered timed out and is canceled",

Probably good to make it more explicit that this timeout will fail the backup job.

@adityamaru adityamaru marked this pull request as ready for review May 5, 2021 11:15
@adityamaru adityamaru requested a review from a team May 5, 2021 11:15
Copy link
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @pbardea)


pkg/ccl/backupccl/backup_processor.go, line 63 at r1 (raw file):

Previously, pbardea (Paul Bardea) wrote…

Probably good to make it more explicit that this timeout will fail the backup job.

Done.

Previously, there was no timeout around the ExportRequest that
was sent as part of a backup. If the request returned certain
kinds of unambiguous errors, dist sender would keep retrying the request
thereby causing the backup to hang.

This change adds an upper bound on the time it should take the sender
to receive a response, thereby erroring out instead of mysteriously
hanging. The upper bound is configurable via a cluster settings and is
set to 5 mins by default, which is very conservative as these requests
should be very fast under normal circumstances.

Release note: None
@adityamaru
Copy link
Contributor Author

TFTR!

bors r=pbardea

@craig
Copy link
Contributor

craig bot commented May 6, 2021

Build succeeded:

@craig craig bot merged commit 72a144d into cockroachdb:master May 6, 2021
@adityamaru adityamaru deleted the timeout-on-requests branch May 6, 2021 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants