Skip to content

Retry attempt InstantDDL up to --default-retries#1667

Merged
meiji163 merged 2 commits intomasterfrom
meiji163/instantddl-retries
Apr 30, 2026
Merged

Retry attempt InstantDDL up to --default-retries#1667
meiji163 merged 2 commits intomasterfrom
meiji163/instantddl-retries

Conversation

@meiji163
Copy link
Copy Markdown
Contributor

@meiji163 meiji163 commented Apr 30, 2026

Following on #1651, increase the max instant DDL attempts from 5 to MaxRetries, which is configurable with --default-retries. In high-concurrency environments, we may need more retries due to metadata lock wait timeout.

In case this PR introduced Go code changes:

  • contributed code is using same conventions as original code
  • script/cibuild returns with no formatting errors, build errors or unit test errors.

Copilot AI review requested due to automatic review settings April 30, 2026 21:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the instant DDL retry behavior to use the configurable --default-retries value (via MigrationContext.MaxRetries()) instead of a hard-coded retry limit, to better tolerate lock wait timeouts in high-concurrency environments.

Changes:

  • Pass MigrationContext.MaxRetries() into the instant DDL lock-wait retry helper.
  • Update retryOnLockWaitTimeout to accept a configurable maxRetries and iterate using int64.
Show a summary per file
File Description
go/logic/applier.go Make instant DDL lock-wait retries configurable via MaxRetries() and adjust retry helper signature/loop accordingly.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

go/logic/applier.go:323

  • maxRetries now comes from MigrationContext.MaxRetries() (default 60 and user-configurable to arbitrarily large values). With the current linear backoff (i * 5s) this can make an instant DDL attempt block the migration start for hours by default, and for large --default-retries values it can overflow time.Duration (overflow occurs once i > ~1.8e9), potentially producing negative/short sleeps and a tight retry loop. Consider capping maxRetries for this helper and/or capping the sleep interval (or switching to a bounded exponential backoff) and defensively handling very large values.
func retryOnLockWaitTimeout(operation func() error, maxRetries int64, logger base.Logger) error {
	var err error
	for i := int64(0); i < maxRetries; i++ {
		if i != 0 {
			logger.Infof("Retrying after lock wait timeout (attempt %d/%d)", i+1, maxRetries)
			RetrySleepFn(time.Duration(i) * 5 * time.Second)
		}
  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread go/logic/applier.go
@meiji163 meiji163 merged commit 29c0c33 into master Apr 30, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants