Skip to content

restore: skip dropped/missing descriptors during restore revert#166116

Merged
trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
kev-cao:restore/cleanup-dropped
Mar 20, 2026
Merged

restore: skip dropped/missing descriptors during restore revert#166116
trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
kev-cao:restore/cleanup-dropped

Conversation

@kev-cao
Copy link
Contributor

@kev-cao kev-cao commented Mar 19, 2026

Previously, if a user dropped descriptors (tables, types, functions, schemas) that were created by an online restore while the download job was paused, canceling the download job would get stuck in the reverting state. This happened because both setDescriptorsOffline and dropDescriptors would fail when trying to look up descriptors that had already been dropped.

Fix this by skipping descriptors that are already dropped or not found.

Fixes: #165789

Release note: None

@kev-cao kev-cao requested a review from a team as a code owner March 19, 2026 00:16
@kev-cao kev-cao requested review from msbutler and removed request for a team March 19, 2026 00:16
@trunk-io
Copy link
Contributor

trunk-io bot commented Mar 19, 2026

😎 Merged successfully - details.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

// as dropped here.
typDesc := details.TypeDescs[i]
mutType, err := descsCol.MutableByID(txn.KV()).Type(ctx, typDesc.ID)
mutType, err := descsCol.MutableByID(txn.KV()).Desc(ctx, typDesc.ID)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched all the lookup calls to Desc so that we could do a simple comparison against catalog.ErrDescriptorNotFound. The type-specific lookup calls like Database() or Schema() always replace the error with a type-specific pgcode:

desc, err := g.Desc(ctx, id)
if err != nil {
if errors.Is(err, catalog.ErrDescriptorNotFound) {
return nil, sqlerrors.NewUndefinedDatabaseError(fmt.Sprintf("[%d]", id))
}
return nil, err
}

I don't think there's any difference here aside from the fact that we lose the checking that the ID is specifically of that object type, but I don't think that makes much of a difference. But if anyone thinks otherwise, I'm open to changing this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment here, the first place you do this?

@kev-cao kev-cao force-pushed the restore/cleanup-dropped branch from f8997dc to 9954fb4 Compare March 19, 2026 00:22
Copy link
Collaborator

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

// as dropped here.
typDesc := details.TypeDescs[i]
mutType, err := descsCol.MutableByID(txn.KV()).Type(ctx, typDesc.ID)
mutType, err := descsCol.MutableByID(txn.KV()).Desc(ctx, typDesc.ID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a comment here, the first place you do this?

Previously, if a user dropped descriptors (tables, types, functions,
schemas) that were created by an online restore while the download job
was paused, canceling the download job would get stuck in the reverting
state. This happened because both `setDescriptorsOffline` and
`dropDescriptors` would fail when trying to look up descriptors that
had already been dropped.

Fix this by skipping descriptors that are already dropped or not found.

Fixes: cockroachdb#165789

Release note: None
@kev-cao kev-cao force-pushed the restore/cleanup-dropped branch from 9954fb4 to a5b1fb2 Compare March 20, 2026 18:09
@kev-cao
Copy link
Contributor Author

kev-cao commented Mar 20, 2026

TFTR!

/trunk merge

@trunk-io trunk-io bot merged commit 7dfbdae into cockroachdb:master Mar 20, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachtest: backup-restore/online-restore-recovery failed

3 participants