Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: check recovery backups/snapshots readiness before cluster bootstrap #3663

Merged
merged 8 commits into from
Mar 11, 2024

Conversation

YanniHu1996
Copy link
Contributor

@YanniHu1996 YanniHu1996 commented Jan 4, 2024

Fix bootstrapping error loop by ensuring recovery backups/snapshots
readiness before Cluster construction starts.

This patch addresses an issue where Clusters bootstrapping from
recovery could enter an error loop if the backups or snapshots were
not prepared at the time of cluster creation. It modifies the
bootstrapping process to delay the construction of the first instance
until the necessary recovery backups or snapshots are fully ready,
thus preventing the error loop.

Closes #3654

Copy link
Contributor

github-actions bot commented Jan 4, 2024

❗ By default, the pull request is configured to backport to all release branches.

  • To stop backporting this pr, remove the label: backport-requested ◀️ or add the label 'do not backport'
  • To stop backporting this pr to a certain release branch, remove the specific branch label: release-x.y

@YanniHu1996 YanniHu1996 force-pushed the dev/cnp-4441 branch 4 times, most recently from 10eb247 to 824219a Compare January 10, 2024 09:32
@YanniHu1996 YanniHu1996 changed the title fix: Continue the reconciliation in case of an incomplete backup. fix: Continue the reconciliation even though backup or volumeSnapshots are not good Jan 10, 2024
Comment on lines +973 to 985
cluster.Spec.Bootstrap.Recovery != nil {
var err error
backup, err = r.getOriginBackup(ctx, cluster)
if err != nil {
return ctrl.Result{}, err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already there before your PR, but the flow is very confusing.
We look for a backup object in the bootstrap.recovery, and it looks like we depend on it.
But further down we may still proceed if we had boostrap.recovery.volumesnapshots.
I'm going to try to make this more explicit.

@jsilvela
Copy link
Collaborator

@jsilvela
Copy link
Collaborator

/ok-to-merge

@cnpg-bot cnpg-bot added the ok to merge 👌 This PR can be merged label Jan 25, 2024
@jsilvela jsilvela changed the title fix: Continue the reconciliation even though backup or volumeSnapshots are not good fix: avoid error loop when bootstrapping from a backup or snapshot that is not completed Jan 25, 2024
@jsilvela jsilvela changed the title fix: avoid error loop when bootstrapping from a backup or snapshot that is not completed fix: avoid error loop when bootstrapping from incomplete backup or snapshot Jan 25, 2024
@sxd sxd removed the release-1.20 label Feb 6, 2024
@mnencia mnencia self-assigned this Feb 7, 2024
YanniHu1996 and others added 8 commits March 11, 2024 15:00
Signed-off-by: YanniHu1996 <yantian.hu@enterprisedb.com>
…ots.

Signed-off-by: YanniHu1996 <yantian.hu@enterprisedb.com>
Signed-off-by: YanniHu1996 <yantian.hu@enterprisedb.com>
Signed-off-by: Tao Li <tao.li@enterprisedb.com>
Signed-off-by: Jaime Silvela <jaime.silvela@enterprisedb.com>
Signed-off-by: Jaime Silvela <jaime.silvela@enterprisedb.com>
Signed-off-by: Jaime Silvela <jaime.silvela@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
@mnencia mnencia changed the title fix: avoid error loop when bootstrapping from incomplete backup or snapshot fix: check recovery backups/snapshots readiness before cluster bootstrap Mar 11, 2024
@mnencia mnencia merged commit d7c7212 into cloudnative-pg:main Mar 11, 2024
27 checks passed
@mnencia mnencia deleted the dev/cnp-4441 branch March 11, 2024 14:28
cnpg-bot pushed a commit that referenced this pull request Mar 11, 2024
…rap (#3663)

Fix bootstrapping error loop by ensuring recovery backups/snapshots
readiness before Cluster construction starts.

This patch addresses an issue where Clusters bootstrapping from
recovery could enter an error loop if the backups or snapshots were
not prepared at the time of cluster creation. It modifies the
bootstrapping process to delay the construction of the first instance
until the necessary recovery backups or snapshots are fully ready,
thus preventing the error loop.

Closes #3654

Signed-off-by: YanniHu1996 <yantian.hu@enterprisedb.com>
Signed-off-by: Tao Li <tao.li@enterprisedb.com>
Signed-off-by: Jaime Silvela <jaime.silvela@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Tao Li <tao.li@enterprisedb.com>
Co-authored-by: Jaime Silvela <jaime.silvela@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit d7c7212)
mnencia pushed a commit that referenced this pull request Mar 11, 2024
…rap (#3663)

Fix bootstrapping error loop by ensuring recovery backups/snapshots
readiness before Cluster construction starts.

This patch addresses an issue where Clusters bootstrapping from
recovery could enter an error loop if the backups or snapshots were
not prepared at the time of cluster creation. It modifies the
bootstrapping process to delay the construction of the first instance
until the necessary recovery backups or snapshots are fully ready,
thus preventing the error loop.

Closes #3654

Signed-off-by: YanniHu1996 <yantian.hu@enterprisedb.com>
Signed-off-by: Tao Li <tao.li@enterprisedb.com>
Signed-off-by: Jaime Silvela <jaime.silvela@enterprisedb.com>
Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Co-authored-by: Tao Li <tao.li@enterprisedb.com>
Co-authored-by: Jaime Silvela <jaime.silvela@enterprisedb.com>
Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
(cherry picked from commit d7c7212)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-requested ◀️ This pull request should be backported to all supported releases ok to merge 👌 This PR can be merged release-1.21 release-1.22
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Restoring from not-completed backup gets stuck in an endless reconciliation loop
6 participants