New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes Volume Snapshots support #2081
Comments
The initial release of volume snapshots (version 1.21.0) only supported cold backups, which required fencing of the instance. This limitation has been waived starting with version 1.21.1. Given the minimal impact of the change on the code, maintainers have decided to backport this feature immediately instead of waiting for version 1.22.0 to be out, and make online backups the default behavior on volume snapshots too. If you are planning to rely instead on cold backups, make sure you follow the instructions below. |
The initial release of volume snapshots (version 1.21.0) only supported cold backups, which required fencing of the instance. This patch waives that limitation through the introduction of the following options in the `.spec.backup.volumeSnapshot` stanza of the `Cluster` resource: - `online`: accepting `true` (default) or `false` as a value - `onlineConfiguration.immediateCheckpoint`: whether you want to request an immediate checkpoint before you start the backup procedure or not; technically, it corresponds to the `fast` argument you pass to the `pg_backup_start`/`pg_start_backup()` function in PostgreSQL, accepting `true` (default) or `false` - `onlineConfiguration.waitForArchive`: whether you want to wait for the archiver to process the last segment of the backup or not; technically, it corresponds to the `wait_for_archive` argument you pass to the `pg_backup_stop`/`pg_stop_backup()` function in PostgreSQL, accepting `true` (default) or `false` Given the minimal impact of the change on the code, the patch is also backported to the release-1.21 branch. Online backups are now the default behavior on volume snapshots too. Closes #2081 Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Co-authored-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
The initial release of volume snapshots (version 1.21.0) only supported cold backups, which required fencing of the instance. This patch waives that limitation through the introduction of the following options in the `.spec.backup.volumeSnapshot` stanza of the `Cluster` resource: - `online`: accepting `true` (default) or `false` as a value - `onlineConfiguration.immediateCheckpoint`: whether you want to request an immediate checkpoint before you start the backup procedure or not; technically, it corresponds to the `fast` argument you pass to the `pg_backup_start`/`pg_start_backup()` function in PostgreSQL, accepting `true` (default) or `false` - `onlineConfiguration.waitForArchive`: whether you want to wait for the archiver to process the last segment of the backup or not; technically, it corresponds to the `wait_for_archive` argument you pass to the `pg_backup_stop`/`pg_stop_backup()` function in PostgreSQL, accepting `true` (default) or `false` Given the minimal impact of the change on the code, the patch is also backported to the release-1.21 branch. Online backups are now the default behavior on volume snapshots too. Closes #2081 Signed-off-by: Armando Ruocco <armando.ruocco@enterprisedb.com> Signed-off-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Signed-off-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Signed-off-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com> Co-authored-by: Leonardo Cecchi <leonardo.cecchi@enterprisedb.com> Co-authored-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com> Co-authored-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com> (cherry picked from commit 4e82248)
Currently, CloudNativePG supports object stores only for continuous backup, requiring that both WAL files and base backups reside in the same backup object store.
However, this technique is not adequate for very large database scenarios, especially on the recovery side with high values of RTO following a (rare) full disaster.
This epic ticket proposes the introduction in CloudNativePG of:
Both capabilities need to be declarative.
Volume snapshotting was first introduced in Kubernetes 1.12 as alpha, promoted to beta in 1.17, and moved to GA in 1.20. It’s now stable and widely available, and provides 3 custom resource definitions:
VolumeSnapshot
,VolumeSnapshotContent
andVolumeSnapshotClass
.Thanks to the transparent support for both incremental and differential copy that volume snapshots provide through the underlying storage classes, this feature has two main benefits:
Hot backup's been the default in PostgreSQL for 20 years now, and one of the core capabilities that guaranteed success to Postgres. It enables taking physical base backups from any instance in the Postgres cluster (either primary or standby) without shutting down the server, and it requires WAL archiving (currently only on object stores).
As part of this activity, we should also address cold (offline) physical base backups and "database snapshot" recovery situations. Opposed to hot backup, cold backup is a technique that enables the operator to also work in the worst case scenario where a WAL archive is not present and “standalone consistent database snapshots” are accepted: this technique basically fences a replica, takes a snapshot and restarts it, without disrupting the primary.
Volume snapshots can have the potential to become the main base backup method in the public cloud, as well as in some enterprise on-premise environments.
The main general idea of this feature is to extend the existing API of the operator for “Backup”, “ScheduledBackup” and “Cluster” CRDs to consider another way of taking physical base backups on top of the existing one based on object stores with Barman Cloud, as well as of recovering, in a declarative way.
The text was updated successfully, but these errors were encountered: