Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement volume snapshot backups and restore #557

Merged
merged 26 commits into from
Feb 23, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
8101b0e
implement volume snapshot backups
nhudson Feb 19, 2024
9ffcd12
make volumeSnapshot disabled by default
nhudson Feb 19, 2024
c4cc9bb
fix test
nhudson Feb 19, 2024
2901554
add scheduled backups for both object store and snapshots if snapshot…
nhudson Feb 20, 2024
5eaccd5
Merge branch 'main' into nhudson/TEM-3089
nhudson Feb 20, 2024
44b3591
Merge branch 'main' into nhudson/TEM-3089
nhudson Feb 21, 2024
830830f
implement volume snapshot restores
nhudson Feb 21, 2024
9f19ff5
update crd manifest
nhudson Feb 21, 2024
88a7926
fix fmt
nhudson Feb 21, 2024
d30023d
Merge branch 'main' into nhudson/TEM-3089
nhudson Feb 21, 2024
2fecb6b
fix namespace lookup for volumesnapshot, update rbac
nhudson Feb 21, 2024
e953d6c
patch/create a volumesnapshotcontent
nhudson Feb 21, 2024
bb3e6dd
make sure when patching to supply the correct name
nhudson Feb 21, 2024
745a25f
make sure to use snapshotHandle instead of volumeHandle
nhudson Feb 22, 2024
8aef2c6
fix tests
nhudson Feb 22, 2024
4d4f925
clean up snapshot stuff
nhudson Feb 22, 2024
8570fcb
Merge branch 'main' into nhudson/TEM-3089
nhudson Feb 22, 2024
86a544d
fix snapshot test
nhudson Feb 22, 2024
9d8b5ef
adding check to make sure volumesnapshot is ready prior to creating t…
nhudson Feb 22, 2024
bf35335
fix issue with pointing to the incorrect snapshot
nhudson Feb 22, 2024
77d9479
add missing crate
nhudson Feb 22, 2024
053f3cf
make sure scheduledbackup job name is <=63 chars
nhudson Feb 22, 2024
1bb6af3
Merge branch 'main' into nhudson/TEM-3089
nhudson Feb 23, 2024
f1cfb43
Merge branch 'main' into nhudson/TEM-3089
nhudson Feb 23, 2024
9372806
add better error logging
nhudson Feb 23, 2024
9d28f45
better logging
nhudson Feb 23, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions charts/tembo-operator/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ name: tembo-operator
description: 'Helm chart to deploy the tembo-operator'
type: application
icon: https://cloud.tembo.io/images/TemboElephant.png
version: 0.3.0
version: 0.3.1
home: https://tembo.io
sources:
- https://github.com/tembo-io/tembo-stacks
- https://github.com/tembo-io/tembo
- https://github.com/cloudnative-pg/cloudnative-pg
keywords:
- postgresql
Expand Down
18 changes: 18 additions & 0 deletions charts/tembo-operator/templates/crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1344,6 +1344,8 @@ spec:
endpointURL: null
s3Credentials:
inheritFromIAMRole: true
volumeSnapshot:
enabled: false
description: |-
The backup configuration for the CoreDB instance to facilitate database backups and WAL archive uploads to an S3 compatible object store.

Expand Down Expand Up @@ -1432,6 +1434,22 @@ spec:
description: The backup schedule set with cron syntax
nullable: true
type: string
volumeSnapshot:
default:
enabled: false
description: Enable using Volume Snapshots for backups instead of Object Storage
nullable: true
properties:
enabled:
description: Enable the volume snapshots for backups
type: boolean
snapshotClass:
description: The reference to the snapshot class
nullable: true
type: string
required:
- enabled
type: object
type: object
connectionPooler:
default:
Expand Down
2 changes: 1 addition & 1 deletion tembo-operator/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion tembo-operator/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "controller"
description = "Tembo Operator for Postgres"
version = "0.35.4"
version = "0.36.0"
edition = "2021"
default-run = "controller"
license = "Apache-2.0"
Expand Down
30 changes: 29 additions & 1 deletion tembo-operator/src/apis/coredb_types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -157,11 +157,29 @@ pub struct S3CredentialsSessionToken {
pub name: String,
}

/// VolumeSnapshots is the type for the configuration of the volume snapshots
/// to be used for backups instead of object storage
#[derive(Serialize, Deserialize, Clone, Debug, Default, JsonSchema)]
pub struct VolumeSnapshot {
/// Enable the volume snapshots for backups
pub enabled: bool,

/// The reference to the snapshot class
#[serde(
default,
skip_serializing_if = "Option::is_none",
rename = "snapshotClass"
)]
pub snapshot_class: Option<String>,
}

/// CoreDB Backup configuration
/// The backup configuration for the CoreDB instance to facilitate database
/// backups and WAL archive uploads to an S3 compatible object store.
/// backups uploads to an S3 compatible object store or using Volume Snapshots
/// For WAL archive uploads utilite an S3 compatible object store.
///
/// **Example**: A typical S3 backup configuration using IAM Role for authentication
/// with Volume Snapshots enabled
///
/// See `ServiceAccountTemplate` for to map the IAM role ARN to a Kubernetes service account.
///
Expand All @@ -178,6 +196,9 @@ pub struct S3CredentialsSessionToken {
/// s3Credentials:
/// inheritFromIAMRole: true
/// schedule: "0 0 * * *" #every day at midnight
/// volumeSnapshots:
/// enabled: true
/// snapshotClass: my-snapshot-class-name
/// ```
#[derive(Deserialize, Serialize, Clone, Debug, Default, JsonSchema)]
#[allow(non_snake_case)]
Expand Down Expand Up @@ -205,6 +226,13 @@ pub struct Backup {
/// The S3 credentials to use for backups (if not using IAM Role)
#[serde(default = "defaults::default_s3_credentials", rename = "s3Credentials")]
pub s3_credentials: Option<S3Credentials>,

/// Enable using Volume Snapshots for backups instead of Object Storage
#[serde(
default = "defaults::default_volume_snapshot",
rename = "volumeSnapshot"
)]
pub volume_snapshot: Option<VolumeSnapshot>,
}

/// Restore configuration provides a way to restore a database from a backup
Expand Down
6 changes: 3 additions & 3 deletions tembo-operator/src/cloudnativepg/clusters.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1157,7 +1157,7 @@ pub enum ClusterBackupTarget {
}

/// VolumeSnapshot provides the configuration for the execution of volume snapshot backups.
#[derive(Serialize, Deserialize, Clone, Debug, Default, JsonSchema)]
#[derive(Serialize, Deserialize, Clone, Debug, Default, JsonSchema, PartialEq)]
pub struct ClusterBackupVolumeSnapshot {
/// Annotations key-value pairs that will be added to .metadata.annotations snapshot resources.
#[serde(default, skip_serializing_if = "Option::is_none")]
Expand Down Expand Up @@ -1202,7 +1202,7 @@ pub struct ClusterBackupVolumeSnapshot {
}

/// Configuration parameters to control the online/hot backup with volume snapshots
#[derive(Serialize, Deserialize, Clone, Debug, Default, JsonSchema)]
#[derive(Serialize, Deserialize, Clone, Debug, Default, JsonSchema, PartialEq)]
pub struct ClusterBackupVolumeSnapshotOnlineConfiguration {
/// Control whether the I/O workload for the backup initial checkpoint will be limited, according to the `checkpoint_completion_target` setting on the PostgreSQL server. If set to true, an immediate checkpoint will be used, meaning PostgreSQL will complete the checkpoint as soon as possible. `false` by default.
#[serde(
Expand All @@ -1221,7 +1221,7 @@ pub struct ClusterBackupVolumeSnapshotOnlineConfiguration {
}

/// VolumeSnapshot provides the configuration for the execution of volume snapshot backups.
#[derive(Serialize, Deserialize, Clone, Debug, JsonSchema)]
#[derive(Serialize, Deserialize, Clone, Debug, JsonSchema, PartialEq)]
pub enum ClusterBackupVolumeSnapshotSnapshotOwnerReference {
#[serde(rename = "none")]
None,
Expand Down
127 changes: 121 additions & 6 deletions tembo-operator/src/cloudnativepg/cnpg.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,12 @@ use crate::{
ClusterBackupBarmanObjectStoreS3CredentialsSecretAccessKey,
ClusterBackupBarmanObjectStoreS3CredentialsSessionToken,
ClusterBackupBarmanObjectStoreWal, ClusterBackupBarmanObjectStoreWalCompression,
ClusterBackupBarmanObjectStoreWalEncryption, ClusterBootstrap, ClusterBootstrapInitdb,
ClusterBootstrapRecovery, ClusterBootstrapRecoveryRecoveryTarget, ClusterCertificates,
ClusterExternalClusters, ClusterExternalClustersBarmanObjectStore,
ClusterBackupBarmanObjectStoreWalEncryption, ClusterBackupVolumeSnapshot,
ClusterBackupVolumeSnapshotOnlineConfiguration,
ClusterBackupVolumeSnapshotSnapshotOwnerReference, ClusterBootstrap,
ClusterBootstrapInitdb, ClusterBootstrapRecovery,
ClusterBootstrapRecoveryRecoveryTarget, ClusterCertificates, ClusterExternalClusters,
ClusterExternalClustersBarmanObjectStore,
ClusterExternalClustersBarmanObjectStoreS3Credentials,
ClusterExternalClustersBarmanObjectStoreS3CredentialsAccessKeyId,
ClusterExternalClustersBarmanObjectStoreS3CredentialsRegion,
Expand All @@ -41,7 +44,7 @@ use crate::{
},
scheduledbackups::{
ScheduledBackup, ScheduledBackupBackupOwnerReference, ScheduledBackupCluster,
ScheduledBackupSpec,
ScheduledBackupMethod, ScheduledBackupSpec,
},
},
config::Config,
Expand All @@ -66,6 +69,8 @@ use std::{collections::BTreeMap, sync::Arc};
use tokio::time::Duration;
use tracing::{debug, error, info, instrument, warn};

const VOLUME_SNAPSHOT_CLASS_NAME: &str = "cnpg-snapshot-class";

pub struct PostgresConfig {
pub postgres_parameters: Option<BTreeMap<String, String>>,
pub shared_preload_libraries: Option<Vec<String>>,
Expand Down Expand Up @@ -103,7 +108,7 @@ fn create_cluster_backup_barman_wal(cdb: &CoreDB) -> Option<ClusterBackupBarmanO
Some(ClusterBackupBarmanObjectStoreWal {
compression: Some(ClusterBackupBarmanObjectStoreWalCompression::Snappy),
encryption,
max_parallel: Some(5),
max_parallel: Some(8),
})
} else {
None
Expand Down Expand Up @@ -152,6 +157,28 @@ fn create_cluster_certificates(cdb: &CoreDB) -> Option<ClusterCertificates> {
}
}

fn create_cluster_backup_volume_snapshot(cdb: &CoreDB) -> ClusterBackupVolumeSnapshot {
let class_name = cdb
.spec
.backup
.volume_snapshot
.as_ref()
.and_then(|vs| vs.snapshot_class.as_ref())
.cloned()
.unwrap_or_else(|| VOLUME_SNAPSHOT_CLASS_NAME.to_string());

ClusterBackupVolumeSnapshot {
class_name: Some(class_name),
online: Some(true),
online_configuration: Some(ClusterBackupVolumeSnapshotOnlineConfiguration {
wait_for_archive: Some(true),
immediate_checkpoint: Some(true),
}),
snapshot_owner_reference: Some(ClusterBackupVolumeSnapshotSnapshotOwnerReference::Cluster),
..ClusterBackupVolumeSnapshot::default()
}
}

fn create_cluster_backup(
cdb: &CoreDB,
endpoint_url: &str,
Expand All @@ -171,14 +198,23 @@ fn create_cluster_backup(
},
};

let volume_snapshot = cdb.spec.backup.volume_snapshot.as_ref().and_then(|vs| {
if vs.enabled {
Some(create_cluster_backup_volume_snapshot(cdb))
} else {
None
}
});

Some(ClusterBackup {
barman_object_store: Some(create_cluster_backup_barman_object_store(
cdb,
endpoint_url,
backup_path,
s3_credentials,
)),
retention_policy: Some(retention_days), // Adjust as needed
retention_policy: Some(retention_days),
volume_snapshot,
..ClusterBackup::default()
})
}
Expand Down Expand Up @@ -1331,6 +1367,13 @@ fn schedule_expression_from_cdb(cdb: &CoreDB) -> String {
fn cnpg_scheduled_backup(cdb: &CoreDB) -> ScheduledBackup {
let name = cdb.name_any();
let namespace = cdb.namespace().unwrap();
let method = cdb.spec.backup.volume_snapshot.as_ref().map(|vs| {
if vs.enabled {
ScheduledBackupMethod::VolumeSnapshot
} else {
ScheduledBackupMethod::BarmanObjectStore
}
});

ScheduledBackup {
metadata: ObjectMeta {
Expand All @@ -1344,6 +1387,7 @@ fn cnpg_scheduled_backup(cdb: &CoreDB) -> ScheduledBackup {
immediate: Some(true),
schedule: schedule_expression_from_cdb(cdb),
suspend: Some(false),
method,
..ScheduledBackupSpec::default()
},
status: None,
Expand Down Expand Up @@ -2292,6 +2336,8 @@ mod tests {
encryption: AES256
retentionPolicy: "45"
schedule: 55 7 * * *
volumeSnapshot:
enabled: false
image: quay.io/tembo/tembo-pg-cnpg:15.3.0-5-48d489e
port: 5432
postgresExporterEnabled: true
Expand Down Expand Up @@ -2323,6 +2369,11 @@ mod tests {
"45d".to_string()
);

assert_eq!(
scheduled_backup.spec.method,
Some(ScheduledBackupMethod::BarmanObjectStore)
);

// Assert to make sure that backup destination path is set
assert_eq!(
backup
Expand Down Expand Up @@ -2634,4 +2685,68 @@ mod tests {
let cdb_no_storage_class: CoreDB = from_str(cdb_no_storage_class_yaml).unwrap();
assert_eq!(cnpg_cluster_storage_class(&cdb_no_storage_class), None);
}

#[test]
fn test_cnpg_cluster_volume_snapshot() {
let cdb_yaml = r#"
apiVersion: coredb.io/v1alpha1
kind: CoreDB
metadata:
name: test
namespace: default
spec:
backup:
destinationPath: s3://tembo-backup/sample-standard-backup
encryption: ""
retentionPolicy: "30"
schedule: 17 9 * * *
endpointURL: http://minio:9000
volumeSnapshot:
enabled: true
snapshotClass: "csi-vsc"
image: quay.io/tembo/tembo-pg-cnpg:15.3.0-5-48d489e
port: 5432
replicas: 1
resources:
limits:
cpu: "1"
memory: 0.5Gi
serviceAccountTemplate:
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::012345678901:role/aws-iam-role-iam
sharedirStorage: 1Gi
stop: false
storage: 1Gi
storageClass: "gp3-enc"
uid: 999
"#;

let cdb: CoreDB = serde_yaml::from_str(cdb_yaml).expect("Failed to parse YAML");
let snapshot = create_cluster_backup_volume_snapshot(&cdb);
let scheduled_backup = cnpg_scheduled_backup(&cdb);

// Set an expected ClusterBackupVolumeSnapshot object
let expected_snapshot = ClusterBackupVolumeSnapshot {
class_name: Some("csi-vsc".to_string()), // Expected to match the YAML input
online: Some(true),
online_configuration: Some(ClusterBackupVolumeSnapshotOnlineConfiguration {
wait_for_archive: Some(true),
immediate_checkpoint: Some(true),
}),
snapshot_owner_reference: Some(
ClusterBackupVolumeSnapshotSnapshotOwnerReference::Cluster,
),
..ClusterBackupVolumeSnapshot::default()
};

// Assert to make sure that the snapshot.snapshot_class and expected_snapshot.snapshot_class are the same
assert_eq!(snapshot, expected_snapshot);

// Assert to make sure that the ScheduledBackup method is set to VolumeSnapshot
assert_eq!(
scheduled_backup.spec.method,
Some(ScheduledBackupMethod::VolumeSnapshot)
);
}
}
2 changes: 1 addition & 1 deletion tembo-operator/src/cloudnativepg/scheduledbackups.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ pub struct ScheduledBackupCluster {
}

/// Specification of the desired behavior of the ScheduledBackup. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#spec-and-status
#[derive(Serialize, Deserialize, Clone, Debug, JsonSchema)]
#[derive(Serialize, Deserialize, Clone, Debug, JsonSchema, PartialEq)]
pub enum ScheduledBackupMethod {
#[serde(rename = "barmanObjectStore")]
BarmanObjectStore,
Expand Down
10 changes: 9 additions & 1 deletion tembo-operator/src/defaults.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ use std::collections::BTreeMap;

use crate::{
apis::coredb_types::{
Backup, ConnectionPooler, PgBouncer, S3Credentials, ServiceAccountTemplate,
Backup, ConnectionPooler, PgBouncer, S3Credentials, ServiceAccountTemplate, VolumeSnapshot,
},
cloudnativepg::poolers::{PoolerPgbouncerPoolMode, PoolerTemplateSpecContainersResources},
extensions::types::{Extension, TrunkInstall},
Expand Down Expand Up @@ -139,6 +139,7 @@ pub fn default_backup() -> Backup {
retentionPolicy: default_retention_policy(),
schedule: default_backup_schedule(),
s3_credentials: default_s3_credentials(),
volume_snapshot: default_volume_snapshot(),
..Default::default()
}
}
Expand Down Expand Up @@ -210,3 +211,10 @@ pub fn default_s3_credentials() -> Option<S3Credentials> {
..Default::default()
})
}

pub fn default_volume_snapshot() -> Option<VolumeSnapshot> {
Some(VolumeSnapshot {
enabled: false,
snapshot_class: None,
})
}
Loading
Loading