Implement new barman and/or barman-cloud subcommands to test the archive-destination contents #432

mnencia · 2021-10-12T12:07:18Z

We could add an option to barman-cloud-wal-archive to enable the following behavior:

when archiving the very first WAL file (000000010000000000000000), barman-cloud-wal-archive should fail if the target position contains any backup or WAL
when archiving the first file of a WAL segment (XXXXXXXXXXXXXXX0000000), barman-cloud-wal-archive should fail if the target contains any WAL in that segment

the rationale is to prevent mixing WALs from different instances.

The text was updated successfully, but these errors were encountered:

gbartolini · 2021-10-12T12:27:26Z

I agree this is an important feature that would allow us to catch in an easier way sharing the same bucket with different servers. Maybe an option "--require-empty-target"?

ringerc · 2021-10-27T04:43:34Z

This is to prevent a variety of issues where a new cluster or a restored backup of an old cluster can write into an archive directory that was already used by a prior cluster.

The obvious hazard here is that WAL segments could be overwritten, or if overwrite is not allowed, WAL segment archive failures. For example, if using k8s and a CNP Cluster, writing to the backup store will run fine, so the user will not be immediately aware of the problem and the potential invalidity of their backups. I assume the same is true if a new cluster is configured with normal barman to use the same name as an existing cluster, but haven't verified.

The bigger issue is actually with failover and recovery, not backup restore. Especially in CNP where failover and rolling update etc is routine. If the CNP Cluster restarts a replica pod, does a failover, or restores a base backup, it will use barman-cloud to look for the most recent timeline history file it can locate in the WAL archive. The contents of the archive directory might have a higher timeline than the start-of-recovery timeline on the last checkpoint or the backup label file. PostgreSQL will try to load timeline history files with ascending numeric filenames from the archive until the first one that's missing, so it will scan-forward past the latest timeline used by the new instance and find the timeline history files from the old database. This causes recovery to fail with an error like

FATAL:  requested timeline 3 is not a child of this server's history

When this happens the logs will look something like this:

LOG:  restored log file \"00000003.history\" from archive
...
[[barman-cloud wal-restore or barman get-wal fails to restore 00000004.history here]]
...
LOG: entering standby mode
LOG: restored log file \"00000003.history\" from archive
LOG: restored log file \"000000030000000400000056\" from archive
...
LOG:  started streaming WAL from primary at 4/56000000 on timeline 1
FATAL:  requested timeline 3 is not a child of this server's history
DETAIL:  Latest checkpoint is at 4/56000060 on timeline 1, but in the history of the requested timeline, the server forked off from that timeline at 0/D0000028

It is important to understand that there are multiple ways we can land up writing timeline history files and/or WAL segments with an older LSN and/or timeline than what's already in an archive directory, and not all of them begin with segment 1:

initdb a new cluster Pg with the same name as the old one. For CNP/barman-wal, the cluster name is used as the default prefix for the archive blobs in the archive storage e.g {{cluster-name}}/wals/... so the new cluster will try to write WAL segs and timeline history files to the same path. The new Pg cluster has a different sysid, but that doesn't affect the archive destination.
Restore an existing Pg cluster from base backup and use the same name as the old one. Again, the same archive directory is used, but in this case the cluster has the same sysid, and it starts with a lsn > 1 and might have an initial timeline > 1 too, so the proposed check will not immediately detect the problem.
Start a new, separate Pg cluster from a base backup or initdb a new one, but use the same name or archive destination. E.g. when someone fires up a dev or test cluster and forgets to change the location to write the archives to.

All of these are operator error: the operator should ensure that the name used for backup archives is unique for each and every postgres cluster they bring up, either as a restore or by initdb.

But in practice people won't do that, especially with deployment automation, cloud management platforms, declarative configurations deeply layered inside other systems, etc. And right now there isn't a good way to detect such a misconfiguration and fail-fast.

Ideally the misconfigured DB wouldn't start up and start writing WAL at all, but that's not something we can do a lot about when dealing with standalone postgres instances where we don't necessarily know if this instance is a new startup of an existing instance or a restored copy from a backup. Really robust solutions probably need to happen in the management layers that are responsible for creating new clusters / restoring backups.

The biggest help barman can be to those layers is to expose an interface that lets deployment automation (CNP, ansible scripts, or whatever else) quickly and easily check if a given archive destination is empty or non-empty. That way they can fail-fast if it's non-empty with an informative error, and we don't have to inefficiently list archive directory contents for each barman wal archive command run.

ringerc · 2021-10-28T01:39:48Z

Proposal

Add a barman and barman-cloud subcommand to check an archive destination to ensure it is empty. This subcommand should return 0 (non-error) on empty, 1 (error) on empty, 2 (error) on failure to check / inaccessible destination, etc.

If feasible, offer the option to ignore existing WAL segs and timeline history files in the archive that are on the starting server's history but safely in the past, so the same check can be used for safer replica promotions too.

Rely on the higher level agent automation to run it when creating a cluster.

Tools would not run this check when promoting a replica during a failover or when starting up an existing cluster after a restart, crash, etc. Only when making a logically distinct new cluster with a new history: restoring a backup, making a staging/dev copy of a database, initdb'ing a new database, etc.

It would be useful to be able to specify additional options to ignore any WAL segs and timeline history files older than the server's current timeline and LSN. We could use this variant during promotion of a replica to make sure that the archive destination doesn't have any WAL from a different promotion, e.g. in split-brain scenarios where fencing and STONITH failed or in cases of operator error where the operator failed to tell us they were creating a new cluster. So if feasible I'd like the option --for-promotion --current-wal-segment XXXX --current-timeline %d or something like that, where it's only an error if there are resources in the archive directory that are in a future timeline or on the same timeline and a future lsn relative to the starting server's current position. But it's not vital to have that, it's a nice-to-have. Because when a new independent cluster is created there's no good reason for there to ever be any archives in the archive destination, and that's the primary use for this protection.

barman check-wal-archive-usable --for-new-cluster
barman check-wal-archive-usable --for-promotion --current-wal-segment XXXXX --current-timeline 1

Rationale

As established above, the proposal in this ticket won't address the issue as written.

Barman's archive command could look-ahead in the archive from whatever WAL seg or timeline history file that it's being asked to store, to make sure there are no existing WAL segs or timeline history files "in the future" in the same archive destination. But that would be a prohibitive performance cost, especially for cloud blob stores where enumeration is expensive.

What barman can do cheaply and efficiently is help the higher level automation (human admin, Ansible/Puppet code, k8s Operator, etc) answer the question "is this archive destination safe to use for a new cluster". Then let that automation ask the question when it knows it's making a new cluster. And for a new cluster, an archive destination is safe if it's empty.

It'd be nice to also be able to sanity check promotion safety as noted above. But again it'd need to be a command exposed to the user or their automation agent, not something done for each archive command invocation.

Background

There's nothing that barman alone can do to prevent this problem, because it needs to be told when a postgres instance is a new and independent cluster vs a continuation of an existing one.

At its roots this is a postgres limitation - and one that doesn't have a lot of simple solutions available. When PostgreSQL is started up it can't possibly know if it's the same postgres instance that just shut down, or if it has been copied somewhere else / restored from backup. It could be a SAN snapshot, cloned VM, or all sorts of things. If it's a continuation of an existing cluster it must WAL archives to the same location to ensure backup continuity. If it's a new independent cluster, it must not write WAL archives to the same location as its ancestor instance to avoid intermingling WAL and timeline history files from cluster histories that are diverging - whether due to backup restore from an older point in time, creation of a separate dev/staging instance or whatever else.

PostgreSQL really wants the human operator or their automation agent to tell it whenever it gets restored from a base backup or copied to a new cluster instance. It cannot know it's a truly new and independent cluster based on the existence of a backup label file or based on being a promoted replica, because both of those are normal for HA/failover setups that maintain a single consistent linear history too.

Right now the user "tells" PostgreSQL when it is a new and independent PostgreSQL cluster instance by changing the archive_command's backup destination.

Barman has exactly the same limitations as postgres itself here. The archive_command has exactly the same knowledge limitations as the postgres instance that calls it. It has no way to know if it's a continuation of the same postgres instance, or a "forked off" cluster from a restored backup, a dev-env clone of the cluster, etc.

PostgreSQL's Timeline ID helps Pg detect misconfigurations and wrong archives when it is in recovery, but it doesn't do a lot to prevent multiple instances or branches in an instance's history from writing to a single archive destination. It doesn't solve this issue, it's only a partial defense for it.

Arguably PostgreSQL itself could expose better user interface for this and be more defensive about it - for example, it could have an option to re-generate the cluster sysid on end-of-recovery promotion, and offer a placeholder for passing the cluster sysid to the archive_command. But at some level it'd still need the user to say "you are a new independent instance or restored backup, generate a new sysid", so the potential for operator error would still exist.

What barman can do is expose helper interfaces to users and higher level automation tools like Ansible modules, k8s Operators, etc. Those things do know when they're making a "new" cluster vs bringing up new replicas / restarting an existing cluster. That's what I'm proposing we do here.

ringerc · 2021-10-28T01:43:09Z

@mnencia @gbartolini @amenonsen @mikewallace1979 Should I turn the feature spec of the above into a separate ticket, then we close this one?

mikewallace1979 · 2021-10-28T13:54:40Z

Thanks for the detailed write up @ringerc - that is the perfect amount of context.

I'm happy continuing on this issue and updating the title to reflect your proposal.

ringerc · 2021-10-29T09:39:13Z

@mnencia Cool.

The TL;DR is then:

Implement new barman and/or barman-cloud subcommands to test the archive-destination contents.

At minimum have a subcommand that returns 0 if the archive destination is empty, or 1 if it is non-empty.

Preferably also have the option to pass the server's current timeline and last WAL segment. If specified, ignore any timeline history files <= the current timeline, any WAL segments < the current timeline, and any WAL segments = the current timeline but < the current WAL segment.

Possible syntax might be:

barman check-wal-archive-usable --for-new-cluster
barman check-wal-archive-usable --for-promotion --current-wal-segment XXXXX --current-timeline 1

Adds commands to barman and barman cloud which check that a barman server or cloud location is safe to use as an archive destination for a new PostgreSQL server. A location is considered safe if either: 1. There are no WAL files at all in the archive. 2. Any existing WAL files meet one of the following criteria: a. They belong to an older timeline than that specified by the current_timeline argument. b. They are on the same timeline as the current_timeline argument but the segment is less than that specified in the current_wal_segment argument. A file is considered a WAL file if it passes the `is_any_xlog_file` check in `barman/xlog.py` so this applies to WAL files, history files, partial WAL files and backup labels. The commands added are: * barman check-wal-archive * barman-cloud-check-wal-archive The motivation for this patch is to provide a way that external orchestration tools can validate the WAL archive destination is safe for a newly provisioned PostgreSQL cluster, given such a cluster may use the exact same name as an old cluster. In such scenarios, any WAL files which have a higher timeline or segment than the WALs being written by the new cluster will cause any attempt to restore from a backup to fail. Reasons why external orchestration tooling may re-use the same cluster name and archive destination include (but are not limited to): * A new cluster is created via initdb with the same name as the old one. The sysid will be different but this does not affect the archive destination so any archived WALs relating to the older cluster will be present in the same location. * A cluster is restored from a base backup and uses the same name as the old cluster. The cluster has the same sysid and starts with a segment ID > 1 and timeline > 1. The same archive destination used by the old cluster will be used for the restored cluster. * A new cluster is started which happens to re-use the same name and archive destination. All of these cases lead to the situation where WAL archiving and backup is functioning normally *but* any attempts to restore from those backups will fail. This is dangerous for anyone relying on the databases managed by external orchestration/automation. The commands provided by this patch do not solve the problem alone because neither Barman nor PostgreSQL have the necessary context. The commands can, however, be added to external automation in order to catch archive safety issues at the provisioning stage. Closes #432

mikewallace1979 · 2021-11-02T16:44:28Z

@ringerc

The proposed PR #443 adds barman-cloud-check-wal-archive <SERVER_URL> and barman check-wal-archive commands.

These accept --current-wal-segment and --current-timeline arguments. If neither are set the command will exit with status 1 if there are any WALs in the archive at all. If both are set then the command will check all the WALs and exit with status 1 unless the archive is empty or only consists of files which can be ignored (i.e. they are history files which are < current_timeline, they are WAL segments < the current timeline, or they are WAL segments on the current timeline but < the current WAL segment).

If either --current-wal-segment or --current-timeline are set (but not the other) then an error is generated and we exit with status 2 (as we would for other error conditions).

I didn't add the --for-new-cluster and --for-promotion flags because they're not essential for the functionality (the presence of --current-wal-segment and --current-timeline is enough to change the behaviour of the check) and the names refer to concepts which barman doesn't know anything about - I'm not convinced it makes sense to tell barman-cloud why it needs to do a check, we should only be giving it the details of what should be included in the check.

If those flags would still be helpful to external automation scripts then let me know and we can still add them.

One other thing to clarify - the PR currently assumes --current-wal-segment is the logical ID and the segment ID combined, e.g. 0000000100000001 - is that what you're expecting?

Adds commands to barman and barman cloud which check that a barman server or cloud location is safe to use as an archive destination for a new PostgreSQL server. A location is considered safe if either: 1. There are no WAL files at all in the archive. 2. Any existing WAL files meet one of the following criteria: a. They belong to an older timeline than that specified by the current_timeline argument. b. They are on the same timeline as the current_timeline argument but the segment is less than that specified in the current_wal_segment argument. A file is considered a WAL file if it passes the `is_any_xlog_file` check in `barman/xlog.py` so this applies to WAL files, history files, partial WAL files and backup labels. The commands added are: * barman check-wal-archive * barman-cloud-check-wal-archive The motivation for this patch is to provide a way that external orchestration tools can validate the WAL archive destination is safe for a newly provisioned PostgreSQL cluster, given such a cluster may use the exact same name as an old cluster. In such scenarios, any WAL files which have a higher timeline or segment than the WALs being written by the new cluster will cause any attempt to restore from a backup to fail. Reasons why external orchestration tooling may re-use the same cluster name and archive destination include (but are not limited to): * A new cluster is created via initdb with the same name as the old one. The sysid will be different but this does not affect the archive destination so any archived WALs relating to the older cluster will be present in the same location. * A cluster is restored from a base backup and uses the same name as the old cluster. The cluster has the same sysid and starts with a segment ID > 1 and timeline > 1. The same archive destination used by the old cluster will be used for the restored cluster. * A new cluster is started which happens to re-use the same name and archive destination. All of these cases lead to the situation where WAL archiving and backup is functioning normally *but* any attempts to restore from those backups will fail. This is dangerous for anyone relying on the databases managed by external orchestration/automation. The commands provided by this patch do not solve the problem alone because neither Barman nor PostgreSQL have the necessary context. The commands can, however, be added to external automation in order to catch archive safety issues at the provisioning stage. Closes #432

ringerc · 2021-11-05T06:50:20Z

@mikewallace1979 Thanks. Just saw this (GH not notifying me for some reason). Will look.

Adds commands to barman and barman cloud which check that a barman server or cloud location is safe to use as an archive destination for a new PostgreSQL server. A location is considered safe if either: 1. There are no WAL files at all in the archive. 2. Any existing WAL files meet one of the following criteria: a. They belong to an older timeline than that specified by the current_timeline argument. b. They are on the same timeline as the current_timeline argument but the segment is less than that specified in the current_wal_segment argument. A file is considered a WAL file if it passes the `is_any_xlog_file` check in `barman/xlog.py` so this applies to WAL files, history files, partial WAL files and backup labels. The commands added are: * barman check-wal-archive * barman-cloud-check-wal-archive The motivation for this patch is to provide a way that external orchestration tools can validate the WAL archive destination is safe for a newly provisioned PostgreSQL cluster, given such a cluster may use the exact same name as an old cluster. In such scenarios, any WAL files which have a higher timeline or segment than the WALs being written by the new cluster will cause any attempt to restore from a backup to fail. Reasons why external orchestration tooling may re-use the same cluster name and archive destination include (but are not limited to): * A new cluster is created via initdb with the same name as the old one. The sysid will be different but this does not affect the archive destination so any archived WALs relating to the older cluster will be present in the same location. * A cluster is restored from a base backup and uses the same name as the old cluster. The cluster has the same sysid and starts with a segment ID > 1 and timeline > 1. The same archive destination used by the old cluster will be used for the restored cluster. * A new cluster is started which happens to re-use the same name and archive destination. All of these cases lead to the situation where WAL archiving and backup is functioning normally *but* any attempts to restore from those backups will fail. This is dangerous for anyone relying on the databases managed by external orchestration/automation. The commands provided by this patch do not solve the problem alone because neither Barman nor PostgreSQL have the necessary context. The commands can, however, be added to external automation in order to catch archive safety issues at the provisioning stage. Closes #432

Adds commands to barman and barman cloud which check that a barman server or cloud location is safe to use as an archive destination for a new PostgreSQL server. A location is considered safe if either: 1. There are no WAL files at all in the archive. 2. All existing WAL files belong to an older timeline than that specified by the --timeline argument. A file is considered a WAL file if it passes the `is_any_xlog_file` check in `barman/xlog.py` so this applies to WAL files, history files, partial WAL files and backup labels. The commands added are: * barman check-wal-archive * barman-cloud-check-wal-archive The motivation for this patch is to provide a way that external orchestration tools can validate the WAL archive destination is safe for a newly provisioned PostgreSQL cluster, given such a cluster may use the exact same name as an old cluster. In such scenarios, any WAL files on the same or higher timeline as the WALs being written by the new cluster will cause any attempt to restore from a backup to fail. Reasons why external orchestration tooling may re-use the same cluster name and archive destination include (but are not limited to): * A new cluster is created via initdb with the same name as the old one. The sysid will be different but this does not affect the archive destination so any archived WALs relating to the older cluster will be present in the same location. * A cluster is restored from a base backup and uses the same name as the old cluster. The cluster has the same sysid and starts with a segment ID > 1 and timeline > 1. The same archive destination used by the old cluster will be used for the restored cluster. * A new cluster is started which happens to re-use the same name and archive destination. All of these cases lead to the situation where WAL archiving and backup is functioning normally *but* any attempts to restore from those backups will fail. This is dangerous for anyone relying on the databases managed by external orchestration/automation. The commands provided by this patch do not solve the problem alone because neither Barman nor PostgreSQL have the necessary context. The commands can, however, be added to external automation in order to catch archive safety issues at the provisioning stage. Closes #432

mikewallace1979 added this to the 2.16 milestone Oct 27, 2021

mikewallace1979 changed the title ~~Add an option to barman-cloud-wal-archive to avoid reusing buckets/storageaccounts~~ Implement new barman and/or barman-cloud subcommands to test the archive-destination contents Oct 29, 2021

mikewallace1979 self-assigned this Nov 1, 2021

mikewallace1979 mentioned this issue Nov 2, 2021

Provide commands to check WAL archive destination is usable #441

Closed

This was referenced Nov 2, 2021

Provide commands to check WAL archive destination is usable #442

Closed

Provide commands to check WAL archive destination is usable #443

Merged

mikewallace1979 closed this as completed in #443 Nov 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement new barman and/or barman-cloud subcommands to test the archive-destination contents #432

Implement new barman and/or barman-cloud subcommands to test the archive-destination contents #432

mnencia commented Oct 12, 2021

gbartolini commented Oct 12, 2021

ringerc commented Oct 27, 2021 •

edited

Loading

ringerc commented Oct 28, 2021 •

edited

Loading

ringerc commented Oct 28, 2021

mikewallace1979 commented Oct 28, 2021

ringerc commented Oct 29, 2021

mikewallace1979 commented Nov 2, 2021 •

edited

Loading

ringerc commented Nov 5, 2021

Implement new barman and/or barman-cloud subcommands to test the archive-destination contents #432

Implement new barman and/or barman-cloud subcommands to test the archive-destination contents #432

Comments

mnencia commented Oct 12, 2021

gbartolini commented Oct 12, 2021

ringerc commented Oct 27, 2021 • edited Loading

ringerc commented Oct 28, 2021 • edited Loading

Proposal

Rationale

Background

ringerc commented Oct 28, 2021

mikewallace1979 commented Oct 28, 2021

ringerc commented Oct 29, 2021

mikewallace1979 commented Nov 2, 2021 • edited Loading

ringerc commented Nov 5, 2021

ringerc commented Oct 27, 2021 •

edited

Loading

ringerc commented Oct 28, 2021 •

edited

Loading

mikewallace1979 commented Nov 2, 2021 •

edited

Loading