Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support standby replication from GS (GCS) #1446

Merged
merged 12 commits into from
Dec 3, 2021
Merged

Support standby replication from GS (GCS) #1446

merged 12 commits into from
Dec 3, 2021

Conversation

jamesmcdonald
Copy link
Contributor

This allows standby clusters to be created from a GCS bucket by specifying either gs_wal_path with a full manual path, or cluster and optionally uid and version for similar behaviour to cloning. I've updated the documentation with the details. I have tested it against a number of GCS WAL-G backups and it seems to work nicely. I haven't tested the cluster-based replication from S3 as I don't have any data in S3 to try it out with.

It uses the main GCPCredentials from the operator config rather than requiring a standby-specific setting which I think should generally work. If different credentials are needed for pushing WALs then they can be changed at the same time as the standby is promoted.

I created this before I realised there was an existing PR! Let me know if it's useful. This is going to be an essential feature for us so I'm happy to amend it as needed.

@FxKu
Copy link
Member

FxKu commented May 31, 2021

Thanks @jamesmcdonald. Why would you need the version. Yes, since Spilo 13, it's part of the WAL path, but then you would have it s3/gs_wal_path, no? I also do not see that it is appended somewhere. Only an env variable is set that Spilo doesn't know about. Aside from that it's looking fine I think.

@FxKu FxKu added this to the 1.7 milestone May 31, 2021
@jamesmcdonald
Copy link
Contributor Author

Thanks for the review! As I understand, PGVERSION gets added to the bucket path here: https://github.com/zalando/spilo/blob/master/postgres-appliance/scripts/configure_spilo.py#L763 after it's set from STANDBY_PGVERSION by https://github.com/zalando/spilo/blob/master/postgres-appliance/scripts/configure_spilo.py#L1027.

@FxKu
Copy link
Member

FxKu commented Jun 4, 2021

No, PGVERSION should already be available for every spilo pod depending on the version chosen in the Postgres cluster manifest. You don't have to define it again. We also don't do this on cloning.

@jamesmcdonald
Copy link
Contributor Author

Right, I see what you mean. That certainly makes sense given that you can only create a standby from the same version. Unfortunately the main PGVERSION isn't picked up by the current logic in Spilo, so if STANDBY_PGVERSION isn't set then the WALG_GS_PREFIX in /var/run/etc/wal-e.d/env-standby-* ends up just ending in /wal with no version.

I guess doing this properly would require additional logic in Spilo to either:

I've updated my branch to remove the version setting and demonstrate this behaviour; there's another branch https://github.com/jamesmcdonald/postgres-operator/tree/gs-standby-version with the previous commit.

Do you have a suggestion for the best way to proceed?

@FxKu FxKu modified the milestones: 1.7, 1.8 Aug 23, 2021
@jamesmcdonald
Copy link
Contributor Author

I've removed the ability to select a path via cluster/uid/version entirely because it doesn't work reliably. It's pretty straightforward to use the s3/gs_wal_path options.

// standby with S3, find out the bucket to setup standby
msg := "Standby from S3 bucket using custom parsed S3WalPath from the manifest %s "
c.logger.Infof(msg, description.S3WalPath)
if description.S3WalPath != "" {
Copy link
Member

@FxKu FxKu Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if description.S3WalPath != "" {
if description.S3WalPath == "" && description.GSWalPath == "" {
return nil
}
if description.S3WalPath != "" {

We should still consider the case when either one of the WAL paths is empty and return nil to have the same behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixed!

@FxKu FxKu mentioned this pull request Nov 29, 2021
Comment on lines 654 to 660
"cluster": {
Type: "string",
},
"uid": {
Type: "string",
Format: "uuid",
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can go, I think. Maybe a left over from an earlier version of this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaned these up too.

@FxKu
Copy link
Member

FxKu commented Nov 29, 2021

@jamesmcdonald can you have a look at my two comments? I think after that we can merge this PR :)

@FxKu
Copy link
Member

FxKu commented Nov 30, 2021

👍

1 similar comment
@Jan-M
Copy link
Member

Jan-M commented Dec 2, 2021

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants