Skip to content

Conversation

ermajn
Copy link
Contributor

@ermajn ermajn commented Jan 28, 2020

After this, there are two options of standby cluster: streaming_host, and s3_wal.
In case of streaming_host option, standby will connect to primary (specified in manifest) and do clone based on basebackup. For that a small refernce to basebackup script in spilo needs to be fixed (zalando/spilo#404).
Also, secret with credentials for standby user are handled manualy in this case. I'm not sure about implementation under #507, but this is minimalistic implementation that basically passes required ENVs to spilo, and it actualy works.

@senthilcodr
Copy link

Hi all,
This PR has been open for a while now. Can the reviewers please provide their feedback?

@FxKu FxKu added this to the 1.5 milestone Feb 25, 2020
@FxKu FxKu added the spilo Issue more related to Spilo label Feb 25, 2020
@senthilcodr
Copy link

@ermajn, does this PR support promoting of the standby cluster using the operator? (Similar to this: patroni/patroni#1096)

If not are you planning to add the same?

@FxKu
Copy link
Member

FxKu commented Apr 3, 2020

Thanks @ermajn for this PR and sorry it took so long to give first feedback. I see one problem with the new required standby_method field. This would break all clusters already running as a standby. We should choose to ignore one standby setup if all fields are present, similar to what we do in the clone section where we clone from S3 when uid and timestamp are present. Otherwise, a basebackup is done from the specified cluster.

I tend to ignore S3 when streaming host and port are set. If it's incomplete take s3_wal_path. In the CRD validation no field should be set to required then. I think this way, everybody who already uses the standby feature can continue as is.

Three more minor requests:

  1. Could you reflect your changes in the user docs, too?
  2. Could you add unit tests similar to the one for cloning env?
  3. Rebase with master :)

@FxKu FxKu modified the milestones: 1.5, 1.6 Apr 23, 2020
@senthilcodr
Copy link

@ermajn, is this PR still active? If not, I can raise a new one based on these changes.

@ermajn
Copy link
Contributor Author

ermajn commented Jun 24, 2020

@senthilcodr yes its active, let me finish all requested task...

@FxKu FxKu removed this from the 1.6 milestone Aug 26, 2020
@machine424
Copy link
Contributor

Hi,
We are really interested in this feature, could it be ready for 1.7?

@FxKu
Copy link
Member

FxKu commented Mar 29, 2021

Sure. Unfortunately, my main complaint from last summer is still not addressed: Backwards compatibility. The standby_method field is a breaking change. What to do with running standby clusters? I think we could do it similar to what we have with cloning. If e.g. host is present choose this method, else S3 like before.

@ermajn
Copy link
Contributor Author

ermajn commented Mar 29, 2021

Hello, yes requirement is still there. It would be great if we can include it.
@FxKu if your propose about backward compatibility is acceptible, I can put some effort in implementing it that way.

@ermajn
Copy link
Contributor Author

ermajn commented Nov 1, 2021

@FxKu this one is old but it seems to be required. I have added now backward compatibility, it falls back to "s3_wal" when standby_method is not specified. Is there anything more needed in order to merge this PR?

@FxKu FxKu added this to the 1.8 milestone Nov 1, 2021
@FxKu
Copy link
Member

FxKu commented Mar 2, 2022

@ermajn I resolved the conflicts. I'm still not a fan of the standby_method. Why not use a streaming cluster if host is present and use WAL when it's not. We do something similar with cloning. If there's a timestamp specified we clone from basebackup, if not we clone from WAL. This would make things easier for users and devs. The behavior has to be documented, of course. At the moment only the reference docs were adjusted.

At first, I wasn't sure about the standby_secret_name either, but I see that with that you don't need to copy values from the secret of the source cluster into the replication user's secret of the standby cluster, right?

c.logger.Infof(msg, description.StandbyHost)

result = append(result, v1.EnvVar{Name: "STANDBY_HOST", Value: description.StandbyHost})
result = append(result, v1.EnvVar{Name: "STANDBY_PORT", Value: description.StandbyPort})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think STANDBY_PORT can be omitted when it's not set. Patroni will default to 5432.

Comment on lines +1926 to +1953
if description.StandbyMethod == "s3_wal" || description.StandbyMethod == "gs_wal" {
if description.S3WalPath != "" {
// standby with S3, find out the bucket to setup standby
msg := "Standby from S3 bucket using custom parsed S3WalPath from the manifest %s "
c.logger.Infof(msg, description.S3WalPath)

result = append(result, v1.EnvVar{
Name: "STANDBY_WALE_S3_PREFIX",
Value: description.S3WalPath,
})
} else if description.GSWalPath != "" {
msg := "Standby from GS bucket using custom parsed GSWalPath from the manifest %s "
c.logger.Infof(msg, description.GSWalPath)

envs := []v1.EnvVar{
{
Name: "STANDBY_WALE_GS_PREFIX",
Value: description.GSWalPath,
},
{
Name: "STANDBY_GOOGLE_APPLICATION_CREDENTIALS",
Value: c.OpConfig.GCPCredentials,
},
}
result = append(result, envs...)
}
result = append(result, v1.EnvVar{Name: "STANDBY_METHOD", Value: "STANDBY_WITH_WALE"})
result = append(result, v1.EnvVar{Name: "STANDBY_WAL_BUCKET_SCOPE_PREFIX", Value: ""})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, my conflict resolving messed up indentation in this part

* **standby_secret_name**
Specify secret name that is used to look-up for a password that
is need to connect to primary host. Password should be under `password` key
of a secret. This is *required* be specified when using `streaming_host` standby method.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not required so define it, since you default to the old behavior in the code.

s3_wal_path: "s3://path/to/bucket/containing/wal/of/source/cluster/"
s3_wal_path: "s3://path/to/bucket/containing/wal/of/source/cluster/" # Make this a standby cluster and provide the s3 bucket path of source cluster for continuous streaming.
standby_method: "s3_wal" # s3_wal or streaming_host;
# standby_host: ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to provide a reasonable example here? Same for the user docs.

@FxKu
Copy link
Member

FxKu commented Apr 4, 2022

I will close this PR now since we merged #1830. Still, thanks @ermajn for bringing it up. Sorry for the feedback being too sporadic.

@FxKu FxKu closed this Apr 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

spilo Issue more related to Spilo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants