Skip to content

Conversation

zimbatm
Copy link
Contributor

@zimbatm zimbatm commented Oct 17, 2019

Hi,

I am looking for early feedback on this PR. The goal right now is to talk about the design and if this is something Zalando would be interested in merging, plus an associated roadmap of things that would need to happen. If we could do that it would be much appreciated!

The main motivation is to be able to specify a custom TLS certificate from k8s secrets ( #633 ). Right now it only supports static secrets but down the road I would also like to support reloading the secrets whenever they change to allow proper key rotation and integration with cert-manager.

Cheers,
z

@Jan-M
Copy link
Member

Jan-M commented Oct 18, 2019

The topic of having proper certs for the Postgres pots comes up regularly. So far my idea was more a minimalist CA, where a secret or AWS KMS encrypted value contains the private key to sign local certs and DB host names.

I have to read into what you are trying to do here. And then it is questionable who provisions what, does the operator provision things more merely take care of mounting/referencing the right things in the statefulset.

Thanks for bringing this up!

@zimbatm
Copy link
Contributor Author

zimbatm commented Oct 18, 2019

Hey @Jan-M, thanks for taking a look :)

On our side we have the cert-manager operator deployed to handles the certificates. It introduces new types of resources such as the Issuer resource that represents a CA and handles ACME, Vault, self-signed, ... and then the Certificate resource that represents a TLS certificate. The Certificate then generates a k8s secret with the same shape as kubectl create secret tls.

The idea of this PR at the moment is to add a new tls.secretName on the postgresql resource so that each PG cluster can get their own certificate. This makes the extension compatible with both cert-manager and kubectl create secret tls.

If I understand properly what you mentioned, your idea would be to have the postgres-operator trigger the certificate generation. The upside would be that ceritifcates would be generate on a per-host basis instead of on a per-cluster basis. But the downside is that now the postgres-operator has to integrate more tightly to either cert-manager or another operator.

@Jan-M
Copy link
Member

Jan-M commented Oct 30, 2019

Nope, I do not think the operator should start dealing with TLS cert and issueing. So in favor of supporting necessary things to inject and or provide TLS certs to Postgres pods.

I just ment we had an idea once about a "CA" for local clusters.

Can we not have the secret name as a format string in the config and then some global and per cluster enable disable?

@zimbatm
Copy link
Contributor Author

zimbatm commented Nov 3, 2019

Okay this PR is working now. I still need to handle certificate rotation somehow, and add more tests and documentation.

Can we not have the secret name as a format string in the config and then some global and per cluster enable disable?

I'm not sure what the advantage would be. Since both the Certificate and Postgresql resources are created by the user, he is in the best position to define a name and make the connection explicit. Otherwise he would have to ask the administrator what secretName mapping has been configured on the operator.

@zimbatm
Copy link
Contributor Author

zimbatm commented Nov 3, 2019

For certificate rotation, this could be handled automatically in the spilo container: zalando/spilo#373

Copy link
Contributor

@abh abh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you! I'll patch it into our local image.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the fsgroup is required for the TLS feature to work, it should be spelled out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. I already have a check in the code but I will add a note in the documentation as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@zimbatm zimbatm changed the title WIP: Custom TLS secrets Custom TLS secrets Nov 14, 2019
@zimbatm
Copy link
Contributor Author

zimbatm commented Nov 14, 2019

Alright, this is ready for another review

@zimbatm
Copy link
Contributor Author

zimbatm commented Nov 26, 2019

now with tests!

@zimbatm
Copy link
Contributor Author

zimbatm commented Nov 26, 2019

once zalando/spilo#373 is merged I also want to add support for CA certificates

@FxKu
Copy link
Member

FxKu commented Dec 4, 2019

@zimbatm so we merged your PR in Spilo. Can you update this PR - so that we might include it in the next release.

@zimbatm
Copy link
Contributor Author

zimbatm commented Dec 4, 2019

thanks @FxKu!

It looks like coverall wants me to write even more tests. Do you have a recommendation at what place I can write tests for pkg/apis/acid.zalan.do/v1/zz_generated.deepcopy.go ?

@FxKu
Copy link
Member

FxKu commented Dec 9, 2019

At the moment, I see some overlap with #736 and also with the current additional secret mount implementation. If it would be possible to specify an array additional volumes, I guess both features could be covered with it. On the other hand, we also aim for low complexity of the cluster manifest. Just one flag enableTLS would be nice. Cert-file and private key file template could be part of the operator configuration.

@zimbatm
Copy link
Contributor Author

zimbatm commented Dec 10, 2019

okay I will take a look tomorrow. Today I also added configuration to pass a CA file (not yet pushed) that will make the implementation more complete and also allow for checking client certificates.

@zimbatm
Copy link
Contributor Author

zimbatm commented Dec 10, 2019

This PR is ready to me. It's tested and it's working for me (aside from the master build failures).

I think the other PR is nice but also opens the deployment to much more variance. Here the nice thing is that the TLS setup is delegated to the controller and has a specific implementation, whereas the other PR allows to build a custom TLS scheme on top with a initContainer, custom Volume and maybe a sidecar container to handle key rotation. So the out-of-the-box experience is probably worse as the user has to figure out how to tie everything together. I believe that using a secret volume is the right call to store the TLS certificates.

@abh
Copy link
Contributor

abh commented Dec 29, 2019

Any chance this can be merged?

@erthalion
Copy link
Contributor

erthalion commented Dec 29, 2019

Any chance this can be merged?

@abh Sorry for leaving it hanging for so long. I believe we can take a look at this soon (at the beginning of January), but in general I agree with the summary from @zimbatm . I intent to merge another volume related PR, so maybe we can include this one too.

@FxKu FxKu added this to the 1.4 milestone Jan 3, 2020
@zimbatm
Copy link
Contributor Author

zimbatm commented Jan 10, 2020

Is there anything I can do to get this merged?

@erthalion
Copy link
Contributor

Is there anything I can do to get this merged?

I'm afraid no, unfortunately it take so long on our side. But I have this PR in my plan.

@zimbatm
Copy link
Contributor Author

zimbatm commented Jan 10, 2020

Well that's life. As long as you're not blocked by me :)

@erthalion
Copy link
Contributor

Btw, @zimbatm could you merge master once into this PR? Or make it available to push to this branch for operator maintainers. We can start fixing travis tests here.

@zimbatm
Copy link
Contributor Author

zimbatm commented Jan 17, 2020

@erthalion: rebased! I am looking into how to give you write access to the branch.

@erthalion
Copy link
Contributor

I am looking into how to give you write access to the branch.

Apparently there should be some toggle already

@zimbatm
Copy link
Contributor Author

zimbatm commented Jan 17, 2020

It looks like this option is disabled when the PR is coming from an org instead of an account. Do you want me to open a new PR coming from my account?

@zimbatm
Copy link
Contributor Author

zimbatm commented Jan 17, 2020

just went ahead and opened #798

@erthalion
Copy link
Contributor

It looks like this option is disabled when the PR is coming from an org instead of an account.

Interesting. Nope, let's keep it like it is right now. If I would like to suggest something, I can fork a PR fork :)

@Jan-M
Copy link
Member

Jan-M commented Jan 27, 2020

Closing as superseded by newer PR

@Jan-M Jan-M closed this Jan 27, 2020
@FxKu FxKu removed this from the 1.4 milestone Feb 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants