New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Encryption Config #516
Conversation
@@ -61,10 +64,11 @@ func NewConfigObserver( | |||
SchedulerLister: configInformer.Config().V1().Schedulers().Lister(), | |||
|
|||
ConfigmapLister: kubeInformersForNamespaces.ConfigMapLister(), | |||
SecretLister_: kubeInformersForNamespaces.SecretLister(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
underscore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matches others in the same file.
return err | ||
} | ||
|
||
switch originalSpec.ManagementState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have library-go helper to deal with this
} | ||
|
||
func NewEncryptionObserver(targetNamespace string, encryptionConfigPath []string) configobserver.ObserveConfigFunc { | ||
return func(genericListers configobserver.Listers, recorder events.Recorder, existingConfig map[string]interface{}) (map[string]interface{}, []error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange to return a func this big
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think this is because we try to be generic. Also what I personally don't like about the definition is that we lose type safety and it's really hard to read what the function does. We accept map[string]interface{}
and return map[string]interface{}
:(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange to return a func this big
I did not want to create a new struct just to close over the inputs.
encryptionConfigSecret, err := listers.SecretLister().Secrets(targetNamespace).Get(encryptionConfSecret) | ||
if errors.IsNotFound(err) { | ||
recorder.Warningf("ObserveEncryptionConfigNotFound", "encryption config secret %s/%s not found", targetNamespace, encryptionConfSecret) | ||
// TODO what is the best thing to do here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what produce this secret?
resourcesynccontroller.ResourceLocation{Namespace: targetNamespace, Name: encryptionConfSecret}, | ||
resourcesynccontroller.ResourceLocation{Namespace: operatorclient.GlobalMachineSpecifiedConfigNamespace, Name: sourceName}, | ||
); err != nil { | ||
panic(err) // coding error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't panic, propagate the error up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
return previouslyObservedConfig, append(errs, err) | ||
} | ||
|
||
if !equality.Semantic.DeepEqual(existingEncryptionConfig, []string{encryptionConfFilePath}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could compare existingEncryptionConfig
field before SetNestedStringSlice
(line 58).
what if existingEncryptionConfig
hasn't changed ? Should we then simply return previouslyObservedConfig
? For example:
if equality.Semantic.DeepEqual(existingEncryptionConfig, []string{encryptionConfFilePath}) {
return previouslyObservedConfig, errs
}
recorder.Eventf("ObserveEncryptionConfigChanged", "encryption config file changed from %s to %s", existingEncryptionConfig, encryptionConfFilePath)
err := unstructured.SetNestedStringSlice(observedConfig, []string{encryptionConfFilePath}
...
errs = append(errs, err) | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is okay to continue even if errs != nil
? Does it mean that previouslyObservedConfig
could be incomplete? (some fields could be missing?)
// TODO what is the best thing to do here? | ||
// for now we do not unset the config as we are checking a synced version of the secret that could be deleted | ||
// return observedConfig, errs | ||
return previouslyObservedConfig, errs // do not append the not found error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can fail because errs
could actually have some errors (line 35)
} | ||
|
||
if !equality.Semantic.DeepEqual(existingEncryptionConfig, []string{encryptionConfFilePath}) { | ||
recorder.Eventf("ObserveEncryptionConfigChanged", "encryption config file changed from %s to %s", existingEncryptionConfig, encryptionConfFilePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could we make the msg a bit more accurate and assume that existingEncryptionConfig
holds only one element? Then the msg would look like changed from x to y
instead of changed from [x] to y
} | ||
|
||
func NewEncryptionObserver(targetNamespace string, encryptionConfigPath []string) configobserver.ObserveConfigFunc { | ||
return func(genericListers configobserver.Listers, recorder events.Recorder, existingConfig map[string]interface{}) (map[string]interface{}, []error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think this is because we try to be generic. Also what I personally don't like about the definition is that we lose type safety and it's really hard to read what the function does. We accept map[string]interface{}
and return map[string]interface{}
:(
Name: fmt.Sprintf("encryption-%s-%s-%s-%d", c.componentName, gr.Group, gr.Resource, keyID), | ||
Namespace: operatorclient.GlobalMachineSpecifiedConfigNamespace, | ||
Labels: map[string]string{ | ||
encryptionSecretComponent: c.componentName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't encryptionSecretComponent
label point to targetNamespace
?
|
||
func newAES256Key() []byte { | ||
b := make([]byte, 32) // AES-256 == 32 byte key | ||
if _, err := rand.Read(b); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd handle the error, perhaps there are actually cases when this function might fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will never fail on linux, and the OAuth server has matched this behavior since the beginning of OpenShift.
c.componentSelector = labelSelectorOrDie(encryptionSecretComponent + "=" + targetNamespace) | ||
|
||
operatorClient.Informer().AddEventHandler(c.eventHandler()) | ||
kubeInformersForNamespaces.InformersFor(operatorclient.GlobalMachineSpecifiedConfigNamespace).Core().V1().Secrets().Informer().AddEventHandler(c.eventHandler()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will the sync
func be triggered if there is no secrets in GlobalMachineSpecifiedConfigNamespace
namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Operator client informer will always trigger this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although I truly enjoyed reading your code it would be nice to add documentation and some unit tests at some point.
I'm not sure if I fully understand how the code works but hopefully, the following will help the next reviewers.
NewEncryptionKeyController
essentially generates encryption keys for validGRs
based on needsNewKey
condition and places them in GlobalMachineSpecifiedConfigNamespace
ns.
NewEncryptionStateController
takes all those keys and generates a secret (encryption-config-kube-apiserver
) with an EncryptionConfiguration
in GlobalMachineSpecifiedConfigNamespace
SyncEncryptionConfig
propagates encryption-config-kube-apiserver
secret to a secret in targetNamespace
ns.
NewEncryptionPodStateController
waits for encryption-config-kube-apiserver
in targetNamespace
ns and propagates state back to the secrets created by NewEncryptionKeyController
by adding some labels.
NewEncryptionMigrationController
waits for encryption-config-kube-apiserver
in targetNamespace
ns and takes all the keys/secrets in GlobalMachineSpecifiedConfigNamespace
finds the writeKey
and performs encryption by modifing desired resources.
NewEncryptionPruneController
removes the last 10 secrets with encryptionSecretMigratedTimestamp
label attached.
} | ||
|
||
func (c *EncryptionKeyController) sync() error { | ||
if ready, err := shouldRunEncryptionController(c.operatorClient); err != nil || !ready { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this controller and the others are not guarded by a TechPreviewNoUpgrade
feature flag?
func (c *EncryptionKeyController) generateKeySecret(gr schema.GroupResource, keyID uint64) *corev1.Secret { | ||
return &corev1.Secret{ | ||
ObjectMeta: metav1.ObjectMeta{ | ||
Name: fmt.Sprintf("encryption-%s-%s-%s-%d", c.componentName, gr.Group, gr.Resource, keyID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from what I have seen the keyID
is monotonic and I'm not sure if I understand how does it relate to
https://github.com/openshift/cluster-kube-apiserver-operator/pull/516/files#diff-9b3719f78d8f89fa6b78d0f3040a3862R159
|
||
preRunCachesSynced: []cache.InformerSynced{ | ||
operatorClient.Informer().HasSynced, | ||
kubeInformersForNamespaces.InformersFor(operatorclient.GlobalMachineSpecifiedConfigNamespace).Core().V1().Secrets().Informer().HasSynced, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
factor out kubeInformersForNamespaces.InformersFor(operatorclient.GlobalMachineSpecifiedConfigNamespace)
into var and reuse below. Too easy to lose track of these, leading to ugly bugs.
return configError | ||
} | ||
|
||
func (c *EncryptionKeyController) handleEncryptionKey() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"handle" here means that new keys are created if needed, right? Call the function checkAndCreateKeys
.
|
||
const encWorkKey = "key" | ||
|
||
type EncryptionKeyController struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs a ten liner comment what this controller does.
|
||
preRunCachesSynced []cache.InformerSynced | ||
|
||
validGRs map[schema.GroupResource]bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does valid mean? encrypted CRs?
|
||
validGRs map[schema.GroupResource]bool | ||
|
||
componentName string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is a component? Be more precise. Below I see that you get secrets with the selector.
return keyID, true // eh? | ||
} | ||
|
||
return keyID, time.Now().After(migrationTimestamp.Add(30 * time.Minute)) // TODO how often? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it a constant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewriting all secrets every 30 min? Is that a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton sounds like this interval came from you. We have to migrate all resources, i.e. update/patch all of them. Are we sure something in this dimension makes sense? Was expecting something around days to a month, like for certs.
|
||
const stateWorkKey = "key" | ||
|
||
type EncryptionStateController struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ten-liner godoc
return configError | ||
} | ||
|
||
func (c *EncryptionStateController) handleEncryptionStateConfig() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better name
resourcesynccontroller.ResourceLocation{Namespace: targetNamespace, Name: encryptionConfSecret}, | ||
resourcesynccontroller.ResourceLocation{Namespace: operatorclient.GlobalMachineSpecifiedConfigNamespace, Name: sourceName}, | ||
); err != nil { | ||
panic(err) // coding error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
pkg/operator/starter.go
Outdated
@@ -82,6 +84,61 @@ func RunOperator(ctx *controllercmd.ControllerContext) error { | |||
ctx.EventRecorder, | |||
) | |||
|
|||
validGRs := map[schema.GroupResource]bool{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
encryptedGRs
@@ -190,6 +252,9 @@ var RevisionSecrets = []revision.RevisionResource{ | |||
// this is needed so that the cert syncer itself can request certs. It uses localhost | |||
{Name: "kube-apiserver-cert-syncer-client-cert-key"}, | |||
{Name: "kubelet-client"}, | |||
|
|||
// etcd encryption | |||
{Name: "encryption-config", Optional: true}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does optional mean that a Secret read error in the kubelet, some race or some admin deleting the secret by accident will lead to an API server to start with invalid encryption config? This will stop it from reading data, but even worse: it might make it store data unencrypted. CVE ahead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to think on this some.
operatorInformer := operatorClient.Informer() | ||
operatorInformer.AddEventHandler(eventHandler) | ||
|
||
secretsInformer := kubeInformersForNamespaces.InformersFor(operatorclient.GlobalMachineSpecifiedConfigNamespace).Core().V1().Secrets().Informer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked myself whether it is safe to assume that an informer will be created for that namespace. I think it is because we do secretClient.Secrets(operatorclient.GlobalMachineSpecifiedConfigNamespace)
which uses CachedSecretGetter
.
} | ||
|
||
if len(revisions) != 1 { | ||
return "", nil // api servers have not converged onto a single revision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could at least log that we are waiting for pods to converge, otherwise, it could be hard to tell why the controller isn't running.
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: enj The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
/retest |
2 similar comments
/retest |
/retest |
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
…aded Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
Signed-off-by: Monis Khan <mkhan@redhat.com>
@enj: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@mfojtik @sttts @p0lyn0mial you probably care about this sooner rather than later. /close |
@enj: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Signed-off-by: Monis Khan mkhan@redhat.com