-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PeeringAcceptor controller #1225
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like what im seeing so far
dfe0cc6
to
ba63d22
Compare
@@ -142,14 +142,22 @@ jobs: | |||
- run: mkdir -p ${{env.TEST_RESULTS}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this file would be removed when merging in the feature branch. This is because the most recently released version of consul does not have peering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we have the right binaries by the time we merge? I'm thinking that maybe we should keep it on the feature branch as well until we have released consul binaries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup agreed, we'd not want to cut a release with all this unless there's something for Consul also plus we'll want to address TODOs and unit tests (for peering dialer) on the feature branch peering
before merging to main.
I don't feel I have much to add given my lack of context. I perused to really just gain a better understanding of what you are working on here. I think you've expressed it very clearly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, Nitya! As usual, love all the comments and the thought you put into this ❤️
I'm leaving my comments so far (I got to reviewing most of the reconcile logic, except for some private functions and tests).
I will review some more today, but wanted to leave my comments so far. Some of them are very minor and I marked them as non-blockers. Others are more questions or thoughts about edge cases, so feel free to punt on those too if need be!
@@ -142,14 +142,22 @@ jobs: | |||
- run: mkdir -p ${{env.TEST_RESULTS}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we have the right binaries by the time we merge? I'm thinking that maybe we should keep it on the feature branch as well until we have released consul binaries.
@@ -0,0 +1,99 @@ | |||
{{- if .Values.connectInject.enabled }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just thinking out loud. I think it makes sense to make this part of connect injector. I think at some point we wanted to make the CRD controller part of the injector deployment as well, so I think making this controller be part of connect inject makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds good-- there is technically a service discovery case that can be enabled without service mesh, which could mean Consul DNS for imported services could work without enabling mesh functionality. Do you think it's worth enabling these controllers by default, not gated on any values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think we will in the future, but probably not for this feature.
control-plane/config/crd/patches/cainjection_in_peeringacceptors.yaml
Outdated
Show resolved
Hide resolved
control-plane/config/crd/patches/webhook_in_peeringacceptors.yaml
Outdated
Show resolved
Hide resolved
|
||
// Look up existing secret. | ||
var existingSecret *corev1.Secret | ||
if statusSecretSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could there be an edge case when the status is not set but secret exists?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any time we create a secret, we do then set the status. However, if the secret was successfully created but the status didn't properly get updated, then the scenario you described could happen. However, that would return a reconcile error since updateStatus
would return an error. Upon the next reconcile, statusSecretSet
would be false, but the peering would exist in Consul, so it'll go into the case in L140 and try to generate a token again, so I think it'd be fine.
// Compare the existing secret hash. | ||
// Get the secret specified by the status, make sure it matches the status' secret.latestHash. | ||
if existingSecret != nil { | ||
existingSecretHashBytes := sha256.Sum256(existingSecret.Data[peeringAcceptor.Status.Secret.Key]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we use secret's resourceVersion
instead of computing a hash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this idea!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, great idea! update: still working on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to use resource version! this was a great idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Leaving my remaining comments. I think this is good to go once we decide what to do about some questions I left.
} | ||
|
||
// updateStatusError updates the peeringAcceptor's ReconcileError in the status. | ||
func (r *PeeringAcceptorController) updateStatusError(ctx context.Context, peeringAcceptor *consulv1alpha1.PeeringAcceptor, reconcileErr error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We always ignore the error that's returned by this function. I'm assuming we shouldn't, so maybe it could be a TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's anything we can do to handle the error from this function. If we fail to update the status with the error message, we don't want the status update failure to be what's returned from the reconcile, instead we want the real error. I suppose we could have this function just not return an error, and document that we are ignoring any errors in this case? The only error case is if the K8s api is down when trying to make an update to the status, and instead the real error would be returned as a reconcile error, wdyt?
listKind: PeeringAcceptorList | ||
plural: peeringacceptors | ||
singular: peeringacceptor | ||
scope: Namespaced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be a cluster resource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was something that i had suggested. This to me felt similar to our config entries of proxy_defaults and mesh ie it is ok for these to be namespaced because there are users that prefer to save all of their config in a single configuration namespace and this feels like something that falls into the same category. additionally, given it is responsible for creating and managing a secret, it is easy to assume that the secret needs to be created in the same namespace as the resource itself instead of us having to determine which namespace the secret has to be written to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh ok! I was thinking that you would only want one of these per cluster, and so that's why it might make sense from rbac perspective to make it cluster-wide. For secret, i think it's ok to manage a secret in some namespace (e.g. ClusterRoleBinding refers to a service account and service accounts are namespaced).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I also think it might be weird for PeeringAcceptors to be cluster wide and for their managed secrets to be namespaced, it might be nicer to always create the secret in whatever ns the PeeringAcceptor is in. And yup there can be multiple PeeringAcceptors in a single cluster
b47572b
to
a0faab9
Compare
Updates: Hey @ishustava re-requesting review, while still addressing:
@thisisnotashwin I'm still addressing:
|
After updating control-plane/PROJECT L6 to reference the control plane folder, ran the command: operator-sdk create api --group consul --version v1alpha1 --kind Peering --controller --namespaced=true --make=false --resource=true
12ec371
to
99fa4b6
Compare
Made a cleanup Asana to refactor to have getters on the CRD fields Currently thinking to not using the object reference, and using the local type instead due to https://github.com/kubernetes/api/blob/be84346886a48ff00fc676f9cb37f10e79dfd0da/core/v1/types.go#L5624 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. i haven't tested the behavior on a cluster myself but this great!
Adds new PeeringAcceptor CRD and controller with unit tests. This controller determines whether to generate a new peering token in Consul by reconciling the PeeringAcceptor resource. It currently supports writing the peering token to a K8s secret. Generating the new CRD and API: After updating control-plane/PROJECT L6 to reference the control plane folder, ran the command: `operator-sdk create api --group consul --version v1alpha1 --kind Peering --controller --namespaced=true --make=false --resource=true`
Adds new PeeringAcceptor CRD and controller with unit tests. This controller determines whether to generate a new peering token in Consul by reconciling the PeeringAcceptor resource. It currently supports writing the peering token to a K8s secret. Generating the new CRD and API: After updating control-plane/PROJECT L6 to reference the control plane folder, ran the command: `operator-sdk create api --group consul --version v1alpha1 --kind Peering --controller --namespaced=true --make=false --resource=true`
Adds new PeeringAcceptor CRD and controller with unit tests. This controller determines whether to generate a new peering token in Consul by reconciling the PeeringAcceptor resource. It currently supports writing the peering token to a K8s secret. Generating the new CRD and API: After updating control-plane/PROJECT L6 to reference the control plane folder, ran the command: `operator-sdk create api --group consul --version v1alpha1 --kind Peering --controller --namespaced=true --make=false --resource=true`
Adds new PeeringAcceptor CRD and controller with unit tests. This controller determines whether to generate a new peering token in Consul by reconciling the PeeringAcceptor resource. It currently supports writing the peering token to a K8s secret. Generating the new CRD and API: After updating control-plane/PROJECT L6 to reference the control plane folder, ran the command: `operator-sdk create api --group consul --version v1alpha1 --kind Peering --controller --namespaced=true --make=false --resource=true`
Adds new PeeringAcceptor CRD and controller with unit tests. This controller determines whether to generate a new peering token in Consul by reconciling the PeeringAcceptor resource. It currently supports writing the peering token to a K8s secret. Generating the new CRD and API: After updating control-plane/PROJECT L6 to reference the control plane folder, ran the command: `operator-sdk create api --group consul --version v1alpha1 --kind Peering --controller --namespaced=true --make=false --resource=true`
Adds new PeeringAcceptor CRD and controller with unit tests. This controller determines whether to generate a new peering token in Consul by reconciling the PeeringAcceptor resource. It currently supports writing the peering token to a K8s secret. Generating the new CRD and API: After updating control-plane/PROJECT L6 to reference the control plane folder, ran the command: `operator-sdk create api --group consul --version v1alpha1 --kind Peering --controller --namespaced=true --make=false --resource=true`
Changes proposed in this PR:
Future PRs (tasks are in Cluster Peering Asana):
How I've tested this PR:
helm install nitya ../../consul-k8s/charts/consul -f consul-values.yml
consul-values.yml
:k apply -f peering-acceptor.yaml
k delete -f peering-acceptor.yaml
Look at the logs and make sure it's as expected.
How I expect reviewers to test this PR:
Code review
Checklist: