Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting cluster geo tag from nodes labels and annotations #823

Closed
wants to merge 3 commits into from

Conversation

jkremser
Copy link
Member

@jkremser jkremser commented Jan 4, 2022

more details in issue #720

Unfortunately, the clusterGeoTag was also used in helm chart for calculating the extdnsOwnerID which is then passed to external-dns deployment as an argument for the main binary/container. However we don't know it upfront so I had to add a logic around it. It (external-dns deployment) takes the param as the env variable from a configmap called external-dns-env. If the cm is not there it won't get deployed (-> no race conditions), we create this configmap after k8gb start with the correctly calculated value for EXTERNAL_DNS_TXT_OWNER_ID (same logic as before)

k8gb now has to be able to read the nodes in order to get the annotations or labels on them

I've created a dummy rbac for tests in just to be able to read the nodes as well (we are not using the global one called "k8gb" because there would be a risk that operator would be handling the gslb resources). In the terratest we deploy another k8gb instance using helm with crippled RBAC and test if it fails if geo tag is not provided by env var nor by label/annotations on the nodes and then put the label on nodes, kill the pod to enforce the restart and check that this time it's ok, since it gets the geo tag from a node.

@jkremser jkremser force-pushed the cluster-tag-from-nodes branch 2 times, most recently from 6315f5a to 7d1f6a6 Compare January 17, 2022 13:53
Command: "bash",
Args: []string{"-c", fmt.Sprintf("cat %s | envsubst | kubectl %s -n %s -f -", rbacPath, verb, options.Namespace)},
Env: map[string]string{
"namespace": options.Namespace,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we controll k8gb namespace in tests?

Copy link
Member Author

@jkremser jkremser Jan 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

y, gets created & deleted here. They will look like k8gb-test-geo-tag-1fuk1x where the last chunk is random

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can throw away envsubs and use .Release.Namespace in original RBAC and ommit having separate rbac resource here.

@jkremser jkremser marked this pull request as ready for review January 17, 2022 14:24
Copy link
Collaborator

@kuritka kuritka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi overall excellent idea and good job!

I have only three little things mentioned in review.

main.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
Copy link
Contributor

@somaritane somaritane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkremser nice work!
This feature might help a lot to simplify the GitOps-centric k8gb deployment workflows, with one set of k8gb deployment settings valid for all the clusters.
There might be a few conceptual comments/questions we might want to address before merging.

main.go Outdated Show resolved Hide resolved
main.go Outdated
const region = "topology.kubernetes.io/region" // top-lvl (example: africa-east-1)
const zone = "topology.kubernetes.io/zone" // lower-lvl (example: africa-east-1a)
// assuming all the nodes to have the same topology.kubernetes.io/region and topology.kubernetes.io/zone so
for _, node := range nodeList.Items {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reflect the inference resolution order in the documentation.

main.go Outdated
// values of these annotations/labels are cloud provider specific
const region = "topology.kubernetes.io/region" // top-lvl (example: africa-east-1)
const zone = "topology.kubernetes.io/zone" // lower-lvl (example: africa-east-1a)
// assuming all the nodes to have the same topology.kubernetes.io/region and topology.kubernetes.io/zone so
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topology.kubernetes.io/zone label might be actually unique for every node in the cluster, at least for managed cloud k8s clusters (EKS, AKS, etc):

kubectl get nodes -o custom-columns=NAME:'{.metadata.name}',REGION:'{.metadata.labels.topology\.kubernetes\.io/region}',ZONE:'{metadata.labels.topology\.kubernetes\.io/zone}'
NAME                                       REGION      ZONE
ip-XXX-XXX-XXX-XXX.eu-west-1.compute.internal   eu-west-1   eu-west-1a
ip-YYY-YYY-YYY-YYY.eu-west-1.compute.internal   eu-west-1   eu-west-1b
ip-ZZZ-ZZZ-ZZZ-ZZZ.eu-west-1.compute.internal   eu-west-1   eu-west-1c

Probably we shouldn't care about zones at all in this case?
According to https://kubernetes.io/docs/reference/labels-annotations-taints/#topologykubernetesiozone:

Kubernetes makes a few assumptions about the structure of zones and regions:

  1. regions and zones are hierarchical: zones are strict subsets of regions and no zone can be in 2 regions
  2. zone names are unique across regions; for example region "africa-east-1" might be comprised of zones "africa-east-1a" and "africa-east-1b"

So in this case, topology.kubernetes.io/region seems like the only viable source of truth.

main.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
Signed-off-by: Jirka Kremser <jiri.kremser@gmail.com>
@jkremser jkremser force-pushed the cluster-tag-from-nodes branch 3 times, most recently from e9cbd54 to 3e538a7 Compare January 24, 2022 11:14
Signed-off-by: Jirka Kremser <jiri.kremser@gmail.com>
@jkremser jkremser force-pushed the cluster-tag-from-nodes branch 2 times, most recently from 2bb970b to e1ceede Compare January 24, 2022 15:55
@@ -139,6 +139,9 @@ type Config struct {
MetricsAddress string `env:"METRICS_ADDRESS, default=0.0.0.0:8080"`
// extDNSEnabled hidden. EdgeDNSType defines all enabled Enabled types
extDNSEnabled bool `env:"EXTDNS_ENABLED, default=false"`
// Route53HostedZoneID identifier of route53 hosted zone that's added (if not empty)
// for external-dns deployment as part of the txt-owner-id
Route53HostedZoneID string `env:"ROUTE53_HOSTED_ZONE_ID"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about NS1 or rfc2136 provider, IIUC controller now expects R53 hosted zone ID and will fail if string matching pattern won't be provided.

Copy link
Member Author

@jkremser jkremser Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be empty, the regexp check is done only if it's not an empty string. So for NS1 or rfc2136 the env var is not going to be set.

If it's empty, the DNS_ZONE value is used instead (same as before in the _helpers.tpl)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use same owner across providers? Not sure mainaining separate case for r53 worth it.

Comment on lines 37 to 40
if err := createOrUpdateExternalDNSConfigMap(operatorConfig, mgr.GetClient()); err != nil {
log.Err(err).Msgf("Can't create/update config map for external-dns")
return err
}
Copy link
Collaborator

@kuritka kuritka Jan 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot fully evaluate if it would not be better from the maintenance point of view to always create configMap declaratively (somewhere in yaml configurations) with predefined value ("%s-%s-%s", txtOwnerPrefix, cfg.DNSZone, cfg.ClusterGeoTag). And only if cfg.Route53HostedZoneID was set, we would update the txtOwnerKey rather than always create configMap in the Go code (handling if it was created etc...).

@team WDYT?

*anyway: createOrUpdateExternalDNSConfigMap is nice code snippet, storing into my coding notes

Copy link
Member Author

@jkremser jkremser Jan 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case when geo tag is not provided upfront, we have it only after k8gb starts (reading it from the node labels), so we can't have it in yaml in helm chart.

why not to create the cm before external-dns start and update the key from the k8gb?
..because if the env var in Deployment is referenced from a ConfigMap and that cm is not there yet, the deployment will wait for the cm to be created. So it solves the race/data condition when external-dns could do its job (reconcile the dnsendpoints) with wrong ownerId + updating the CM will not make the deployment that uses it to restart iirc, we would need to restart it anyway.

I am currently looking into an option to throw away the geo tag from the txt-owner-id as we talked about it and use the k8gb-{{ .Values.k8gb.dnsZone }}-$uuid where $uuid is generated during the helm install (w/ the option to override it using helm values).. the problem with this approach is that helm upgrade will regenerate the uuid (as I've pointed out during the call) so it's quite cumbersome to do that... helm install will give you some random id, then we have to use it during the helm upgrade ... not very user (/dev-ops) friendly

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkremser , ok, thanks for the explanation

controllers/post_start.go Outdated Show resolved Hide resolved
@jkremser jkremser force-pushed the cluster-tag-from-nodes branch 2 times, most recently from 9cd2000 to dafd1e0 Compare January 26, 2022 11:44
…art as env var regerenced from configmap

Signed-off-by: Jirka Kremser <jiri.kremser@gmail.com>
Copy link
Member

@ytsarev ytsarev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is breaking change, we need documentation for it

@ytsarev
Copy link
Member

ytsarev commented Jan 5, 2024

@jkremser do you plan to continue working on this PR?

@ytsarev ytsarev added this to the 1.1 milestone Jul 20, 2024
@ytsarev
Copy link
Member

ytsarev commented Jul 20, 2024

Stale, we can return to it if there is a demand by community. Thanks for the implementation attempt!

@ytsarev ytsarev closed this Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants