Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NE-568] Add option to manage DNS to ingresscontrollers #1114

Merged
merged 6 commits into from Jun 29, 2022

Conversation

thejasn
Copy link
Contributor

@thejasn thejasn commented May 10, 2022

This enhancement provides the customer the ability to completely
disable DNS management on the Ingress Controller so that DNS can be
managed externally by them.

Signed-off-by: thejasn thn@redhat.com

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 10, 2022
@openshift-ci openshift-ci bot requested review from dgoodwin and jerpeter1 May 10, 2022 07:54
@thejasn
Copy link
Contributor Author

thejasn commented May 10, 2022

/assign @sherine-k @alebedev87

This enhancement provides the customer the ability to completely
disable DNS management on the Ingress Controller so that DNS can be
managed externally by them.
@thejasn thejasn force-pushed the feature/configurable-dns-lb branch from 649a339 to c14c290 Compare May 10, 2022 15:22
Copy link
Contributor

@sherine-k sherine-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @thejasn,
as promised, a few typo fixes suggested.
along with 1 question...
Great job

@thejasn thejasn force-pushed the feature/configurable-dns-lb branch 2 times, most recently from 6c049ac to d5e5918 Compare May 11, 2022 09:58
@thejasn
Copy link
Contributor Author

thejasn commented May 11, 2022

/label tide/merge-method-squash

@openshift-ci openshift-ci bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label May 11, 2022
@thejasn thejasn force-pushed the feature/configurable-dns-lb branch from d5e5918 to b9413d0 Compare May 11, 2022 10:04
@thejasn thejasn changed the title WIP: Add option to manage DNS to ingresscontrollers Add option to manage DNS to ingresscontrollers May 11, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 11, 2022
thejasn added 2 commits May 16, 2022 13:26
Updated downgrade strategy

Signed-off-by: thejasn <thn@redhat.com>
Signed-off-by: thejasn <thn@redhat.com>
@thejasn thejasn force-pushed the feature/configurable-dns-lb branch from bd4c459 to 42e2736 Compare May 16, 2022 07:56
@alebedev87
Copy link
Contributor

/retitle [NE-568] Add option to manage DNS to ingresscontrollers

@openshift-ci openshift-ci bot changed the title Add option to manage DNS to ingresscontrollers [NE-568] Add option to manage DNS to ingresscontrollers May 16, 2022
Copy link
Contributor

@brandisher brandisher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thejasn overall I understand the intent but I did have a few questions/comments and some feedback to share. For any of my API related comments, defer to the API Team for final say. I mostly wanted to share my recent API experience to help move the needle.

// +kubebuilder:validation:Optional
// +kubebuilder:validation:Enum=Managed;Unmanaged
// +optional
DNS LoadBalancerDNSManagementPolicy `json:"dns"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect this to be DNSManagementPolicy so it's more clear at a glance what it's affecting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting sharing the type variable between the fields? I wanted to avoid that, hence the different names. Sharing the type doesn't seem correct, since it would couple the two objects LoadBalancerStrategy and DNSRecord when any new types are added (eg: externalDNS). WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's a great question! I wasn't as clear as I intended to be; this is the change I'm suggesting for clarity.

Suggested change
DNS LoadBalancerDNSManagementPolicy `json:"dns"`
DNSManagementPolicy LoadBalancerDNSManagementPolicy `json:"dns"`

That way you're maintaining the distinction in type between LoadBalancerDNSManagementPolicy and DNSManagementPolicy but when a cluster admin is reading the API/config it's clear that they're changing the loadBalancerStrategy.dnsManagementPolicy which is, in my opinion, more clear than loadBalancerStrategy.dns. Also just a reminder that ultimately the API team should make the call on the shape of the API but I suspect they'd err on the side of making configuration options more obvious.

// +kubebuilder:validation:Optional
// +kubebuilder:validation:Enum=Managed;Unmanaged
// +optional
ManagementPolicy DNSManagementPolicy `json:"managementPolicy"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here as above, I'd expect this to be DNSManagementPolicy. Although with this one it's sort of implicit what the management policy is informing so this is probably even more nit-like than above.


- Provide a day-0 solution for cluster installations involving external DNS
management by the customer.
- DNS migration from managed to unmanaged.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read this as "DNS management policy migration from managed to unmanaged". If that's the correct way to read this then I think it should be made more explicit as you do allow for migration from managed to unmanaged when the IC domain is the same as the cluster domain in the Modifying/Updating an ingresscontroller section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, this was meant to state traffic management is not going to be considered during moving from Managed to Unmanaged. I've updated the point, to make it more clear.


### API Extensions

The ingresscontroller CRD is extended by adding an optional field `DNS`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The ingresscontroller CRD is extended by adding an optional field `DNS`
The ingresscontroller CRD is extended by adding an optional field `DNSManagementPolicy`

@thejasn thejasn force-pushed the feature/configurable-dns-lb branch from e176ff1 to c2ed4c2 Compare May 24, 2022 12:10
@thejasn thejasn force-pushed the feature/configurable-dns-lb branch from c2ed4c2 to b87ec6d Compare June 1, 2022 14:30
Co-authored-by: Chad Scribner <cscribne@redhat.com>
@thejasn thejasn force-pushed the feature/configurable-dns-lb branch from b87ec6d to f4535ec Compare June 1, 2022 15:12
will be updated,
- `DNSReady` condition will be set to true and reason updated to
UnmanagedDNS.
- Post successfully updating the ingresscontroller, the associated *DNSRecord*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added deletion workflow.

by the operator, but it is to be noted that if the ingresscontroller is updated from
`Unmanaged` to `Managed` any changes done will be lost during re-sync.

If the ingresscontroller is deleted, and the associated *DNSRecord* still exists
Copy link
Contributor Author

@thejasn thejasn Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified the implementation details on how deletion of ingresscontrollers will be done.

Copy link
Contributor

@alebedev87 alebedev87 Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that there is probably no easy way out. But I think it still a bit contradictory. In this comment we say that DNSRecord CR won't be created/updated/deleted, however the deletion will still happen: deletionTimestamp will be added. Also the same comment says that it's the full responsibility of the cluster admin to deal with DNSRecord CRs however the cluster admin won't do oc delete dnsrecord it has to be oc edit dnsrecord (as not removed finalization label will block the deletion) which is kinda not intuitive.

I think I'm rather for one of the following:

  • "fully orphaned" unmanaged DNSRecord CR: cluster admin has to do aws route53 delete resource record + oc delete dnsrecord. The orphaned DNSRecord CRs can be found with the command you suggested.
  • "cloud provider orphaned" unmanaged DNSRecord CR: cluster admin has to do aws route53 delete resource record only. DNSRecord will be cleared from the finalization label without doing the actual finalization.

@thejasn, @brandisher: WDYT guys?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on our discussion, decided to treat Unmanaged as the DNS record on the cloud provider will be unmanaged by the operator. (going ahead with latter)

Comment on lines 87 to 93
- `Managed`: It is the default state and behaves the same as the existing
implementation. The ingresscontroller manages create/update/delete actions
on the *DNSRecord* CR and the actual DNS records on the DNS provider.
- `Unmanaged`: In this state, the ingresscontroller will not create/update/delete
the *DNSRecord* associated with it, nor does it create/update/delete
the actual DNS record on the cloud provider. This responsibility entirely falls
on the cluster admin.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly stated what the 2 new states denote and their scope.

Signed-off-by: thejasn <thn@redhat.com>
@alebedev87
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 2, 2022
### Goals

- Provide the ability to opt out of DNS management by the `cluster-ingress-operator`.
- Recover from degraded state of the `cluster-ingress-operator` during upgrades
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Recover from degraded state of the `cluster-ingress-operator` during upgrades
- Enable migration

Comment on lines 81 to 82
Currently, there is partial support provided by the `installer` (in UPI installations)
to disable the `cluster-ingress-operator` from managing the DNS cluster-wide as documented
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't say "partial support" or "provided by the installer"; it's an existing, course-grained option in the cluster DNS config, which the cluster administrator can configure at install time (by customizing installer manifests) or after installation (using oc patch dnses.config.openshift.io/cluster --type=merge --patch='{"spec":{"privateZone":null,"publicZone":null}}). The goal, as your next paragraph says, is to add a more fine-grained option.

*DNSRecord* CR and the DNS record on the cloud provider.
- Test the following update paths on a custom ingresscontroller
- Updating `.loadbalancer.dnsManagementPolicy` from `Managed` -> `Unmanaged` -> `Managed` and
to ensure no conflicts during creation/recreation of DNS record on the cloud provider.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "recreation"? Is the test going to delete the DNS record after setting dnsManagementPolicy to "Unmanaged"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is what I had in mind. Thinking of testing both the scenarios to verify there were no collisions during recreation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense; it just isn't clear from the test plan as written.

Copy link
Contributor Author

@thejasn thejasn Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test plan might need to be revisited. I've mostly implemented this feature and turns out we can't have updates from Unmanaged to Managed since domain cannot be updated. Essentially, to update an ingresscontroller from Unmanaged to Managed becomes an DELETE and CREATE (replace) due to the limitation on domain.

Although it is possible to update from Managed to Unmanaged users would be forced to re-use the domain leading to again DELETE and CREATE of the ingresscontroller so that a new unmanaged domain can be given.

So I'm thinking it is better to make dnsManagementPolicy also not updatable similar to domain (figured this would be better than making domain updatable`), then the test plan would be straight forward. Just test with an Unmanaged ingresscontroller.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. Why would it matter that domain cannot be updated? The operator should still be able to do an upsert when the management state is changed from Unmanaged to Managed, the same as it does an upsert if a managed DNSRecord's targets change. On some platforms, an upsert may actually be implemented as discrete delete and create steps within a transaction, but that's an implementation detail that is abstracted behind the dns.Provider interface.

Comment on lines 279 to 280
- Updating `Unmanaged` -> `Managed` and to ensure creation of new *DNSRecord*
and correct conditions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea here to create the IngressController with "Unmanaged", update it to "Managed", and verify that the operator creates the DNS record upon the update to the IngressController? Here, "DNSRecord" means the actual DNS record and not the CR, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, updated the text.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 9, 2022
@thejasn thejasn force-pushed the feature/configurable-dns-lb branch from 9664dba to 4a584ad Compare June 9, 2022 12:13

- Provide a day-0 solution for cluster installations involving external DNS
management by the customer.
- Traffic management during DNS management policy migration from managed to unmanaged.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ingresscontroller will still handle any traffic that it receives, so I'm not sure exactly what this means. Does this refer to DNS traffic, or to ingress (HTTP/TLS) traffic?

Comment on lines +102 to +104
__Note__: This feature is only supported on new or non-default ingresscontrollers.
The default ingresscontroller can be modified/updated at the discretion of
the cluster admin.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we do something (such as trigger an alert or set Upgradeable=False) to convey that the cluster is no longer in a supported state?

Comment on lines 116 to 118
- The DNSRecord will also be updated as part of the same reconcile,
`.spec.dnsManagementPolicy` will be set to `Unmanaged` and the following conditions
will be updated,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What conditions?

*DNSRecord* CR and the DNS record on the cloud provider.
- Test the following update paths on a custom ingresscontroller
- Updating `.loadbalancer.dnsManagementPolicy` from `Managed` -> `Unmanaged` -> `Managed` and
to ensure no conflicts during creation/recreation of DNS record on the cloud provider.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense; it just isn't clear from the test plan as written.

Co-authored-by: Miciah Dashiel Butler Masters <mmasters@redhat.com>
Signed-off-by: thejasn <thn@redhat.com>
@thejasn thejasn force-pushed the feature/configurable-dns-lb branch from 4a1b4e7 to 0df2f16 Compare June 29, 2022 12:44
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 29, 2022

@thejasn: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@Miciah
Copy link
Contributor

Miciah commented Jun 29, 2022

Thanks!
/approve
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 29, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 29, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 29, 2022
@openshift-ci openshift-ci bot merged commit 1cfd7cf into openshift:master Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants