Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CFE-748: Improve CoreDNS Integration with EgressFirewall #1335

Conversation

arkadeepsen
Copy link
Contributor

This enhancement proposes to improve the integration between CoreDNS and EgressFirewall by adding new plugin to CoreDNS and a new CR for tracking the IPs and TTLs of DNS names used in EgressFirewall rules.

…and EgressFirewall by adding new plugin to CoreDNS and a new CR for tracking the IPs and TTLs of DNS names used in EgressFirewall rules
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 6, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2023
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 6, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 6, 2023

@arkadeepsen: This pull request references CFE-748 which is a valid jira issue.

In response to this:

This enhancement proposes to improve the integration between CoreDNS and EgressFirewall by adding new plugin to CoreDNS and a new CR for tracking the IPs and TTLs of DNS names used in EgressFirewall rules.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@arkadeepsen arkadeepsen marked this pull request as ready for review February 9, 2023 18:37
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 9, 2023
enhancements/dns/coredns-egressfirewall-integration.md Outdated Show resolved Hide resolved
enhancements/dns/coredns-egressfirewall-integration.md Outdated Show resolved Hide resolved
enhancements/dns/coredns-egressfirewall-integration.md Outdated Show resolved Hide resolved
enhancements/dns/coredns-egressfirewall-integration.md Outdated Show resolved Hide resolved
enhancements/dns/coredns-egressfirewall-integration.md Outdated Show resolved Hide resolved
}
````

The following ``DNSName`` CRD will be added to the ``k8s.ovn.org`` api-group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are using the CoreDNS plugin with ovn-kubernetes, it's not logically bound to ovn-kubernetes. I think this should be something like dns.openshift.io

enhancements/dns/coredns-egressfirewall-integration.md Outdated Show resolved Hide resolved
enhancements/dns/coredns-egressfirewall-integration.md Outdated Show resolved Hide resolved
@arkadeepsen
Copy link
Contributor Author

cc @Miciah @JoelSpeed @TrilokGeer

@candita
Copy link
Contributor

candita commented Mar 15, 2023

/assign @Miciah
/assign

}
````

The following ``DNSName`` CRD will be added to the ``k8s.ovn.org`` api-group.
Copy link
Contributor

@candita candita Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since this will only apply to Egress Firewall DNSNames, the objects should have names that reflect that, like EgressFirewallDNSName or similar. No one should be able to confuse DNSNames with traditional DNS names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the group name makes it clear that this is for OVN or for EgressFirewall, then the CRD name doesn't necessarily need to indicate it. So, for example, if the group name is "k8s.ovn.org" or "egressfirewall.operators.openshift.io" or whatever, then the qualified resource name ends up as "dnsnames.k8s.ovn.org" or "dnsnames.egressfirewall.operators.openshift.io", which makes it clear that the resource is related to the OVN-CoreDNS integration for EgressFirewall.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wanted to discuss the naming. I don't think egress firewall or ovn is necessary in the name, because the functionality of resolving DNS names to ips is quite generic. As an example, even in ovn we can use this functionality for features different from egressfirewall, e.g. network policy.
So maybe we can figure out the name that would explain what this object is used, e.g. DNSNameResolver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the EP with the following changes that were discussed:

  • The CoreDNS plugin will not wait for OVN-K masters to update the underlying ACL rules
  • OVN-K cluster manager will create and delete the new CR.
  • A new controller will watch the CR and perform the re-resolution of the DNS names added to the .status of the CRs.
  • The api definition of the new CRD is updated.

@Miciah @danwinship @npinaeva @JoelSpeed PTAL and let me know if any other changes are required.

Some of the points which are still kind of unresolved:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per today's discussions, the name of the CRD is decided to be DNSNameResolver and it'll belong to network.openshift.io apigroup. The name of the new plugin will be ocp_dnsnameresolver.

* Additionally, the exact matching of the next lookup time may never be successful. If the existing next lookup time of a DNS name lies
within a threshold value (say 1 second) of the new lookup time, specifically lies in the range defined by
`[new lookup time - threshold duration, new lookup time + threshold duration]`, then they are considered as same.
* The `Degraded` condition of the DNS name will be set to `false` along with the success reason and message. The `resolutionFailures` field for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting a Degraded condition on a DNS name seems like overkill, and beyond the scope and nature of this EP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was suggested by @JoelSpeed here: #1335 (comment).

A DNS name will be removed from the status of the CR if it fails for 5 times. The Degraded condition and the related reason and message will provide clarity why the resolution failed. This will also provide information when we would need to debug any future issue related to EgressFirewall.

The details of the `DNSNameResolver` CRD can be found in the [Proposal](#proposal) section.


### Implementation Details/Notes/Constraints [optional]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Implementation Details/Notes/Constraints [optional]
### Implementation Details/Notes/Constraints

Comment on lines +564 to +567
Cluster DNS Operator will deploy CoreDNS with the `ocp_dnsnameresolver` plugin enabled by adding it to the corefile. For the EgressFirewall rules to
apply consistently, even for DNS names that are resolved by custom upstream nameservers, it will be added to all server blocks in the corefile. As the
plugin will watch and update the `DNSNameResolver` CRs in the `network.openshift.io` api-group, proper RBAC permissions will be needed
to be added to the `ClusterRole` for CoreDNS.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not design a way for the EgressFirewall to configure this if it is activated? If a customer doesn't even use the EgressFirewall, why would we need to make any changes in the plugin chain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a customer is not using EgressFirewall, the plugin will mostly be no-op apart from checking the DNS names in the maps. As the maps will be empty the plugin will immediately return.

If we want to design it in such a way so that the plugin chain is changed only when EgressFirewall is used, it will end up restarting all the CoreDNS pods to enable the new plugin. In such a scenario, a CoreDNS pod may get a request from a pod before being restarted and this will result in missing out adding the IP addresses to the status of the corresponding DNSNameResolver CR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, the feature will be behind a feature gate. If the feature gate is enabled, it means that the customer intends to use the feature and all the related components will be enabled including the new plugin.

Comment on lines +569 to +571
The new `DNSNameResolver` controller will be added to the Cluster DNS Operator. The controller will watch the `DNSNameResolver` CRs,
and will send DNS lookup requests for the `spec.name` field. It will also re-resolve the `status.resolvedNames[*].dnsName` fields based on the
corresponding next lookup time (TTL + last lookup time).
Copy link
Contributor

@candita candita Sep 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to enable/install this controller only when EgressFirewall is in use?

Will the status of the DNSNameResolver impact the status of the Cluster DNS Operator? Do we have to mark c-d-o as degraded if the DNSNameResolver controller is degraded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to enable/install this controller only when EgressFirewall is in use?

The feature will be behind a feature gate. If the feature gate is enabled, it means that the customer intends to use the feature and all the related components will be enabled including the new controller.

Will the status of the DNSNameResolver impact the status of the Cluster DNS Operator? Do we have to mark c-d-o as degraded if the DNSNameResolver controller is degraded?

The status of Cluster DNS Operator won't be impacted by the DNSNameResolver controller. The controller will mainly do lookups for the DNS names. It will also remove IP addresses whos TTL have expired. These operations won't impact the CDO status.

Comment on lines +635 to +636
* Whenever there's a change in the IP addresses or the next lookup time (TTL + last lookup time) for a DNS name, the additional step of updating the
related `DNSNameResolver` CRs will be executed. This will add some delay to the DNS lookup process. However, this will only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this feature is required because the EgressFirewall wasn't designed to keep up with the changes in DNS information, I think it is misleading to imply that this doesn't happen very often.

We frequently get people asking how to troubleshoot why a DNS lookup takes so long, so I do think we need to find out precisely how much this could add for a wildcard domain with a popular suffix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this feature is required because the EgressFirewall wasn't designed to keep up with the changes in DNS information, I think it is misleading to imply that this doesn't happen very often.

Can you please let me know which part seems to be misleading. I will change the wording to make it correct.

The last line is added to imply that the updates will only take place when there's a change in the DNS information and only then will we see a delay in the CoreDNS response.

We frequently get people asking how to troubleshoot why a DNS lookup takes so long, so I do think we need to find out precisely how much this could add for a wildcard domain with a popular suffix.

The additional step which may introduce some delay will be the update of the corresponding DNSNameResolver CRs. For a regular DNS name, a maximum of 2 CRs (a regular and a wildcard) will be updated. For a wildcard DNS name, only the corresponding wildcard CR will be updated. Without implementing and running some tests it will be difficult to predict the precise delay that would be added. As this feature is planned as a TechPreview feature, we can use the customer feedback to understand the impact visible to customers as well.


## Design Details

### Open Questions [optional]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to find out how this impacts DNS lookups, and generally how this works, when there are multiple Egress Firewalls, possibly with overlapping wildcard DNS names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add a note in this section regarding the impact to DNS name lookups?

and the `ocp_dnsnameresolver` plugin will also start updating the `.status` fields of the `DNSNameResolver` CRs. The scenarios arising
out of the order of the update of the various components are discussed in [Version Skew Strategy](#version-skew-strategy)

Downgrade expectations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you update the regex for EgressFirewall DNS names, you may have some issues with downgrade. Need to cover both regexes.

#### Removing a deprecated feature


### Upgrade / Downgrade Strategy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding/removing the plugin from CoreDNS will be interesting - need tests to make sure it doesn't greatly impact the upgrade/downgrade times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will cover the scenario in e2e tests.

}
````

### Workflow Description
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the the steps from the point of view of the DNS lookup - which steps are taken when there is an Egress FW, when there isn't, when there is a wildcard, when there isn't, when a DNS name is in multiple Egress FW, and when there is an address not even in Egress FW, what's the flow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to explain each scenario here as well.

when there is an Egress FW

For each unique DNS name used in all the Egress Firewalls, a corresponding DNSNameResolver CR will be created by the OVN-K cluster manager. The CoreDNS plugin will watch for the DNSNameResolver CRs and store regular and wildcard DNS names in 2 separate maps.

When a DNS lookup for a regular DNS name is intercepted and the DNS name matches either a regular or wildcard DNS name or both, only then the status of the corresponding CR is updated. For wildcard DNS name lookup it will be only checked with the wildcard DNS name map and if there's a match only then the corresponding wildcard DNS name CR will be updated.

when there isn't

If there are no EgressFirewall objects then there won't be any DNSNameResolver CRs created by the OVN-K cluster manager. Thus the 2 maps will be empty. Whenever a DNS lookup is intercepted as the maps are empty the plugin immediately sends the response to the next plugin in the chain.

when there is a wildcard, when there isn't

Already mentioned above. Copying it here again for completion.

When a DNS lookup for a regular DNS name is intercepted and the DNS name matches either a regular or wildcard DNS name or both, only then the status of the corresponding CR is updated. For wildcard DNS name lookup it will be only checked with the wildcard DNS name map and if there's a match only then the corresponding wildcard DNS name CR will be updated.

when a DNS name is in multiple Egress FW

As already mentioned, only a single DNSNameResolver CR will be created for a unique DNS name used in the EgressFirewall objects, even if the same DNS name is used in multiple EgressFirewall objects. The DNS lookup process will be the same as explained above.

when there is an address not even in Egress FW

I actually didn't get the question. Do you mean when there are no DNS names mentioned in any of the EgressFirewall objects? In that case, the OVN-K cluster manager won't create any DNSNameResolver CR. The DNS lookup process will be similar to the scenario where there are no EgressFirewall objects mentioned above.

@candita
Copy link
Contributor

candita commented Sep 1, 2023

@arkadeepsen this is a big effort, so it has alot of pieces, and alot of comments. Please address my latest comments and I will lgtm as soon as I can.

…uest to 5 random CoreDNS pods only if DNS name has more than one associtated ip. Change grace period from 1s to 5s.
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 6, 2023
@arkadeepsen
Copy link
Contributor Author

@arkadeepsen this is a big effort, so it has alot of pieces, and alot of comments. Please address my latest comments and I will lgtm as soon as I can.

@candita Thanks for your review comments. I have responded to your review comments and also made the appropriate changes in the EP. PTAL.

@knobunc
Copy link
Contributor

knobunc commented Sep 11, 2023

/approve
/lgtm

This enhancement looks good. Thank you! I do want to make sure that we continue to re-resolve the domains as the ttl gets short, but I assume since we already do that we will keep that behavior.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 11, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 11, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: knobunc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 11, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 11, 2023

@arkadeepsen: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 6cebc13 into openshift:master Sep 11, 2023
2 checks passed
@arkadeepsen
Copy link
Contributor Author

This enhancement looks good. Thank you! I do want to make sure that we continue to re-resolve the domains as the ttl gets short, but I assume since we already do that we will keep that behavior.

Thanks for approving the EP.

The behavior of re-resolution will continue to be there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants