Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFE-748: Improve CoreDNS Integration with EgressFirewall #1335
CFE-748: Improve CoreDNS Integration with EgressFirewall #1335
Changes from 6 commits
3626860
86cd179
78520f5
35292b2
08681c3
bb769bf
f3809fc
48bbc96
ad5499d
88cd4dd
fba7fa9
6854ffc
6afefe1
132057e
72b8056
84d572b
02dc287
b7dc6ae
494a4b6
a73edd7
a99f116
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see the the steps from the point of view of the DNS lookup - which steps are taken when there is an Egress FW, when there isn't, when there is a wildcard, when there isn't, when a DNS name is in multiple Egress FW, and when there is an address not even in Egress FW, what's the flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DNS lookup details is added in the workflow of the plugin: https://github.com/openshift/enhancements/pull/1335/files#diff-b27f04c5e2673c47f922695835e746ecb2205a9f7800def32cc15c25af521758R333-R500
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to explain each scenario here as well.
For each unique DNS name used in all the Egress Firewalls, a corresponding DNSNameResolver CR will be created by the OVN-K cluster manager. The CoreDNS plugin will watch for the DNSNameResolver CRs and store regular and wildcard DNS names in 2 separate maps.
When a DNS lookup for a regular DNS name is intercepted and the DNS name matches either a regular or wildcard DNS name or both, only then the status of the corresponding CR is updated. For wildcard DNS name lookup it will be only checked with the wildcard DNS name map and if there's a match only then the corresponding wildcard DNS name CR will be updated.
If there are no EgressFirewall objects then there won't be any DNSNameResolver CRs created by the OVN-K cluster manager. Thus the 2 maps will be empty. Whenever a DNS lookup is intercepted as the maps are empty the plugin immediately sends the response to the next plugin in the chain.
Already mentioned above. Copying it here again for completion.
When a DNS lookup for a regular DNS name is intercepted and the DNS name matches either a regular or wildcard DNS name or both, only then the status of the corresponding CR is updated. For wildcard DNS name lookup it will be only checked with the wildcard DNS name map and if there's a match only then the corresponding wildcard DNS name CR will be updated.
As already mentioned, only a single DNSNameResolver CR will be created for a unique DNS name used in the EgressFirewall objects, even if the same DNS name is used in multiple EgressFirewall objects. The DNS lookup process will be the same as explained above.
I actually didn't get the question. Do you mean when there are no DNS names mentioned in any of the EgressFirewall objects? In that case, the OVN-K cluster manager won't create any DNSNameResolver CR. The DNS lookup process will be similar to the scenario where there are no EgressFirewall objects mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like the fact that it replaces the name with something else that is also a valid DNS name... yes, you have
isReguar
/isWildcard
to handle the case where you need rules for both*.example.com
andwildcard.example.com
, but... it's all just a mess. (The whole "deletes unless this is true and that is false in which case it doesn't delete etc etc etc" thing later.)FTR, since the plugin and ovn-k both need to monitor all DNSName objects anyway, the names don't really matter and you could just store the DNS name in a different field. Though OTOH, if the object names weren't the DNS name, it's not obvious what they should be...
One possibility would be that, since the
status
already has to contain an array to be able to deal with wildcards, you could just let thespec
contain an array of names too. Then instead of creating oneDNSName
per DNS name, you'd create one per EgressFirewall or whatever, containing all of the DNS names used by that firewall rule, and thestatus
would be updated any time any of the DNS names in thespec
changed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Name field is set to the DNS name so that there's one to one mapping with the
DNSName
CR and a DNS name and the CR can be fetched by the CoreDNS plugin just by using the DNS name whenever the lookup happens. I am not sure how this can be achieved without directly using the DNS name. The problem comes with wildcard DNS name.If we have one
DNSName
for each EgressFirewall and the.status
ofDNSName
containing all the details of the IPs, would it not be better to just add the information about the IPs in the.status
of the EgressFirewall itself and not create a new CRD?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the plugin needs to be keeping a cache of all existing DNSName objects anyway, so it could just keep a "name → DNSName object" map in addition to that
yes, maybe...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Though that would imply that the plugin was EgressFirewall-specific and couldn't be used for other things.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have already asked that wildcard dns not be a part of this e.p. It is not in the RFE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a single
DNSName
per cluster will probably have some impact on the performance when looking up DNS names in the status of the object.Creating a
DNSName
for eachEgressFirewall
will introduce duplication of information as same DNS name can be used in multipleEgressFirewalls
. So, the IPs associated with the DNS name have to be updated in all the correspondingDNSName
objects.I am more inclined towards having a
DNSName
object for each DNS name (regular and wildcard). The name of theDNSName
can be a hash value and the spec will contain the corresponding DNS name. Thus theIsRegular
andIsWildcard
fields also will not be needed anymore. Going through the status of thisDNSName
object will have lesser performance impact.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, if there are 2 EgressFirewall DNS rules, one for
www.redhat.com
and another for*.openshift.com
, then there will be 2DNSName
CRs for each one of them. TheIsRegular
andIsWildcard
fields won't be required. Whenever, a lookup happens forwww.redhat.com
, CoreDNS will only fetch the CR corresponding to it and if required update the status. Similarly, if a lookup happens for any subdomain under*.openshift.com
, then will only fetch the CR corresponding to it and if required update the status.NOTE: The status of a regular DNS name will only contain one item in the
resolvedNames
field corresponding to the regular DNS name, whereas for a wildcard DNS name theresolvedNames
will contain one item for the wildcard DNS name and additional items any other subdomains under it, if any.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you're talking about performance impacts here, I'm going to assume that you'll be using an informer to watch the objects in some operator? Having many objects subscribed will result in far more events hitting the informer than a single shared object, I think in terms of informers a single object would be more efficient.
I think as well if you're watching a single object, any time it updates, you can calculate the diff and using an atomic map update keep a cache of the data, that could then be read from in another thread to action the changes. That said, I guess you can do that with multiple objects too 🤔
Is there some other performance impact you're thinking of? My suspicion is that your data access is the biggest performance hit since that will involve an API request, so, optimising that to a single request/watch sounds better to me no?
Is CoreDNS fetching the CR directly? Will it be using an informer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only the create and delete events will be the additional ones. However, the number of update events will be the same as update will happen whenever there's a change in the IPs for a DNS name, whether we use a single object or one object each for each DNS name. The additional impact will be proportional to the number of unique DNS names used in the EgressFirewall rules.
The performance impact that I am thinking about is the time spent to traverse through the status of the CR. The
resolvedNames
field in the status is a slice and if it contains all the DNS names used in EgressFirewall rules, including those which match wildcard DNS names, then every time OVN-K will have to go through the entire slice to get the latest update in the IP addresses. It'll also have spend time to check whether IPs for each DNS name changed or not and update only for the ones that changed. If we have a CR for each DNS name, then OVN-K will immediately know, during an update event, the IPs for which DNS name changed and take appropriate actions. Same for the CoreDNS plugin when it has to update the status of the single object.Another issue is more from the perspective of parallel updates to the single object by CoreDNS and OVN-K. It was suggested here (#1335 (comment)) to have a feedback from OVN-K to CoreDNS. If we have a single object, parallel updates may cause some race condition problems with out of order updates. Additionally, CoreDNS pods run as DaemonSets and each pod runs independent of the other. Thus, more than one pod may receive DNS lookup requests for different DNS names simultaneously. When updating the status of the single object chances are there that one update may be done over the other one and some updates may get lost because of this.
I was planning to have both an informer for the create and delete events in the CoreDNS plugin. But when updating the object, it would be first fetched then the status would be checked with the current IPs and lookup time information. If it requires an update then update the status. OVN-K will be creating and deleting the CRs, so it will be using an informer for the update event to get the changes in the IP information of a DNS name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me. Keep in mind that there is potential fanout: An EgressFirewall can have rules with many DNS names or wildcards, and a wildcard can result in an EgressFirewallDNSName with many values under
status.resolvedNames
.Agreed. We will have many CoreDNS pods updating the EgressFirewallDNSName objects to update
status.resolvedNames
(by default, a cluster has 1 DNS pod per node), and we don't want these pods to be constantly in conflict with each other, which would cause frequent retries and unnecessarily overhead for both the CoreDNS plugin and kube-apiserver.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are using the CoreDNS plugin with ovn-kubernetes, it's not logically bound to ovn-kubernetes. I think this should be something like
dns.openshift.io
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, since this will only apply to Egress Firewall DNSNames, the objects should have names that reflect that, like EgressFirewallDNSName or similar. No one should be able to confuse DNSNames with traditional DNS names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the group name makes it clear that this is for OVN or for EgressFirewall, then the CRD name doesn't necessarily need to indicate it. So, for example, if the group name is "k8s.ovn.org" or "egressfirewall.operators.openshift.io" or whatever, then the qualified resource name ends up as "dnsnames.k8s.ovn.org" or "dnsnames.egressfirewall.operators.openshift.io", which makes it clear that the resource is related to the OVN-CoreDNS integration for EgressFirewall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also wanted to discuss the naming. I don't think egress firewall or ovn is necessary in the name, because the functionality of resolving DNS names to ips is quite generic. As an example, even in ovn we can use this functionality for features different from egressfirewall, e.g. network policy.
So maybe we can figure out the name that would explain what this object is used, e.g. DNSNameResolver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the EP with the following changes that were discussed:
.status
of the CRs.@Miciah @danwinship @npinaeva @JoelSpeed PTAL and let me know if any other changes are required.
Some of the points which are still kind of unresolved:
As of now I have used
EgressFirewallDNSName
.As of now I have used
dns.openshift.io
as per the suggestion here: CFE-748: Improve CoreDNS Integration with EgressFirewall #1335 (comment)As of now it is a namespace-scoped CR. There was a previous suggestion to keep it namespace-scoped itself: CFE-748: Improve CoreDNS Integration with EgressFirewall #1335 (comment)
As of now I have used
egressfirewall
. However, if we are changing the CRD name and apigroup then we probably have to change the name of the plugin as well.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per today's discussions, the name of the CRD is decided to be
DNSNameResolver
and it'll belong tonetwork.openshift.io
apigroup. The name of the new plugin will beocp_dnsnameresolver
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the TTL field use type
int32
(orint64
, as we use in the DNSRecord CR) instead ofstring
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Wil change the type to
int64
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain in a little more detail how the DNSName CR lookup is done—what kind of informer, index, or data structure will the plugin use?
How is the matching done, especially with wildcards?
How do you mitigate performance impact when there are no matching CRs?
Is the update done asynchronously? What happens if there is a conflict during an update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am planning to use SharedIndexInformer. As for data structure, I am planning to use maps to store DNSName CR information. Do you have any suggestions?
I have mentioned about this in this #1335 (comment). I'll add the details in the EP.
I have mentioned about this as well in this #1335 (comment). I'll also add these details in the EP.
Once the update to the DNSName CR is completed, then the response to the DNS lookup will be sent. If the update to the DNSName CR fails then the DNS lookup should also fail, otherwise this might cause discrepancy in the information between the pods and OVN-K. We can add retries to the DNSName CR update process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to think of a suitable data structure that could accommodate both regular and wildcard names/lookups, but using two maps as you suggested in #1335 (comment) is probably better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can add a note here about extra delay for requests matching wildcarded DNS names, that don't have a wildcarded DNS record?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to find out how this impacts DNS lookups, and generally how this works, when there are multiple Egress Firewalls, possibly with overlapping wildcard DNS names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I add a note in this section regarding the impact to DNS name lookups?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding/removing the plugin from CoreDNS will be interesting - need tests to make sure it doesn't greatly impact the upgrade/downgrade times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will cover the scenario in e2e tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you update the regex for EgressFirewall DNS names, you may have some issues with downgrade. Need to cover both regexes.