OPNET-679: Block external access to coredns by emy · Pull Request #1968 · openshift/enhancements

emy · 2026-04-08T15:36:24Z

This enhancement proposes to block external access to CoreDNS instances running on bare metal OpenShift cluster nodes.

openshift-ci-robot · 2026-04-08T15:36:33Z

@emy: This pull request references OPNET-679 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "4.22.0" version, but no target version was set.

Details

In response to this:

This enhancement proposes to block external access to CoreDNS instances running on bare metal OpenShift cluster nodes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2026-04-08T15:37:04Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign dougbtv for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

enhancements/network/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2026-04-08T15:50:14Z

@emy: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/markdownlint	`8a0e400`	link	true	`/test markdownlint`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

danwinship · 2026-04-08T18:27:04Z

+* As a **cluster administrator**, I want to deploy proof-of-concept (POC)
+  clusters without sitewide DNS changes.


Not clear what this means, or how it relates to the rest of the enhancement.

It was more of a use-case on why it would be useful to have a toggle to enable/disable the external access. If it's just confusing I can remove it.

In some environments it can take a long time to get sitewide DNS changes made. To enable timely deployments for those prospective customers, we have allowed them to deploy with one of the "internal" DNS servers as their "public" resolver so clients have those addresses available. We do not support that for production use, but we definitely have people taking advantage of it for test clusters.

Without this use case there is no need to maintain accessibility by making this configurable. That's why I think it's important to include it here.

I guess there's also the monitoring use case where external access might be required. We could use that instead if it would be more clear.

danwinship · 2026-04-08T18:27:48Z

+* As a **security engineer**, I want to ensure that DNS queries to the
+  on-prem CoreDNS instances can only originate from within the cluster
+  nodes or internal networks so that I can prevent DNS-based reconnaissance
+  and amplification attacks targeting my infrastructure.


This is basically the same story as the first one, just with a different persona wanting the same thing...

I put this in there two times because both personas might want the same thing. If the redundancy can be left out I'm fine with removing one of them.

danwinship · 2026-04-08T18:32:40Z

+
+Use node-level firewall mechanisms (nftables) to block external
+access to the CoreDNS service port (typically UDP/TCP 53) on each bare
+metal node. This could be implemented via:


I believe nftables rules won't work for ovn-kubernetes in shared gateway mode, because in that case the external interface is on the OVS bridge and so the traffic passes directly into OVS without going through nftables rules first. (This was part of the theory for using eBPF in ingress-node-firewall.)

Thanks for the clarification. Should the CoreDNS ACL approach be considered as the primary solution then?

I'm confused by this. I thought shared gateway mode had to do with how egress traffic is routed. We don't care about traffic coming from pods because they already have access to this DNS server (albeit indirectly) because it's configured as the "upstream" of the DNS operator pods.

Also, we're using host firewall rules to redirect API traffic coming into the host and AFAIK it works on shared gateway mode. If it didn't I'm pretty sure we'd be buried in API disruption bugs.

danwinship · 2026-04-09T12:48:42Z

+   - Allow DNS queries from other cluster nodes (node IPs within cluster
+     CIDR)


Why do we need to allow queries from other nodes? Doesn't every node have its own resolver? (Or is it just that each resolver is only canonical for certain resources so they need to query each other sometimes?)

(This question also applies to the Risks section; depending on how important these cross-node DNS queries are, the risk of the ACLs being wrong/out-of-date is larger or smaller.)

I assumed it would be beneficial to have the nodes be able to go and query each other in case something like these dedicated resources are a requirement that is present somewhere I am not aware of.

If this is something we don't do anywhere I am fine with not allowing cross node queries.

I'm not sure what the use cases are for the on-prem CoreDNS instance... you'd need to answer that on your side.

If you can avoid needing to allow cross-node requests, then obviously this all becomes much simpler, because you can just restrict it to localhost only.

I'm not aware of any normal use cases where one node needs to query another's host coredns instance. They should all be identical.

danwinship · 2026-04-09T12:55:42Z

+CoreDNS on bare metal nodes (e.g., for external monitoring systems or
+specific network requirements):
+
+**Option 1: Disable the feature gate** (if external access is needed


Feature gates are for enabling/disabling non-GA behavior only. Once the feature is GA, the feature gate will go away. So this isn't a long-term solution.

Makes sense. I'll remove this option.

danwinship · 2026-04-09T12:57:33Z

+
+If a cluster administrator needs to allow external access to on-prem
+CoreDNS on bare metal nodes (e.g., for external monitoring systems or
+specific network requirements):


How common will this be? If it's rare/unimportant, you could just force the admin to roll their own access method. (eg, ssh to a node to do the DNS query there)

danwinship · 2026-04-09T13:05:20Z

+- MachineConfig deployment may cause node reboots, impacting workload
+  availability during initial feature enablement
+
+## Alternatives


Given that we control both the clients and servers here (right?) is there some way we could avoid making nodes query other nodes' DNS via their public IPs?

In the ovn-k case, we could have the node-to-node traffic go over the overlay to a private node IP rather than over the underlying node network to a public node IP. But that doesn't easily generalize to third-party network plugins. But is there anything else we could do?

cybertron

A few more thoughts.

cybertron · 2026-04-17T04:11:53Z

+
+Use node-level firewall mechanisms (nftables) to block external
+access to the CoreDNS service port (typically UDP/TCP 53) on each bare
+metal node. This could be implemented via:


I'm confused by this. I thought shared gateway mode had to do with how egress traffic is routed. We don't care about traffic coming from pods because they already have access to this DNS server (albeit indirectly) because it's configured as the "upstream" of the DNS operator pods.

Also, we're using host firewall rules to redirect API traffic coming into the host and AFAIK it works on shared gateway mode. If it didn't I'm pretty sure we'd be buried in API disruption bugs.

cybertron · 2026-04-17T04:13:24Z

+   - Performance impact and resource overhead
+   - Compatibility with RHCOS and different node operating systems
+
+2. How to make this configurable if administrators need to allow external


My first inclination is a field in the Infrastructure object. That's where most of the other host networking configuration lives.

cybertron · 2026-04-17T04:21:28Z

+* As a **cluster administrator**, I want to deploy proof-of-concept (POC)
+  clusters without sitewide DNS changes.


I guess there's also the monitoring use case where external access might be required. We could use that instead if it would be more clear.

cybertron · 2026-04-17T04:30:36Z

+
+#### Hypershift / Hosted Control Planes
+
+**Not Applicable**: Hypershift deployments typically run on cloud


I am hearing that baremetal hypershift is becoming more of a thing. That could throw a wrench in any MCO-based implementation since MCO doesn't run in hosted clusters. All we would be able to do is deploy a pod via ignition that manages the firewall rules. We couldn't rely on MCO to handle rolling out updates if changes are made on day 2.

We might be able to get around that by reading API fields ourselves, but it is something we'll need to consider.

cybertron · 2026-04-17T04:33:54Z

+   NodeFirewallConfiguration) is deployed that configures firewall rules
+   on bare metal nodes to block external access to on-prem CoreDNS.
+
+3. The MCO (or Node Firewall Operator) deploys the firewall configuration


Implementation detail: MCO is not going to directly manage these firewall rules. What I would propose is adding the ability to manage firewall rules to the coredns-monitor container. We already have similar functionality in haproxy-monitor that we can use as an example.

emy added 3 commits March 30, 2026 15:17

add block-external-access-to-coredns enhancement.

8e51b10

address review comments and correct scope of enhancement.

8039740

address further review comments

8a0e400

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 8, 2026

openshift-ci bot requested review from abhat and danwinship April 8, 2026 15:37

danwinship reviewed Apr 9, 2026

View reviewed changes

cybertron reviewed Apr 17, 2026

View reviewed changes

		* As a cluster administrator, I want to deploy proof-of-concept (POC)
		clusters without sitewide DNS changes.

		- Allow DNS queries from other cluster nodes (node IPs within cluster
		CIDR)


		#### Hypershift / Hosted Control Planes

		Not Applicable: Hypershift deployments typically run on cloud

Conversation

emy commented Apr 8, 2026

Uh oh!

openshift-ci-robot commented Apr 8, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Apr 8, 2026

Uh oh!

openshift-ci bot commented Apr 8, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cybertron left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

openshift-ci-robot commented Apr 8, 2026 •

edited by openshift-ci bot

Loading