Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

service.spec.externalIPs is unnecessarily limited (enhancement request) #124636

Open
jan-g opened this issue Apr 30, 2024 · 11 comments
Open

service.spec.externalIPs is unnecessarily limited (enhancement request) #124636

jan-g opened this issue Apr 30, 2024 · 11 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. kind/support Categorizes issue or PR as a support question. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@jan-g
Copy link

jan-g commented Apr 30, 2024

What happened?

I'm looking at various ways to expose a service directly from a set of hosts in a k8s cluster. [Yes, I know about the caveats here, but this is a service that doesn't play like a traditional http/rest endpoint.]

I can do this with a svc that has a port setting and a list of spec.externalIPs - however, I need to maintain that list of externalIPs manually (or write a small oeprator that does so).

Since these externalIPs tend to turn into individual iptbles rules (or the moral equivalent, depending on your networking stack), it's surprising that I can't give a CIDR here. That would (a) cut down on the number of IPtables rules that need evaluating, (b) require no dynamic updates.

What did you expect to happen?

I'd like to be able to say

spec.externalIPs: ["169.254.0.0/16"]

and not have to maintain the set of IPs on the svc through some external reconciliation.

How can we reproduce it (as minimally and precisely as possible)?

As above - this is a limitation currently in the validation of externalIPs rules.

Anything else we need to know?

This is related to a slew of questions people ask about exposing NodePorts on low ports. There are legitimate reasons to want to do this (again, largely around services that don't behave like a cloudy rest service, but those definitely exist), but there's always been a reasonable degree of pushback from supporting this - also for legitimate reasons.

Kubernetes version

all current

Cloud provider

n/a

OS version

n/a

Install tools

n/a

Container runtime (CRI) and version (if applicable)

n/a

Related plugins (CNI, CSI, ...) and versions (if applicable)

n/a

@jan-g jan-g added the kind/bug Categorizes issue or PR as related to a bug. label Apr 30, 2024
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 30, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@aojea
Copy link
Member

aojea commented Apr 30, 2024

/kind support
/sig network

I don't fully understand the request, NodePorts port range is configurable, you can use them on low ports at your own risk of causing collisions. A Service NodePort when is available is exposed on all Nodes

spec.externalIPs: ["169.254.0.0/16"]

Should those addresses capture the traffic and forward it to the service?

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 30, 2024
@jan-g
Copy link
Author

jan-g commented May 1, 2024

The issue here is that nodeport ranges are not necessarily configurable everywhere (eg, I believe EKS doesn't permit this, as the api service runs in their controlplane and you don't get much/any control over that).

The intention is that this should behave like

spec.externalIPs: ["169.254.0.1", "169.254.0.2", "..."]

Each entry in externalIPs turns into one IPTables rule [or the moral equivalent depending on one's networking stack], eg,

-A KUBE-SERVICES -d 169.254.0.1/32 -p tcp -m comment --comment "default/some-service:some-port external IP" -m tcp --dport 9999 -j KUBE-EXT-123456789ABCDEF
-A KUBE-SERVICES -d 169.254.0.2/32 -p tcp -m comment --comment "default/some-service:some-port external IP" -m tcp --dport 9999 -j KUBE-EXT-123456789ABCDEF
// u.s.w.

and there's no reason this shouldn't turn into a single rule:

-A KUBE-SERVICES -d 169.254.0.0/16 -p tcp -m comment --comment "default/some-service:some-port external IP" -m tcp --dport 9999 -j KUBE-EXT-123456789ABCDEF

The current situation is that if I want to manage this pseudo-nodeport then I need some active pod monitoring nodes and updating the "externalIPs" entries on services - with the concomitant update cost from kube-proxy massaging rules.

@uablrek
Copy link
Contributor

uablrek commented May 7, 2024

Can't be implemented for proxy-mode=ipvs afaik, since the addresses must be assigned to the dummy device kube-ipvs0.

@uablrek
Copy link
Contributor

uablrek commented May 7, 2024

Is 169.254.0.0/16 the cidr for your K8s node addresses? So, ["169.254.0.1", "169.254.0.2", "..."] are real node addresses assigned to some interface on the nodes?

@uablrek
Copy link
Contributor

uablrek commented May 7, 2024

If so, ref kubernetes-sigs/kubespray#10572

@uablrek
Copy link
Contributor

uablrek commented May 7, 2024

/remove kind-bug
/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label May 7, 2024
@uablrek
Copy link
Contributor

uablrek commented May 7, 2024

/remove-kind bug
(sigh...)

@k8s-ci-robot k8s-ci-robot removed the kind/bug Categorizes issue or PR as related to a bug. label May 7, 2024
@thockin thockin self-assigned this May 9, 2024
@thockin
Copy link
Member

thockin commented May 9, 2024

A few thoughts:

First, the externalIPs API is generally considered to have been a Bad Idea, and I will not be eager to expand on it. It would be (marginally) better if you did like like a load-balancer and wrote to status. But at that point you have a controller, so expanding to a list of IPs is not impossible. The problem with expanding the meaning of either of these fields is that you have agents (kube-proxy, cilium, antrea, ...) which parse them as IPs today, which would fail if they find a CIDR. At least the LB status could be expanded via a new field, which is less likely to fail catastrophically.

There are other ways to get low-ports, including using hostNetwork or hostPort, but I am not sure those satisfy your needs.

Can you say more about what is driving this?

@jan-g
Copy link
Author

jan-g commented May 13, 2024

I'm looking for behaviour much like hostPort - or, like a low-numbered NodePort with the topology-aware capabilities coming with k8s 1.30. Effectively this is just using k8s as a matrix for an appliance - I'd like local traffic by default, but some fallback behaviour during periods of change.

@thockin
Copy link
Member

thockin commented May 13, 2024

I'm looking for behaviour much like hostPort

Can you literally use hostPort ? How do your clients discover service IPs? Are the clients in the cluster or outside?

It sounds like you want an implementation of Services with type=LB, but aren't in a cloud? Have you looked at things like metallb?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. kind/support Categorizes issue or PR as a support question. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests

5 participants