Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable generating random ports with host port mapping #49792

Closed
gyliu513 opened this issue Jul 28, 2017 · 35 comments
Closed

Enable generating random ports with host port mapping #49792

gyliu513 opened this issue Jul 28, 2017 · 35 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.

Comments

@gyliu513
Copy link
Contributor

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

This is a problem that I found when testing https://github.com/projectcalico/k8s-policy/issues/109 . The problem is that when I use hostport mapping feature in calico, I was always need to specify pods hostPort as following in ports section.

root@k8s001:~/calico2.4# cat pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: nginx-host
spec:
  containers:
  - image: nginx
    imagePullPolicy: IfNotPresent
    name: nginx-host
    ports:
    - containerPort: 80
      hostPort: 80
  restartPolicy: Always
root@k8s001:~/calico2.4# kubectl create -f ./pod.yaml
pod "nginx-host" created
root@k8s001:~/calico2.4# kubectl get pods -owide
NAME                      READY     STATUS    RESTARTS   AGE       IP                NODE
nginx-host                1/1       Running   0          9s        192.168.124.143   k8s004

The problem is that if one node start two such pods, only one pod can be started and another pod will be pending always as it cannot get the same port again.

Enable host port mapping can generate the host port randomly so that end user will not need to specify the host port.

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. labels Jul 28, 2017
@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 28, 2017
@gyliu513
Copy link
Contributor Author

/sig network

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Jul 28, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 28, 2017
@cmluciano
Copy link

@kubernetes/sig-network-feature-requests

@caseydavenport
Copy link
Member

This is a bit of complexity I'd like to avoid in the Kubernetes API if possible, but need to understand the use-case a bit better to understand if it is.

@gyliu513 is there a reason you need each pod to be using a nodeIP:hostPort for your use case rather than podIP:port or nodeIP:nodePort?

@gyliu513
Copy link
Contributor Author

@caseydavenport , yes, I want to use nodeIP:hostPort but not podIP:port or nodeIP:nodePort.

  1. With podIP:port, I cannot enable external access cross two different k8s clusters as the podIP was only available inside the k8s cluster. Consider the following case: I have two nginx app deployed cross two different k8s clusters and I want to achieve lb across those two clusters. Basically, I want to have a HAProxy stand in front of those two k8s clusters and enable the HAProxy using nodeIP:hostPort to balance the workload cross two different k8s clusters.

  2. With nodeIP:nodePort, the first reason is performance issue, as the nodePort will generate iptables record for each pod if those pod are exposed as nodePort services, it will have performance issue when filtering the iptables record. Another reason is that the nodePort will waste quite a lot of port on each node as every services will expose their port on each node. The third reason is nodePort is also kind of lb and it will route the request to different pods cross nodes, access the pod directly via nodeIP:hostPort can have better performance.

@felipejfc
Copy link

felipejfc commented Aug 2, 2017

+1

At gaming industry that would be very valuable as well, our use case is:

We want to deploy "game room units" into kubernetes and they can listen in any protocol, tcp or udp, for us the best way to expose the game servers to the clients is by using hostNetwork as we need performance(exchanging a lot of ticks between server and client) and creating a service for each pod would generate too much overhead (for udp based game servers).
And today we hit this problem while developing an architecture because k8s hasn't a way to allocate free ports dynamically :/

@jpiper
Copy link

jpiper commented Aug 6, 2017

+1

I have a similar use case where I want to start a cassandra ring where the nodes are accessible outside of the kubernetes cluster - I can do this by using a hostPort and then using the downward API to use the hostIP as the listen address.

However, it would be nice if I didn't have to pick ports like this manually, and I could just ask for a random available one and have the port number accessible in the pod via an API.

@caseydavenport
Copy link
Member

/remove-kind bug

@k8s-ci-robot k8s-ci-robot removed the kind/bug Categorizes issue or PR as related to a bug. label Aug 10, 2017
@thockin
Copy link
Member

thockin commented Aug 11, 2017

To allocate a random HostPort, it has to be done on the kubelet node, not by the API server. So we would have to pass it down to kubelet with some sentinel value indicating "allocate random", have kubelet allocate it, and then have kubelet write it back to the API server. Not impossible, but not implemented.

Kubelet is going to create an iptables rule anyway, just the same as NodePort.

@jpiper
Copy link

jpiper commented Aug 11, 2017

I've never looked into the Kubernetes source code (maybe this is an excuse to do so). I suppose it depends how the hostPort data structure looks like, if it's stored as an int we could use a sentinel value like -1 but that seems a bit hacky, something more explicit like hostPort: random or a new port type (randomHostPort?) seems neater.

We'd also need to make sure that the port information are easily accessible within the pod so that services could make use of them. I'm not sure how this would look if we asked for multiple ports.

@euank
Copy link
Contributor

euank commented Sep 27, 2017

@gyliu513

The problem is that if one node start two such pods, only one pod can be started and another pod will be pending always as it cannot get the same port again.

That's a documented feature; the scheduler avoids allocating the hostport twice. That's the scheduler, not the node iirc.

Why do you have to specify hostport? It's a sort-of legacy mechanism. A service / service + nodeport should generally be preferred I think.

@jpiper

However, it would be nice if I didn't have to pick ports like this manually, and I could just ask for a random available one and have the port number accessible in the pod via an API.

That's just a NodePort, right? When you expose a service, as a nodeport type, you should get a port you can use.

@felipejfc

If you're using hostNetwork (which it sounds like you are and should be), then hostPorts do nothing. I don't really understand what they solve in your use-case?


I don't understand the use-case for this given that hostports are largely unneeded in favor of services.

Mind explaining exactly why you need hostports here at all @gyliu513?

@gyliu513
Copy link
Contributor Author

gyliu513 commented Sep 28, 2017

@euank The reason is that I want to have my own load balance cross different kubernetes clusters via calico host port feature.

Suppose I have two clusters, each cluster have 3 worker nodes, and I want to start 6 pods on each cluster with calico host port, then it will be failed, as I can at most create 3 pods on each cluster on each cluster. But if we can generate the hostPort dynamically, then I will not need to hardcode it in my yaml template so that I can create more pods in one cluster.

image

@euank
Copy link
Contributor

euank commented Sep 28, 2017

@gyliu513 How is that use case not covered by using a NodePort, and not setting any hostports at all?
(nice diagram btw, thanks!)

@jpiper
Copy link

jpiper commented Sep 28, 2017

@euank

My use case is specifically if I want to be able to access each of the pods in my deployment individually from outside of the cluster (i.e. each pod must be able to identify it's own specific endpoint).

As far as I understand, A NodePort only allows you to load balance between all the pods in the deployment, but in the case of Cassandra (or kafka, HDFS,or another similarly clustered app) a a client needs to be able to address each of the pods separately.

It could be that I'm misunderstanding something (or not communicating my use case effectively).

EDIT: It looks like some comments here: #28660 touch on similar things like "PetSet controller creates a service per pet"

@gyliu513
Copy link
Contributor Author

@euank I do not want to use NodePort because the NodePort will re-direct my request to the pods which may not located on the specified node and thus the performance is bad; With calico host port, I can access the pod directly to get better performance.

@caseydavenport
Copy link
Member

@gyliu513 As mentioned above, you could use a service-per-pod. Then, so long as you know the node that the Pod is running on (which you need to anyway for the hostPort approach) and you direct your traffic to that node then the traffic won't get redirected and will behave similar to hostPorts.

Would that work for your case?

@eisig
Copy link
Contributor

eisig commented Oct 13, 2017

+1
We plan to migrate our service to k8s.We use zookeeper as service discovry allrady, so we can not use k8s service. We can not migrate all our service to k8s at one time, the serivce must be accessible outsible the k8s cluster.
The nodeIP:hostPor may be the only choice.

@gyliu513
Copy link
Contributor Author

@caseydavenport sorry for the late reply, service-per-pod is one solution, but I do not want to use services, as it will introduce quite a lot of iptables records in my host if there are thousands of services and this make performance bad, so I do hope that the host port mapping can help generate some random ports.

@felipejfc
Copy link

Had the same problem as you @gyliu513, our solution was to create https://github.com/topfreegames/maestro, it has a logic to manually manage a pool of node ports

@euank
Copy link
Contributor

euank commented Oct 14, 2017

@gyliu513 I'm confused about how you expect this to work:

I do not want to use services, as it will introduce quite a lot of iptables records ...and make performance bad

hostports mappings are typically also implemented as iptables rules. They would have very similar performance characteristics to services.

If your use-case can't handle the overhead of mapping between a hostport and a different constant containerport, then your best bet would probably be to use host networking and have your containerized application listen on :0, which will allow the kernel to assign it a random port from the ip_local_port_range, and then the application can report back out-of-band what its port is.
If the application isn't itself listening on a random port, there will have to be something, be it iptables or ipvs or whatever, which does the translation between those ports.

Am I missing some detail here?

@gyliu513
Copy link
Contributor Author

@euank unlike service, I think calico host port only create iptables on the node where the pod is running, so we will not have too many iptables on each node for other pods like services. This can help improve the performance.

Can you please show a detailed example for this solution you proposed? It would be great if you can show some yaml templates, thanks!

be to use host networking and have your containerized application listen on :0, which will allow the kernel to assign it a random port from the ip_local_port_range,

@drinktee
Copy link
Contributor

I want to create rc rs jobs with hostPort and containerPort. But this need to know which port can be used...
Is there any opensource project manage hostports of k8s nodes?

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 24, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@danbeaulieu
Copy link

We are prototyping moving a docker-based homegrown orchestration system over to k8s and this functionality is important. The existing orchestration relies on random port assignment done by docker to ensure that even though each of the thousands of containers are essentially identical they can be addressed individually via a host ip and unique port pair. We would similarly like to run each container as its own pod, managed via a deployment and be able to address each individual pod from outside of the cluster.

@uablrek
Copy link
Contributor

uablrek commented Sep 11, 2018

Is NodePort with externalTrafficPolicy: local a possible alternative? It seem to give the requested function but it has iptables rules so there should be a (minor?) performance impact.

@yourbuddyconner
Copy link

Just going to chime in here on this dead issue with my use-case. I use K8s to launch blockchain nodes via stateful sets. It is a requirement that each unique node/pod has a unique hostIP:hostPort combination, but not that they be the same or even known prior to runtime.

There is the caveat however that the blockchain nodes have to be configured with the same port they're bound to on the host.

It was my hope that I would be able to do something like daemon run --port $POD_ASSIGNED_HOSTPORT to pass the port configuration to the node, but K8s doesn't support this sort of dynamic allocation...

@xinyu7git
Copy link

+1
We plan to migrate our service to k8s.We use zookeeper as service discovry allrady, so we can not use k8s service. We can not migrate all our service to k8s at one time, the serivce must be accessible outsible the k8s cluster.
The nodeIP:hostPor may be the only choice.

I have the same problem. Has your problem been solved? How to solve it?

@markmandel
Copy link
Contributor

If anyone wants the code, this is how we do random, non conflicting hostPort management in Agones, as we had the same issue for dedicated multiplayer game servers.

It is tied to our CRD, but it would probably be turned into something more generic is someone was so inclined.

https://github.com/googleforgames/agones/blob/master/pkg/gameservers/portallocator.go

@CodingSinger
Copy link

+1
We plan to migrate our service to k8s.We use zookeeper as service discovry allrady, so we can not use k8s service. We can not migrate all our service to k8s at one time, the serivce must be accessible outsible the k8s cluster.
The nodeIP:hostPor may be the only choice.

I have the same problem, how did you solve the problem of accessing internal services from outside the cluster?

@marcusjwhelan
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Why was this issue closed? It seems like this is still something that many people are asking for.

And easy example is having a game server cluster. Each server needs its own port. With Terraform you could then take that output of ports and use them.

@ThiagoT1
Copy link

/reopen

@k8s-ci-robot
Copy link
Contributor

@ThiagoT1: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@0blu
Copy link

0blu commented Jul 23, 2020

With the help of @coderanger I've developed a system that might be able to solve a niche problem.
You have to install a service that is watching all over all pods, if a new pod (with specific labels) is created, a service and endpoint with a nodePort is automatically assigned to it.
To get the dynamic port number you have to use some kind of query API on the pod to fetch an annotation that is storing the number.
https://github.com/0blu/dynamic-hostports-k8s

@Revolution1
Copy link

one-service-per-pod is not acceptable
The nodeport should be uniqe acrross the cluster, but I may need to run 10,000+ pods, each takes one port will quickly use up all ports of the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet
Development

No branches or pull requests