Ports could not be allocated #131
Comments
The available port range is determined based on the resources offered by On Sun, Jan 25, 2015 at 5:54 PM, Brandon notifications@github.com wrote:
James DeFelice |
@jdef Turns out I was mistaken about the cause of the error. I was trying to assign a port on the host machine which was outside of the allowable range. Changing that made everything move smoothly enough. This has raised a similar issue however. It seems like when I try to launch a replication controller, it tries to use the same host ports for each replicated container within each pod. That is to say, if I describe a replicationController with two replicas, one container each, with the template describing the port bindings, it creates one then spins indefinitely trying to bind to the same host port for the other replica. Is this expected behavior? |
Since host ports are mesos-managed resources, if you define them in a On Mon, Jan 26, 2015 at 3:35 PM, Brandon notifications@github.com wrote:
James DeFelice |
So if I understand you correctly, I always need to have more minions/slaves than replicated pods? I've only tried a controller of size 2 with a cluster of size 3, and the error still persists. So that means it should try and place on pod on each slave, and allocate identical ports on those slaves. Is this where a service would come into play? Would I have the service describe the port mapping, so that only the host on which the service lives would have the external-facing port declared? |
Hmm. Controller of size 2 with 3 slaves should work. Would you mind sending A bit about services: The Kubernetes service model, by default, allocates an IP per service where I've commented on the service spec structure below, with respect to k8sm: service {
port // port advertised to other pods via SERVICENAME_PORT
variable
selector // identify the pods that back this service
portalIP // (don't set this) managed by apiserver, IP pulled from
-portal_net range
proxyPort // (optional) ephemeral, target of iptables nat rules
publicIPs // (optional) ip address(es) of load balancer(s)
containerPort // (optional) should match a
selectedPod.container[x].port[y].{ hostPort | name }, so it's an int or
string.
// unspecified => use first hostPort of first container
in matching pods
...
} On Tue, Jan 27, 2015 at 10:53 AM, Brandon notifications@github.com wrote:
James DeFelice |
Sure. Here's a piece of my scheduler log file. This is only a snippet, right now my log file is about 25mb large. You can see it just continuously cycling between the three slaves, trying to allocate ports already in use. When I look at the slaves, I can see the actual pods running totally fine. It may be worth noting that I'm also running into #135 when I start single pods as well as replication controllers. So is it possible that the framework is seeing the pod as a failure, trying to fill in the number of required replications, having those fail, and just looping? Edit: I went back into the log file and found where it starts to try and schedule the pods. I went down a bit so you can see more of the looping. |
As an update, I updated the framework to the latest pull from git, and seemed to see some changes. Unfortunately, my error still persists. Before, when creating pods, they immediately entered the Failed state. After an update, I am able to see them enter the Pending state, before ultimately still failing. I'm still not sure what's causing them to enter the Failed state. I think that this is the issue behind everything here. |
I'm seeing the same thing too, built everything clean still seeing the issues from #135. I'm going to dig and see what I can find. |
Looks like the jboss container is flapping because of a missing directory. From the docker log for the jboss container:
I added a couple of volume mounts so that the proc would not die immediately, and the tasks both seem to run OK (they still complain about missing things but they don't die instantly) for me: {
"id": "jboss-controller",
"kind": "ReplicationController",
"apiVersion": "v1beta1",
"desiredState": {
"replicas": 2,
"replicaSelector": {"name": "jboss"},
"podTemplate": {
"desiredState": {
"manifest": {
"version": "v1beta1",
"id": "jboss-wildfly-controller",
"volumes": [
{ "name": "wildfly-log", "source": { "emptyDir": {} } },
{ "name": "wildfly-servers", "source": { "emptyDir": {} } }
],
"containers": [{
"name": "jboss-wildfly",
"image": "vnguyen/jboss-wildfly-admin",
"ports": [{"containerPort": 8080, "hostPort": 31008}, {"containerPort": 9990, "hostPort": 31009}],
"volumeMounts": [
{ "name": "wildfly-log", "path": "/opt/jboss/wildfly/domain/log" },
{ "name": "wildfly-servers", "path": "/opt/jboss/wildfly/domain/servers" }
],
}]
}
},
"labels": {"name": "jboss"}
}},
"labels": {"name": "jboss-wildfly"}
} Something else to check: does your mesos master report any orphan tasks? If so, they may be falsely consuming resources (like host ports). |
I've never gotten an error like that from running the container I'm using. In fact, the containers themselves are definitely being created without error. The entire pod is being created, just reported as Failed, so the controller tries to create it again, thus running into the port conflict. The pod is reported as failed no matter what container I use. So far I've tried a few Jboss, RabbitMQ, nginx, MongoDB, others. My slave is reporting the resource as consumed, and the docker daemon is showing the containers as running, but kubectl is showing the pod as Failed. I tried your config file just to be sure, and it did the same thing. |
Can you try querying etcd and the kubelet APIs? For example: :; curl http://$servicehost:4001/v2/keys/registry/pods/default/
{"action":"get","node":{"key":"/registry/pods/default","dir":true,"nodes":[{"key":"/registry/pods/default/nginx-id-01","value":"{\"kind\":\"Pod\",\"id\":\"nginx-id-01\",\"uid\":\"da4a6804-a775-11e4-9155-04012f416701\",\"creationTimestamp\":\"2015-01-29T05:15:34Z\",\"resourceVersion\":15,\"apiVersion\":\"v1beta1\",\"namespace\":\"default\",\"labels\":{\"cluster\":\"gce\",\"name\":\"foo\"},\"desiredState\":{\"manifest\":{\"version\":\"v1beta2\",\"id\":\"\",\"volumes\":null,\"containers\":[{\"name\":\"nginx-01\",\"image\":\"dockerfile/nginx\",\"ports\":[{\"hostPort\":31000,\"containerPort\":80,\"protocol\":\"TCP\"}],\"livenessProbe\":{\"httpGet\":{\"path\":\"/index.html\",\"port\":\"8081\"},\"initialDelaySeconds\":30},\"imagePullPolicy\":\"\"}],\"restartPolicy\":{\"always\":{}}}},\"currentState\":{\"manifest\":{\"version\":\"\",\"id\":\"\",\"volumes\":null,\"containers\":null,\"restartPolicy\":{}},\"status\":\"Waiting\",\"host\":\"10.132.189.240\"}}","modifiedIndex":17,"createdIndex":15}],"modifiedIndex":15,"createdIndex":15}}
:; curl http://$slaveip:10250/podInfo?podID=nginx-id-01\&podNamespace=default
{"net":{"state":{"running":{"startedAt":"2015-01-29T05:15:40.220382019Z"}},"restartCount":0,"podIP":"172.17.36.49","image":"kubernetes/pause:latest"},"nginx-01":{"state":{"running":{"startedAt":"2015-01-29T05:15:40.463949269Z"}},"restartCount":0,"image":"dockerfile/nginx"}} ... and your minions... :; curl http://$servicehost:8888/api/v1beta2/minions
{
"kind": "MinionList",
"creationTimestamp": null,
"selfLink": "/api/v1beta2/minions",
"resourceVersion": 24,
"apiVersion": "v1beta2",
"items": [
{
"id": "10.132.189.243",
"uid": "b7af1eb2-a775-11e4-9155-04012f416701",
"creationTimestamp": "2015-01-29T05:14:36Z",
"selfLink": "/api/v1beta2/minions/10.132.189.243",
"resourceVersion": 6,
"resources": {}
},
{
"id": "10.132.189.240",
"uid": "b7b11029-a775-11e4-9155-04012f416701",
"creationTimestamp": "2015-01-29T05:14:36Z",
"selfLink": "/api/v1beta2/minions/10.132.189.240",
"resourceVersion": 7,
"resources": {}
},
{
"id": "10.132.189.242",
"uid": "b7b36c72-a775-11e4-9155-04012f416701",
"creationTimestamp": "2015-01-29T05:14:36Z",
"selfLink": "/api/v1beta2/minions/10.132.189.242",
"resourceVersion": 8,
"resources": {}
}
]
} |
{"action":"get","node":{"key":"/registry/pods/default","dir":true,"nodes":[{"key":"/registry/pods/default/jboss-pod","value":"{\"kind\":\"Pod\",\"id\":\"jboss-pod\",\"uid\":\"26bc871c-a7bb-11e4-a805-fa163e3c002e\",\"creationTimestamp\":\"2015-01-29T13:31:37Z\",\"resourceVersion\":199,\"apiVersion\":\"v1beta1\",\"namespace\":\"default\",\"labels\":{\"name\":\"jboss\"},\"desiredState\":{\"manifest\":{\"version\":\"v1beta2\",\"id\":\"\",\"volumes\":null,\"containers\":[{\"name\":\"wildfly\",\"image\":\"vnguyen/jboss-wildfly-admin\",\"ports\":[{\"hostPort\":31000,\"containerPort\":8080,\"protocol\":\"TCP\"},{\"hostPort\":31010,\"containerPort\":9990,\"protocol\":\"TCP\"}],\"imagePullPolicy\":\"\"}],\"restartPolicy\":{\"always\":{}},\"dnsPolicy\":\"ClusterFirst\"}},\"currentState\":{\"manifest\":{\"version\":\"\",\"id\":\"\",\"volumes\":null,\"containers\":null,\"restartPolicy\":{}},\"status\":\"Waiting\",\"host\":\"mesos-slave-2\"}}","modifiedIndex":200,"createdIndex":199}],"modifiedIndex":10,"createdIndex":10}}
{"net":{"state":{"running":{"startedAt":"2015-01-29T13:31:38Z"}},"restartCount":1,"podIP":"172.17.0.4","image":"kubernetes/pause:latest","containerID":"docker://ec1892cd54f4db2be49025556db0b467501210a5289ffe9439c0a8a0a8cbc597"},"wildfly":{"state":{"running":{"startedAt":"2015-01-29T13:31:38Z"}},"restartCount":1,"image":"vnguyen/jboss-wildfly-admin","containerID":"docker://036a01d5975cd74bf89f28f61b60b83c0ba79920b848500ec7866184c4c6186f"}}
{
"kind": "MinionList",
"creationTimestamp": null,
"selfLink": "/api/v1beta2/minions",
"resourceVersion": 207,
"apiVersion": "v1beta2",
"items": [
{
"id": "23.23.23.56",
"uid": "adc3b379-a720-11e4-a805-fa163e3c002e",
"creationTimestamp": "2015-01-28T19:05:52Z",
"selfLink": "/api/v1beta2/minions/23.23.23.56",
"resourceVersion": 60,
"hostIP": "23.23.23.56",
"resources": {},
"status": {}
},
{
"id": "23.23.23.50",
"uid": "a5372e50-a716-11e4-a805-fa163e3c002e",
"creationTimestamp": "2015-01-28T17:54:03Z",
"selfLink": "/api/v1beta2/minions/23.23.23.50",
"resourceVersion": 7,
"hostIP": "23.23.23.50",
"resources": {},
"status": {}
},
{
"id": "23.23.23.51",
"uid": "3d1f3cc4-a721-11e4-a805-fa163e3c002e",
"creationTimestamp": "2015-01-28T19:09:52Z",
"selfLink": "/api/v1beta2/minions/23.23.23.51",
"resourceVersion": 62,
"hostIP": "23.23.23.51",
"resources": {},
"status": {}
}
]
} |
I suspect that the problem may be that the Try converting your slaves to use The real "bug" here is probably the way I've implemented the mesos cloud provider to use IP addresses and not hostnames (https://github.com/mesosphere/kubernetes-mesos/blob/master/pkg/cloud/mesos/client.go#L40). |
you're welcome, and thanks for reporting the problem. |
Ahh yes that worked for me too. Thanks @jdef! |
Hi all,
I'm trying to launch a simple pod, using a pre-built docker image. The kubernetes-mesos scheduler is giving me an error related to port allocation:
I'm wondering if that port range is something that's configurable. The port that's throwing the error is the container port, which I'd like to avoid changing. Thanks.
The text was updated successfully, but these errors were encountered: