-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IP address + exported port instead of endpoint from marathon #1243
Comments
Traefik uses the IP addresses from the application tasks as defined by the task's Are you using a real Marathon cluster, or possibly some reduced, local version? |
The /task/trace endpoint is below : {
"app": {
"id":"/trace",
"cmd": null,
"args": null,
"user": null,
"env": {},
"instances": 1,
"cpus": 1,
"mem": 512,
"disk": 0,
"gpus": 0,
"executor": "",
"constraints": [],
"uris": ["file:///etc/catsa-anonymous-puller.credentials.tar.gz"],
"fetch":[
{
"uri": "file:///etc/catsa-anonymous-puller.credentials.tar.gz",
"extract": true,
"executable": false,
"cache": false
}
],
"storeUrls": [],
"backoffSeconds": 1,
"backoffFactor": 1.15,
"maxLaunchDelaySeconds": 3600,
"container": {
"type": "DOCKER",
"volumes": [],
"docker": {
"image": "sysadm-reg/assemblage/trace:1.4-20170303.164750-23",
"network":"BRIDGE",
"portMappings": [
{
"containerPort": 8080,
"hostPort": 0,
"servicePort": 10000,
"protocol": "tcp",
"name": "tomcat",
"labels": {}
}
],
"privileged": false,
"parameters": [],
"forcePullImage": true
}
},
"healthChecks": [
{
"path": "/trace/api/health",
"protocol": "HTTP",
"portIndex": 0,
"gracePeriodSeconds": 300,
"intervalSeconds": 60,
"timeoutSeconds": 20,
"maxConsecutiveFailures": 3,
"ignoreHttp1xx": false
}
],
"readinessChecks": [],
"dependencies": [],
"upgradeStrategy": {
"minimumHealthCapacity": 0,
"maximumOverCapacity": 0
},
"labels": {
"traefik.frontend.rule": "PathPrefix:/trace",
"traefik.backend.loadbalancer.sticky": "true"
},
"acceptedResourceRoles": null,
"ipAddress": null,
"version": "2017-03-03T17:13:55.405Z",
"residency": null,
"secrets": {},
"taskKillGracePeriodSeconds": null,
"ports": [10000],
"portDefinitions": [
{
"port": 10000,
"protocol": "tcp",
"labels": {}
}
],
"requirePorts": false,
"versionInfo": {
"lastScalingAt": "2017-03-03T17:13:55.405Z",
"lastConfigChangeAt": "2017-03-03T17:13:55.405Z"
},
"tasksStaged": 0,
"tasksRunning": 1,
"tasksHealthy": 1,
"tasksUnhealthy": 0,
"deployments": [],
"tasks": [
{
"id": "trace.8790b0ed-0036-11e7-8a86-0242c8fc4f18",
"slaveId": "546d7d9b-b7de-4745-8eb8-3c2993b7b300-S0",
"host": "infra-q-i-mes01",
"state": "TASK_RUNNING",
"startedAt": "2017-03-03T17:26:27.815Z",
"stagedAt": "2017-03-03T17:26:26.074Z",
"ports": [31855],
"version": "2017-03-03T17:13:55.405Z",
"ipAddresses": [
{
"ipAddress": "10.16.0.3",
"protocol": "IPv4"
}
],
"appId": "/trace",
"healthCheckResults": [
{
"alive": true,
"consecutiveFailures": 0,
"firstSuccess": "2017-03-03T17:27:07.040Z",
"lastFailure": null,
"lastSuccess": "2017-03-07T12:32:50.072Z",
"lastFailureCause": null,
"taskId": "trace.8790b0ed-0036-11e7-8a86-0242c8fc4f18"
}
]
}
],
"lastTaskFailure": {
"appId": "/trace",
"host": "infra-q-i-mes01",
"message": "Task was killed since health check failed. Reason: AskTimeoutException: Ask timed out on [Actor[akka://marathon/user/IO-HTTP#-1431489078]] after [20000 ms]. Sender[null] sent message of type \"spray.http.HttpRequest\".",
"state": "TASK_KILLED",
"taskId": "trace.c93df68c-0034-11e7-8a86-0242c8fc4f18",
"timestamp": "2017-03-03T17:25:20.734Z",
"version": "2017-03-03T17:13:55.405Z",
"slaveId":"546d7d9b-b7de-4745-8eb8-3c2993b7b300-S0"
}
}
} As of now, the marathon server is unique, with a unique slave. traefik and the 2 marathon processes are all launched on the same machine ( |
What I find incoherent is that the address used should either be :
The second solution wouldn't work for us as this IP would not be routable from outside Once again, it might be a problem with our configuration of Marathon (as suggested by |
I tried to use the mesos provider, with the following configuration
(the last line is necessary, see #1248) In that configuration it worked but with backend urls of the form http://:31721 . I thought this was strange and tried launching traefik on a different server. Then, domain still wasn't specified and consequently it didn't work. Does that help ? |
Apologies for the delay. I also have been wondering why Traefik uses the container IP address and the public port by default. Chances are this was introduced by a series of changes which aren't coherent anymore. Let's use this ticket to track investigations and possibly drive a change. You can get to a working state using the Mesos slave host names along with the exposed ports by making a slight modification to the default Marathon template file: Replace
by
|
The workaround you offered works perfectly (although it forces me to manually update the template regularly) . I gather you want me to leave the issue open to follow the longer correction but I thank you profusely for your quick and effective help. Also, I am available for tests with this as we have a small platform to qualify if we want to have traefik in production. |
@lcottereau glad it worked for you. 🚀 And thanks for your offer -- I suppose I'll get back to that once/if we have a correction in place. |
OK. Thanks again @timoreimann . |
@lcottereau @timoreimann when using docker, the IP address reported by the marathon API might not be reachable due some docker NAT/proxy magic. I don't see a simple way to automatically choose between |
@diegooliveira As stated above, even if the ipAddress was reachable, the fact that the port used is the exposed port (on the host) would make the result incorrect. It seems to me there is something else at stake here. |
@diegooliveira thanks for chiming in, I appreciate it. @lcottereau AFAICS, the port does not have to be a Docker-exposed port: If you schedule applications other than Docker containers via Marathon, the task port (which Traefik gives you) could be accessible and not be hidden behind a bridging interface like Docker's. There's also the IP-per-task feature in Marathon, which may give you direct access to Docker containers? (Never worked with that mode, so not exactly sure.) I'm not exactly sure what the motivation for the initial implementation back then was; I'm going to dig a bit in git history to see if I can find something. Either way, I think making the host setting configurable through a label so that users can pick what they want to have without having to modify the default template makes sense. Diego, if you'd like to work on that, I'd be happily reviewing any PR you submit. |
@timoreimann I'll do that @lcottereau there are some tests cases for how the marathon provider handles the task port. You might take a look here https://github.com/containous/traefik/blob/master/provider/marathon_test.go#L1000. In this test case there is no one that points to the container port. I think this is in the same condition of choosing the task's host name or IP address. Is it OK to always use the container port or you might point which one to use in a label? If you know the port in advance it's possible to use the |
@diegooliveira in my use case, the issue is rather in the DNS/IP used rather than the port (which would be unroutable from my understanding) and I don't see a test related to this (except maybe |
Thinking a bit more about the label-based solution, I'm starting to wonder if users would really need to distinguish the host part on a per-application basis frequently. It might seem easier to just introduce a global configuration flag (e.g., |
@timoreimann the global configuration flag would suit my use case |
@diegooliveira Should we take the global config flag route? WDYT? |
Hi, I have a similar problem since I tried to migrate from «camembert» to «morbier». My backend references Docker container IP and not the Marathon endpoint. I have to postpone the migration due to this regression. Regards |
@Gabitchov @timoreimann did some tests in an environment with and without IP per task and found some guideline to make an implementation that is more backward compatible, but also adjustable to specific use cases. In my tests it looks like using the IP address is only relevant when there is an IP-Per-Task application description. I'm planning to work on a patch that uses the hostname if there is no IP-Per-Task information in the application definition, use the task IP if there is one, but allowing to force one specific behavior with a global marathon configuration. |
@diegooliveira I'm mostly positive on your approach. Somewhat of a concern I see is that there might be (non-Docker) applications and networking topologies which do not follow one of the two patterns we've been discussing so far. I'm not too deep into the CNI space, but I know that Mesos supports it and AFAIU it enables very different kinds of networking models, some of which may not be covered by our binary classification. For those cases, however, the manual override should hopefully do the trick. So 👍 on moving forward with your suggestion. |
@Gabitchov If you're fine with making a small modification to the vanilla Marathon template until better auto-detection lands, getting the Marathon provider to speak to hostnames instead of task IP addresses is pretty easy: In line 4, simply replace
This change does not require to (re-)compile Traefik: Copying the existing template, making the adjustment, and referencing it via |
@timoreimann I have a path ready to fix the unsound behavior, please review it #1345 . |
traefik#1243 (comment) + add ability to override "docker build" command in Makefile (help to coss corporate proxy): make DOCKER_BUILD="docker build --build-arg ..." + commit the big fat traefik binary so dockerhub is happy Signed-off-by: Gaetan Semet <gaetan@xeberon.net>
Fix by #1345 |
What version of Traefik are you using (
traefik version
)?What is your environment & configuration (arguments, toml...)?
Linux RHEL 7.2
my configuration file
The deployed app details in Marathon :
The deployed app configuration
Notice the 2 traefik labels
What did you do?
I try to access my application through traefik with the url
http://infra-q-i-mes01:8008/trace/
What did you expect to see?
I expect to see the login webpage of my application trace.
What did you see instead?
I get an HTTP error :
502 Bad Gateway
Just to confirm, the IP address of the trace container 10.16.0.3 is indeed not routable. So the problem seems to come from traefik using the IP address provided by marathon instead of the endpoint. Is that normal (in which case do you know of a way to configure marathon to provide the IP address of the Docker host) or is it a bug or configuraton issue with Traefik ? In anycase it seems uncoherent to me as the IP address is the address of the container and the port is the exported port (hence on the Docker host.)
The traefik log
The text was updated successfully, but these errors were encountered: