-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Traefik not updating config #42
Comments
I am indeed noticing this behavior. I have notice I have some containers set with really long health checks, and when those are in play I think this tends to exacerbate this problem. |
i have the same problem. when i upgrade a server and the ip adress changes it does not get reflected in the traefik config. is there a way to manually regenerate the rules & traefik.toml? currently i restart the traefik docker and the config is correct again but this is not suitable for production |
@rawmind0 any recommendations on how to fix this in situ? I have tried restarting either rancher-traefik or alpine-traefik, or both, with curious results. One of which being banned from letsencrypt by rate limiting :( I'd like to know if there is a better method, perhaps a command I can run inside one of the containers to force it to reload it's configurations without dropping all the certs. Another thought, is that maybe we could have a version of this that keeps all it's configs in a convoy-nfs mount. I know all of this might be moot as well once traefik begins to natively support rancher. |
Hi guys, sorry about the issues you have suffering. Could you please, provide some more details about?? BTW, inside alpine-traefik container, you could restart traefik or confd without the need of restart the container..
|
At the beginning everything worked fine but after some time rancher-traefik did not updated a new ip after an upgrade of a container (and the resulting ip change). it still had the old ip address for the backed. I am not sure but it could be related with a updated to rancher version 1.5.3. Currently i am testing the new nativ Traefik Rancher backend and it looks promising. |
Hello I have the same problem with Rancher 1.5.3 and Traefik rawmind/alpine-traefik:1.2.3-1 EDIT: Maybe it's due to confd does not refresh metadata:
btw dns is working on other containers and metadata works. Due to rancher/rancher#5041 I tried to add search into rancher ui and after upgrade dns is now working but confd is always empty :( |
Hi @snahelou ... This is not the cause of the problem.... confd is able to ressolv rancher URI an connect...This problem is with alpine curl, not confd.... If you do curl http://rancher-metadata.rancher.internal it should work..... Please, publish confd logs...../opt/tools/confd/log/confd.log inside alpine-traefik containers.... Have your services healthcheck configured?? |
Hello Yes sorry, dns was not the problem. I had the following error
I remove 2 stacks and the service come back available. It's strange because stacks were green. |
It seems you din't have healthcheks configured....health checks are mandatory...only healthy backends are added to traefik.. |
Ok, strange, healthchecks were configured because I used a jenkins multibranch pipeline and other branchs works well. Thanks for your support. Regards |
Hi! New stacks and changes to stacks sometimes don't get reflected in every host config. About our configuration:
One note, the confd log of the traefik1 shows the error "executing "rules.toml.tmpl" at <getv (printf "/stack...>: error calling getv: key does not exist", but traefik1 is the one configured ok, traefik2 is the one that is not configured ok (not refreshed). I've also check every traefik label on the servers and are exactly the same as the one attached Anyone else with the same? Thanks! traefik 2 dashboard where test-portal1-14-06 service is not discovered traefik 1 dashboard where test-portal1-14-06 service is discovered traefik-1-confd.txt |
Some more information, I've check file /opt/traefik/etc/rules.toml on traefik-1 and traefik-2 and on both of them the "test-portal1-14-06 " service configuration is present, don't know why traefik does not reload, perhups related to this? |
Check if all of your stacks are green even if they have no traefik tags Regards |
When a container crashes and restarts itself, Traefik correctly removes the container from the pool but doesn't readd it once it's restarted again. I have to manually scale the stack up and down to get Traefik to pick it up. Any ideas? Considering abandoning this image and going for the native Rancher support in Traefik 1.3 to see if that resolves it. |
@dbsanfte, no idea, I've try to evacuate a host and traefik updates correctly when new containers are created on other hosts. Some test I've done, not sure if they are the ones that makes it work now... (just in case it helps someone):
Till no more red stacks and using ubuntu 16.04, traefik seams to be working ok for, at least, 24 hours |
@jjscarafia , your case is so strange.... In your confd log files, last update should set rules.toml file to same content....It's so strange to work just in one server.... Infrastructure services are working well on both??
traefik-1-confd.txt
With ubuntu and docker 1.12.6 is working well??? |
Hi @rawmind0 and thanks for the comments!
@rawmind0 just in case you are available and want, I can give you access to the rancher, just send me an email to jjs@adhoc.com.ar |
Hi @jjscarafia ...
Best regards.... |
I've been playing for a while and I can see that:
|
Moving over to the native Traefik Rancher support resolved my issue with my crashed/auto-restarted Node.js containers not being picked up by this image. |
@dbsanfte good to know that and thanks for sharing. Are you also using acme support with native rancher support? |
No we're just defining a plain old SSL cert/key, no ACME. |
I just hit this one too. In my case, a host went down which caused some stacks to migrate to another host. There were some other stacks that were simply stopped because I didn't want them alive at the moment. Traefik did not start updating until I started those stacks as well, which I could then stop at my leisure. |
@lasley moving to native traefik support to rancher make it works ok for me. |
@jjscarafia I've built something similar using the native rancher templates: https://github.com/nhsuk/traefik-rancher Unfortunately I've come across a critical bug which stops us using Traefik for now: traefik/traefik#1927 |
@adamgraves-choices thanks for the feedback. It seams that was the issue I've face yesterday... |
Honestly I thought I was just screwing up somehow so I wasn't even going to say anything 😆 |
I am having a similar issue. I was able to get past the error in the log message by setting an environmental variable When confd completes its interval I do in fact see a new I believe it is skipping over the following block in the template because rancher-meta has not yet registered the container is healthy by the time confd finishes writing the new rules.toml.
It seems to be when confd is trigged to run it detects a change in the number of stacks in "latest" but it if the container is not "healthy" by the time it writes the new rules file it will skip over that part of the template. My suspicion is since the number of stacks doesn't change by the next interval the rules.toml doesn't get updated until the number of stacks change in rancher, which could be a long time or even never. If my suspicion is correct then is there a better methodology of updating the rules.toml other then counting the number of stacks in rancher? I do have health checks configured on all my stacks so I am not sure how to move forward. Once again assuming that confd is only looking for a change in number of stacks in the environment I see 3 possible solutions.
|
@alexisaperez - Regarding confd - I think that it's a dumb implementation & simply rewrites the rules every X units of time. The reasoning behind this assertion is that when I make the comma change in #51, it's just a few seconds until the rule is updated in Traefik. I'm definitely no confd expert though, so it's possible it's noticing the change in the rules file itself and triggering the update. |
@lasley I thought that at first as well, but in my testing it seems that the rules.toml only gets updated when the number of stacks in the environment changes. I also am not an expert in confd it is just what I observed. I think one way that might solve the issue for my environment at least would be to change the key in the |
I'm also having the same problem with frontends/backend not getting updated although everything is green and healty - confd.log logs show plenty of: 2017-10-12T12:55:53Z traefik-traefik-1 /opt/tools/confd/bin/confd[24]: ERROR template: traefik.crt.tmpl:1:20: executing "traefik.crt.tmpl" at <getv "/traefik/ssl_c...>: error calling getv: key does not exist |
Hi all, From alpine-traefik release 1.4.0-3, traefik built in rancher integration is supported, metadata and api. Also, community-catalog is already updated. Now 3 rancher integration are available, metadata, api ( traefik built in) or external (rancher-traefik). Take into account that labels are different with traefik built in integration, https://docs.traefik.io/configuration/backends/rancher/#labels-overriding-default-behaviour Also, I made a PR that is merged and will be included in next traefik release with a refactor of rancher integration. traefik/traefik#2291 Best regards... |
Great news, great work! Thanks for the update! |
Hi all, rancher-traefik updated to use rancher-template instead confd to get immediate updates from metadata. Traefik external integration use it. Best regards... |
Hi,
We've got an intermittent issue where traefik isn't updating the frontend and backend configures in our Rancher environment.
New stacks and changes to stacks sometimes don't get reflected in the config, sometimes it resolves itself within approx. 10-60 minutes, but on some occasions we have to restart the Traefik stack. Sometimes that doesn't help, and we have ended up destroying the environment and rebuilding it from scratch to resolve the issue.
Last time it occurred I tested the rancher-metadata service to ensure that was working, and everything looked fine from there.
Anyone else encountering this?
The text was updated successfully, but these errors were encountered: