-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure each provider keeps frontends even if backends are missing/empty #1689
Comments
Case 1 seems to be somewhat of a corner case. It could be remedied by means of exposing an "unready" state until the API has been accessed at least once successfully. An external process could then leverage the state information to mark the bootstrapping as incomplete until the ready state turns successful. This should be considered only if we really deem it necessary, and if so be covered by a separate issue. |
Case 2 seems to be one that Consul is affected by, at least that's my understanding from #1077 #1077 (comment). @grobinson-blockchain and @bsphere, can you confirm? If so, we can continue working towards a solution. One idea that @emilevauge had was to use custom error pages as designed in #1634 and have a special backend return just the desirable error code. That'd keep Traefik free from special-casing for providers that cannot be changed according to our needs. |
Found to be working for Marathon as well. |
There seems to be one more case of 404s. I am evaluating a 3-node cluster, and doing outage testing. Currently, if Consul has a small hiccup (say, due to a network issue) and returns error 500, Traefik currently removes all configuration, which makes the services running on it unavailable for a few seconds, returning a 404, until Consul can recover. This is on Traefik v1.3.7. |
Any news or is there a plan to do anything around consul. I'm also seeing the same problem is backend goes away and service is still registered in consul. Traefik returns 404. |
@timoreimann Has there been any update regarding this issue and Docker Swarm? Currently the temporary 404 errors break my WebDAV sync clients connected to a Nextcloud instance running in the Docker Swarm behind Traefik. This also applies to standalone containers with a health check. To me it looks like this code completely filters out unhealthy containers and therefore causes 404 errors to be returned for them instead of 502/503. Are there any plans / possibilities to change this behavior in v1.7? Same applies to this code in the master/v2 branches. |
Without it, any service that is unavailable will make Traefik return a 404 or use the next matching IngressRoute, instead of returning a 503 as would be expected. Related tickets: traefik#1689 traefik#5332
Is there any reference to follow up on why this is impossible with Consul? Upstream bug? |
In order to achieve #1688, all providers must leave frontends around even if backends are missing or empty.
Kubernetes already does this correctly, Marathon presumably not; we need to go through each and fix things where needed. This could be as simple as updating the template to making changes to the provider implementation.
It's worth noting that for the case where a provider API isn't available temporarily, all implementations should back off and let Traefik reuse the previous configuration. Thus, there seem to be only two cases to still get a 404 when we wanted a 503 instead, namely:
List to providers that are passing the requirement already (please extend as progress is made):
Refs #1077.
The text was updated successfully, but these errors were encountered: