Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traefik does not register backend services when any service has a configuration error #3313

Open
littleninja opened this issue May 11, 2018 · 8 comments
Assignees
Labels
area/provider/servicefabric kind/bug/confirmed a confirmed bug (reproducible). priority/P2 need to be fixed in the future

Comments

@littleninja
Copy link

littleninja commented May 11, 2018

Do you want to request a feature or report a bug?

BUG

What did you do?

  1. Deploy a service (BorkService) to Service Fabric cluster with an invalid ServiceManifest.xml
<StatelessServiceType ServiceTypeName="BorkServiceType" UseImplicitHost="true">
  <Extensions>
    <Extension Name="Traefik">
      <Labels xmlns="http://schemas.microsoft.com/2015/03/fabact-no-schema">
        <Label Key="traefik.frontend.rule.borkservice">PathPrefixStrip: /bork</Label>
        <Label Key="traefik.enable">true</Label>
        <Label Key="traefik.frontend.entryPoints">["https","http"]</Label>
      </Labels>
    </Extension>
  </Extensions>
</StatelessServiceType>
  1. Deploy another service (GoodService) to cluster with a valid ServiceManifest.xml
  2. Deploy Traefik to cluster.

What did you expect to see?

Traefik should register GoodService but not BorkService.

What did you see instead?

Neither backend service registers. Traefik Dashboard UI is entirely blank except for header. Errors are not visible unless viewing Traefik logs on server.

Output of traefik version: (What version of Traefik are you using?)

1.6.0

What is your environment & configuration (arguments, toml, provider, platform, ...)?

Provider: Service Fabric v6.1.480.9494
Platform: Windows Server 2016 Datacenter

################################################################
# Global configuration
################################################################

InsecureSkipVerify = true

# Enable debug mode
#
# Optional
# Default: false
#
debug = true

# Traefik logs file
# If not defined, logs to stdout
#
# Optional
#

[traefikLog]
  filePath = "traefik.log"

# Log level
#
# Optional
# Default: "ERROR"

logLevel = "ERROR"

# Entrypoints to be used by frontends that do not specify any entrypoint.
# Each frontend can specify its own entrypoints.
#
# Optional
# Default: ["http"]
#
defaultEntryPoints = ["https","http"]

# Entrypoints definition
#
# Optional
# Default:
[entryPoints]
  [entryPoints.http]
  address = ":30080"
  [entryPoints.https]
  address = ":30443"
    [entryPoints.https.proxyProtocol]
      insecure = true
    [entryPoints.https.tls]
      [entryPoints.https.tls.ClientCA]
      optional = true
      [[entryPoints.https.tls.certificates]]
      certFile = "localhost.cert"
      keyFile = "localhost.key"
[entryPoints.traefik]
address = ":38080"


################################################################
# API definition
################################################################

[api]
  # Name of the related entry point
  #
  # Optional
  # Default: "traefik"
  #
  entryPoint = "traefik"

  # Enabled Dashboard
  #
  # Optional
  # Default: true
  #
  dashboard = true

################################################################
# Service Fabric provider
################################################################

# Enable Service Fabric configuration backend
[servicefabric]

# Service Fabric Management Endpoint
clustermanagementurl = "https://localhost:19080"

# Service Fabric Management Endpoint API Version
apiversion = "3.0"

# Enable TLS connection.
#
# Optional
#
[serviceFabric.tls]
  cert = "traefikcert.crt"
  key = "traefikkey.key"
  insecureskipverify = true
  caoptional = true

If applicable, please paste the log output at DEBUG level (--logLevel=DEBUG switch)

time="2018-05-11T19:54:07Z" level=error msg="Provider connection error: Near line 133 (last key parsed 'frontends.frontend-fabric:/BorkService/BorkService.entryPoints'): expected a comma or array terminator ']', but got 'h' instead; retrying in 1.008273491s"
@ldez
Copy link
Contributor

ldez commented May 11, 2018

Thanks for your interest in Traefik !

It's the expected behavior.

You can join the Traefik community Slack


Note, your configuration contains errors:

InsecureSkipVerify = true

# Enable debug mode
#
# Optional
# Default: false
#
debug = true

# Traefik logs file
# If not defined, logs to stdout
#
# Optional
#

[traefikLog]
  filePath = "traefik.log"

# Log level
#
# Optional
# Default: "ERROR"

logLevel = "ERROR"

# Entrypoints to be used by frontends that do not specify any entrypoint.
# Each frontend can specify its own entrypoints.
#
# Optional
# Default: ["http"]
#
defaultEntryPoints = ["https","http"]

# ...

I remove comments:

InsecureSkipVerify = true
debug = true

[traefikLog]
  filePath = "traefik.log"
logLevel = "ERROR"
defaultEntryPoints = ["https","http"]

# ...

TOML is not base on indentation.

You must write your configuration like that:

InsecureSkipVerify = true
debug = true
logLevel = "ERROR"
defaultEntryPoints = ["https","http"]

[traefikLog]
  filePath = "traefik.log"

# ...

And the correct syntax for the entry points label is:

 <Label Key="traefik.frontend.entryPoints">https,http</Label>

@ldez ldez added kind/question a question and removed status/0-needs-triage labels May 11, 2018
@ldez ldez closed this as completed May 11, 2018
@littleninja
Copy link
Author

littleninja commented May 11, 2018

Thanks and good find, @ldez. We fixed our toml and redeployed but we're still seeing this issue. We think the ServiceManifest.xml caused the issue, as removing the entrypoints label allowed the backend services to register.

Edit to clarify: We expect a mis-configured service not to register, but instead it seems to block all services from registering and this seems like a bug. Could you take another look?

Here is our current toml:

################################################################
# Global configuration
################################################################

InsecureSkipVerify = true
logLevel = "ERROR"
defaultEntryPoints = ["https","http"]

#debug = true
#[traefikLog]
#  filePath = "traefik.log"

[entryPoints]
  [entryPoints.http]
  address = ":30080"
    [entryPoints.http.proxyProtocol]
      insecure = true
  [entryPoints.https]
  address = ":30443"
    [entryPoints.https.proxyProtocol]
      insecure = true
    [entryPoints.https.tls]
      [entryPoints.https.tls.ClientCA]
      optional = true
      [[entryPoints.https.tls.certificates]]
      certFile = "localhost.cert"
      keyFile = "localhost.key"
[entryPoints.traefik]
address = ":38080"

################################################################
# API definition
################################################################

[api]
  entryPoint = "traefik"
  dashboard = true

################################################################
# Service Fabric provider
################################################################

[servicefabric]
clustermanagementurl = "https://localhost:19080"
apiversion = "3.0"

[serviceFabric.tls]
  cert = "traefikcert.crt"
  key = "traefikkey.key"
  insecureskipverify = true
  caoptional = true

@littleninja
Copy link
Author

@ldez I see your edit, thanks. The issue is this: a single misconfigured service breaks Traefik and puts all services at risk of an outage. I would expect Traefik to gracefully recover and register other backends, but that's not happening. Can we agree that sounds like a bug, is it expected behavior?

@ldez ldez changed the title Traefik does not register backend services when any service has a configuration error (Service Fabric) Traefik does not register backend services when any service has a configuration error Jun 1, 2018
@ldez ldez removed the kind/question a question label Jun 1, 2018
@ldez ldez reopened this Jun 1, 2018
@ldez ldez self-assigned this Jun 1, 2018
@mmatur mmatur added priority/P2 need to be fixed in the future kind/bug/confirmed a confirmed bug (reproducible). and removed status/0-needs-triage labels Jun 1, 2018
@jcleavel
Copy link

This is kind of a big deal.
Does anyone have any idea when this might get looked at?

@johnib
Copy link

johnib commented Sep 15, 2018

I hope this behavior would change in the near future.

Applications in service fabric can be unrelated to one another, developed by different teams running different products.

If this behavior remains intact then it puts me in position where my availability depends on others configuring their apps right.
The more apps running on the cluster - the less available I am.

We would be happy to have this behavior modified.

@dzoech
Copy link

dzoech commented Jun 17, 2019

Any updates on this issue?

@iambryancs
Copy link

Experienced the same running multiple services via stack on a swarm.
Two services were mis-configured and have the same traefik port.

@emil-nasso
Copy link

emil-nasso commented Nov 18, 2020

We have run into this issue too.

We deployed a new service with a bad configuration. The name of the router/service was missing, ie: traefik.http.services..loadbalancer.server.port=80. This is just plain broken and a mistake from our side, can't blame traefik for that. 😄

The provider scans all services and when it finds this error, it stops, not providing configuration for any services.

In the perfect world, it would be nice if the provider looked at the labels for each service individually and if a syntax error was detected on one of the services was detected, it would be skipped but the other services would still be considered valid.

If that was the case our cluster would still be up and running with almost all our services and only the misconfigured service would be down (we can't expect traefik to read minds, after all 😅 ).

Are you able to confirm if there are any limitations that forces this behavior or if this should be considered a feature request or a bug @ldez ? This would be a great addition to an already rock solid product. 🙂

Can I provide any more information?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/servicefabric kind/bug/confirmed a confirmed bug (reproducible). priority/P2 need to be fixed in the future
Projects
None yet
Development

No branches or pull requests

9 participants