Skip to content

CPU usage increase on Marathon leader when Traefik is configured with Marathon provider #3812

@renaudhager

Description

@renaudhager

Hi guys,

I searched on stackoverflow but I did not find anything related to my issue, that is why I'm opening that issue.

Do you want to request a feature or report a bug?

Bug

What did you do?

We have a Mesos & Marathon (1.5.0) with 3 masters and nodes ~20 agents.

On each agent we run Traefik in a container configured with marathon as a backend, here is the configuration file:

logLevel = "INFO"

[entryPoints]
    [entryPoints.http]
    address = ":80"

[traefikLog]
  format = "json"

[accessLog]
  format = "json"

# API definition
[api]

  entryPoint = "traefik"
  dashboard = true
  debug = false

  [api.statistics]
    recentErrors = 10

# Metrics definition
[metrics]

  [metrics.datadog]

    address = "10.X.X.X:8125"

    pushInterval = "10s"

# enable /ping on the API port
[ping]
entryPoint = "traefik"

################################################################
# Mesos/Marathon Provider
################################################################

# Enable Marathon Provider.
[marathon]

endpoint = "http://mesos.lan:8080/"

watch = true

domain = "service.lan"

exposedByDefault = false

It works well, all Marathon tasks are correctly configured in Traefik and traffic is correctly served.

What did you expect to see?

No or a little impact on Marathon leader cpu usage.

What did you see instead?

Since we run Traefik (15/08) we observed a significant increase of the CPU usage on Marathon leader (see below graphs):

alt text

We stopped all Traefik containers on 21/08, CPU usage and load decrease immediately.

I quickly investigated how Traefik fetch Marathon informations (I did a tcpdump).
I saw the following request made repeatedly:

GET //v2/apps?embed=apps.tasks&embed=apps.deployments&embed=apps.readiness HTTP/1.1
Host: mesos.lan:8080
User-Agent: Go-http-client/1.1
Accept: application/json
Content-Type: application/json
Accept-Encoding: gzip

This requests returns informations about all tasks running and seems to be quite resource consuming for the Marathon leader.

First I'm wondering if my configuration is ok or if I missed something.

Then, Marathon exposes an event bus endpoint (https://mesosphere.github.io/marathon/docs/event-bus.html),
I quickly checked Traefik dependencies, it seems that it uses the following lib to interact with Marathon https://github.com/gambol99/go-marathon, which seems to be able to handle that stream.

I would like to know if there is a way/possible to configure Traefik to subscribe to that event bus instead of querying /v2/apps endpoint.

Output of traefik version: (What version of Traefik are you using?)

docker run traefik:v1.6 version
Version:      v1.6.6
Codename:     tetedemoine
Go version:   go1.10.3
Built:        2018-08-20_01:10:06PM
OS/Arch:      linux/amd64

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions