Skip to content

[Profiling] Humongous CPU with event_stream #139

@migueleliasweb

Description

@migueleliasweb

I recently developed with the help of your client, which is quite awesome, a listener for marathon events using the event_stream() method.

The idea is simple, listen for the "deployment_step_success" event and execute some internal business code when this happens.

During the development/test phase I used a local marathon installation to provide the events to my code but when the code went live some problems started to happen.

The code seemed much slower than I expected and the CPU usage was ginormous.

After a quick profiling I discovered the problem was lying on the SSECLIENT lib that is used behind the scenes then I extracted the bit of code that actually listens for the events and performed another test.

from marathon import MarathonClient

marathon_client = MarathonClient(marathon_address)

for _event in marathon_client.event_stream():
    print("Got event : {}".format(_event.event_type))

OBS: The mesos/marathon cluster on production is around 60 apps with 200 tasks total

This simple code hangs about two minutes before outputs "Got event whatever" !

Htop output:
screenshot from 2016-08-22 12-02-14

Here's the cProfiling output:

17857859 function calls (16810521 primitive calls) in 425.454 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    143/1    0.007    0.000  425.454  425.454 {built-in method exec}
        1    0.001    0.001  425.454  425.454 events.py:1(<module>)
       22    0.001    0.000  425.326   19.333 client.py:696(event_stream)
       38    4.582    0.121  425.004   11.184 sseclient.py:58(__next__)
   522457    0.650    0.000  268.910    0.001 sseclient.py:52(_event_complete)
   522458    0.644    0.000  268.261    0.001 re.py:167(search)
   522464  266.073    0.001  266.073    0.001 {method 'search' of '_sre.SRE_Pattern' objects}
1567259/522420    0.770    0.000  151.506    0.000 {built-in method next}
   522420    0.696    0.000  151.198    0.000 utils.py:368(stream_decode_response_unicode)
   522420    0.279    0.000  149.246    0.000 models.py:672(generate)
   522420    0.975    0.000  148.967    0.000 response.py:336(stream)
   522420    3.639    0.000  146.838    0.000 response.py:276(read)
   522420    1.078    0.000  134.903    0.000 client.py:518(read)
   522425    0.731    0.000  133.973    0.000 {method 'read' of '_io._RawIOBase' objects}
   522420    0.656    0.000  133.094    0.000 client.py:549(readinto)
       85    0.000    0.000  132.453    1.558 socket.py:360(readinto)
       85  132.452    1.558  132.452    1.558 {method 'recv_into' of '_socket.socket' objects}
   522420    0.281    0.000  132.438    0.000 {method 'readinto' of '_io.BufferedReader' objects}
   522420    0.976    0.000    3.345    0.000 response.py:180(_init_decoder)

Besides "recv_into" which is too low level, the regex is executes half a million times which is responsible for the most time spent during the whole execution.

Any ideas of how to workaround this ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions