Add profiling support for Gevent #53

sumerc · 2020-04-14T11:44:27Z

This PR aims to add support profiling Gevent greenlets with respect to multithreading.

mguijarr · 2020-04-14T12:02:02Z

@sumerc so, the added tests are similar to those for asyncio : covering both single and multi threaded.

advance512 · 2020-06-17T20:37:12Z

Is there any help that can be offered here?
This would be extremely useful for us.

sumerc · 2020-06-17T20:48:05Z

Is there any help that can be offered here?

Help would be really appreciated! I could not find time to work on this. But it will take some time to test and tweak. We can implement something working in few days, but manual testing/tweaking/automating tests will take time. We should also throw whatever we can to it. Basically, I can implement a PoC, and it would really be helpful your guys experience on gevent to help me break it? Then we could tweak it more.

Let me think few more days on this and I will throw something to play with, hopefully. @mguijarr has already done great work on helping with the tests and explaining the internals of gevent in the relevant issue: #47

sumerc · 2020-06-17T21:43:49Z

Ok guys.

Below is the first problem we need to solve:

import gevent
import yappi


def burn_cpu(sec):
    t0 = yappi.get_clock_time()
    elapsed = 0
    while (elapsed < sec):
        for _ in range(1000):
            pass
        elapsed = yappi.get_clock_time() - t0


def foo():
    burn_cpu(0.1)
    gevent.sleep(1.0)


def bar():
    burn_cpu(0.1)
    gevent.sleep(1.0)


#yappi.set_clock_type("wall")
yappi.start()
g1 = gevent.spawn(foo)
g2 = gevent.spawn(foo)
gevent.wait([g1, g2])
yappi.get_func_stats(
    filter_callback=lambda x: yappi.func_matches(x, [foo, bar])
).print_all()

Now if the clock type is CPU(which is by default) the output is as following:

Clock type: CPU
Ordered by: totaltime, desc

name                                  ncall  tsub      ttot      tavg      
gevent_yappi.py:14 foo                1      0.000038  0.200483  0.200483
gevent_yappi.py:19 bar                1      0.000034  0.100193  0.100193

CPU time of foo becomes 0.2 as gevent switches to another greenlet (bar func) and accumulates that CPU time to foo.

I will be working on a patch to fix this, then we will move onto wall time.

@mguijarr , @advance512 How can I make foo and bar run in different threads as well in this simple example? monkey_patch and monkey_patch without threads? I would like to try all kinds of possible gevent running modes?

advance512 · 2020-06-17T23:14:49Z

Why do you want to run foo and bar on different threads? That isn't a common use-case for gevent.

What you can do is use gevent's native threadpool:
http://www.gevent.org/api/gevent.threadpool.html

Or, create two gevent Hub() objects, each on a different thread, and have each spawn either foo or bar.

sumerc · 2020-06-18T08:02:57Z

Why do you want to run foo and bar on different threads? That isn't a common use-case for gevent.

I would like to make tests where same function is being called concurrently from different greenlets(same thread+different threads), or some other tests as well related to multiple threads. That is why I asked how to do it.

What you can do is use gevent's native threadpool:
http://www.gevent.org/api/gevent.threadpool.html
...

Thanks for the information. Will look into those!

sumerc · 2020-07-23T15:36:13Z

I have been trying to allocate time to this issue. The problem is: this issue is not easy and requires continuous/focused work which I currently don't have and it is really pissing me off :)

Here are the steps that needs to support Gevent from Yappi:

1/ Choose a minimum gevent version that we would like to support.
2/ Add a new API set_ctx_backend. THat both supports gevent and asyncio. default will be asyncio.
3/ Implement a new set_ctx_id callback that returns the current greenlet id on every function call enter/exit.
4/ On C side, if you detect a greenlet switch we will be correcting the calculations. SInce multiple greenlets can run in a single thread, all CPU time values gets accumulated over each other. I have also seen issues with wall-time too. THis is where we need to debug to understand what is happening when greenlet switch occurs. What values need to be corrected in minimum to give correct results. (to same thread or to different thread). by putting hooks into call_enter/call_leave events. This is the hardest part to do correct.
5/ Write lots of tests. (think about edge cases)
6/ Update docs.
7/ Release.

If anybody would like to work on this I can help/assist in the implementation with any of above steps/unclear points. Otherwise I will be setting this to Blocked and wait few more months unfortunately. :(

Suhail-MOHD · 2020-07-24T18:57:09Z

@sumerc Our team at nutanix would like to help!

Our service is built upon gevent and its monkey patch feature. Lately we have been searching for a profiling tool and have only found statistical profilers to work well with gevents. We were excited to find GreenletProfiler, but after further investigation we realised it cannot report CPU usage reliably for the same reason you've mentioned above.

We would like to contribute to this issue and help fix it.

As a next step, I would like to provide a fix proposal with a POC alongside it. May I proceed and pick this up?

sumerc · 2020-07-25T11:19:16Z

Hi,

I would be more than happy to accept a fix proposal and a POC!

There are already some tests written by mguijarr, for both single/multithreaded cases. I hope they might help you for edge cases.

Please proceed :)

advance512 · 2020-08-03T21:11:09Z

Exciting! Can't wait to see what @Suhail-MOHD cooks up.

…ame thread This change adds support for greenlets in yappi by doing the following: - ensure '_profile_thread' is called only from the target thread to be profiled. '_profile_thread' invokes the '_current_context_id' operation which retrieves the execution context of a thread. Using '_current_context_id' to retrieve the context information from another thread is not guaranteed to work and is hard to get right when yappi is configured with 'set_context_id_callback' because the callback is not passed the information about what thread it has to fetch information for. It is also to hard to achieve the above when using greenlets because there is no obvious way to retrieve the greenlet currently executing on another thread. i.e there is no simple way to retrieve the greenlet currently executing on threadA from threadB. To get around this, this change guarantees that '_profile_thread' and '_ensure_thread_profiled' is only called from the target thread by removing their invocations from the 'start' sequence when 'multithreading' is enabled and delaying it. This is done by attaching a separate profiler func callback (profile_event and profile_event_c) to the threads. These functions first run '_ensure_thread_profiled' in the target thread context and change the profiler func callback to '_yapp_callback'. This way the work done by the 'start' sequence is reduced and '_ensure_thread_profiled' is delayed to the first time a callback is invoked on the target thread. - add 'set_ctx_backend' and allow specifying threading backend as 'greenlets' - modify '_current_context_id' to identify the greenlet currently being executed - account for interleaving greenlet executions on the same native thread 'greenlet' is a co-operative multitasking framework that introduces user controlled threads called 'greenlets'. Each native python thread can run a number of greenlets. Greenlets on the same thread co-operatively switch between one another using 'greenlet.switch'. it is important to note that a greenlet can only execute on the thread it was spawned on. As a result, greenlets spawned on one thread can only swithc between each other. Greenlets does not play wel with yappi. Yappi uses thread local stats to track the CPU usage of each frame. It tracks the usage at the entry of each function and at the exit of each function and subtracts the values to get the total clock cycles spent in that frame. This does not work well with greenlets. Let's take an example of two greenlets A and B and view the following sequence of events. - greenletA is currently running - greenletA enters function 'foo' - greenletA co-operatively switches to greenletB - greenletB consumes 1 second of CPU - greenletB switches over back to greenletA - greenletA exits function 'foo' The CPU usage of function foo will be 1 second more than it should be. To account for this we have to track the context switches and adjust stats. The solution implemented in this change is a derivative of https://github.com/ajdavis/GreenletProfiler with some modifications to account for multiple native threads To track context switches: - declare a new thread local variable 'tl_prev_ctx'. This needs to be thread local since multiple threads can exist. Greenlet context switches should be tracked at a thread level rather than at a global level. - 'tl_prev_ctx' is initialised in '_ensure_thread_profiled'. It is initialised to the greenlet currently running on that thread - use 'tl_prev_ctx' in '_yapp_callback' to determine if the context changed since the last time it was set. To adjust stats, if a switch is noticed from 'prev'->'current', then: - pause 'prev'. Store the time it was paused under the '_ctx' struct associated with that greenlet - resume the current greenlet. use the 'paused_at' field to determine how long the greenlet was paused for. Add this value to all timestamps in the current callstack.

achauvin-wish · 2020-08-04T22:47:11Z

We were excited to find GreenletProfiler, but after further investigation we realised it cannot report CPU usage reliably for the same reason you've mentioned above.

I was going to run some tests and saw your reply here so I wanted to ask about the results you obtained. Does the GreenletProfiler not perform accurately on python 2.7?

Suhail-MOHD · 2020-08-06T08:05:20Z

@achauvin-wish I would like to amend my previous statement after having spent some time understanding both GreenletProfiler and yappi source. GreenletProfiler did not return accurate results for the service my team works on . Our service uses multiple native threads, one thread running greenlets and other threads running operations that cannot work well if run in the greenlet thread. I can write a test that demonstrates how GreenletProfiler does not work in such a system.

For python2.7, I believe GreenletProfiler may work in a system with one native thread running multiple greenlets but I have not tested it out.

sumerc · 2020-08-13T07:51:13Z

For reference: the issue is ongoing with another PR: #57

- move greenlet specificd ata stored under _ctx to a separate field - replace '_subsystem_type_y' with '_ctx_type_t' - use yapphavestats instead of yapprunning as condition for changing backend type

…n failure

with time.sleep

This change: - Introduces new flag profile_greenlets under 'start' - Introduces new API get_greenlet_stats to retrieve greenlet stats - get_thread_stats will fail when profiling greenlets and profile_threads is ignore. Conversely, get_greenlet_stats will fail when profiling threads and profile_greenlets is ignored.

Store the last seen tickcount for every context. Use this while calculating 'ttot' for the context

Make yappi grok greenlets

sumerc · 2020-09-17T15:58:35Z

#57 has been merged. Thanks to @Suhail-MOHD, Yappi has initial support for gevent profiling!

A new release will hopefully coming in few days!

add gevent test from mguijarr:yappi

018001e

sumerc added the next release label Apr 14, 2020

sumerc self-assigned this Apr 14, 2020

sumerc mentioned this pull request Apr 14, 2020

add gevent test, based on existing asyncio test #52

Closed

sumerc added blocked and removed next release labels Jul 23, 2020

sumerc added next release and removed blocked labels Jul 25, 2020

Suhail-MOHD mentioned this pull request Aug 4, 2020

Make yappi grok greenlets #57

Merged

Suhail-MOHD added 7 commits August 22, 2020 16:08

Use python callback to retrieve greenlet id instead of greenlet C API

003d785

Use TLS provided by python API in place of __thread

62f711c

This commit addresses cosmetic comments and TODOs

31bab21

- move greenlet specificd ata stored under _ctx to a separate field - replace '_subsystem_type_y' with '_ctx_type_t' - use yapphavestats instead of yapprunning as condition for changing backend type

Add more test cases for gevent

b969d47

Add test only dependencies to setup.py and modify build jobs accordingly

b04b31a

Move py_key variable declaration to top of function to fix compilatio…

9e045d5

…n failure

Ignore tsubtotal value for burn_cpu on account of variance

3f35734

Suhail-MOHD and others added 9 commits August 24, 2020 22:41

Replace gevent.sleep with burn_io_gevent to avoid name collisions

ee4806c

with time.sleep

Bring enum_thread_stats back

a12e8f7

Fix incorrect context stat measurement.

66e1ad8

Store the last seen tickcount for every context. Use this while calculating 'ttot' for the context

Add default context name callback for greenlets

fd85d4a

Add usage docs for greenlet profiling

e296da5

Add greenlet snippet to README

4f8421a

Merge pull request #57 from Suhail-MOHD/gevent-profiling

30292a9

Make yappi grok greenlets

Merge branch 'master' into gevent-profiling

a042043

sumerc merged commit ba0a264 into master Sep 17, 2020

sumerc mentioned this pull request Nov 27, 2020

Add async profiling support for gevent #47

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add profiling support for Gevent #53

Add profiling support for Gevent #53

sumerc commented Apr 14, 2020 •

edited

Loading

mguijarr commented Apr 14, 2020

advance512 commented Jun 17, 2020

sumerc commented Jun 17, 2020 •

edited

Loading

sumerc commented Jun 17, 2020

advance512 commented Jun 17, 2020

sumerc commented Jun 18, 2020

sumerc commented Jul 23, 2020 •

edited

Loading

Suhail-MOHD commented Jul 24, 2020

sumerc commented Jul 25, 2020 •

edited

Loading

advance512 commented Aug 3, 2020

achauvin-wish commented Aug 4, 2020

Suhail-MOHD commented Aug 6, 2020

sumerc commented Aug 13, 2020

sumerc commented Sep 17, 2020

Add profiling support for Gevent #53

Add profiling support for Gevent #53

Conversation

sumerc commented Apr 14, 2020 • edited Loading

mguijarr commented Apr 14, 2020

advance512 commented Jun 17, 2020

sumerc commented Jun 17, 2020 • edited Loading

sumerc commented Jun 17, 2020

advance512 commented Jun 17, 2020

sumerc commented Jun 18, 2020

sumerc commented Jul 23, 2020 • edited Loading

Suhail-MOHD commented Jul 24, 2020

sumerc commented Jul 25, 2020 • edited Loading

advance512 commented Aug 3, 2020

achauvin-wish commented Aug 4, 2020

Suhail-MOHD commented Aug 6, 2020

sumerc commented Aug 13, 2020

sumerc commented Sep 17, 2020

sumerc commented Apr 14, 2020 •

edited

Loading

sumerc commented Jun 17, 2020 •

edited

Loading

sumerc commented Jul 23, 2020 •

edited

Loading

sumerc commented Jul 25, 2020 •

edited

Loading