-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add profiling support for Gevent #53
Conversation
@sumerc so, the added tests are similar to those for asyncio : covering both single and multi threaded. |
Is there any help that can be offered here? |
Help would be really appreciated! I could not find time to work on this. But it will take some time to test and tweak. We can implement something working in few days, but manual testing/tweaking/automating tests will take time. We should also throw whatever we can to it. Basically, I can implement a PoC, and it would really be helpful your guys experience on gevent to help me break it? Then we could tweak it more. Let me think few more days on this and I will throw something to play with, hopefully. @mguijarr has already done great work on helping with the tests and explaining the internals of gevent in the relevant issue: #47 |
Ok guys. Below is the first problem we need to solve:
Now if the clock type is CPU(which is by default) the output is as following:
CPU time of foo becomes 0.2 as gevent switches to another greenlet (bar func) and accumulates that CPU time to foo. I will be working on a patch to fix this, then we will move onto wall time. @mguijarr , @advance512 How can I make foo and bar run in different threads as well in this simple example? |
Why do you want to run foo and bar on different threads? That isn't a common use-case for gevent. What you can do is use gevent's native threadpool: Or, create two gevent Hub() objects, each on a different thread, and have each spawn either foo or bar. |
I would like to make tests where same function is being called concurrently from different greenlets(same thread+different threads), or some other tests as well related to multiple threads. That is why I asked how to do it.
Thanks for the information. Will look into those! |
I have been trying to allocate time to this issue. The problem is: this issue is not easy and requires continuous/focused work which I currently don't have and it is really pissing me off :) Here are the steps that needs to support Gevent from Yappi: 1/ Choose a minimum gevent version that we would like to support. If anybody would like to work on this I can help/assist in the implementation with any of above steps/unclear points. Otherwise I will be setting this to |
@sumerc Our team at nutanix would like to help! Our service is built upon gevent and its monkey patch feature. Lately we have been searching for a profiling tool and have only found statistical profilers to work well with gevents. We were excited to find GreenletProfiler, but after further investigation we realised it cannot report CPU usage reliably for the same reason you've mentioned above. We would like to contribute to this issue and help fix it. As a next step, I would like to provide a fix proposal with a POC alongside it. May I proceed and pick this up? |
Hi, I would be more than happy to accept a fix proposal and a POC! There are already some tests written by mguijarr, for both single/multithreaded cases. I hope they might help you for edge cases. Please proceed :) |
Exciting! Can't wait to see what @Suhail-MOHD cooks up. |
…ame thread This change adds support for greenlets in yappi by doing the following: - ensure '_profile_thread' is called only from the target thread to be profiled. '_profile_thread' invokes the '_current_context_id' operation which retrieves the execution context of a thread. Using '_current_context_id' to retrieve the context information from another thread is not guaranteed to work and is hard to get right when yappi is configured with 'set_context_id_callback' because the callback is not passed the information about what thread it has to fetch information for. It is also to hard to achieve the above when using greenlets because there is no obvious way to retrieve the greenlet currently executing on another thread. i.e there is no simple way to retrieve the greenlet currently executing on threadA from threadB. To get around this, this change guarantees that '_profile_thread' and '_ensure_thread_profiled' is only called from the target thread by removing their invocations from the 'start' sequence when 'multithreading' is enabled and delaying it. This is done by attaching a separate profiler func callback (profile_event and profile_event_c) to the threads. These functions first run '_ensure_thread_profiled' in the target thread context and change the profiler func callback to '_yapp_callback'. This way the work done by the 'start' sequence is reduced and '_ensure_thread_profiled' is delayed to the first time a callback is invoked on the target thread. - add 'set_ctx_backend' and allow specifying threading backend as 'greenlets' - modify '_current_context_id' to identify the greenlet currently being executed - account for interleaving greenlet executions on the same native thread 'greenlet' is a co-operative multitasking framework that introduces user controlled threads called 'greenlets'. Each native python thread can run a number of greenlets. Greenlets on the same thread co-operatively switch between one another using 'greenlet.switch'. it is important to note that a greenlet can only execute on the thread it was spawned on. As a result, greenlets spawned on one thread can only swithc between each other. Greenlets does not play wel with yappi. Yappi uses thread local stats to track the CPU usage of each frame. It tracks the usage at the entry of each function and at the exit of each function and subtracts the values to get the total clock cycles spent in that frame. This does not work well with greenlets. Let's take an example of two greenlets A and B and view the following sequence of events. - greenletA is currently running - greenletA enters function 'foo' - greenletA co-operatively switches to greenletB - greenletB consumes 1 second of CPU - greenletB switches over back to greenletA - greenletA exits function 'foo' The CPU usage of function foo will be 1 second more than it should be. To account for this we have to track the context switches and adjust stats. The solution implemented in this change is a derivative of https://github.com/ajdavis/GreenletProfiler with some modifications to account for multiple native threads To track context switches: - declare a new thread local variable 'tl_prev_ctx'. This needs to be thread local since multiple threads can exist. Greenlet context switches should be tracked at a thread level rather than at a global level. - 'tl_prev_ctx' is initialised in '_ensure_thread_profiled'. It is initialised to the greenlet currently running on that thread - use 'tl_prev_ctx' in '_yapp_callback' to determine if the context changed since the last time it was set. To adjust stats, if a switch is noticed from 'prev'->'current', then: - pause 'prev'. Store the time it was paused under the '_ctx' struct associated with that greenlet - resume the current greenlet. use the 'paused_at' field to determine how long the greenlet was paused for. Add this value to all timestamps in the current callstack.
I was going to run some tests and saw your reply here so I wanted to ask about the results you obtained. Does the GreenletProfiler not perform accurately on python 2.7? |
@achauvin-wish I would like to amend my previous statement after having spent some time understanding both For python2.7, I believe |
For reference: the issue is ongoing with another PR: #57 |
- move greenlet specificd ata stored under _ctx to a separate field - replace '_subsystem_type_y' with '_ctx_type_t' - use yapphavestats instead of yapprunning as condition for changing backend type
This change: - Introduces new flag profile_greenlets under 'start' - Introduces new API get_greenlet_stats to retrieve greenlet stats - get_thread_stats will fail when profiling greenlets and profile_threads is ignore. Conversely, get_greenlet_stats will fail when profiling threads and profile_greenlets is ignored.
Store the last seen tickcount for every context. Use this while calculating 'ttot' for the context
Make yappi grok greenlets
#57 has been merged. Thanks to @Suhail-MOHD, Yappi has initial support for gevent profiling! A new release will hopefully coming in few days! |
This PR aims to add support profiling Gevent greenlets with respect to multithreading.