-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add async profiling support for gevent #47
Comments
Thanks a lot! Motivating words :)
I have been thinking about this a lot. We have more than one So, basically I am OK, but I need time to learn how gevent works under the hood to discover the necessary parts to implement
Then we can hopefully work together to make these tests pass in Yappi itself? How that sounds?
That guy is @ajdavis . Awesome programmer :) Hi, @ajdavis, we were discussing something like above, I am pretty sure you don't have any time but any comment/idea on above? Thanks :) |
Thanks for the quick answer! I am going to try to answer your questions asap |
Hello! My interests have moved very far from this project and I can't say what would be required to upgrade greenlet_profiler to the latest Python, Yappi, and Gevent. I also recall having some concerns, personally, about greenlet_profiler's accuracy, but I don't remember the details anymore. Good luck. I think it's a worthy goal to have a good greenlet profiler. Perhaps someone else has built a greenlet profiler that works today, or that's a better starting point than greenlet_profiler? |
Thank you for your precious input! Let's see if there is any other movement in that area, too and nice to hear from you again :) |
I made a little survey there is no Python 3 greenlet profiler. Not yet 😉 @ajdavis I also checked forks of your greenlet profiler, with no luck - it does not work (test do not pass), in fact some adaptations have been made but it misses your changes to yappi so it is not helpful. I am going to try to answer @sumerc questions in the previous message as soon as possible. Thanks a lot guys ! Have a nice week-end in those difficult times. |
I installed Python 2.7 and Python 3.7 in two different environments, to do some tests. In both environments I installed latest gevent 1.4. Python 2.7, original GreenletProfiler codeHere is a gevent adaptation of the original test suite of GreenletProfiler: from functools import partial
import unittest
import gevent
import GreenletProfiler
def find_func(ystats, name):
items = [
yfuncstat for yfuncstat in ystats
if yfuncstat.name == name]
assert len(items) == 1
return items[0]
def assert_children(yfuncstats, names, msg):
names = set(names)
callees = set([
ychildstat.name for ychildstat in yfuncstats.children
])
final_msg = '%s: expected %s, got %s' % (
msg, ', '.join(names), ', '.join(callees))
assert names == callees, final_msg
def spin(n):
for _ in range(n * 10000):
pass
class GreenletTest(unittest.TestCase):
spin_cost = None
@classmethod
def setUpClass(cls):
# Measure the CPU cost of spin() as a baseline.
GreenletProfiler.set_clock_type('cpu')
GreenletProfiler.start()
for _ in range(10):
spin(1)
GreenletProfiler.stop()
f_stats = GreenletProfiler.get_func_stats()
spin_stat = find_func(f_stats, 'spin')
GreenletTest.spin_cost = spin_stat.ttot / 10.0
GreenletProfiler.clear_stats()
def tearDown(self):
GreenletProfiler.stop()
GreenletProfiler.clear_stats()
def assertNear(self, x, y, margin=0.3):
if abs(x - y) / float(x) > margin:
raise AssertionError(
"%s is not within %d%% of %s" % (x, margin * 100, y))
def test_three_levels(self):
def a():
gevent.sleep(0) # a bit equivalent to greenlet.switch()
b()
spin(1)
def b():
spin(5)
gevent.sleep(0)
c()
def c():
spin(7)
GreenletProfiler.set_clock_type('cpu')
GreenletProfiler.start(builtins=True)
g = gevent.spawn(a)
gevent.sleep(0) # give hand to the hub to run another greenlet
spin(2)
gevent.sleep(0)
spin(3)
gevent.sleep(0)
assert g.ready # greenlet is done
GreenletProfiler.stop()
ystats = GreenletProfiler.get_func_stats()
# Check the stats for spin().
spin_stat = find_func(ystats, 'spin')
self.assertEqual(5, spin_stat.ncall)
self.assertAlmostEqual(18 * self.spin_cost, spin_stat.ttot,
places=2, msg="spin()'s total time is wrong")
assert_children(spin_stat, ['range'], 'spin() has wrong callees')
# Check the stats for a().
a_stat = find_func(ystats, 'a')
self.assertEqual(1, a_stat.ncall, 'a() not called once')
assert_children(
a_stat,
['spin', 'b', "sleep"],
'a() has wrong callees')
self.assertAlmostEqual(13 * self.spin_cost, a_stat.ttot,
places=2, msg="a()'s total time is wrong")
self.assertAlmostEqual(13 * self.spin_cost, a_stat.tavg,
places=2, msg="a()'s average time is wrong")
self.assertAlmostEqual(a_stat.tsub, 0,
places=2, msg="a()'s subtotal is wrong")
# Check the stats for b().
b_stat = find_func(ystats, 'b')
self.assertEqual(1, b_stat.ncall, 'b() not called once')
assert_children(
b_stat,
['spin', 'c', "sleep"],
'b() has wrong callees')
self.assertAlmostEqual(12 * self.spin_cost, b_stat.ttot,
places=2, msg="b()'s total time is wrong")
self.assertAlmostEqual(12 * self.spin_cost, b_stat.tavg,
places=2, msg="b()'s average time is wrong")
self.assertAlmostEqual(b_stat.tsub, 0,
places=2, msg="b()'s subtotal is wrong")
# Check the stats for c().
c_stat = find_func(ystats, 'c')
self.assertEqual(1, c_stat.ncall, 'c() not called once')
assert_children(c_stat, ['spin'], 'c() has wrong callees')
self.assertAlmostEqual(7 * self.spin_cost, c_stat.ttot,
places=2, msg="c()'s total time is wrong")
self.assertAlmostEqual(7 * self.spin_cost, c_stat.tavg,
places=2, msg="c()'s average time is wrong")
self.assertAlmostEqual(c_stat.tsub, 0,
places=2, msg="c()'s subtotal is wrong")
def test_recursion(self):
def r(n):
spin(1)
gevent.sleep(0)
if n > 1:
r(n - 1)
gevent.sleep(0)
def s(n):
spin(1)
gevent.sleep(0)
if n > 1:
s(n - 1)
gevent.sleep(0)
GreenletProfiler.set_clock_type('cpu')
GreenletProfiler.start(builtins=True)
g0 = gevent.spawn(partial(r, 10)) # Run r 10 times.
gevent.sleep(0)
g1 = gevent.spawn(partial(s, 2)) # Run s 2 times.
gevent.sleep(0)
greenlets = [g0, g1]
# Run all greenlets to completion.
gevent.joinall(greenlets, raise_error=True)
GreenletProfiler.stop()
ystats = GreenletProfiler.get_func_stats()
# Check the stats for spin().
spin_stat = find_func(ystats, 'spin')
self.assertEqual(12, spin_stat.ncall)
# r() ran spin(1) 10 times, s() ran spin(1) 2 times.
self.assertNear(12, spin_stat.ttot / self.spin_cost)
assert_children(spin_stat, ['range'], 'spin() has wrong callees')
# Check the stats for r().
r_stat = find_func(ystats, 'r')
self.assertEqual(10, r_stat.ncall)
assert_children(
r_stat,
['spin', 'r', "sleep"],
'r() has wrong callees')
self.assertNear(10, r_stat.ttot / self.spin_cost)
self.assertNear(1, r_stat.tavg / self.spin_cost)
self.assertAlmostEqual(0, r_stat.tsub, places=3)
# Check the stats for s().
s_stat = find_func(ystats, 's')
self.assertEqual(2, s_stat.ncall)
assert_children(
s_stat,
['spin', 's', "sleep"],
's() has wrong callees')
self.assertNear(2, s_stat.ttot / self.spin_cost)
self.assertNear(1, s_stat.tavg / self.spin_cost)
self.assertAlmostEqual(0, s_stat.tsub, places=3)
if __name__ == '__main__':
unittest.main() Basically I just replaced explicit calls to Indeed, gevent is scheduling coroutines execution on I/O - so, when sleeping the code returns Tests pass:
So, at first glance I can confirm the latest gevent works with the original GreenletProfiler. Python 3.7, one of the forks of GreenletProfiler, latest yappiThen I tested the same code as above with Python 3 and one of the forks of GreenletProfiler, adapted to Python 3, with latest Yappi. Of course it does not work (this is the problem :wink):
The patches of @ajdavis are not there... The good news is that the test is compatible with both Python 2/Python 3 with no change. Note about gevent, monkey patching and threads (maybe useful if you do not know gevent)The good thing about gevent is that it proposes to patch the Python standard library, But, I do not think this behaviour makes a big difference for the profiler ? It is possible to only enable the monkey-patching partially. Usually people patch nothing (see 1. below), all (see 2. below) or everything but the threading module (see 3. below): 1. No monkey patching at all
2. If gevent monkey-patching is full
3. Monkey-patching of everything except threads
I do not know if all those cases make a difference for yappi. But I thought it would be more clear to tell about that. In fact, more tests should be written for the different cases when the first code starts to work, maybe, in order to ensure it's all fine... Hope this helps ! |
This is a test to run with yappi without GreenletProfiler ; I took the code from from functools import partial
from contextlib import contextmanager
import unittest
import gevent
from yappi import set_clock_type, get_func_stats, set_context_id_callback, set_context_name_callback, start, stop, clear_stats
def start_profiling(builtins, threads):
"""Starts profiling all threads and all greenlets.
This function can be called from any thread at any time.
Resumes profiling if stop() was called previously.
"""
set_context_id_callback(lambda: gevent and id(gevent.getcurrent()) or 0)
set_context_name_callback(lambda: gevent and gevent.getcurrent().__class__.__name__ or '')
start(builtins, threads)
def stop_profiling():
"""Stops the currently running yappi instance.
The same profiling session can be resumed later by calling start().
"""
stop()
set_context_id_callback(None)
@contextmanager
def profiling(builtins=False, threads=True):
set_clock_type("cpu")
start_profiling(builtins, threads)
yield
stop_profiling()
def find_func(ystats, name):
items = [
yfuncstat for yfuncstat in ystats
if yfuncstat.name == name]
assert len(items) == 1
return items[0]
def assert_children(yfuncstats, names, msg):
names = set(names)
callees = set([
ychildstat.name for ychildstat in yfuncstats.children
])
final_msg = '%s: expected %s, got %s' % (
msg, ', '.join(names), ', '.join(callees))
assert names == callees, final_msg
def spin(n):
for _ in range(n * 10000):
pass
class GreenletTest(unittest.TestCase):
spin_cost = None
@classmethod
def setUpClass(cls):
# Measure the CPU cost of spin() as a baseline.
with profiling():
for _ in range(10):
spin(1)
f_stats = get_func_stats()
spin_stat = find_func(f_stats, 'spin')
GreenletTest.spin_cost = spin_stat.ttot / 10.0
clear_stats()
def assertNear(self, x, y, margin=0.3):
if abs(x - y) / float(x) > margin:
raise AssertionError(
"%s is not within %d%% of %s" % (x, margin * 100, y))
def test_three_levels(self):
def a():
gevent.sleep(0) # a bit equivalent to greenlet.switch()
b()
spin(1)
def b():
spin(5)
gevent.sleep(0)
c()
def c():
spin(7)
with profiling(builtins=True):
g = gevent.spawn(a)
gevent.sleep(0) # give hand to the hub to run another greenlet
spin(2)
gevent.sleep(0)
spin(3)
gevent.sleep(0)
assert g.ready # greenlet is done
ystats = get_func_stats()
# Check the stats for spin().
spin_stat = find_func(ystats, 'spin')
self.assertEqual(5, spin_stat.ncall)
self.assertAlmostEqual(18 * self.spin_cost, spin_stat.ttot,
places=2, msg="spin()'s total time is wrong")
assert_children(spin_stat, ['range'], 'spin() has wrong callees')
# Check the stats for a().
a_stat = find_func(ystats, 'a')
self.assertEqual(1, a_stat.ncall, 'a() not called once')
assert_children(
a_stat,
['spin', 'b', "sleep"],
'a() has wrong callees')
self.assertAlmostEqual(13 * self.spin_cost, a_stat.ttot,
places=2, msg="a()'s total time is wrong")
self.assertAlmostEqual(13 * self.spin_cost, a_stat.tavg,
places=2, msg="a()'s average time is wrong")
self.assertAlmostEqual(a_stat.tsub, 0,
places=2, msg="a()'s subtotal is wrong")
# Check the stats for b().
b_stat = find_func(ystats, 'b')
self.assertEqual(1, b_stat.ncall, 'b() not called once')
assert_children(
b_stat,
['spin', 'c', "sleep"],
'b() has wrong callees')
self.assertAlmostEqual(12 * self.spin_cost, b_stat.ttot,
places=2, msg="b()'s total time is wrong")
self.assertAlmostEqual(12 * self.spin_cost, b_stat.tavg,
places=2, msg="b()'s average time is wrong")
self.assertAlmostEqual(b_stat.tsub, 0,
places=2, msg="b()'s subtotal is wrong")
# Check the stats for c().
c_stat = find_func(ystats, 'c')
self.assertEqual(1, c_stat.ncall, 'c() not called once')
assert_children(c_stat, ['spin'], 'c() has wrong callees')
self.assertAlmostEqual(7 * self.spin_cost, c_stat.ttot,
places=2, msg="c()'s total time is wrong")
self.assertAlmostEqual(7 * self.spin_cost, c_stat.tavg,
places=2, msg="c()'s average time is wrong")
self.assertAlmostEqual(c_stat.tsub, 0,
places=2, msg="c()'s subtotal is wrong")
def test_recursion(self):
def r(n):
spin(1)
gevent.sleep(0)
if n > 1:
r(n - 1)
gevent.sleep(0)
def s(n):
spin(1)
gevent.sleep(0)
if n > 1:
s(n - 1)
gevent.sleep(0)
with profiling():
g0 = gevent.spawn(partial(r, 10)) # Run r 10 times.
gevent.sleep(0)
g1 = gevent.spawn(partial(s, 2)) # Run s 2 times.
gevent.sleep(0)
greenlets = [g0, g1]
# Run all greenlets to completion.
gevent.joinall(greenlets, raise_error=True)
ystats = get_func_stats()
# Check the stats for spin().
spin_stat = find_func(ystats, 'spin')
self.assertEqual(12, spin_stat.ncall)
# r() ran spin(1) 10 times, s() ran spin(1) 2 times.
self.assertNear(12, spin_stat.ttot / self.spin_cost)
assert_children(spin_stat, ['range'], 'spin() has wrong callees')
# Check the stats for r().
r_stat = find_func(ystats, 'r')
self.assertEqual(10, r_stat.ncall)
assert_children(
r_stat,
['spin', 'r', "sleep"],
'r() has wrong callees')
self.assertNear(10, r_stat.ttot / self.spin_cost)
self.assertNear(1, r_stat.tavg / self.spin_cost)
self.assertAlmostEqual(0, r_stat.tsub, places=3)
# Check the stats for s().
s_stat = find_func(ystats, 's')
self.assertEqual(2, s_stat.ncall)
assert_children(
s_stat,
['spin', 's', "sleep"],
's() has wrong callees')
self.assertNear(2, s_stat.ttot / self.spin_cost)
self.assertNear(1, s_stat.tavg / self.spin_cost)
self.assertAlmostEqual(0, s_stat.tsub, places=3)
if __name__ == '__main__':
unittest.main() |
Hi Matias, Thanks for the valuable information on First off, let's start from the notion of Now, coming to I know above is a lot to grasp but it is not that hard. In summary, I assume we have CPU-time profiling issues with And once, I find a solution to this new API required, I think you can also help me on C side, too if you like. Thaks, pls feel free to ask any unclear points. |
When I find time, I will be trying to provide some small examples to clarify more. |
Well, I think I have already things to do - I will try to adapt the test you have for asyncio to gevent when I get some time. Thanks for all this, already ! |
Closing the issue as |
First of all, thanks a lot for this project, yappi is really the best Python profiler !
In my projects I use gevent extensively. I was full of hope when I found this blog article:
https://emptysqua.re/blog/greenletprofiler/
Someone made a greenlet profiler a few years ago on top of yappi...
But the project is Python 2, and it was coming with a modified, bundled version of yappi.
Now that yappi supports coroutines (with asyncio, unfortunately), could you please give me
some guidance how to redo what was done with greenletprofiler ?
I would be happy to contribute to yappi with gevent support, but I need some help - my C++
skills are rusty and I don't know the code.
I saw in this discussion #21 that it was something you thought about once... But I am afraid
set_ctx_backend()
was finally not implemented ?Thanks a lot
The text was updated successfully, but these errors were encountered: