-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get a callgraph including asyncio.gather
?
#54
Comments
Hey Andreas, I understand it is a reasonable/practical request to identify caller/callee relationships between coroutines. However, the issue is To make this work, one needs to hook into library functions to make sense of what is happening. Currently, I cannot see an easy way through to work this out without not going too deep into the library internals. Needless to say: I am open to suggestions if you have any? |
I was hoping there might be some easy trick using Adding a callback to change the recorded callstack is probably a no-go (not obvious how this could work and performance reasons). Do you think there is a chance a monkey-patched |
I am not sure if I understand your suggested solution. You can use |
I think I got a prototype doing the tagging but I am struggleling with rewriting the statistics. Already failing to filter the stats for stats = yappi.get_func_stats()
orig_len = len(stats)
assert [fs for fs in stats if fs.tag == 1] == [] # gives [], fs.tag always seems to be zero (but they are not)
assert 0 < len(stats.get({'tag': 1})) < orig_len # there seem to be functions with tag == 1 as expected
len(stats) == len(stats.get({'tag': 1})) # .get seems to have mutated stats How do I filter for functions by tag? |
If you are using the latest master branch, please get it like: But the problem seems to be different in your case, are you sure you set correct tag callback? This should work in any case: |
Attach my code below. The code accessing the tags is inside the from asyncio import create_task, run, sleep
from contextvars import ContextVar
import inspect
import yappi
_marker = ContextVar('yappi_task_marker')
_task_counter = 0
_task_map = {}
CREATE_TASK_ID = (
create_task.__code__.co_filename, create_task.__code__.co_firstlineno, create_task.__code__.co_name
)
def _task_tag_cbk():
return _marker.get(0)
async def aio_worker():
await sleep(1.)
return _marker.get(0)
async def doit():
# except for the two lines marked with '# <- keep this' everything should go into the task factory (or monkey-patch)
global _task_counter
coro = aio_worker() # <- keep
f = inspect.currentframe()
caller_id = (f.f_code.co_filename, f.f_code.co_firstlineno, f.f_code.co_name)
callee_id = (coro.cr_code.co_filename, coro.cr_code.co_firstlineno, coro.cr_code.co_name)
fid = (*caller_id, *callee_id)
tag = _task_map.get(fid, 0)
if not tag:
_task_counter += 1
_task_map[fid] = tag = _task_counter
token = _marker.set(tag)
task = create_task(coro) # <- keep this
_marker.reset(token)
return await task # <- keep this
if __name__ == '__main__':
yappi.set_tag_callback(_task_tag_cbk)
yappi.set_clock_type('wall')
with yappi.run(builtins=True):
print('Task tag: ', run(doit(), debug=False))
stats = yappi.get_func_stats()
## various attempts a retrieving functions with tag == 1
assert len(stats) == 274
assert [fs for fs in stats if fs.tag == 1] == [] # gives [], fs.tag always seems to be zero (but they are not)
assert len(stats.get({'tag': 1})) == 61 # using .get and filter finds 61 functions which looks reasonable
assert len([fs for fs in stats if fs.tag == 1]) == 61 # .get with filter seems to have populated the tags
len(stats) == 61 # but also mutated stats in place
##
stats = yappi.get_func_stats() # this seems to get the original stats back
assert len(stats) == 274
``` |
The correct/fastest way to retrieve per-tag func stats is like:
Final: the reason Please use:
|
Thx, this was helpful. Below is a rough sketch of the idea. The resulting callgraph for Still some issue to work-out
but none of this looks impossible. from asyncio import create_task, run, sleep
from contextvars import ContextVar
import inspect
import yappi
_marker = ContextVar('yappi_task_marker')
_task_counter = 0
_task_map = {}
CREATE_TASK_ID = (
create_task.__code__.co_filename, create_task.__code__.co_firstlineno, create_task.__code__.co_name
)
def _task_tag_cbk():
return _marker.get(0)
async def aio_worker():
await sleep(1.)
return _marker.get(0)
async def doit():
# except for the two lines marked with '# <- keep this' everything should go into a task factory (or monkey-patch)
global _task_counter
coro = aio_worker() # <- keep
f = inspect.currentframe()
caller_id = (f.f_code.co_filename, f.f_code.co_firstlineno, f.f_code.co_name)
callee_id = (coro.cr_code.co_filename, coro.cr_code.co_firstlineno, coro.cr_code.co_name)
fid = (*caller_id, *callee_id)
tag = _task_map.get(fid, 0)
if not tag:
_task_counter += 1
_task_map[fid] = tag = _task_counter
token = _marker.set(tag)
task = create_task(coro) # <- keep this
_marker.reset(token)
return await task # <- keep this
def get_func_stats_with_tags(tags):
"""yappi 1.2.4 does not populate YFuncStat.tag unless queried"""
result = yappi.YFuncStats()
for tag in tags:
stats = yappi.get_func_stats()
for fs in stats.get({'tag': tag}):
result.append(fs)
if 0 not in tags:
stats = yappi.get_func_stats()
for fs in stats.get({'tag': 0}):
result.append(fs)
return result
def to_child_func_stat(y_func_stat):
return yappi.YChildFuncStat([
y_func_stat.index,
y_func_stat.ncall,
y_func_stat.nactualcall,
y_func_stat.ttot,
y_func_stat.tsub,
y_func_stat.tavg,
y_func_stat.builtin,
y_func_stat.full_name,
y_func_stat.module,
y_func_stat.lineno,
y_func_stat.name
])
def fix_calltree(stats, task_map):
# TODO: not working yet
fixed_stats = yappi.YFuncStats()
callee_map = {}
caller_map = {}
for (rmodule, rline, rname, emodule, eline, ename), tag in task_map.items():
callee_map[emodule, eline, ename] = tag
caller_map[rmodule, rline, rname] = tag
callees = {}
for fs in stats:
tag = callee_map.get((fs.module, fs.lineno, fs.name))
if tag and fs.tag == tag:
callees[tag] = to_child_func_stat(fs)
create_tasks = {}
for fs in stats:
if (fs.module, fs.lineno, fs.name) == CREATE_TASK_ID:
callee = callees.get(fs.tag)
if callee:
new_fs = yappi.YFuncStat(fs)
new_fs.ttot += callee.ttot
new_fs.tavg += callee.tavg
new_fs.children.append(callee)
fixed_stats.append(new_fs)
create_tasks[fs.tag] = to_child_func_stat(new_fs)
for fs in stats:
new_fs = yappi.YFuncStat(fs)
tag = caller_map.get((fs.module, fs.lineno, fs.name))
if tag:
ct = create_tasks.get(tag)
if ct:
new_fs.tsub -= ct.ttot
new_fs.children.append(ct)
fixed_stats.append(new_fs)
return fixed_stats
if __name__ == '__main__':
yappi.set_tag_callback(_task_tag_cbk)
yappi.set_clock_type('wall')
with yappi.run(builtins=True):
print('Task tag: ', run(doit(), debug=False))
stats = get_func_stats_with_tags(set(_task_map.values()))
fixed_stats = fix_calltree(stats, _task_map)
# save to view with snakerun or similar
fixed_stats.save('the_profile.pstat', type='pstat') |
Sorry to bother you again, but bumped into this: class YFuncStat(YStat):
...
def __eq__(self, other):
if other is None:
return False
return self.full_name == other.full_name Shouldn't comparison take |
Not at all :)
I forgot to mention that it is not possible to traverse on Anyway, the only correct way to traverse per-tag or context id stats is: If you really, really want to enumerate the stats yourself, you can always call |
Got a first prototype by now. Turns-out you have to set a context var on the caller side of The current prototype is trying to customize the (Without support for setting the profile function I'd likely go with (iii), the only downside of that being that you need to run the whole program on a custom event loop even if you are just profiling little pieces) |
First: Wow! Congrats on this! Now I would like to help you on this but I think this is not something I can include in Yappi for the time being. Let me share my reasoning before moving further:
However: If you really would like to go over this: what I would suggest is to implement another library using Yappi or simply fork it. I would try my best to help on your issues. Let's clarify this first and then we can talk about potential hooks that you request. |
Closing this issue as there is no progress. |
Coroutines being run via
asyncio.gather
do not show-up in the callgraph for the calling function.asyncio.gather
returns aFuture
gathering the results from the provided coroutines. Timings are correct (up to the caveats in #21) but the callgraph only shows the creation of the gathering future. The caller for the coroutines run viagather
is the event loop.Is there any way to provide hints to yappi to change the caller for the coroutines?
Example:
For me it would be ok if
gather
would not show-up in the callgraph at all andaio_worker
looked like a direct callee ofdoit
.The text was updated successfully, but these errors were encountered: