Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Profiler] iterate frontend function events for profiler post processing #124596

Conversation

zejun-chen
Copy link
Contributor

@zejun-chen zejun-chen commented Apr 22, 2024

The function_events in _parse_kineto_results is used to contain all function events from the result. It contains 2 kinds of events. One is frontend function events whose correlation id is 0, for example, aten::add, aten::mul. They are on the top level of the profile results. The other is the backend events, which are associated with the frontend events and its correlation id is > 0, for example, at::native::vectorized_elementwise_kernel, it should be the backend event of a frontend element-wise op. They have the device execution duration for the related frontend op.

In the following post processing code, the frontend function events should be iterated to find its correlated backend events in device_corr_map, instead of iterating all function events, because device_corr_map is designed as a dict, whose key is the id of the frontend function event.

for fe in function_events:
if (
fe.device_type == DeviceType.CPU
and not fe.is_async
and fe.id in device_corr_map
):
for f_evt in device_corr_map[fe.id]:
if f_evt.device_type == DeviceType.CUDA:
fe.append_kernel(
f_evt.name,
f_evt.device_index,
f_evt.time_range.end - f_evt.time_range.start,
)
elif f_evt.device_type == DeviceType.CPU:
# make sure that 'thread' of a CPU Kineto (e.g. Device Runtime) event is associated
# with the 'thread' of the corresponding linked PyTorch event to properly track
# parents and children
f_evt.thread = fe.thread

if corr_id > 0:
if corr_id not in device_corr_map:
device_corr_map[corr_id] = []
device_corr_map[corr_id].append(fe)

whose correlation id is 0, then post processing is to iterate the
frontend function events

Signed-off-by: Chen, Zejun <zejun.chen@intel.com>
Copy link

pytorch-bot bot commented Apr 22, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124596

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 53af78e with merge base b198423 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@zejun-chen zejun-chen changed the title [Profiler] divide all the function events to frontend event lists, whose correlation id is 0 [Profiler] iterate frontend function events for profiler post processing Apr 22, 2024
@zejun-chen
Copy link
Contributor Author

Hi, @aaronenyeshi

Could you help review the PR?

Thank you. 👍

@colesbury colesbury added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 22, 2024
@fwenguang
Copy link
Contributor

It looks like we've encountered the same issue. If this patch is merged, then the PR #124389 can be closed.

@zejun-chen
Copy link
Contributor Author

zejun-chen commented Apr 23, 2024

It looks like we've encountered the same issue. If this patch is merged, then the PR #124389 can be closed.

Woohoo! We are solving the same issue. Thank you for help. Both PR are ok to me.
@aaronenyeshi You can review and comment both 2 PRs.

Thank you

Copy link
Member

@aaronenyeshi aaronenyeshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both implementations are good, let's go and merge this since its easier reading the comments and variable names.

@aaronenyeshi aaronenyeshi added release notes: profiler release notes category topic: improvements topic category labels Apr 23, 2024
@aaronenyeshi
Copy link
Member

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 23, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

carmocca pushed a commit to carmocca/pytorch that referenced this pull request Apr 29, 2024
…ing (pytorch#124596)

The `function_events` in `_parse_kineto_results` is used to contain all function events from the result. It contains 2 kinds of events. One is frontend function events whose correlation id is 0, for example, `aten::add`, `aten::mul`. They are on the top level of the profile results. The other is the backend events, which are associated with the frontend events and its correlation id is > 0, for example, `at::native::vectorized_elementwise_kernel`, it should be the backend event of a frontend element-wise op. They have the device execution duration for the related frontend op.

In the following post processing code, the **frontend function events** should be iterated to find its correlated backend events in `device_corr_map`, instead of iterating all function events, because `device_corr_map` is designed as a dict, whose key is the id of the frontend function event.
https://github.com/pytorch/pytorch/blob/3af12447f85dfede191a113c052e58fa7b21a8b3/torch/autograd/profiler.py#L543-L560

https://github.com/pytorch/pytorch/blob/3af12447f85dfede191a113c052e58fa7b21a8b3/torch/autograd/profiler.py#L537-L540
Pull Request resolved: pytorch#124596
Approved by: https://github.com/aaronenyeshi
andoorve pushed a commit to andoorve/pytorch that referenced this pull request May 1, 2024
…ing (pytorch#124596)

The `function_events` in `_parse_kineto_results` is used to contain all function events from the result. It contains 2 kinds of events. One is frontend function events whose correlation id is 0, for example, `aten::add`, `aten::mul`. They are on the top level of the profile results. The other is the backend events, which are associated with the frontend events and its correlation id is > 0, for example, `at::native::vectorized_elementwise_kernel`, it should be the backend event of a frontend element-wise op. They have the device execution duration for the related frontend op.

In the following post processing code, the **frontend function events** should be iterated to find its correlated backend events in `device_corr_map`, instead of iterating all function events, because `device_corr_map` is designed as a dict, whose key is the id of the frontend function event.
https://github.com/pytorch/pytorch/blob/3af12447f85dfede191a113c052e58fa7b21a8b3/torch/autograd/profiler.py#L543-L560

https://github.com/pytorch/pytorch/blob/3af12447f85dfede191a113c052e58fa7b21a8b3/torch/autograd/profiler.py#L537-L540
Pull Request resolved: pytorch#124596
Approved by: https://github.com/aaronenyeshi
petrex pushed a commit to petrex/pytorch that referenced this pull request May 3, 2024
…ing (pytorch#124596)

The `function_events` in `_parse_kineto_results` is used to contain all function events from the result. It contains 2 kinds of events. One is frontend function events whose correlation id is 0, for example, `aten::add`, `aten::mul`. They are on the top level of the profile results. The other is the backend events, which are associated with the frontend events and its correlation id is > 0, for example, `at::native::vectorized_elementwise_kernel`, it should be the backend event of a frontend element-wise op. They have the device execution duration for the related frontend op.

In the following post processing code, the **frontend function events** should be iterated to find its correlated backend events in `device_corr_map`, instead of iterating all function events, because `device_corr_map` is designed as a dict, whose key is the id of the frontend function event.
https://github.com/pytorch/pytorch/blob/3af12447f85dfede191a113c052e58fa7b21a8b3/torch/autograd/profiler.py#L543-L560

https://github.com/pytorch/pytorch/blob/3af12447f85dfede191a113c052e58fa7b21a8b3/torch/autograd/profiler.py#L537-L540
Pull Request resolved: pytorch#124596
Approved by: https://github.com/aaronenyeshi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: profiler release notes category topic: improvements topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants