Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2/n] Critical Path analysis for GPU events and CPU->GPU and GPU->CPU dependencies #68

Closed
wants to merge 4 commits into from

Conversation

briancoutinho
Copy link
Contributor

What does this PR do?

This is a follow up to #67 that adds the nodes in the critical path analysis graph for GPU kernels
and in addition we also add the kernel launch delay and synchronization events.

Details

We update the graph with the following

  1. GPU kernels have two nods - begin, end and an edge connecting them.
  2. Kernel launch delay - if the CUDA runtime launched a kernel when there was no outstanding kernel on the same stream, we add an edge for the delay in the critical path with the weight = launch delay.
  3. Kernel kernel delay - the else case of 2, we add an implicit dependency on the last CUDA kernel executed on the stream.
  4. Synchronization - using CUDA sync events we track
    a. Context/Device sync= the last kernel on all streams are a dependency for the Context Sync call on CPU.
    b. Stream Sync= the last kernel on the specific stream being synced is a dependency for the Stream Sync runtime call on CPU.

Event sync is handled in the next PR.

Test/Examples

Running this on simple add test.
We can see the edges for kernel launch and kernel-kernel delay.
Screenshot 2023-10-05 at 11 04 36 AM

The example below shows Context Sync event (pink on the right), there are two edges coming in, one each from the last kernel on each CUDA stream.

Screenshot 2023-10-05 at 11 05 16 AM

The final critical path shifts from CPU to GPU in the end of the iteration - highlighted events are on critical path.

Screenshot 2023-10-05 at 11 06 27 AM

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
    • N/A
  • Did you write any new necessary tests?
    • N/A
  • Did you make sure to update the docs?
    • N/A
  • [] Did you update the changelog?
    • N/A

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 5, 2023
Copy link
Contributor

@anupambhatnagar anupambhatnagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Even though it is an experimental feature, it would be nice to have critical path analysis in the documentation on Read The Docs. It would make it easier for users to find the feature and learn about it.

def _add_gpu_cpu_sync_edge(self, gpu_node: CPNode, runtime_eid: int) -> None:
"""Add an edge between gpu_node and the runtime event on CPU"""
_, end_node = self.get_nodes_for_event(runtime_eid)
logger.info(f"Adding a sync edge between nodes {gpu_node} -> {end_node}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be debug level?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change the print statements on lines 396 and 397 to use logger.

@codecov-commenter
Copy link

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (8084627) 90.78% compared to head (06d5e31) 90.94%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #68      +/-   ##
==========================================
+ Coverage   90.78%   90.94%   +0.15%     
==========================================
  Files          30       30              
  Lines        2441     2506      +65     
==========================================
+ Hits         2216     2279      +63     
- Misses        225      227       +2     
Files Coverage Δ
hta/analyzers/trace_counters.py 96.59% <ø> (ø)
tests/test_trace_analysis.py 99.57% <100.00%> (+0.03%) ⬆️
hta/analyzers/critical_path_analysis.py 90.53% <92.06%> (+1.30%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@facebook-github-bot
Copy link
Contributor

@briancoutinho has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@briancoutinho has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@briancoutinho merged this pull request in 7780150.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants