Execution Trace Correlation Support #58

briancoutinho · 2023-07-17T17:31:46Z

🚀 Motivation and context

Chakra Execution Traces is an open and interoperable graph-based representation of AI/ML workloads focused on enabling and accelerating AI SW/HW co-design. Chakra execution traces represent key operations, such as compute, memory, and communication, data and control dependencies, timing, and resource constraints. Additionally, Chakra includes a complementary set of tools and capabilities to enable the collection, analysis, generation, and adoption of Chakra ETs by a broad range of simulators, emulators, and replay tools.

Correlating Execution Trace with PyTorch timeline traces will lead to an enriched trace data structure containing

Detailed operator input/output tensor information (from ET).
Dependency edges between operators and modules (from ET).
Timeline (start, duration) information of PyTorch framework as well as GPU kernels (from Kineto).

This unlocks work like critical path analysis, estimation of efficiency improvements for anti-pattern detection, better operator input/output details etc.

Description

We can start correlating Execution Trace and Kineto Trace for single rank.
There are two possible cases for correlation

ET and Kineto trace have overlap i.e collected together. This can be easily handled using record function ID ('rf_id') field.
ET and Kineto are from different times. To correlate here we need to use a tree correlation algorithm. Possible implementation for this already exists in param #PR79

Setup

We propose adding param as a third party dependency for this project, this will import the Execution trace parsing datastructures etc.

Alternatives

Additional context

No response

Summary: ## What does this PR do? Add ability to read and correlate execution trace explained in #58 ## Before submitting - [x] Was this discussed/approved via a Github issue? (no need for typos, doc improvements) - [ ] N/A - [x] Did you write any new necessary tests? - [ ] N/A - [ ] Did you make sure to update the docs? - [x] N/A Feature as a whole is not yet ready, so we can wait till some of the foundational blocks are done - [ ] Did you update the [changelog](https://github.com/facebookresearch/HolisticTraceAnalysis/blob/main/CHANGELOG.md)? - [ ] N/A Pull Request resolved: #57 Reviewed By: anupambhatnagar Differential Revision: D47805905 Pulled By: briancoutinho fbshipit-source-id: 291bc0ea891a7ab15c9627a6f867497c29fdf466

briancoutinho added feature request New feature request good first issue Good for newcomers labels Jul 17, 2023

briancoutinho self-assigned this Jul 17, 2023

briancoutinho mentioned this issue Jul 21, 2023

Read and basic correlation for execution trace #57

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Execution Trace Correlation Support #58

Execution Trace Correlation Support #58

briancoutinho commented Jul 17, 2023 •

edited

Loading

Execution Trace Correlation Support #58

Execution Trace Correlation Support #58

Comments

briancoutinho commented Jul 17, 2023 • edited Loading

🚀 Motivation and context

Description

Setup

Alternatives

Additional context

briancoutinho commented Jul 17, 2023 •

edited

Loading