-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import HTA to Chakra to extract synchronization dependency #1
base: refactor
Are you sure you want to change the base?
Conversation
be20a5d
to
e73309b
Compare
f305f72
to
8f71209
Compare
linker = TraceLinker( | ||
args.pytorch_et_file, | ||
args.kineto_file, | ||
args.log_level | ||
) | ||
linker.load_traces() | ||
linker.enforce_inter_thread_order() | ||
linker.enforce_sync_order(cpa) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether it is the best name to describe your method. Could you please justify the method name or rename the method name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to make similar name with 'enforce_inter_thread_order'.
self.raw_events = None | ||
self.sync_deps = {} | ||
|
||
annotation = "ProfilerStep" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to know whether we can assume that the annotation is always 'ProfilerStep'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can. That is always assumed in HTA testing example. HTA also describes in that way.
annotation (str): a trace annotation to limit the analysis to,
for example "ProfilerStep" would match all annotations that
match this string (ProfilerStep#100, ProfilerStep#101 etc)
self.sync_deps = {} | ||
|
||
annotation = "ProfilerStep" | ||
instance_id = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the meaning of instance_id? Why is it set to zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to HTA, instance_id is used to classify which annotation to consider.
instance_id: can be either of the following
(int) - specify which instance of the annotation to consider.
Defaults to the first instance.
(Tuple(int, int)) - considers a range of annotation instances start to end,
inclusive of both start and end instance.
The tool fails with the following command.
|
This failure occurs when the HTA can not find the files in the directory. Could you check if the files are really there. |
9b58ec7
to
1df8ee6
Compare
Summary
This PR is to process synchronization dependency between the Chakra nodes.
In order to do that, we use CriticalPathAnalyzer in Holistic Trace Analysis (https://github.com/facebookresearch/HolisticTraceAnalysis/blob/main/hta/analyzers/critical_path_analysis.py).
Please note that,
Test Plan
Download and Install HTA
Run Chakra et_converter
Test input with Resnet-50 with 2 GTX1070 (rank 0)
eg.rank_0.pt.trace.json
kineto.rank_0_step_5.1708449344148840892.pt.trace.json
Test result with Resnet-50 with 2 GTX1070 (rank 0)
rank_0.json
Test Result with Megatron (No Sync dependency)
I've observed that this update will not cause any changes in result with trace which has no synchronization dependency.
Original
sys[4] finished, 607252677 cycles
sys[5] finished, 607253196 cycles
sys[6] finished, 607253715 cycles
sys[7] finished, 607254234 cycles
sys[0] finished, 607254753 cycles
sys[1] finished, 607255272 cycles
sys[2] finished, 607255791 cycles
sys[3] finished, 607256310 cycles
New
sys[4] finished, 607252677 cycles
sys[5] finished, 607253196 cycles
sys[6] finished, 607253715 cycles
sys[7] finished, 607254234 cycles
sys[0] finished, 607254753 cycles
sys[1] finished, 607255272 cycles
sys[2] finished, 607255791 cycles
sys[3] finished, 607256310 cycles