-
Notifications
You must be signed in to change notification settings - Fork 129
Dump graph env #665
Dump graph env #665
Conversation
|
changed to be gated by a config flag instead of using a whole different path per request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, although I defer to Jason if he has any strong opinions on how debug logging should be structured.
torchinductor/scheduler.py
Outdated
|
|
||
| import numpy as np | ||
| import torch | ||
| from functorch._src.partitioners import draw_graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to move this import into the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved
torchinductor/scheduler.py
Outdated
| assert False, node | ||
| self.name_to_node = {node.get_name(): node for node in self.nodes} | ||
|
|
||
| if bool(os.environ.get("INDUCTOR_DEBUG", False)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this to INDUCTOR_SCHEDULER_GRAPH=1 or something like that for now.
We'll need a better system eventually (cc: @jansel ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed
torchinductor/scheduler.py
Outdated
| self.name_to_node = {node.get_name(): node for node in self.nodes} | ||
|
|
||
| if bool(os.environ.get("INDUCTOR_DEBUG", False)): | ||
| global graph_dump_index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... we only create a single scheduler per graph we pass to inductor. I'd propose instead that we make a global variable that anybody can access for the purposes of logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed, we also need the PR here: pytorch/pytorch#82368
torchinductor/scheduler.py
Outdated
| return func1 | ||
|
|
||
|
|
||
| def create_fx_graph(nodes, fname, print_graph=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, let's move this code to utils.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved offline. Will keep this here because moving it to utils.py might cause circular dependency in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, let's rename this to create_fx_from_buffers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than one final note, LGTM.
Not sure if anybody else has any opinions :)
(might want to wait until other folks chime in)
torchinductor/scheduler.py
Outdated
| from functorch._src.aot_autograd import get_graph_being_compiled | ||
|
|
||
| graph_name = get_graph_being_compiled() | ||
| create_fx_graph(self.nodes, graph_name, print_graph=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure you want print_graph=True here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as @Chillee said below, this line is only called when we are in debugging mode, so I think printing out the graph is probably helpful?
|
|
||
| if print_graph: | ||
| print(graph) | ||
| print("starting creating module") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stray prints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is within create_fx_graph, which is for debugging purposes, so i think it's fine.
This reverts commit 5e32ac2.
### Description Add utilities to get the graph being compiled for debugging purposes. used in this PR: pytorch/torchdynamo#665 Pull Request resolved: #82368 Approved by: https://github.com/Chillee
Summary: ### Description Add utilities to get the graph being compiled for debugging purposes. used in this PR: pytorch/torchdynamo#665 X-link: pytorch/pytorch#82368 Approved by: https://github.com/Chillee Reviewed By: osalpekar Differential Revision: D38290516 Pulled By: yushangdi fbshipit-source-id: 77ba38176683e27450307f23740cfc74725e8311
Summary: ### Description Add utilities to get the graph being compiled for debugging purposes. used in this PR: pytorch/torchdynamo#665 Pull Request resolved: #82368 Approved by: https://github.com/Chillee Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/46eeb78c5694fe5d2ba3cb13268abcfc3d53997d Reviewed By: osalpekar Differential Revision: D38290516 Pulled By: yushangdi fbshipit-source-id: 77ba38176683e27450307f23740cfc74725e8311
creates a fx graph of the buffers generated by lowering
with this patch you can run e.g.
INDUCTOR_SCHEDULER_GRAPH=1 python benchmarks/torchbench.py --training --devices=cuda --inductor --skip-accuracy-check -n 1 --isolate -k hf_Bertto dump the forward and backward graphs of compute-buffers of hf_Bert. The resulting svg file will be in torchbenchmark/.
The dumped files' names are in the format of compute_buffer_{num}.svg.
Also the dumping is quite slow, so for large models like hf_Bert, you would want to change the number of layers to a small number.
The graph dumped looks like this: https://www.svgviewer.dev/s/PpmacAjw