Dump graph env #665

yushangdi · 2022-07-27T16:55:31Z

creates a fx graph of the buffers generated by lowering

with this patch you can run e.g.

INDUCTOR_SCHEDULER_GRAPH=1 python benchmarks/torchbench.py --training --devices=cuda --inductor --skip-accuracy-check -n 1 --isolate -k hf_Bert

to dump the forward and backward graphs of compute-buffers of hf_Bert. The resulting svg file will be in torchbenchmark/.

The dumped files' names are in the format of compute_buffer_{num}.svg.

Also the dumping is quite slow, so for large models like hf_Bert, you would want to change the number of layers to a small number.

The graph dumped looks like this: https://www.svgviewer.dev/s/PpmacAjw

yushangdi · 2022-07-27T17:22:50Z

changed to be gated by a config flag instead of using a whole different path per request.
Different with PR :#635:
now we use the group already computed in the scheduler node instead of recomputing them for dumping.

Chillee

Mostly LGTM, although I defer to Jason if he has any strong opinions on how debug logging should be structured.

Chillee · 2022-07-27T19:02:24Z

torchinductor/scheduler.py


 import numpy as np
 import torch
+from functorch._src.partitioners import draw_graph


Might want to move this import into the function.

Chillee · 2022-07-27T19:03:08Z

torchinductor/scheduler.py

                assert False, node
        self.name_to_node = {node.get_name(): node for node in self.nodes}

+        if bool(os.environ.get("INDUCTOR_DEBUG", False)):


Rename this to INDUCTOR_SCHEDULER_GRAPH=1 or something like that for now.

We'll need a better system eventually (cc: @jansel ).

Chillee · 2022-07-27T19:07:02Z

torchinductor/scheduler.py

        self.name_to_node = {node.get_name(): node for node in self.nodes}

+        if bool(os.environ.get("INDUCTOR_DEBUG", False)):
+            global graph_dump_index


So... we only create a single scheduler per graph we pass to inductor. I'd propose instead that we make a global variable that anybody can access for the purposes of logging.

changed, we also need the PR here: pytorch/pytorch#82368

Chillee · 2022-07-27T23:29:01Z

torchinductor/scheduler.py

+    return func1
+
+
+def create_fx_graph(nodes, fname, print_graph=False):


By the way, let's move this code to utils.py

resolved offline. Will keep this here because moving it to utils.py might cause circular dependency in the future.

Also, let's rename this to create_fx_from_buffers

Chillee

Other than one final note, LGTM.

Not sure if anybody else has any opinions :)

(might want to wait until other folks chime in)

cc: @anijain2305 @ngimel @desertfire

ngimel · 2022-07-28T00:47:12Z

torchinductor/scheduler.py

+            from functorch._src.aot_autograd import get_graph_being_compiled
+
+            graph_name = get_graph_being_compiled()
+            create_fx_graph(self.nodes, graph_name, print_graph=True)


not sure you want print_graph=True here?

as @Chillee said below, this line is only called when we are in debugging mode, so I think printing out the graph is probably helpful?

ngimel · 2022-07-28T00:47:27Z

torchinductor/scheduler.py

+
+    if print_graph:
+        print(graph)
+    print("starting creating module")


stray prints?

This is within create_fx_graph, which is for debugging purposes, so i think it's fine.

This reverts commit 5e32ac2.

### Description Add utilities to get the graph being compiled for debugging purposes. used in this PR: pytorch/torchdynamo#665 Pull Request resolved: #82368 Approved by: https://github.com/Chillee

Summary: ### Description Add utilities to get the graph being compiled for debugging purposes. used in this PR: pytorch/torchdynamo#665 X-link: pytorch/pytorch#82368 Approved by: https://github.com/Chillee Reviewed By: osalpekar Differential Revision: D38290516 Pulled By: yushangdi fbshipit-source-id: 77ba38176683e27450307f23740cfc74725e8311

Summary: ### Description Add utilities to get the graph being compiled for debugging purposes. used in this PR: pytorch/torchdynamo#665 Pull Request resolved: #82368 Approved by: https://github.com/Chillee Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/46eeb78c5694fe5d2ba3cb13268abcfc3d53997d Reviewed By: osalpekar Differential Revision: D38290516 Pulled By: yushangdi fbshipit-source-id: 77ba38176683e27450307f23740cfc74725e8311

yushangdi added 10 commits July 22, 2022 00:15

add dump graph utils

95f0b1e

cleanup

f4dee87

add meta data

761b8d7

add node type to target

cafe8d8

add model name

f7768ae

refactor

c584cdd

clean

1e8941c

clean

9d2ee60

revert decomposition

627356f

change path

b358b7f

yushangdi requested a review from Chillee July 27, 2022 16:55

facebook-github-bot added the cla signed label Jul 27, 2022

yushangdi added 6 commits July 27, 2022 16:56

revert redundant

ddb863e

revert

d51fa19

lint

d1c816a

isort lint

c535eb0

resolve with main

512c206

Merge branch 'main' into dump_graph_env

0722f45

yushangdi requested a review from jansel July 27, 2022 17:23

Chillee reviewed Jul 27, 2022

View reviewed changes

yushangdi added 2 commits July 27, 2022 20:54

change import and env var name

4480aad

use functorch's global variable

b24d2ba

yushangdi mentioned this pull request Jul 27, 2022

add graph_name to functorch pytorch/pytorch#82368

Closed

yushangdi requested a review from Chillee July 27, 2022 21:41

lint

89e458b

Chillee reviewed Jul 27, 2022

View reviewed changes

Chillee approved these changes Jul 27, 2022

View reviewed changes

yushangdi requested review from anijain2305 and desertfire July 28, 2022 00:08

yushangdi requested a review from ngimel July 28, 2022 00:08

ngimel reviewed Jul 28, 2022

View reviewed changes

rename

336bd81

yushangdi merged commit 5e32ac2 into main Jul 28, 2022

anijain2305 added a commit that referenced this pull request Jul 28, 2022

Revert "Dump graph env (#665)"

a65cdb7

This reverts commit 5e32ac2.

yushangdi pushed a commit that referenced this pull request Jul 28, 2022

Revert "Dump graph env (#665)" (#673)

f478349

This reverts commit 5e32ac2.

yushangdi mentioned this pull request Jul 28, 2022

Relanding scheduler graph dump #677

Merged

		return func1


		def create_fx_graph(nodes, fname, print_graph=False):

Dump graph env #665

Dump graph env #665

Uh oh!

Conversation

yushangdi commented Jul 27, 2022 • edited by Chillee Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yushangdi commented Jul 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Chillee left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Chillee left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yushangdi commented Jul 27, 2022 •

edited by Chillee

Loading

yushangdi commented Jul 27, 2022 •

edited

Loading