Skip to content

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Feb 11, 2021

Stack from ghstack:

Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu). So, we need to match subgraphs.

This PR does the following:

  • defines a "subgraph" as (start_node, end_node). The current assumption
    is that subgraphs are simple, there is always a path from start_node to
    end_node, and we can ignore any non-input args/kwargs of these nodes
    for the purposes of matching and copying things. An example one node
    subgraph is (F.linear, F.linear). An example two node subgraph
    is (F.linear, F.relu).
  • changes the matching logic to iterate over subgraphs instead of nodes
  • changes the NS core APIs to use subgraph pairs instead of node pairs:
  1. for weights, we match on the start node
  2. for unshadowed activations, we observe the end nodes
  3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

python test/test_quantization.py TestFXNumericSuiteCoreAPIs

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D26403092

Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Feb 11, 2021
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 862b004
Pull Request resolved: #52130
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Feb 11, 2021

💊 CI failures summary and remediations

As of commit e705ae6 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D26403092](https://our.internmc.facebook.com/intern/diff/D26403092)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Feb 12, 2021
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: fb8df70
Pull Request resolved: #52130
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D26403092](https://our.internmc.facebook.com/intern/diff/D26403092)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Feb 12, 2021
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: a66f4ce
Pull Request resolved: #52130
(F.relu, F.linear),
])

# TODO(future PR): we should see if we can reuse quantization's fusion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a good idea. In general, would the entire graph_matcher logic be shared with fx quantization passes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would be a good long term state. I'd imagine most of the benefits to come from sharing the fusion patterns, the matching logic itself may not generalize.

# if we match linear-relu backwards, we will always skip the
# relu node and attempt to match the linear node. This can
# be made configurable later if needed.
for _reverse_fusion_ops in get_reversed_fusions():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we always pick the largest fusion here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the ordering in the default would be followed, with the ability for user to customize. It doesn't handle that yet. We'd have to add that logic as soon as we have more than one fusion.

# for now, use node name.
# TODO(future PR): find a better solution
return node_b.name
return end_node_b.name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be start_node_b.name+end_node_b.name so that it is clear that it is a sub-graph and not a single node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. In general naming of nodes is something we should talk through in detail for all of these APIs, saving that for a future PR. Thankfully changing that later should be low eng cost, so it doesn't have to be perfect now.

and continuing backwards.
1. Returns matchable nodes, in order
2. Skips over non-matchable nodes
1. Returns matchable subgraphs, in order. A subgraph is defined by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, dont follow this logic: Where is the sub-graph specified in the init? I dont see the end_node being defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is in the __next__ function. We pop a potential end node off the stack. If the end node is matchable, we try to peek backwards to check for a potential fusion pattern. We return (end_node, end_node) if there is no fusion pattern, or (start_node, end_node) if there is a fusion pattern ending at end_node.

Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D26403092](https://our.internmc.facebook.com/intern/diff/D26403092)

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Feb 16, 2021
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 79add57
Pull Request resolved: #52130
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

TODO before land: better test plan

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D26403092](https://our.internmc.facebook.com/intern/diff/D26403092)

[ghstack-poisoned]
])

class _NSGraphMatchableNodesIterator:
def get_reversed_fusions() -> Set[Tuple[Callable, Callable]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this only going to support matching 2 nodes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main restriction is that the subgraph needs one start point and one end point. Support for a length of 3 nodes can be added here with low eng cost. This type would become Union[Tuple[Callable, Callable], Tuple[Callable, Callable, Callable]].

note: currently this is not a list because lists are not hashable. In general, I'm flexible on the exact data structure, happy to revise it if there is a better solution.

continue
self.seen_nodes.add(cur_node)

# for subgraphs which are single nodes, start_node == end_node
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A n00b question, how do we distinguish the subgraph of single nodes and the subgraph with two identical nodes? For example if start_node and end_node are both linear, the subgraph could be single node or it could be a subgraph with two linear nodes. Or do we assume that subgraph nodes are different, which is true for the subgraphs we want to match such as (linear, relu) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, the node here is the actual node object (not node type or name, etc). So, two distinct F.linear nodes can be used as a start point and an endpoint. For example,

linear1 -> linear2 -> linear3

# a subgraph of just linear1
(linear1, linear1)

# a subgraph of 1..3
(linear1, linear3)

if node_b.op == 'call_module':
assert isinstance(node_b.target, str)
return node_b.target
if end_node_b.op == 'call_module':
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, just wonder why need special handling for call_module and can't get node name directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds reasonable to me, happy to update that in a separate PR. In general this part is pretty self-contained, so we can make updates to this easily.

# note: relatedness is checked on the start node, i.e.
# if a linear-relu pattern is checked, we would check for relatedness
# of the linear
if not _node_a_related_to_b(cur_start_node_a, cur_start_node_b,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since subgraph matching is introduced to graph matcher, would it be natural to extend the relatedness checking to support subgraph also, which might be more general?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that could make sense. Is there a use case in mind?

Since this is a private API, the eng cost of changing this in the future would be low.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can think about it in later PR. I don't have specific use case in mind, just thinking if we can relate for example the subgraph (linear, relu) to linear_relu directly, then it would be self-explanatory and we may not need to note "relatedness is checked on the start node".

@vkuzo vkuzo changed the title [wip] ns for fx: add support for subgraph matching ns for fx: add support for subgraph matching Feb 17, 2021
nodes_to_instrument_b_to_a = {}
for match_name, (node_a, node_b) in matched_node_pairs.items():
nodes_to_instrument_b_to_a[node_b] = node_a
node_b_to_matched_subgraph_a = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is quite useful to insert shadow at multiple levels. Assuming in a later PR we will support shadowing a sub-graph instead of a single node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds reasonable

# subgraph so far:
#
# dtype_cast_node --> node_a_copy
# dtype_cast_node --> subgraph_a_copy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the cast be at the input of node_c?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct. The code does this, but the comment is wrong, thanks for catching. Will update before landing.

Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D26403092](https://our.internmc.facebook.com/intern/diff/D26403092)

[ghstack-poisoned]
Summary:

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:

```
python test/test_quantization.py TestFXNumericSuiteCoreAPIs
```

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D26403092](https://our.internmc.facebook.com/intern/diff/D26403092)

[ghstack-poisoned]
@codecov
Copy link

codecov bot commented Feb 18, 2021

Codecov Report

Merging #52130 (e705ae6) into gh/vkuzo/225/base (16f0cdf) will increase coverage by 0.00%.
The diff coverage is 99.14%.

@@                Coverage Diff                 @@
##           gh/vkuzo/225/base   #52130   +/-   ##
==================================================
  Coverage              80.33%   80.34%           
==================================================
  Files                   1967     1967           
  Lines                 215670   215722   +52     
==================================================
+ Hits                  173262   173313   +51     
- Misses                 42408    42409    +1     

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in d903106.

@facebook-github-bot facebook-github-bot deleted the gh/vkuzo/225/head branch February 22, 2021 15:17
xsacha pushed a commit to xsacha/pytorch that referenced this pull request Mar 31, 2021
Summary:
Pull Request resolved: pytorch#52130

We have patterns like (F.linear, F.relu) which need to match
to (toq.linear_relu).  So, we need to match subgraphs.

This PR does the following:
* defines a "subgraph" as (start_node, end_node). The current assumption
is that subgraphs are simple, there is always a path from start_node to
end_node, and we can ignore any non-input args/kwargs of these nodes
for the purposes of matching and copying things. An example one node
subgraph is (F.linear, F.linear).  An example two node subgraph
is (F.linear, F.relu).
* changes the matching logic to iterate over subgraphs instead of nodes
* changes the NS core APIs to use subgraph pairs instead of node pairs:
1. for weights, we match on the start node
2. for unshadowed activations, we observe the end nodes
3. for shadowed activations, we copy the subgraph of a to graph c

TODO(before review) write up better, not ready for review yet

Test Plan:
TODO before land: better test plan

Imported from OSS

Reviewed By: raghuramank100

Differential Revision: D26403092

fbshipit-source-id: e49aaad4b02b8d60589435848bee422b8f41937a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants