perf[next-dace]: Enhance MoveDataflowIntoIfBody transformation#2514
perf[next-dace]: Enhance MoveDataflowIntoIfBody transformation#2514iomaganaris merged 12 commits intomainfrom
Conversation
edopao
left a comment
There was a problem hiding this comment.
LGTM, just one minor comment.
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
0cfd5fb to
3ab44a0
Compare
philip-paul-mueller
left a comment
There was a problem hiding this comment.
See my comment.
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
philip-paul-mueller
left a comment
There was a problem hiding this comment.
I have now looked at the changes and I have some suggestions.
A part of my comments are "adding todos for me" and suggestions about improving the comments, so it is not as much as it looks like.
My main concern is that some nodes are replicated (added to the nested state) multiple times.
However, I am not fully sure why the unit test passes, it is probably that I do not see the part that does it.
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
| # Gather all the already moved nodes to avoid that we move the same node multiple times | ||
| already_moved_nodes: set[dace_nodes.Node] = set() | ||
| # Finally relocate the dataflow |
There was a problem hiding this comment.
| # Gather all the already moved nodes to avoid that we move the same node multiple times | |
| already_moved_nodes: set[dace_nodes.Node] = set() | |
| # Finally relocate the dataflow | |
| # Relocate the dataflow, because the node sets listed in `relocatable_dataflow` are not disjoint we have to make sure that we relocate the nodes only once. | |
| already_moved_nodes: set[dace_nodes.Node] = set() |
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
| sdfg: dace.SDFG, | ||
| ) -> None: | ||
| if_block: dace_nodes.NestedSDFG = self.if_block | ||
| if_block_spec = self._partition_if_block(if_block) |
There was a problem hiding this comment.
I kind of have the feeling that this function should compute the mapping you compute in conn_name_to_access_node_map since it has to look at all nodes anyway.
However, I think a TODO in _partition_if_block()'s docstring should be sufficient.
There was a problem hiding this comment.
It needs the relocatable_dataflow though so not sure if it's possible to do that there unless we do it for every node
There was a problem hiding this comment.
Not necessarily.
Having relocatable_dataflow would just allow you skip some instances that are not needed, but I think it is still a net gain if we do it in one go and then reuse, because we have to look at every node anyway.
But it is for sure something for later.
I think I will create a PR that addresses this and some indeterminism.
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Are you sure that here no filtering is needed?
This was possible because nodes_to_move where disjoint, which is no longer the case, thus I would expect that there is some filtering needed, when you create the nodes inside the branch, but not when you create the new_nodes mapping.
The same applies to where you create the new data descriptors.
There was a problem hiding this comment.
I have added the old_to_new_nodes_map that takes care of copying the access nodes per branch only once
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
| existing_connector = conn_name_to_access_node_map[oedge.dst_conn][1].data | ||
| if not inner_sdfg.arrays[existing_connector].transient: | ||
| inner_sdfg.arrays[existing_connector].transient = True | ||
| if_block.remove_in_connector(existing_connector) |
There was a problem hiding this comment.
I do not like this.
It is not "uniform" meaning now you remove the in connector in two different locations.
I would prefer if it only at one location, have you tried to just remove it here and then play the other thing out.
There was a problem hiding this comment.
I think it would actually complicate things if we removed it elsewhere because we would have to keep track of which connectors have to be removed from here or look for which connectors don't have any edges assigned to them
Some changes were requested by Philip, removing my approval.
Co-authored-by: Philip Müller <philip.mueller@cscs.ch>
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Show resolved
Hide resolved
src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
| ][1].data | ||
| if not inner_sdfg.arrays[inner_access_node_of_connector_name].transient: | ||
| inner_sdfg.arrays[inner_access_node_of_connector_name].transient = True | ||
| if_block.remove_in_connector(inner_access_node_of_connector_name) |
There was a problem hiding this comment.
Have you tried to remove just this line.
I think it should still work, because this function is called for the connector anyway and at the end it is removed.
//Currently remove_in_connector() does not check if the connector exits or not and ALWAYS returns True.
There was a problem hiding this comment.
Actually I had a closer look and figure out that if we add another tasklet after the a2 AccessNode then the transformation doesn't move all the AccessNodes and Tasklets it could into the if. See the following SDFG.
The blocker is that _partition_if_block expects in the states of the ConditionalBlocks only nodes that are AccessNodes that have the names of the in/out connectors of the ConditionalBlock NestedSDFG.
Then _check_for_data_and_symbol_conflicts fails because it finds a2 inside the NestedSDFG.
I think the first blocker can be easily handled by relaxing the requirements. The second one however requires more work because we have to rename the AccessNodes in case we have to move them inside and there's an existing AccessNode with the same name. I think that this situation may arise only if we try to move inside the NestedSDFG of the ConditionalBlock an AccessNode that has the same name as one of the input Connectors and thus an AccessNode inside the NestedSDFG as well.
Currently thankfully we don't come across this case in the graupel case so we can avoid fixing this but I don't know if we should actually take care of this in this PR

src/gt4py/next/program_processors/runners/dace/transformations/move_dataflow_into_if_body.py
Outdated
Show resolved
Hide resolved
|
I discussed with @philip-paul-mueller to get this merged and the added test with |

Move nodes into the state of a
ConditionalBlockeven if they are marked by multipleConnectorsNeeds:
main