Skip to content
This repository was archived by the owner on Apr 23, 2021. It is now read-only.

Conversation

bondhugula
Copy link
Contributor

@bondhugula bondhugula commented Sep 6, 2019

  • add canonicalization pattern to compose maps into affine loads/stores and
    canonicalize its maps + operands; templatize the pattern and reuse it for
    affine.apply as well

  • rename getIndices -> getMapOperands() (getIndices is confusing since
    these are no longer the indices themselves but operands to the map
    whose results are the indices). This also makes the accessor uniform
    across affine.apply/load/store. Change arg names on the affine
    load/store builder to avoid confusion. Drop an unused confusing build
    method on AffineStoreOp.

  • update incomplete doc comment for canonicalizeMapAndOperands (this was
    missed from a previous update).

Addresses issue #121

Signed-off-by: Uday Bondhugula uday@polymagelabs.com

@River707
Copy link
Contributor

River707 commented Sep 6, 2019

@andydavis1 Can you review this?

@andydavis1
Copy link
Contributor

Thanks for this pull request Uday. Can you comment on what the use case is here? The goal with affine.load/store is to construct them with affine maps which are already composed. What is your use case for generating IR with uncomposed affine.load/store maps?

@bondhugula
Copy link
Contributor Author

One typical use case is when we unroll and jam or unroll loops, affine apply's are going to be prepended to the innermost loop body blocks (where not all uses of the affine apply's results are loads and stores); so composing those maps into the loads and stores is quite useful there. For eg., scalar replacement on such register tiled/unrolled bodies will benefit from such one-time composition.

Similarly, other mutations that may insert affine apply's in front of ops like when normalizing memref layouts will need composition - but in these cases, one could use a helper method to update the load/store and compose in place without adding an affine apply (which you mention). However, for loop transformations, the IV remappings would get prepended as affine applys in front of code blocks, and not all uses of those affine applys are going to be loads and stores. For eg. they could be passed into function calls or nested regions that are isolated via block arguments.

@andydavis1
Copy link
Contributor

OK, but you won't be able to call build on affine.load/store (their verifier method checks that operands are valid dims/symbols). In the dma generation pass, I had to create affine.apply ops, then compose maps, before calling create for affine load/store/dma_start/wait (after which I would delete the temporary affine applys). I think what we need would be a way to compose a map in the IR, without having the temp affine applys. We are shooting for transformation passes that do not generate uncomposed maps for affine ops.

@bondhugula
Copy link
Contributor Author

bondhugula commented Sep 7, 2019

OK, but you won't be able to call build on affine.load/store (their verifier method checks that operands are valid dims/symbols). In the dma generation pass, I had to create affine.apply ops, then

I see - by "valid dims/symbols", I assume you mean "top-level symbols and for induction vars"?

compose maps, before calling create for affine load/store/dma_start/wait (after which I would delete the temporary affine applys). I think what we need would be a way to compose a map in the IR, without having the temp affine applys.

Yes, we really need this - to be able to compose the result of a "map + operands" into the input of another map (without having to create an affine.apply). The place to implement this would be the AffineApplyNormalizer. I actually have a method in my local copy that does a diff of two "map + operands" inputs to create a "result map + result operands" without creating additional affine.apply's. I think all support around this is going to be complete if we have the following two in general:

  1. compose the result of a "map + operands" into another map's input operand
  2. compute an affine function of the results of several "map + operands" and represent the result as a "map + operands" (the 'difference' would just be a special case of this and has common uses cases).

We are shooting for transformation passes that do not generate uncomposed maps for affine ops.

Note however that there is no way to avoid creating the affine.apply in general -- as an example to transform the snippet below (consider an unroll and jam, unroll, or any kind of shifting/scaling of the IV on it) --- because you can only compose those transforming maps into affine load's and store's but not into the call's or other op's which don't have maps as attributes. Similarly, std load's and store's won't have maps, and you can't eliminate the affine.apply resulting from the remapping. Furthermore, if the affine load's and store's are in an inner nested region that is isolated from above, the maps can't be composed with such ops and you'll have to leave the affine apply at the top. I don't see a disadvantage with this - there is still no uncomposed map post transformation.

for %i = ..
  for %j = ..
    affine.load %A[%i, %j]  :...
    affine.load %B[%i, %j + 1] ...
    %v = call @foo(%i, %j) : (index, index) -> (f64)
     "vec.bar"(%i, %j) : (...) -> (...)
    affine.store %v, %C[%i, %j]  : ...

@nicolasvasilache
Copy link

We are shooting for transformation passes that do not generate uncomposed maps for affine ops.

@andydavis1 any rough ETA on when we hope to do this? We have a chance back in January but didn't take, there is a lot of code I'd like to simplify once we are ready to concretely give up on chains of ops with maps + operands (originally affine.apply but now also affine.load/store).

Note however that there is no way to avoid creating the affine.apply in general -- as an example to transform the snippet below (consider an unroll and jam, unroll, or any kind of shifting/scaling of the IV on it) --- because you can only compose those transforming maps into affine load's and store's but not into the call's or other op's which don't have maps as attributes.

Right, the issue is chains of ops that have maps and operands and that will need to be composed at some point. By using only operations/builders that don't increase the chain length (I think fullyComposeMapAndOperands IIRC), and not writing tests using chains, we will be better off. If an affine.apply is conceptually needed and it has nothing to compose/fold into then it needs to be materialized.

@andydavis1
Copy link
Contributor

ntv@ I changed over all the affine analysis and transformation passes to create affine ops with composed maps by construction in my mega commit I made back in august.

bondhugula@ I see the need for affine applys to still exist for things like function calls (your example above). However, in your example above, the affine loads/stores should be transformed to take a composed affine map (even if the function call cannot and still needs an affine apply).

@bondhugula
Copy link
Contributor Author

ntv@ I changed over all the affine analysis and transformation passes to create affine ops with composed maps by construction in my mega commit I made back in august.

bondhugula@ I see the need for affine applys to still exist for things like function calls (your example above). However, in your example above, the affine loads/stores should be transformed to take a composed affine map (even if the function call cannot and still needs an affine apply).

I see. But how about the use case where you inline a function and it brings in affine.load's/affine.store's from the callee and there are dominating affine.applys at the caller? At that point, one would have to compose any affine apply at the caller with the just inlined function. Do you foresee that the inliner does the composition in an integrated way rewriting the affine load/store and thus keeping it composed? But this would lead to a mixing of concerns on the flip side. Normally, one may want to just inline (keeping the inliner simple), and then have a pattern rewrite to compose those maps into whichever ops that were brought in for inlining. WDYT?

The other option would be that of composing maps across procedure boundaries (rewriting function argument lists and cloning functions) would require more elaborate implementation and we don't have any such support at the moment AFAIK. Even if we did, we may still want to be able to represent the affine.apply outside and be able to inline and then compose.

@andydavis1
Copy link
Contributor

bondhugula@ In both your loop unrolling and function inlining example, its OK for the non-affine.load/store operations to take affine.apply results, but any IR transformation should create affine.load/store ops with composed affine maps (I think this was the main point of creating these new ops). Otherwise, you'd have to call canonicalize after every affine transformation pass and affine transformation passes could not depend on affine.load/stores already having composed maps (making them more complicated, because they would have to re-check).

@bondhugula
Copy link
Contributor Author

bondhugula@ In both your loop unrolling and function inlining example, its OK for the non-affine.load/store operations to take affine.apply results, but any IR transformation should create affine.load/store ops with composed affine maps (I think this was the main point of creating these new ops). Otherwise, you'd have to call canonicalize after every affine transformation pass and affine transformation passes could not depend on affine.load/stores already having composed maps (making them more complicated, because they would have to re-check).

Yes, this makes sense to me. There are currently a handful of places (like loop unroll, unroll and jam) where such uncomposed affine.apply's are being generated. I think once we have a helper that can compose into ops that would allow it, those transforms could be updated. Looks like we need an enhanced replaceAllUsesWith that takes an AffineMap as an additional input, and does a compose with map + replace operand to applicable ops. WDYT?

@bondhugula bondhugula changed the title Add rewrite pattern to compose maps into affine load/stores Add rewrite pattern to canonicalize/compose maps into affine load/stores Sep 13, 2019
@bondhugula
Copy link
Contributor Author

@andydavis1 Coming back to this PR: the rewrite patterns that this PR is adding really do two things: in addition to the composition, they are also canonicalizing the map and operands of the load/store (this was also missing). In various cases after transformations/replacements, it's common to have unused or duplicate or constant operands -- as an example, replaceAllUsesWith could change %A[%i - %j] to %A[%i - %i] or %A[%i - %c0] - this happens often when replacing single iteration loops by their bodies. So we want -canonicalize to be able to trim operands or propagate constants into affine ops.

As for the composing into load/store part, once we have the 'composed by construction' invariant respected everywhere, the only use cases I can think are: (a) composing post function inlining, (b) users of MLIR generating such uncomposed stuff as the first pass for convenience and then calling the canonicalizer to compose before any IR verification is done.

@joker-eph
Copy link
Contributor

Seems like this branch is in conflicts?

@bondhugula
Copy link
Contributor Author

Seems like this branch is in conflicts?

Fixed, thanks.

- add canonicalization pattern to compose maps into affine loads/stores;
  templatize the pattern and reuse it for affine.apply as well

- rename getIndices -> getMapOperands() (getIndices is confusing since
  these are no longer the indices themselves but operands to the map
  whose results are the indices). This also makes the accessor uniform
  across affine.apply/load/store. Change arg names on the affine
  load/store builder to avoid confusion. Drop an unused confusing build
  method on AffineStoreOp.

- update incomplete doc comment for canonicalizeMapAndOperands (this was
  missed from a previous update).

Addresses issue tensorflow#121

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>
@bondhugula
Copy link
Contributor Author

Seems like this branch is in conflicts?

Fixed, thanks.

@joker-eph All checks now pass. Thanks.

tensorflow-copybara pushed a commit to tensorflow/tensorflow that referenced this pull request Sep 17, 2019
- add canonicalization pattern to compose maps into affine loads/stores;
  templatize the pattern and reuse it for affine.apply as well

- rename getIndices -> getMapOperands() (getIndices is confusing since
  these are no longer the indices themselves but operands to the map
  whose results are the indices). This also makes the accessor uniform
  across affine.apply/load/store. Change arg names on the affine
  load/store builder to avoid confusion. Drop an unused confusing build
  method on AffineStoreOp.

- update incomplete doc comment for canonicalizeMapAndOperands (this was
  missed from a previous update).

Addresses issue #121

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes #122

COPYBARA_INTEGRATE_REVIEW=tensorflow/mlir#122 from bondhugula:compose-load-store e71de1771e56a85c4282c10cb43f30cef0701c4f
PiperOrigin-RevId: 269619540
@bondhugula bondhugula deleted the compose-load-store branch September 28, 2019 05:18
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Dec 24, 2019
- add canonicalization pattern to compose maps into affine loads/stores;
  templatize the pattern and reuse it for affine.apply as well

- rename getIndices -> getMapOperands() (getIndices is confusing since
  these are no longer the indices themselves but operands to the map
  whose results are the indices). This also makes the accessor uniform
  across affine.apply/load/store. Change arg names on the affine
  load/store builder to avoid confusion. Drop an unused confusing build
  method on AffineStoreOp.

- update incomplete doc comment for canonicalizeMapAndOperands (this was
  missed from a previous update).

Addresses issue tensorflow/mlir#121

Signed-off-by: Uday Bondhugula <uday@polymagelabs.com>

Closes tensorflow/mlir#122

COPYBARA_INTEGRATE_REVIEW=tensorflow/mlir#122 from bondhugula:compose-load-store e71de1771e56a85c4282c10cb43f30cef0701c4f
PiperOrigin-RevId: 269619540
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants