Skip to content

Commit

Permalink
Allocation order refactor (#2168)
Browse files Browse the repository at this point in the history
refactored allocation order inference pass:
* Instead of per operation propagation rule, we are now using IdModel
mapping to map allocation domain of reference tensor to rfactor domain
of target tensor.
* Updated the inference API to allow specified sources and destinations
for the propagation.
  ```
void inferenceAllocationOrder(
    Fusion* fusion,
    const std::vector<TensorView*>& srcs,
    const std::vector<TensorView*>& dsts);
  ```

* The propagation tried to keep the memory format of `dsts` closer to
the `srcs` to simplify scheduling as well as facilitate vectorization.
It works roughly as:
* For each entry `dst`, among all its producers in `srcs`, we'll find
the one with the most loop iter domain in its allocation domain as the
reference `ref`
* We try to map each iter domain in `dst`'s rfactor domain to `ref`'s
allocation order domain and push those as the inner dimension in `dst`'s
new allocation domain, while pushing unmapped iter domains as outer
dimensions.
* I have to put in a WAR for the mapping logic for now, since reduction
scheduler is struggling with permuted output. See issue #2202. The WAR
is simply to preserve the existing position of reduction iter domain in
rfactor the same as it would be in its new allocation domain. This WAR
is supposed to be removed at a later point once we fixed reduction
scheduler. I kept both code path in the PR for easier future cleanup.

---------

Co-authored-by: Naoya Maruyama <naoyam@users.noreply.github.com>
Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
  • Loading branch information
3 people committed May 14, 2024
1 parent dfba77a commit 8c18701
Show file tree
Hide file tree
Showing 5 changed files with 328 additions and 517 deletions.

0 comments on commit 8c18701

Please sign in to comment.