[NNC] New APIs to get loops corresponding to a Buf #53778

navahgar · 2021-03-11T00:47:35Z

This PR adds the following APIs to NNC.

// In For:
static For* getParentLoop(const Stmt* st);
static std::vector<For*> getEnclosingLoopNest(const Stmt* st);

// In LoopNest:
std::vector<const Stmt*> getAllWritesToBuf(const Buf*) const;
std::vector<For*> getAllInnermostLoopsWritingToBuf(const Buf*) const;
std::vector<std::vector<For*>> getAllLoopNestsWritingToBuf(const Buf*) const;

These APIs are required for some usecases that involve multiple transformations like splitWithTail followed by reorder as shown in #53092

facebook-github-bot · 2021-03-11T00:47:44Z

💊 CI failures summary and remediations

As of commit 342ad4d (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

codecov · 2021-03-11T04:04:04Z

Codecov Report

Merging #53778 (342ad4d) into master (d57ae6c) will decrease coverage by 0.00%.
The diff coverage is 75.00%.

@@            Coverage Diff             @@
##           master   #53778      +/-   ##
==========================================
- Coverage   77.29%   77.29%   -0.01%     
==========================================
  Files        1888     1888              
  Lines      183504   183528      +24     
==========================================
+ Hits       141838   141852      +14     
- Misses      41666    41676      +10

bertmaher

I tried this out yesterday, and it's definitely better than not having it :-D.

That said, I'm very worried about the overall approach, because it's very difficult to keep track of the loop structure of a program when you're writing a schedule (and I suspect that would make autotuning messy as well, although maybe it could be overcome by throwing a ton of cycles at it).

As an example, I wanted to perform a conceptually simple optimization to special-case the 2-pixel padding on a 5x5 stride=1 convolution: peel the first and and last 2 iterations. The code for this looks like this impenetrable thicket:

 For *head, *tail;
  auto loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  nest.sliceHead(loops[1][3], 1, &head, &tail);
  loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  nest.sliceHead(loops[3][3], 1, &head, &tail);
  loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  nest.sliceTail(loops[5][3], 1, &head, &tail);
  loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  nest.sliceTail(loops[5][3], 1, &head, &tail);

  loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  nest.sliceHead(loops[5][2], 1, &head, &tail);
  loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  nest.sliceHead(loops[15][2], 1, &head, &tail);
  loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  nest.sliceTail(loops[25][2], 1, &head, &tail);
  loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  nest.sliceTail(loops[25][2], 1, &head, &tail);

I might be able to do a cleaner job here by using head and tail more wisely to keep track of the "main loop" that I'm operating on (this schedule ends up being a loser anyways so I don't know if I'll put more effort into it). But I wish it didn't feel like an intellectually Herculean task to express "peel the first and last two iterations".

test/cpp/tensorexpr/test_loopnest.cpp

bertmaher · 2021-03-11T17:10:40Z

torch/csrc/jit/tensorexpr/stmt.h

+  // Returns the For stmt that is immediately enclosing the given stmt.
+  static For* getParentLoop(const Stmt* st) {
+    if (st == nullptr) {
+      return nullptr;
+    }
+    auto par = st->get_parent();
+    if (auto f = dynamic_cast<For*>(par)) {
+      return f;
+    }
+    return For::getParentLoop(par);
+  }
+
+  // Returns the list of For stmts corresponding to the loopnest that is
+  // enclosing the given stmt.
+  static std::vector<For*> getEnclosingLoopNest(const Stmt* st) {
+    std::vector<For*> loops;
+    auto f = For::getParentLoop(st);
+    while (f) {
+      loops.push_back(f);
+      f = For::getParentLoop(f);
+    }
+    std::reverse(loops.begin(), loops.end());
+    return loops;
+  }
+


Is there any compelling reason to make these functions static methods of For? I think they should be static free functions inside loopnest.cpp instead, to minimize clutter in the API.

I can move them to LoopNest, but why do you want them to be non-static? There is no use of any state here.

@ZolotukhinM and I have been leaning towards static methods in the hope that we can get rid of the state in LoopNest eventually. That would make the code more clear IMO.

Moved them to LoopNest but retained as static. Let me know if you have any reservations with that.

Sorry , static is overloaded. I meant a static free function (not part of a class), as opposed to a class member. But a class member is fine too if they should be exposed to the API.

Aah okay. Ideally, we should have them as standalone function. That's probably my end goal as well, once we make all these transforms as static in LoopNest. For now, since we have several other static functions (and no standalone functions), lets keep it that way for uniformity.

ZolotukhinM · 2021-03-11T18:17:16Z

As an example, I wanted to perform a conceptually simple optimization to special-case the 2-pixel padding on a 5x5 stride=1 convolution: peel the first and and last 2 iterations. The code for this looks like this impenetrable thicket:

To be honest, that seems like a wrong usage, and not an issue with the API. Here is the code that'd peel first and last iterations, and it looks perfectly readable and reasonable to me:

  For *main, *peeled;

  // Locate the loop we want to peel
  auto loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  main = loops[1][3];
  // Peel the first iteration
  nest.sliceHead(main, 1, &peeled, &main);
  // Peel the last iteration
  nest.sliceTail(main, 1, &main, &peeled);

bertmaher · 2021-03-11T19:29:49Z

Okay, so I'm an idiot. I tried out your way, and it's pretty nice:

  auto loops = nest.getAllLoopNestsWritingToBuf(conv->buf());
  main = loops[1][3];
  nest.sliceHead(main, 1, &peeled, &main);
  nest.sliceHead(main, 1, &peeled, &main);
  nest.sliceTail(main, 1, &main, &peeled);
  nest.sliceTail(main, 1, &main, &peeled);
  main = For::getParentLoop(main);
  nest.sliceHead(main, 1, &peeled, &main);
  nest.sliceHead(main, 1, &peeled, &main);
  nest.sliceTail(main, 1, &main, &peeled);
  nest.sliceTail(main, 1, &main, &peeled);

bertmaher · 2021-03-11T19:31:12Z

I think the reason I wrote the first version is that I "want" to think in terms of the structure of the whole computation, so I keep doing "get all the loops". But that's not the best way to express this; you want to hang on to a handle to the main loop and navigate from there.

My perf results are still bad, but that's LLVM's fault :-p

facebook-github-bot

@navahgar has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@navahgar has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-03-13T02:52:49Z

@navahgar merged this pull request in ef07a04.

Summary: Fixes pytorch#53092 This PR adds the following APIs to NNC. ``` // In For: static For* getParentLoop(const Stmt* st); static std::vector<For*> getEnclosingLoopNest(const Stmt* st); // In LoopNest: std::vector<const Stmt*> getAllWritesToBuf(const Buf*) const; std::vector<For*> getAllInnermostLoopsWritingToBuf(const Buf*) const; std::vector<std::vector<For*>> getAllLoopNestsWritingToBuf(const Buf*) const; ``` These APIs are required for some usecases that involve multiple transformations like `splitWithTail` followed by `reorder` as shown in pytorch#53092 Pull Request resolved: pytorch#53778 Reviewed By: albanD Differential Revision: D26987013 Pulled By: navahgar fbshipit-source-id: 491459eddfff045132d2358631ad069bbcc520df

navahgar requested review from bertmaher and ZolotukhinM March 11, 2021 00:47

facebook-github-bot added oncall: jit Add this issue/PR to JIT oncall triage queue cla signed labels Mar 11, 2021

navahgar changed the title ~~[NNC] New APIs to get loops corresponding to a Buf.~~ [NNC] New APIs to get loops corresponding to a Buf Mar 11, 2021

bertmaher approved these changes Mar 11, 2021

View reviewed changes

navahgar force-pushed the reorder1 branch from d66efbb to 158fce7 Compare March 11, 2021 20:05

facebook-github-bot reviewed Mar 11, 2021

View reviewed changes

navahgar mentioned this pull request Mar 12, 2021

[NNC] Adding API to distribute loops #53865

Closed

[NNC] New APIs to get loops corresponding to a Buf.

342ad4d

navahgar force-pushed the reorder1 branch from 158fce7 to 342ad4d Compare March 12, 2021 01:59

facebook-github-bot reviewed Mar 12, 2021

View reviewed changes

facebook-github-bot closed this in ef07a04 Mar 13, 2021

facebook-github-bot added the Merged label Mar 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NNC] New APIs to get loops corresponding to a Buf #53778

[NNC] New APIs to get loops corresponding to a Buf #53778

navahgar commented Mar 11, 2021

Uh oh!

facebook-github-bot commented Mar 11, 2021 •

edited

Loading

Uh oh!

codecov bot commented Mar 11, 2021 •

edited

Loading

Uh oh!

bertmaher left a comment

Uh oh!

Uh oh!

bertmaher Mar 11, 2021

Uh oh!

navahgar Mar 11, 2021

Uh oh!

navahgar Mar 11, 2021

Uh oh!

bertmaher Mar 11, 2021

Uh oh!

navahgar Mar 11, 2021

Uh oh!

ZolotukhinM commented Mar 11, 2021

Uh oh!

bertmaher commented Mar 11, 2021

Uh oh!

bertmaher commented Mar 11, 2021

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented Mar 13, 2021

Uh oh!

Uh oh!

[NNC] New APIs to get loops corresponding to a Buf #53778

[NNC] New APIs to get loops corresponding to a Buf #53778

Conversation

navahgar commented Mar 11, 2021

Uh oh!

facebook-github-bot commented Mar 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

codecov bot commented Mar 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

bertmaher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bertmaher Mar 11, 2021

Choose a reason for hiding this comment

Uh oh!

navahgar Mar 11, 2021

Choose a reason for hiding this comment

Uh oh!

navahgar Mar 11, 2021

Choose a reason for hiding this comment

Uh oh!

bertmaher Mar 11, 2021

Choose a reason for hiding this comment

Uh oh!

navahgar Mar 11, 2021

Choose a reason for hiding this comment

Uh oh!

ZolotukhinM commented Mar 11, 2021

Uh oh!

bertmaher commented Mar 11, 2021

Uh oh!

bertmaher commented Mar 11, 2021

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Mar 13, 2021

Uh oh!

Uh oh!

facebook-github-bot commented Mar 11, 2021 •

edited

Loading

codecov bot commented Mar 11, 2021 •

edited

Loading