Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MLIR][XLA] Buffer Assignment #37212

Merged

Conversation

dfki-ehna
Copy link
Contributor

In this PR, we have provided our first draft for generic XLA buffer assignment. This generic BufferAssignment class automatically analyzes the values and their aliases (also in other blocks) and returns the positions of allocations and deallocations. To find these positions, the algorithm uses the block Dominator and Post-Dominator analyses. In our proposed algorithm, we have considered aliasing, liveness analysis, nested regions, branches, conditional branches, critical edges, and independency to custom block terminators. This implementation doesn't support block loops. However, we have considered this in our design. For this purpose, we only need to have a loop analysis to allocate and deallocate some special cases outside of these loops.

This is the sample usage of BufferAssignment class:

BufferAssignment assignments(funcOp);
auto positions = assignments.computeAllocAndDeallocPositions(value);
allocBuilder.setInsertionPoint(positions.getAllocPosition());
<create alloc>
deallocBuilder.setInsertionPointAfter(positions.getDeallocPosition());
<create dealloc>

Or alternatively:

BufferAssignment assignments(funcOp);
auto positions = assignments.computeAllocAndDeallocPositions(value);
positions.insertAlloc<AllocOp>(...);
positions.insertDealloc<DeallocOp>(...);

Please note that this class should be used during the legalization (i.e. HLO-To-LHLO).

@tensorflow-bot tensorflow-bot bot added the size:L CL Change Size: Large label Mar 1, 2020
@gbaned gbaned self-assigned this Mar 2, 2020
@gbaned gbaned requested a review from jpienaar March 2, 2020 04:19
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Mar 2, 2020
@sherhut sherhut requested review from pifon2a and sherhut March 2, 2020 14:15
Copy link
Contributor

@pifon2a pifon2a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First portion of nits. More coming.

@gbaned gbaned requested a review from pifon2a March 5, 2020 11:52
@gbaned gbaned added the awaiting review Pull request awaiting review label Mar 11, 2020
@dfki-ehna
Copy link
Contributor Author

Refactored BufferAssignment analysis into two distinct parts:

  • BufferAssignmentLegalizer helper class that can be used during legalization.
  • BufferAssignmentPass that moves alloc and dealloc nodes (if any) into the
    right positions. If there are not associated dealloc nodes for a given
    alloc node, this pass inserts a DeallocOp in the right place automatically.

PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Mar 18, 2020
@sherhut
Copy link

sherhut commented Mar 18, 2020

Could you provide an example how the use of this would look like for hlo->lhlo? It is not clear to me how this composes.

@dfki-ehna
Copy link
Contributor Author

dfki-ehna commented Mar 19, 2020

The intended use of this buffer assignment is that the dialect-expert creates an instance of BufferAssignmentLegalizer at the beginning of the pass (i.e. in the runOnFunction()) and passes this instance to all conversion-pattern classes using OwningRewritePatternList instance.
These conversion pattern classes could inherit from BufferAssignmentOpConversionPattern that has a protected BufferAssignmentLegalizer member. This will avoid creating BufferAssignmentLegalizer multiple times. In each conversion pattern class that the dialect-expert needs to allocate a buffer, they do the following:

BufferAssignmentOpConversionPattern<HloOpTy>::bufferAssignment->computeAllocPosition(result);
auto alloc = allocPosition.insertAlloc<AllocOp>(result, loc, memref_type); // we are going to remove result from the list of arguments in the next commit since it's unnecessary.

Later on, --buffer-assignment flag must be passed to tf-opt to move Alloc and Dealloc operations in their proper positions. If there is no Dealloc for a buffer, it will be automatically inserted.

sherhut
sherhut previously approved these changes Mar 30, 2020
Copy link

@sherhut sherhut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this landed and improve further in tree. Thanks!

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Mar 30, 2020
@tensorflow-bot tensorflow-bot bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Mar 30, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Mar 30, 2020
@gbaned gbaned added kokoro:force-run Tests on submitted change and removed awaiting review Pull request awaiting review labels Mar 31, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Mar 31, 2020
PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Apr 1, 2020
@tensorflow-bot tensorflow-bot bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Apr 1, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Apr 1, 2020
tensorflow/compiler/mlir/BUILD Outdated Show resolved Hide resolved
PR Queue automation moved this from Approved by Reviewer to Reviewer Requested Changes Apr 2, 2020
@google-ml-butler google-ml-butler bot removed the ready to pull PR ready for merge process label Apr 2, 2020
Copy link

@sherhut sherhut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Apr 2, 2020
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Apr 2, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Apr 2, 2020
@gbaned gbaned removed the stat:awaiting response Status - Awaiting response from author label Apr 2, 2020
}
}
return result;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test covering this function? (our test coverage shows it uncovered)

}
});
};
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing where is this pass covered by tests?

@tensorflow-copybara tensorflow-copybara merged commit bae789d into tensorflow:master Apr 3, 2020
@joker-eph
Copy link
Contributor

Lacking my comment being addressed, I issued a revert here: #38291

@byronyi
Copy link
Contributor

byronyi commented Apr 7, 2020

This unfortunately leaves this PR to a terminal state. @dfki-ehna would you mind to open a new one? Thanks!

@dfki-mako
Copy link
Contributor

@joker-eph Oh, I see 🔢 Maybe we can add a test dialect including a legalization step in order to simulate lowering from one dialect to another one (or even the same) with buffers?

@joker-eph
Copy link
Contributor

Why do we need a test dialect? I may be missing something but the description of the pass is:

 /// The actual buffer assignment pass that moves alloc and dealloc nodes into
 /// the right positions. 

I was expecting this to be able to run on something like LHLO?

@pifon2a
Copy link
Contributor

pifon2a commented Apr 7, 2020

@joker-eph The test dialect will be needed if we want to have this pass in MLIR core, since there is no LHLO there. I think @dfki-mako was assuming this.

@dfki-mako
Copy link
Contributor

@joker-eph I have already thought about this transfer and future features at the same time. It might be useful to have a small test dialect that contains additional features that we might want to think about and include in the general BA pass.

@dfki-mako
Copy link
Contributor

@joker-eph Please note that this is not a strict requirement at the moment. We can include several Linalg-based tests in the MLIR core version and apply BA to them.

@joker-eph
Copy link
Contributor

I am a bit lost (but it is quite late here, my brain may be slow...): I was looking at this PR in TensorFlow, in a disconnected way from Core, and I see code that does not seem to have any test here.
Regardless of plans about upstreaming, I expect that the development in MLIR-TensorFlow follows roughly the same set of good practices as upstream, starting with complete coverage of new code with lit tests. Unless I'm mistaken there is a non-tested pass in this PR?

@dfki-mako
Copy link
Contributor

Yes and no. We had to redesign several parts and split some features (regarding legalization and use) into other PRs. No, because the test pass basically computes all required information to move allocs and deallocs. Yes, because the actual movement part is not covered.

@dfki-ehna
Copy link
Contributor Author

@joker-eph you are right with respect to this PR. It has never been our intention to push an untested pass. We address the missing-tests issue and add a complete test case to test the whole BA pass including movement of allocs and deallocs.

@joker-eph
Copy link
Contributor

We address the missing-tests issue and add a complete test case to test the whole BA pass including movement of allocs and deallocs.

Is there already a PR up for review?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process size:L CL Change Size: Large
Projects
PR Queue
  
Approved by Reviewer
Development

Successfully merging this pull request may close these issues.

None yet