Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace tm_tensor with torch-mlir proper #14917

Merged
merged 5 commits into from
Sep 6, 2023
Merged

Conversation

stellaraccident
Copy link
Collaborator

Per discussion on iree-discuss, this patch removes the original tm_tensor source snapshots and special casing from the codebase, instead pulling in the parts of torch-mlir that provide a full solution, starting from the torch dialect down.

This reuses the existing IREE_INPUT_TORCH CMake define but routes it to a compiler plugin at compiler/plugins/input/Torch where all build plumbing to integrate and local pipelines exist. I have kept the original tm_tensor input type for the moment since downstreams are using that, but we will want to eventually replace that with a dedicated torch input type which works directly from the torch dialect.

This patch should be NFC to all current users, however it allows programmatic and API access to the full Torch dialects and conversion pipelines, which is essential for broader scale integration.

We add third_party/torch-mlir as a submodule at torch-mlir's current HEAD. During LLVM integrates, if there are ever API breaks that affect this, we may need to push patches to torch-mlir to address. This can be done on a torch-mlir branch as needed (or committed at head). I have been tracking changes for a couple of months and have not had a need to do this yet, which indicates that this is relatively stable.

Committing with an XFAIL'd lit test, which for some reason was not running previously and now is: #14916

stellaraccident added a commit to llvm/torch-mlir that referenced this pull request Sep 6, 2023
…ned and optional.

While working on iree-org/iree#14917, I noticed that it is somewhat hard to take a dependency on torch-mlir such that one only builds deps for the target(s) of interest (in this case Linalg). I noticed that some ifdef'ey optionality was added for stablehlo, but this was not mirrored for the others. Further, it does the switching very deep in the dependency graph vs having top-level directories and defines gating entire features. In addition, I noticed that a lot of things in the Linalg path were broken down to a fine level of detail but were not actually shared/shareable outside of that target. I opted to clump these together into TorchToLinalg. It is easy enough to "promote" them to common with this new organization if the need arises.

General approach:

* Isolate each conversion target in one of TorchToLinalg, TorchToStablehlo, TorchToTosa.
* Gate each by top-level CMake flags and defines.
* Common conversions go in a Common/ directory (currently Arith and SCF).
* Pull target specific conversions out of TorchConversion/Transforms and put in their top-level directory.
* General maintenance on the build graph and registration stuff that had bitrotted and was blocking progress.

The main functional change for people taking a source dep is that `#include "torch-mlir/Conversion/Passes.h"` no longer is a one stop shop: For optional conversions, you have to include the dedicated `Passes.h` of each and take a library dep. See `InitAll.cpp` which does it right (and *is* a one stop shop still).
@stellaraccident stellaraccident merged commit 265d41f into main Sep 6, 2023
57 checks passed
@stellaraccident stellaraccident deleted the include_torch_mlir branch September 6, 2023 04:17
@jpienaar
Copy link
Member

jpienaar commented Sep 6, 2023

Does torch become a plugin proper now? Having this dep be optional is important for usage where not all packages are available (or desirable).

@stellaraccident
Copy link
Collaborator Author

Yes, it is entirely self contained as compiler/plugins/input/Torch

This should be an improvement for such cases over the previous state where it was optional but still threaded through the main codebase.

jinchen62 pushed a commit to jinchen62/iree that referenced this pull request Sep 18, 2023
Per discussion on iree-discuss, this patch removes the original
tm_tensor source snapshots and special casing from the codebase, instead
pulling in the parts of torch-mlir that provide a full solution,
starting from the torch dialect down.

This reuses the existing `IREE_INPUT_TORCH` CMake define but routes it
to a compiler plugin at `compiler/plugins/input/Torch` where all build
plumbing to integrate and local pipelines exist. I have kept the
original tm_tensor input type for the moment since downstreams are using
that, but we will want to eventually replace that with a dedicated
`torch` input type which works directly from the `torch` dialect.

This patch should be NFC to all current users, however it allows
programmatic and API access to the full Torch dialects and conversion
pipelines, which is essential for broader scale integration.

We add `third_party/torch-mlir` as a submodule at torch-mlir's current
HEAD. During LLVM integrates, if there are ever API breaks that affect
this, we may need to push patches to torch-mlir to address. This can be
done on a torch-mlir branch as needed (or committed at head). I have
been tracking changes for a couple of months and have not had a need to
do this yet, which indicates that this is relatively stable.

Committing with an XFAIL'd lit test, which for some reason was not
running previously and now is:
iree-org#14916
@ScottTodd
Copy link
Member

Ah... I just noticed that if someone runs git submodule update --init --recursive, that will now also clone torch-mlir's copy of llvm-project. Recursive updates are less common when working directly in IREE, but downstream projects may use it for convenience vs explicitly choosing which subfolders and submodules to use.

Not sure how we could improve the situation there... maybe write down the git submodule workflows we suggest and keep an eye on downstream projets (e.g. https://github.com/RechieKho/IREE.gd#build-from-source)?

@ScottTodd
Copy link
Member

Ah... I just noticed that if someone runs git submodule update --init --recursive, that will now also clone torch-mlir's copy of llvm-project. Recursive updates are less common when working directly in IREE, but downstream projects may use it for convenience vs explicitly choosing which subfolders and submodules to use.

Not sure how we could improve the situation there... maybe write down the git submodule workflows we suggest and keep an eye on downstream projets (e.g. https://github.com/RechieKho/IREE.gd#build-from-source)?

The instructions at https://github.com/iree-org/iree-template-runtime-cmake/ should also be updated to account for that too:

For a faster checkout the LLVM dependency can be dropped as this template is only compiling the runtime (only bother if optimizing build bots):

$ git \
    -c submodule."third_party/llvm-project".update=none \
    submodule update --init --recursive

Output even with that (https://github.com/iree-org/iree-template-runtime-cmake/actions/runs/6656984452/job/18090744983):

Submodule path 'third_party/iree/third_party/torch-mlir': checked out '6f81ad72938deb56c6d43bbc01388c1f8f1253c1'
Submodule 'externals/llvm-project' (https://github.com/llvm/llvm-project.git) registered for path 'third_party/iree/third_party/torch-mlir/externals/llvm-project'
Submodule 'externals/stablehlo' (https://github.com/openxla/stablehlo.git) registered for path 'third_party/iree/third_party/torch-mlir/externals/stablehlo'
Cloning into '/home/runner/work/iree-template-runtime-cmake/iree-template-runtime-cmake/third_party/iree/third_party/torch-mlir/externals/llvm-project'...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants