[MHLO] Init Torch to MHLO conversion. #1025

ZihengJiang · 2022-07-07T20:34:34Z

See RFC: #999

TODO:

Remove dependency for FuncTorch in examples
Update BERT and ResNet example
Configurable Decomposition
Clean and format

Co-authored-by: @byronyi @Vremold Xuanrun Zhang

powderluv

Thank you. Drive by comments. I know it is only a draft PR but wanted to get comments early.

powderluv · 2022-07-07T21:34:39Z

scripts/build.sh

@@ -0,0 +1,33 @@
+# in-tree build


probably the scripts in scripts/ are meant to be local scripts and not checked in ?

Sounds good. I left it here as the example to build the project. Will remove it in the end.

powderluv · 2022-07-07T21:35:13Z

setup.py

                f"-DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR={src_dir}",
                f"-DLLVM_EXTERNAL_TORCH_MLIR_DIALECTS_SOURCE_DIR={src_dir}/externals/llvm-external-projects/torch-mlir-dialects",
+                f"-DLLVM_EXTERNAL_MLIR_HLO_SOURCE_DIR={src_dir}/externals/mlir-hlo",
+                f"-DMLIR_PDLL_TABLEGEN_EXE=mlir-pdll", # FIXME: set MLIR_PDLL_TABLEGEN_EXE since mlir-hlo doesn't


should we submit a PR upstream in mhlo for setting mlir-pdll ?

powderluv · 2022-07-07T21:38:41Z

CMakeLists.txt

@@ -95,6 +110,9 @@ else()
  set(MLIR_INCLUDE_DIR ${LLVM_MAIN_SRC_DIR}/../mlir/include)
  set(MLIR_GENERATED_INCLUDE_DIR ${LLVM_BINARY_DIR}/tools/mlir/include)
  set(MLIR_INCLUDE_DIRS "${MLIR_INCLUDE_DIR};${MLIR_GENERATED_INCLUDE_DIR}")
+  # since mhlo didn't set INTERFACE_DIRECTORIES for their target, we need include mhlo directories globally


probably good to file and issue upstream so they are aware of this and maybe if it is easy add a PR upstream.

powderluv · 2022-07-07T21:39:21Z

CMakeLists.txt

+                   EXCLUDE_FROM_ALL)
+  include_directories(${CMAKE_CURRENT_SOURCE_DIR}/externals/mlir-hlo/include)
+  include_directories(${CMAKE_CURRENT_BINARY_DIR}/mlir-hlo/include)
+  include_directories(${CMAKE_CURRENT_BINARY_DIR})


is the whole CMAKE_CURRENT_BINARY_DIR required since we are adding it globally ?

powderluv · 2022-07-07T21:40:00Z

.gitmodules

@@ -1,3 +1,6 @@
 [submodule "external/llvm-project"]
 	path = externals/llvm-project
 	url = https://github.com/llvm/llvm-project.git


we probably should land the llvm update and even mhlo as submodules first.

powderluv · 2022-07-07T21:49:36Z

CMakeLists.txt

@@ -41,7 +41,13 @@ torch_mlir_add_llvm_external_project(
  TORCH_MLIR_DIALECTS
  ${CMAKE_CURRENT_SOURCE_DIR}/externals/llvm-external-projects/torch-mlir-dialects)

+torch_mlir_add_llvm_external_project(


Can we add a top level TORCH_MLIR_MHLO (name can be anything @silvasean any suggestions ?) CMake flag that can enable / disable the MHLO backend ? This can be the big hammer in case we have to disable it for any reason (broken on macOS etc) .

powderluv · 2022-07-07T21:50:02Z

CMakeLists.txt

@@ -81,7 +87,16 @@ if(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
  set(TORCH-MLIR_BUILT_STANDALONE 1)
  set(BACKEND_PACKAGE_STRING "LLVM ${LLVM_PACKAGE_VERSION}")
  add_subdirectory(externals/llvm-external-projects/torch-mlir-dialects)
+


Wrap in same TORCH_MLIR_MHLO (?)

silvasean · 2022-07-08T02:20:27Z

One thing I would like is if we can add this to our e2e test suite, even if only a few tests pass. We can model it on the Tosa support

Add a case here:

torch-mlir/e2e_testing/torchscript/main.py

Line 77 in f202ae0

if args.config == 'tosa':

Other related files:
python/torch_mlir_e2e_test/torchscript/configs/tosa_backend.py
python/torch_mlir_e2e_test/tosa_backends/linalg_on_tensors.py -- will require using MHLOToLinalg lowerings
e2e_testing/torchscript/xfail_sets.py - add list of expected tests to pass

tanyokwok · 2022-07-08T05:50:58Z

CMakeLists.txt

@@ -81,7 +87,16 @@ if(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
  set(TORCH-MLIR_BUILT_STANDALONE 1)
  set(BACKEND_PACKAGE_STRING "LLVM ${LLVM_PACKAGE_VERSION}")
  add_subdirectory(externals/llvm-external-projects/torch-mlir-dialects)
+
+  set(MHLO_BUILD_EMBEDDED ON)


What is MHLO_BUILD_EMBEDDED meant for? It seems not used.

tanyokwok · 2022-07-08T06:03:02Z

examples/aot_autograd_bert.py

+        fout.write(str(module))
+        print("MHLO module has been save to {}".format(fname))
+    print("MHLO execution is not support yet. Stopped.")
+    exit(0)


Nit: The codes following will not be reached, and should be removed.

tanyokwok · 2022-07-08T09:32:32Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+  MLIRContext *context = patterns.getContext();
+
+  target.addIllegalOp<AtenViewOp>();
+  patterns.add<ConvertAtenViewOp>(typeConverter, context);


I think operators like aten::view, aten::expand, aten::flatten, aten::[un]squeeze can be split into another source file, such as ViewLikeOps.cpp

tanyokwok · 2022-07-08T09:32:44Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+  target.addIllegalOp<AtenBroadcastToOp>();
+  patterns.add<ConvertAtenBroadcastToOp>(typeConverter, context);
+
+  target.addIllegalOp<AtenSliceTensorOp>();


ditto: Those slices like operators can be moved to SliceLikeOps.cpp

tanyokwok · 2022-07-08T09:42:55Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+    return op->emitError("unimplemented: dim is not constant");
+  uint64_t batchDims = 0;
+
+  rewriter.replaceOpWithNewOp<mhlo::TorchIndexSelectOp>(


It's wired that MHLO has this operator. I think mhlo::DynamicGatherOp is sufficient to represent the semantic. Is anyone knows the difference?

https://www.tensorflow.org/mlir/hlo_ops#mhlodynamic_gather_mlirmhlodynamicgatherop

mhlo::TorchIndexSelect was added to simplify lowering tf.GatherV2 to MHLO in this commit 2.5 years ago. It is modelled after the client HLO API TorchIndexSelect.

I agree that it doesn't look particularly essential for the MHLO dialect. We've been recently thinking about moving it to CHLO, along with a few other ops which are modelled after client HLO APIs which don't correspond to dedicated HLO ops. (More specifically, these other ops are: mhlo.broadcast - keeping broadcast_in_dim in MHLO, mhlo.create_token, mhlo.cross_replica_sum, mhlo.dot - keeping dot_general in MHLO, mhlo.einsum, mhlo.unary_einsum).

Thanks for your advice, we will change this part's implementation in later commits.

tanyokwok · 2022-07-08T09:53:11Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+  patterns.add<ConvertAtenOp<AtenOp>>(typeConverter, context);
+  INSERT_ATENOP_PATTERN(ValueTensorLiteralOp);
+  INSERT_ATENOP_PATTERN(AtenTanhOp);
+  INSERT_ATENOP_PATTERN(AtenIndexSelectOp);


Ditto: I think operations like aten::gather, aten::index_select and so on can be moved to MemOps.cpp

tanyokwok · 2022-07-08T09:57:43Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+  target.addIllegalOp<AtenTransposeIntOp>();
+  patterns.add<ConvertAtenTransposeIntOp>(typeConverter, context);
+
+  target.addIllegalOp<AtenPermuteOp>();
+  patterns.add<ConvertAtenPermuteOp>(typeConverter, context);


ditto: Can be move to PermutationsOps.cpp

tanyokwok · 2022-07-08T10:02:48Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+  int64_t lhsSize = 1;
+  for (auto &en : llvm::enumerate(lhsTy.getShape())) {
+    lhsSize *= en.value();
+  }
+  auto constTy = RankedTensorType::get(lhsTy.getShape(), lhsElemTy);
+  DenseElementsAttr constAttr;
+  if (lhsElemTy.isa<mlir::FloatType>()) {
+    std::vector<APFloat> constVec(
+        lhsSize,
+        APFloat::getZero(lhsElemTy.cast<mlir::FloatType>().getFloatSemantics(),
+                         /*negative=*/false));
+    constAttr = DenseElementsAttr::get(constTy, constVec);
+  } else if (lhsElemTy.isa<mlir::IntegerType>()) {
+    std::vector<APInt> constVec(
+        lhsSize, APInt::getZero(lhsElemTy.getIntOrFloatBitWidth()));
+    constAttr = DenseElementsAttr::get(constTy, constVec);
+  }
+  Value rhs =
+      rewriter.create<mhlo::ConstantOp>(op.getLoc(), constTy, constAttr);


Nit: Value rhs = chlo::getConstantLike(rewriter, loc, 0.0, input);

tanyokwok · 2022-07-08T10:32:15Z

lib/Conversion/TorchToMhlo/MhloLegalizeUtils.cpp

+    size_t inPos = inShape.size() - 1 - i;
+    int64_t outDim = outShape[outPos];
+    int64_t inDim = inShape[inPos];
+    if (inDim == outDim) {


The implementation doesn't support implicit broadcast when inDim == outDim == -1 both unknown?

tanyokwok · 2022-07-08T10:39:44Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+    lhsTensor = mhlo::promoteAndBroadcast(rewriter, lhsTensor, outType);
+    rhsTensor = mhlo::promoteAndBroadcast(rewriter, rhsTensor, outType);
+
+    rewriter.replaceOpWithNewOp<mhlo::MulOp>(op, outType, lhsTensor, rhsTensor);


Use chlo::BroadcastMulOp to support implicit/dynamic broadcast

tanyokwok · 2022-07-08T10:41:14Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+    lhsTensor = mhlo::promoteAndBroadcast(rewriter, lhsTensor, outType);
+    rhsTensor = mhlo::promoteAndBroadcast(rewriter, rhsTensor, outType);
+
+    rewriter.replaceOpWithNewOp<mhlo::DivOp>(op, outType, lhsTensor, rhsTensor);


tanyokwok · 2022-07-08T10:42:11Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+#define INSERT_BINARY_ADDSUB_PATTERN(AtenOp, MhloOp)                           \
+  target.addIllegalOp<AtenOp>();                                               \
+  patterns.add<ConvertAtenAddSubOp<AtenOp, MhloOp>>(typeConverter, context);
+  INSERT_BINARY_ADDSUB_PATTERN(AtenAddTensorOp, mhlo::AddOp)


Use chlo::Broadcast[Add]SubOp to support implicit/dynamic broadcast

tanyokwok · 2022-07-08T10:43:33Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+          op->getContext(), mhlo::ComparisonDirection::NE);
+    }
+
+    rewriter.replaceOpWithNewOp<mhlo::CompareOp>(


nit: chlo::BroadcastCompareOp

Yancey1989 · 2022-07-11T01:24:41Z

examples/aot_autograd_bert.py

+SEQ_LEN = 128
+data = {
+    'input_ids': torch.randint(30522, (BATCH_SIZE, SEQ_LEN)),
+    # 'labels': torch.randint(30522, (BATCH_SIZE, SEQ_LEN)),


remove the comment code?

Yancey1989 · 2022-07-11T01:27:37Z

include/torch-mlir/Conversion/Passes.td

+def ConvertTorchToMhlo : Pass<"convert-torch-to-mhlo", "func::FuncOp"> {
+  let summary = "Convert Torch ops to MHLO ops";
+  let description = [{
+    Convert ATen ops to mhlo ops.


Convert ATen ops to mhlo ops.

Convert Torch ops ... ? Seems not only aten ops in this pass.

Yancey1989 · 2022-07-11T01:35:05Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+};
+
+template<>
+LogicalResult ConvertAtenOp<AtenReciprocalOp>::matchAndRewrite(AtenReciprocalOp op, OpAdaptor adaptor, ConversionPatternRewriter &rewriter) const {


This line is too long, does torch-mlir community follow any code style like Google Code Style?

cc @silvasean

Yancey1989 · 2022-07-11T01:37:46Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+};
+
+template<>
+LogicalResult ConvertAtenOp<AtenReciprocalOp>::matchAndRewrite(AtenReciprocalOp op, OpAdaptor adaptor, ConversionPatternRewriter &rewriter) const {


add a comment to verify descript the lowering logic, e.g. Reciprocal(x) = Div(1, x)?

Yancey1989 · 2022-07-11T01:38:10Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+    AtenBatchNormOp op, OpAdaptor adaptor,
+    ConversionPatternRewriter &rewriter) const {
+  Value input = adaptor.input();
+  // shape = [N C H W]


[N C H W] => [N, C, H, W]

Yancey1989 · 2022-07-11T01:38:47Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+  Value bias = adaptor.bias();
+  Value runningMean = adaptor.running_mean();
+  Value runningVar = adaptor.running_var();
+  // momentum is ignored


remove momentum related code?

Yancey1989 · 2022-07-11T01:45:21Z

lib/Conversion/TorchToMhlo/BasicOp.cpp

+
+    DenseElementsAttr valueAttr =
+        elements.mapValues(builtinTensorElemTy, [&](const APInt &v) {
+          return APInt(bitWidth, v.getSExtValue());


How about unsigned int?

Yancey1989 · 2022-07-11T01:57:19Z

test/Conversion/TorchToMhlo/basic.mlir

+// -----
+
+// CHECK-LABEL: 	func.func @torch.aten.addtensor$alpha(
+// CHECK-SAME: 																 		 %[[VAL_0:.*]]: !torch.vtensor<[4,64],f32>,


format the code?

Yancey1989 · 2022-07-11T01:58:42Z

scripts/test_custom.sh

@@ -0,0 +1 @@
+./build/bin/torch-mlir-opt < test/Conversion/TorchToMhlo/custom.mlir -convert-torch-to-mhlo -split-input-file -verify-diagnostics


how about other FileCheck uts?

Yancey1989 · 2022-07-11T01:59:09Z

scripts/test_custom.sh

@@ -0,0 +1 @@
+./build/bin/torch-mlir-opt < test/Conversion/TorchToMhlo/custom.mlir -convert-torch-to-mhlo -split-input-file -verify-diagnostics


can not find custom.mlir in this pull request.

silvasean · 2022-07-12T01:15:53Z

I just found a better reference on coding style for us: https://mlir.llvm.org/getting_started/DeveloperGuide/

ZihengJiang · 2022-07-13T18:44:28Z

@silvasean @powderluv @fortianyou @Yancey1989

Thanks to everyone who reviews here. According to the offline meeting between Bytedance and Alibaba. We decide to break the proposal into several PRs:

Integrate MHLO into TorchMLIR Repo
Basic conversion pipeline
Operator conversions
ResNet and BERT examples

We will address the issue mentioned above in the following PRs and track the progress in the RFC #999

…lvm#1025) * Fix missleading output when bad path to onnx model is given Signed-off-by: Yasushi Negishi <negishi@jp.ibm.com> * Simply the fix by avoiding to open the input file twice. Signed-off-by: Yasushi Negishi <negishi@jp.ibm.com>

[MHLO] Init Torch to MHLO conversion.

a20226b

ZihengJiang mentioned this pull request Jul 7, 2022

[RFC] Adding Torch to MHLO conversion #999

Closed

powderluv reviewed Jul 7, 2022

View reviewed changes

tanyokwok reviewed Jul 8, 2022

View reviewed changes

Yancey1989 reviewed Jul 11, 2022

View reviewed changes

ZihengJiang closed this Jul 21, 2022

tanyokwok mentioned this pull request Aug 3, 2022

[MHLO] Init MHLO linear op patterns #1132

Merged

silvasean mentioned this pull request Aug 8, 2022

Add decomposition for aten.roll #1170

Merged

tanyokwok mentioned this pull request Aug 15, 2022

[MHLO] Init end-to-end unit tests #1223

Merged

		@@ -0,0 +1 @@
		./build/bin/torch-mlir-opt < test/Conversion/TorchToMhlo/custom.mlir -convert-torch-to-mhlo -split-input-file -verify-diagnostics

[MHLO] Init Torch to MHLO conversion. #1025

[MHLO] Init Torch to MHLO conversion. #1025

Conversation

ZihengJiang commented Jul 7, 2022 • edited

powderluv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

silvasean commented Jul 8, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

silvasean commented Jul 12, 2022

ZihengJiang commented Jul 13, 2022 • edited

ZihengJiang commented Jul 7, 2022 •

edited

ZihengJiang commented Jul 13, 2022 •

edited