[MLIR] Fix verifier for TF::Conv2DOp and TF::Conv3DOp. #44022

pr4tgpt · 2020-10-14T17:00:31Z

This commit addresses a bug where in tf.conv2D and tf.conv3D, incorrect sizes of output shape was not verified. Added changes to verify the output shape and emit error for incorrect output shapes.

Signed-off-by: Prateek Gupta prateek@polymagelabs.com

bondhugula · 2020-10-14T17:40:51Z

@smit-hinsu for visibility - as the author of the surrounding code.

smit-hinsu · 2020-10-14T18:50:45Z

Thanks for the improvements!

Could you make use the new inferReturnTypes trait for shape inference?

See this commit for an example. edeb469

pr4tgpt · 2020-10-15T09:00:17Z

Thanks for the improvements!

Could you make use the new inferReturnTypes trait for shape inference?

See this commit for an example. edeb469

It isn't immediately clear from the other revision as to how inferReturnTypes should be used/incorporated. Could you please expand a bit more on what changes/additions should be made here? Thanks

smit-hinsu · 2020-10-15T17:55:17Z

Once we declare InferTypeOpInterface interface and implement inferReturnTypes methods for these ops, we don't need to explicitly call shape verification. You should see that shapes are being verified automatically without explicit call to inferReturnTypes. The benefit of this generic approach is that the interface can be shared between buillder, verifier and allows other uses as all the ops follow the same mechanism.

Checkout interfaces documentation at https://mlir.llvm.org/docs/Interfaces/.

Let me know if you need further explanation or specific details.

pr4tgpt · 2020-10-16T14:26:49Z

Once we declare InferTypeOpInterface interface and implement inferReturnTypes methods for these ops, we don't need to explicitly call shape verification. You should see that shapes are being verified automatically without explicit call to inferReturnTypes. The benefit of this generic approach is that the interface can be shared between buillder, verifier and allows other uses as all the ops follow the same mechanism.

Checkout interfaces documentation at https://mlir.llvm.org/docs/Interfaces/.

Let me know if you need further explanation or specific details.

@smit-hinsu , there is an issue in using the InferTypeOpInterface. Previously, when I was using the verifier to check the output type all other parameters like non-negative strides, a correct number of input dimensions/strides etc were checked before only(as I kept my check at the end of the verifier).
But now if I use the inferReturnTypes method, the output type is inferred before doing other sanity checks resulting in a segmentation fault. The output type in the case of tf.conv2D and tf.conv3D can only be inferred if the attributes are correct.
When I run my code using inferReturnTypes with correct attributes, it runs fine. But if there are incorrect attributes they have to be checked first
If I emit errors in those instances, then there will be no use of a verifier. So could you suggest something here?

smit-hinsu · 2020-10-17T02:16:13Z

We need to emit error in that case and fail the type inference. See the following op inference in HLO dialect for an example,
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/hlo/lib/Dialect/mhlo/IR/hlo_ops.cc#L1687

gbaned · 2020-10-19T13:46:05Z

@pr4tgpt Can you please check @smit-hinsu's comments and keep us posted ? Thanks!

pr4tgpt · 2020-10-19T14:29:05Z

@pr4tgpt Can you please check @smit-hinsu's comments and keep us posted ? Thanks!

Yes I am working on smit's suggestion. Thanks.

pr4tgpt · 2020-10-19T20:45:18Z

We need to emit error in that case and fail the type inference. See the following op inference in HLO dialect for an example,
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/hlo/lib/Dialect/mhlo/IR/hlo_ops.cc#L1687

Hi @smit-hinsu I have made changes as requested by you. Please review them. Thanks.

smit-hinsu

Sorry for the delay in the response. I somehow missed the update mail. Please ping on the review if you don't get an update within 24 hours.

This looks quite good but I have a few suggestions.

tensorflow/compiler/mlir/tensorflow/ir/tf_ops_a_m.cc

pr4tgpt · 2020-10-24T06:41:09Z

Thank you @smit-hinsu for suggestions. Will update the code with relevant changes.

tensorflow/compiler/mlir/tensorflow/ir/tf_ops_a_m.cc

gbaned · 2020-10-29T16:56:46Z

@pr4tgpt Can you please check @smit-hinsu's comments and keep us posted ? Thanks!

pr4tgpt · 2020-10-30T04:40:13Z

@pr4tgpt Can you please check @smit-hinsu's comments and keep us posted ? Thanks!

I have checked and resolved the comments. Thanks.

pr4tgpt · 2020-10-30T04:45:20Z

Hi @smit-hinsu , I have made changes as requested by you. Please review the changes and feel free to post suggestions/improvements. Thanks!

smit-hinsu

Thanks for all the hard work here. We are almost there!

Let me know if any of the comments are not clear and you need further pointer/clarification.

tensorflow/compiler/mlir/tensorflow/ir/tf_generated_ops.td

tensorflow/compiler/mlir/tensorflow/ir/tf_ops_a_m.cc

smit-hinsu · 2020-11-23T13:13:41Z

Could you run all the tests in these directories and fix the failing ones?

tensorflow/compiler/mlir
tensorflow/compiler/tests
tensorflow/lite/python
tensorflow/lite/testing/model_coverage

gbaned · 2020-11-25T09:50:41Z

@pr4tgpt Can you please check @smit-hinsu's comments and keep us posted ? Thanks!

pr4tgpt · 2020-12-03T15:23:49Z

Could you run all the tests in these directories and fix the failing ones?

tensorflow/compiler/mlir
tensorflow/compiler/tests
tensorflow/lite/python
tensorflow/lite/testing/model_coverage

Yes I am looking into this. Thanks!

pr4tgpt · 2020-12-08T03:04:18Z

Could you run all the tests in these directories and fix the failing ones?

tensorflow/compiler/mlir
tensorflow/compiler/tests
tensorflow/lite/python
tensorflow/lite/testing/model_coverage

Hi @smit-hinsu I checked the tests you asked.

The tests in tensorflow/compiler/mlir specific to tf_to_gpu and tf_to_kernel pass fail. These tests have no dependency to my patch.
The python tests in the tensorflow/compiler/tests are failing due to reasons I can't determine(mostly ModuleNameError). I find no dependency to my patch here also.
The python tests in the tensorflow/lite/python are failing due to reasons I can't determine(mostly ImportError). I find no dependency to my patch here also.
The tests in tensorflow/lite/testing/model_coverage pass successfully.

Could you see the failing tests once and tell me what could be the possible reason for them to fail? Thanks!

pr4tgpt · 2020-12-10T13:33:45Z

Could you run all the tests in these directories and fix the failing ones?
tensorflow/compiler/mlir
tensorflow/compiler/tests
tensorflow/lite/python
tensorflow/lite/testing/model_coverage

Hi @smit-hinsu I checked the tests you asked.

The tests in tensorflow/compiler/mlir specific to tf_to_gpu and tf_to_kernel pass fail. These tests have no dependency to my patch.
The python tests in the tensorflow/compiler/tests are failing due to reasons I can't determine(mostly ModuleNameError). I find no dependency to my patch here also.
The python tests in the tensorflow/lite/python are failing due to reasons I can't determine(mostly ImportError). I find no dependency to my patch here also.
The tests in tensorflow/lite/testing/model_coverage pass successfully.

Could you see the failing tests once and tell me what could be the possible reason for them to fail? Thanks!

Hi @smit-hinsu could you help with this? Thanks!

smit-hinsu · 2020-12-10T15:29:27Z

All these tests depend on your change. tensorflow/compiler/mlir/tensorflow/tests uses tf-opt tool which links everything andd tensorflow/lite/python links the TF dialect so there are dependencies.

There are many tests failing. Following are some of these:

tensorflow/compiler/mlir/tensorflow/tests:tf-ops.mlir.test
tensorflow/compiler/mlir/lite/tests:prepare-tf.mlir.test
tensorflow/compiler/tests:conv3d_mlir_bridge_test_cpu
tensorflow/lite/python:lite_test
tensorflow/lite/testing/model_coverage:model_coverage_test_user_vgg_tflite_builtins_select_tf_ops

Please take a look. You can use 'tensorflow/compiler/mlir/tensorflow/tests:all' to test all tests in the directory.

bondhugula · 2020-12-10T16:21:05Z

These failures are all clearly related to this PR. An example from tensorflow/compiler/mlir/tensorflow/tests/mlir/tensorflow/tests/tf_optimize.mlir.test:

2020-12-10 21:49:06.560035: F ./tensorflow/core/util/tensor_format.h:266] Check failed: format == FORMAT_OIHW_VECT_I (0 vs. 3)
TensorFlow crashed, please file a bug on https://github.com/tensorflow/tensorflow/issues with the trace below.
Stack dump:
0.      Program arguments: /home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt /home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir -tf-optimize 
1.      Program arguments: /home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt /home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir -tf-optimize 
 #0 0x00000000122c096f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt+0x122c096f)
 #1 0x00000000122be5ed llvm::sys::RunSignalHandlers() (/home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt+0x122be5ed)
 #2 0x00000000122bee6d SignalHandler(int) (/home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt+0x122bee6d)
 #3 0x00007fe1f5214dd0 __restore_rt (/lib64/libpthread.so.0+0x12dd0)
 #4 0x00007fe1f4c5f70f raise (/lib64/libc.so.6+0x3770f)
 #5 0x00007fe1f4c49b25 abort (/lib64/libc.so.6+0x21b25)
 #6 0x00000000111d2b10 virtual thunk to tensorflow::internal::LogMessageFatal::~LogMessageFatal() (/home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-o
 #7 0x000000000fe767ec mlir::LogicalResult mlir::TF::inferConvReturnTypes<mlir::TF::Conv2DOpAdaptor, (void*)0>(mlir::TF::Conv2DOpAdaptor, llvm::SmallVectorImpl<mlir::Type>&, llvm::Optional<mlir::Location>, llvm::ArrayRef<mlir::Attribute>) (.cold.3446) (/home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt+0xfe767ec)
 #8 0x000000000fe77224 mlir::TF::Conv2DOp::inferReturnTypes(mlir::MLIRContext*, llvm::Optional<mlir::Location>, mlir::ValueRange, mlir::DictionaryAttr, mlir::RegionRange, llvm::SmallVectorImpl<mlir::Type>&) (/home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt+0xfe77224)
 #9 0x0000000011eeb238 mlir::detail::verifyInferredResultTypes(mlir::Operation*) (/home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt+0x11eeb238)
#10 0x000000000fc884d9 mlir::Op<mlir::TF::Conv2DOp, mlir::OpTrait::ZeroRegion, mlir::OpTrait::OneResult, mlir::OpTrait::ZeroSuccessor, mlir::OpTrait::NOperands<2u>::Impl, mlir::InferTypeOpInterface::Trait, mlir::MemoryEffectOpInterface::Trait, mlir::TF::LayoutSensitiveInterface::Trait, mlir::DerivedAttributeOpInterface::Trait>::verifyInvariants(mlir::Operation*) (/home/uday/.cache/bazel/_bazel_uday/97da8b8d2f64c0a54677e92074ce1d42/execroot/org_tensorflow/bazel-out/k8-opt/bin/tensorflow/compiler/mlir/tensorflow/tests/tf_optimize.mlir.test.runfiles/org_tensorflow/tensorflow/compiler/mlir/tf-opt+0xfc884d9)

This commit addresses a bug where in tf.conv2D and tf.conv3D, incorrect sizes of output shape was not verified. Added changes to verify the output shape and emit error for incorrect output shapes. Signed-off-by: Prateek Gupta <prateek@polymagelabs.com>

In this commit, some of the dead code is removed. Also a duplicate function for verifying convolution attributes is also removed. Signed-off-By: Prateek Gupta<prateek@polymagelabs.com>

Some of the test cases were using wrong output sizes for tf.conv2D. This commit corrects those test cases. Signed-off-by: Prateek Gupta<prateek@polymagelabs.com>

This commit addresses a bug where incorrect filter dimensions were being read and used. Corrected the bug and the relevant test cases were also checked. Signed-off-by: Prateek Gupta <prateek@polymagelabs.com>

pr4tgpt · 2020-12-14T07:25:15Z

All these tests depend on your change. tensorflow/compiler/mlir/tensorflow/tests uses tf-opt tool which links everything andd tensorflow/lite/python links the TF dialect so there are dependencies.

There are many tests failing. Following are some of these:

tensorflow/compiler/mlir/tensorflow/tests:tf-ops.mlir.test
tensorflow/compiler/mlir/lite/tests:prepare-tf.mlir.test
tensorflow/compiler/tests:conv3d_mlir_bridge_test_cpu
tensorflow/lite/python:lite_test
tensorflow/lite/testing/model_coverage:model_coverage_test_user_vgg_tflite_builtins_select_tf_ops

Please take a look. You can use 'tensorflow/compiler/mlir/tensorflow/tests:all' to test all tests in the directory.

Hi @smit-hinsu, @bondhugula I have fixed the code. The filter dimensions were fetched incorrectly. The tensorflow/compiler/mlir/tensorflow/tests are now passing.
Looking at other failing tests also. Thanks! Please review.

pr4tgpt · 2020-12-14T09:06:19Z

All these tests depend on your change. tensorflow/compiler/mlir/tensorflow/tests uses tf-opt tool which links everything andd tensorflow/lite/python links the TF dialect so there are dependencies.
There are many tests failing. Following are some of these:
tensorflow/compiler/mlir/tensorflow/tests:tf-ops.mlir.test
tensorflow/compiler/mlir/lite/tests:prepare-tf.mlir.test
tensorflow/compiler/tests:conv3d_mlir_bridge_test_cpu
tensorflow/lite/python:lite_test
tensorflow/lite/testing/model_coverage:model_coverage_test_user_vgg_tflite_builtins_select_tf_ops
Please take a look. You can use 'tensorflow/compiler/mlir/tensorflow/tests:all' to test all tests in the directory.

Hi @smit-hinsu, @bondhugula I have fixed the code. The filter dimensions were fetched incorrectly. The tensorflow/compiler/mlir/tensorflow/tests are now passing.
Looking at other failing tests also. Thanks! Please review.

Tests in mlir/lite/tests are also passing.

smit-hinsu · 2020-12-15T02:17:21Z

It finally got submitted! Thanks Prateek for the contribution and working through all the issues.

pr4tgpt · 2020-12-15T04:29:52Z

It finally got submitted! Thanks Prateek for the contribution and working through all the issues.

Thank you Smit for your valuable help and suggestions.

google-ml-butler bot added the size:M CL Change Size: Medium label Oct 14, 2020

google-ml-butler bot requested a review from joker-eph October 14, 2020 17:00

google-cla bot added the cla: yes label Oct 14, 2020

pr4tgpt changed the title ~~[MLIR]Add changes in verifier for TF::Conv2DOp and TF::Conv3DOp.~~ [MLIR]Fix verifier for TF::Conv2DOp and TF::Conv3DOp. Oct 14, 2020

pr4tgpt changed the title ~~[MLIR]Fix verifier for TF::Conv2DOp and TF::Conv3DOp.~~ [MLIR] Fix verifier for TF::Conv2DOp and TF::Conv3DOp. Oct 14, 2020

smit-hinsu self-requested a review October 14, 2020 18:50

gbaned self-assigned this Oct 15, 2020

gbaned added the stat:awaiting response Status - Awaiting response from author label Oct 19, 2020

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Oct 21, 2020

smit-hinsu reviewed Oct 24, 2020

View reviewed changes

tensorflow/compiler/mlir/tensorflow/ir/tf_ops_a_m.cc Outdated Show resolved Hide resolved

gbaned added the stat:awaiting response Status - Awaiting response from author label Oct 27, 2020

smit-hinsu approved these changes Oct 30, 2020

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Oct 30, 2020

gbaned added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Nov 18, 2020

gbaned removed the ready to pull PR ready for merge process label Nov 25, 2020

gbaned added the stat:awaiting response Status - Awaiting response from author label Nov 27, 2020

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Dec 5, 2020

gbaned added the stat:awaiting response Status - Awaiting response from author label Dec 7, 2020

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Dec 10, 2020

pr4tgpt added 5 commits December 11, 2020 11:52

Removed dead code and duplicate verification functions.

d851fba

In this commit, some of the dead code is removed. Also a duplicate function for verifying convolution attributes is also removed. Signed-off-By: Prateek Gupta<prateek@polymagelabs.com>

Add correction in some of the test cases.

12217d1

Some of the test cases were using wrong output sizes for tf.conv2D. This commit corrects those test cases. Signed-off-by: Prateek Gupta<prateek@polymagelabs.com>

Add minor checks for data_format and padding value

8cef4cd

[MLIR] Add minor fix for filter dimensions.

6df0071

This commit addresses a bug where incorrect filter dimensions were being read and used. Corrected the bug and the relevant test cases were also checked. Signed-off-by: Prateek Gupta <prateek@polymagelabs.com>

smit-hinsu approved these changes Dec 14, 2020

View reviewed changes

google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Dec 14, 2020

kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 14, 2020

copybara-service bot merged commit af9ad9d into tensorflow:master Dec 15, 2020

miaout17 mentioned this pull request Feb 11, 2021

Partially infer conv return types #47060

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLIR] Fix verifier for TF::Conv2DOp and TF::Conv3DOp. #44022

[MLIR] Fix verifier for TF::Conv2DOp and TF::Conv3DOp. #44022

pr4tgpt commented Oct 14, 2020

bondhugula commented Oct 14, 2020

smit-hinsu commented Oct 14, 2020

pr4tgpt commented Oct 15, 2020

smit-hinsu commented Oct 15, 2020

pr4tgpt commented Oct 16, 2020 •

edited

Loading

smit-hinsu commented Oct 17, 2020

gbaned commented Oct 19, 2020

pr4tgpt commented Oct 19, 2020

pr4tgpt commented Oct 19, 2020

smit-hinsu left a comment

pr4tgpt commented Oct 24, 2020

gbaned commented Oct 29, 2020

pr4tgpt commented Oct 30, 2020

pr4tgpt commented Oct 30, 2020 •

edited

Loading

smit-hinsu left a comment

smit-hinsu commented Nov 23, 2020

gbaned commented Nov 25, 2020

pr4tgpt commented Dec 3, 2020

pr4tgpt commented Dec 8, 2020

pr4tgpt commented Dec 10, 2020

smit-hinsu commented Dec 10, 2020

bondhugula commented Dec 10, 2020 •

edited

Loading

pr4tgpt commented Dec 14, 2020

pr4tgpt commented Dec 14, 2020

smit-hinsu commented Dec 15, 2020

pr4tgpt commented Dec 15, 2020

[MLIR] Fix verifier for TF::Conv2DOp and TF::Conv3DOp. #44022

[MLIR] Fix verifier for TF::Conv2DOp and TF::Conv3DOp. #44022

Conversation

pr4tgpt commented Oct 14, 2020

bondhugula commented Oct 14, 2020

smit-hinsu commented Oct 14, 2020

pr4tgpt commented Oct 15, 2020

smit-hinsu commented Oct 15, 2020

pr4tgpt commented Oct 16, 2020 • edited Loading

smit-hinsu commented Oct 17, 2020

gbaned commented Oct 19, 2020

pr4tgpt commented Oct 19, 2020

pr4tgpt commented Oct 19, 2020

smit-hinsu left a comment

Choose a reason for hiding this comment

pr4tgpt commented Oct 24, 2020

gbaned commented Oct 29, 2020

pr4tgpt commented Oct 30, 2020

pr4tgpt commented Oct 30, 2020 • edited Loading

smit-hinsu left a comment

Choose a reason for hiding this comment

smit-hinsu commented Nov 23, 2020

gbaned commented Nov 25, 2020

pr4tgpt commented Dec 3, 2020

pr4tgpt commented Dec 8, 2020

pr4tgpt commented Dec 10, 2020

smit-hinsu commented Dec 10, 2020

bondhugula commented Dec 10, 2020 • edited Loading

pr4tgpt commented Dec 14, 2020

pr4tgpt commented Dec 14, 2020

smit-hinsu commented Dec 15, 2020

pr4tgpt commented Dec 15, 2020

pr4tgpt commented Oct 16, 2020 •

edited

Loading

pr4tgpt commented Oct 30, 2020 •

edited

Loading

bondhugula commented Dec 10, 2020 •

edited

Loading