Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip failed RFFT tests for TF-2.4.x #2026

Merged
merged 1 commit into from
Aug 20, 2022

Conversation

q-ycong-p
Copy link
Contributor

4 unit RFFT-related backend tests fail for TF-2.4.x. Here are the dependencies and error messages when seeing failed tests:

  • Tensorflow: 2.4.0 or 2.4.1
  • ONNX: 1.12.0
  • onnxruntime: 1.12.1
FAILED tests/test_backend.py::BackendTests::test_rfft_ops - ValueError: make_sure failure: Current implementation of RFFT2D only allows ComplexAbs as consumer not {'Reshape'}
FAILED tests/test_backend.py::BackendTests::test_rfft_ops_fft_length - ValueError: make_sure failure: Current implementation of RFFT2D only allows ComplexAbs as consumer not {'Reshape'}
FAILED tests/test_backend.py::BackendTests::test_rfft_ops_fft_length_many - ValueError: make_sure failure: Current implementation of RFFT2D only allows ComplexAbs as consumer not {'Reshape'}
FAILED tests/test_backend.py::BackendTests::test_rfft_ops_fft_length_many_bigger - ValueError: make_sure failure: Current implementation of RFFT2D only allows ComplexAbs as consumer not {'Reshape'}

It looks like TF-2.4.x is not in the Azures Pipelines suite hence wasn't previously caught (pls correct me if wrong)? The failed tests would otherwise pass running on TF>=2.5.0.

I'd like to raise a PR to skip these tests for TF-2.4.x to unblock failure in our CI pipeline that depends on tf2onnx and TF-2.4. If the test failures are due to tf2onnx rather than TF-2.4.x bug based on your judgement, could you pls instead open an issue to track the fix? Thank you!

Signed-off-by: Yu Cong <congyc@amazon.com>
@q-ycong-p
Copy link
Contributor Author

Failed Pipelines test test_resnext.py::TestResNext::test_SEResNext appears unrelated to the PR change. See here. Did I miss anything?

@q-ycong-p
Copy link
Contributor Author

Would really appreciate some comment/advise from other contributors (@hwangdeyu , @fatcat-z ?) on below:

  • if the tests skipped in PR are indeed a TF-2.4.X debug, hence makes sense to be skipped in tf2onnx?
  • The failed Pipeline tests do not seem related to PR change to me? Can we re-run it?

@fatcat-z
Copy link
Collaborator

Rerunning the failed tests and will take a look at those test cases.
Actually, we've added tf2.4.1 to be verified in nightly Azure CI pipeline but it is not included into PR CI pipeline.

@fatcat-z
Copy link
Collaborator

For test_rfft_ops, the ValueError is expected because there is one test case to cover invalid operation scenario.
For the rest of others, I tried to run them in local box with tf2.4.1, but didn't find out the ValueError exceptions. Could you please help to confirm?

@q-ycong-p
Copy link
Contributor Author

Thanks @fatcat-z ,

For test_rfft_ops, I confirmed that the error comes from running test case for func1, on this line. The "expected" ValueError is only caught for func2 and func3 here, which seems insufficient.

I used a new and clean conda env to installed the following dependencies and pulled the lastest tf2onnx from main with no local changes, and confirmed that the 4 tests are indeed failing with above errors. Could you pls check if your local box repro this env I tested:

  • python==3.8.13
  • tensorflow==2.4.1
  • onnx==1.12.0
  • onnxruntime==1.12.1
  • numpy==1.19.3 (TF-2.4.1 requires numpy-1.19.x. This is installed after installing onnxruntime-1.12.1, to workaround the issue discussed here)
  • OS: Linux x86_64

I've attached the error log as a text file, for your reference:
tf2onnx_error_log.txt

If you can repro these error, given test_rfft_ops's existing code, perhaps it's better to surround failing tests' self._run_test_case calls within with self.assertRaises(ValueError):, instead of skipping TF-2.4.x? I can update the PR if so. If you still cannot repro, could you shed some light on what other dependencies I might have missed to fix version for? Many thanks!

@fatcat-z
Copy link
Collaborator

@q-ycong-p ,

Thanks for your details. Now I'm clear with this issue. It only happens to tflite+tf2.4.1, that's why I didn't meet such issue when I only ran tf tests.

Copy link
Collaborator

@fatcat-z fatcat-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contributions.

@fatcat-z fatcat-z merged commit 7e3fd01 into onnx:main Aug 20, 2022
@q-ycong-p q-ycong-p deleted the skip_failed_test_tf24 branch August 22, 2022 16:05
fatcat-z pushed a commit that referenced this pull request Sep 21, 2022
Signed-off-by: Yu Cong <congyc@amazon.com>

Signed-off-by: Yu Cong <congyc@amazon.com>
Co-authored-by: Yu Cong <congyc@amazon.com>
Signed-off-by: Jay Zhang <jiz@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants