Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the flaky issue in ParallelInterleaveDatasetOpTest #27805

Conversation

feihugis
Copy link
Member

This PR fixes the flaky issue in ParallelInterleaveDatasetOpTest. We reset the iterator before restoring it so that all the resources/tasks the old iterator holds can be released.

cc:@jsimsa

Copy link
Contributor

@jsimsa jsimsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should all roundtrip tests include this fix?

@feihugis
Copy link
Member Author

Should all roundtrip tests include this fix?

Yes, the roundtrip test for ParallelMapDataset has a similar flaky issue. Will add this fix to all other roundtrip tests soon.

@feihugis
Copy link
Member Author

This fix is added to all roundtrip tests via edb84f2.

@@ -536,6 +536,8 @@ TEST_P(ParameterizedConcatenateDatasetOpTest, Roundtrip) {
TF_EXPECT_OK(iterator->Save(serialization_ctx.get(), &writer));
TF_EXPECT_OK(writer.Flush());
VariantTensorDataReader reader(&data);
TF_ASSERT_OK(concatenate_dataset->MakeIterator(iterator_ctx.get(),
Copy link
Contributor

@jsimsa jsimsa Apr 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create a test utility (in the test base) for restoring an iterator that makes sure the iterator is reset and it its docstring explain why it is needed. Then apply this utility everywhere (instead of the current error prone pattern that expect that the test writer understand they need to call MakeIterator before Restore). Thanks.

@feihugis feihugis force-pushed the Debug_ParallelInterleaveDatasetOpTest branch from 7662c18 to 9a86868 Compare April 14, 2019 07:28
@feihugis
Copy link
Member Author

Good suggestion! I change it via this commit (9a86868). Could you please have a look at the change?

jsimsa
jsimsa previously approved these changes Apr 14, 2019
@rthadur rthadur self-assigned this Apr 15, 2019
@rthadur rthadur added this to Assigned Reviewer in PR Queue via automation Apr 15, 2019
@rthadur rthadur added comp:data tf.data related issues size:M CL Change Size: Medium kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Apr 15, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Apr 15, 2019
PR Queue automation moved this from Assigned Reviewer to Reviewer Requested Changes Apr 15, 2019
@feihugis
Copy link
Member Author

Sorry that there is a typo issue. I just fix it (bbed106). @rthadur Could you please re-trigger the test?

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Apr 15, 2019
@rthadur rthadur added the kokoro:force-run Tests on submitted change label Apr 15, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Apr 15, 2019
@feihugis
Copy link
Member Author

The test failure in Linux GPU is unrelated.

@tensorflow-copybara tensorflow-copybara merged commit bbed106 into tensorflow:master Apr 15, 2019
PR Queue automation moved this from Approved by Reviewer to Merged Apr 15, 2019
tensorflow-copybara pushed a commit that referenced this pull request Apr 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:data tf.data related issues ready to pull PR ready for merge process size:M CL Change Size: Medium
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

6 participants