-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-29198][test] Fail after maximum RetryOnException #20805
[FLINK-29198][test] Fail after maximum RetryOnException #20805
Conversation
...link-test-utils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnFailureTest.java
Outdated
Show resolved
Hide resolved
Thanks for the contribution. |
Thanks Sergey, especially for the collaboration! This was a tricky thing to test; it gets a bit unclear about how far to go when testing test code :D But since this is such a widely used TestExtension, it's good to have extra coverage -- especially when the symptom is otherwise silently skipping tests that should have failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some tiny comments: add Error and Throwable to the tests
...tils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnExceptionExtensionTest.java
Show resolved
Hide resolved
...tils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnExceptionExtensionTest.java
Show resolved
Hide resolved
...tils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnExceptionExtensionTest.java
Outdated
Show resolved
Hide resolved
...tils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnExceptionExtensionTest.java
Outdated
Show resolved
Hide resolved
...tils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnExceptionExtensionTest.java
Outdated
Show resolved
Hide resolved
...tils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnExceptionExtensionTest.java
Outdated
Show resolved
Hide resolved
@flinkbot run azure |
lgtm from my side //cc @XComp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this PR, @RyanSkraba, and @snuyanzin for the initial review. I looked into it and added a few comments. Please find them below.
...tils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnExceptionExtensionTest.java
Outdated
Show resolved
Hide resolved
...tils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnExceptionExtensionTest.java
Show resolved
Hide resolved
...ava/org/apache/flink/testutils/junit/extensions/retry/strategy/RetryOnExceptionStrategy.java
Show resolved
Hide resolved
// Failed when reach the total retry times | ||
if (attemptIndex >= totalTimes) { | ||
LOG.error("Test Failed at the last retry.", throwable); | ||
throw throwable; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering whether we should move this code into AbstractRetryStrategy
instead of using the same code snippet in both implementing classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello! I thought about this too -- the way the code is expressed, there could be other implementations of AbstractRetryStrategy
that apply different strategies (for example, to continue retrying for max period of time regardless of the number of attempts).
My opinion isn't strong on this, but the logic isn't really complicated enough to need to be abstracted away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid point - but your example would probably get its own implementation of RetryStrategy
instead of relying on AbstractRetryStrategy
as the parent. A counter-argument would be to not try to optimize code for future feature to keep the code as small as possible. If someone decides to come up with a feature that doesn't fit the current implementation, it's time to refactor. Having code duplicates usually doesn't help but introduces potential problems, e.g. bugs only being fixed in one version but not the other.
Anyway, I still see your point...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave your suggestion a try, but creating and delegating to a new method in the abstract strategy takes more code and makes the final implementation non-obvious.
It might have been a bit premature to have introduced the strategy pattern with only two alternatives, but I'm not too worried about repeating this pretty simple if explicitly! Do you have a strong opinion about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leave this up to you. No strong opinion here since, as you said, it's a small code change. So, I consider it an edge case. But in general, my comment from above still applies: Duplicate code creates the risk of fixing bugs only in one location but not the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw. nice catch with the accidental JUnit4->5 migration. Could you reorganize/squash the commits. Ideally, reverting the JUnit4->5 migration should be in a separate commit pointing to the relevant ticket.
c29037a
to
7fa5d53
Compare
Sorry for the delay! I've merge/squashed all of the commits (with @snuyanzin now as co-author, thanks!). The only changes were 1 javadoc clarification and the small requested simplification in the unit test. Can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @RyanSkraba for addressing my comments. The change looks good now. Could you rebase the PR and we do a final PR run. It would also help to update the commit message and move the JavaDoc update into its own commit.
.../src/main/java/org/apache/flink/testutils/junit/extensions/retry/strategy/RetryStrategy.java
Show resolved
Hide resolved
7fa5d53
to
bde4719
Compare
Rebased again and edited the commit messages! |
...link-test-utils-junit/src/test/java/org/apache/flink/testutils/junit/RetryOnFailureTest.java
Show resolved
Hide resolved
bde4719
to
4235eb4
Compare
Rebased again and moved some modifications to their own separate commit |
Co-authored-by: Sergey Nuyanzin <sergey.nuyanzin@aiven.io>
It's actually meant to test the JUnit4 RetryRule feature. The corresponding JUnit5 Extension is tested in `RetryOnFailureExtensionTest`.
4235eb4
to
fa21097
Compare
Rebased and changed the commit messages to the suggested text. Thanks for your attention! In fact, for testing correctness, I would suggest that all three commits be cherry-picked back into branch-1.16 |
@flinkbot run azure |
Failure caused by FLINK-24119 |
@flinkbot run azure |
1 similar comment
@flinkbot run azure |
What is the purpose of the change
This pull request causes tests annotated with
@RetryOnException
to fail after the given number of retries when used with the JUnit5RetryExtension
. This is the old behaviour we saw with the JUnit4RetryRule
.Currently, the test is retried for the maximum number of executions, but the "expected" exception is never thrown even on the last try. This causes the test to appear skipped instead of failed.
Brief change log
RetryRule
. This code is appropriately duplicated forRetryExtension
.Verifying this change
This change is already covered by existing tests, such as
RetryOnExceptionExtensionTest
.I've manually tested that the following test now fails after the expected number of retries (instead of being skipped):
In JUnit5, it's very tricky to specify that a test is supposed to fail, but it's possible. I started down this route with a technique similar to ExpectToFail but it ends up needing to be very aware of the internals of our RetryExtension. I'm not sure it's worth adding test extension code to test extension code, but I can continue down that route!
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation