Use READ_ENTRY_SCHEDULING_DELAY to stable stickyReadsWithFailures#3628
Merged
hangc0276 merged 1 commit intoDec 7, 2022
Merged
Conversation
Member
Author
|
ping @hangc0276 @dlg99 @zymap @shoothzj PTAL. Thanks. |
Member
Author
|
ping @zymap @hangc0276 @shoothzj @dlg99 If you have time, can you help take a look. Thanks. |
hangc0276
approved these changes
Nov 25, 2022
yaalsn
pushed a commit
to yaalsn/bookkeeper
that referenced
this pull request
Jan 30, 2023
…ache#3628) ### Motivation I found the following flaky-test: org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures: https://github.com/apache/bookkeeper/actions/runs/3367374609/jobs/5584792353 ``` Error: Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.925 s <<< FAILURE! - in org.apache.bookkeeper.bookie.BookieStickyReadsTest Error: org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures Time elapsed: 1.752 s <<< ERROR! java.lang.IndexOutOfBoundsException: Index: -1, Size: 3 at java.base/java.util.LinkedList.checkElementIndex(LinkedList.java:559) at java.base/java.util.LinkedList.get(LinkedList.java:480) at org.apache.bookkeeper.test.BookKeeperClusterTestCase.serverByIndex(BookKeeperClusterTestCase.java:369) at org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures(BookieStickyReadsTest.java:153) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.lang.Thread.run(Thread.java:829) ``` In the stickyReadsWithFailures test, the client successfully reads the entry, but the `READ_ENTRY_REQUEST` metric does not get bonuses. After reading the `READ_ENTRY_REQUEST`update logic, I found the metric updating after netty channel sends the response successfully, the metric is updated through the `ChannelFutureListener` callback, and the asynchronous update causes the above test to fail. ```java protected void sendResponse(StatusCode code, Object response, OpStatsLogger statsLogger) { ...... if (channel.isActive()) { channel.writeAndFlush(response).addListener(new ChannelFutureListener() { @OverRide public void operationComplete(ChannelFuture future) throws Exception { long writeElapsedNanos = MathUtils.elapsedNanos(writeNanos); if (!future.isSuccess()) { requestProcessor.getRequestStats().getChannelWriteStats() .registerFailedEvent(writeElapsedNanos, TimeUnit.NANOSECONDS); } else { requestProcessor.getRequestStats().getChannelWriteStats() .registerSuccessfulEvent(writeElapsedNanos, TimeUnit.NANOSECONDS); } if (StatusCode.EOK == code) { statsLogger.registerSuccessfulEvent(MathUtils.elapsedNanos(enqueueNanos), TimeUnit.NANOSECONDS); } else { statsLogger.registerFailedEvent(MathUtils.elapsedNanos(enqueueNanos), TimeUnit.NANOSECONDS); } } }); } ...... } ``` ### Changes The `READ_ENTRY_SCHEDULING_DELAY` metric is processed before the Read request is processed, which proved that bookie receives the read request from the client and can well meet the needs of `BookieStickyReadsTest`. This makes the `BookieStickyReadsTest` test more stabled.
Ghatage
pushed a commit
to sijie/bookkeeper
that referenced
this pull request
Jul 12, 2024
…ache#3628) ### Motivation I found the following flaky-test: org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures: https://github.com/apache/bookkeeper/actions/runs/3367374609/jobs/5584792353 ``` Error: Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 8.925 s <<< FAILURE! - in org.apache.bookkeeper.bookie.BookieStickyReadsTest Error: org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures Time elapsed: 1.752 s <<< ERROR! java.lang.IndexOutOfBoundsException: Index: -1, Size: 3 at java.base/java.util.LinkedList.checkElementIndex(LinkedList.java:559) at java.base/java.util.LinkedList.get(LinkedList.java:480) at org.apache.bookkeeper.test.BookKeeperClusterTestCase.serverByIndex(BookKeeperClusterTestCase.java:369) at org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures(BookieStickyReadsTest.java:153) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.lang.Thread.run(Thread.java:829) ``` In the stickyReadsWithFailures test, the client successfully reads the entry, but the `READ_ENTRY_REQUEST` metric does not get bonuses. After reading the `READ_ENTRY_REQUEST`update logic, I found the metric updating after netty channel sends the response successfully, the metric is updated through the `ChannelFutureListener` callback, and the asynchronous update causes the above test to fail. ```java protected void sendResponse(StatusCode code, Object response, OpStatsLogger statsLogger) { ...... if (channel.isActive()) { channel.writeAndFlush(response).addListener(new ChannelFutureListener() { @OverRide public void operationComplete(ChannelFuture future) throws Exception { long writeElapsedNanos = MathUtils.elapsedNanos(writeNanos); if (!future.isSuccess()) { requestProcessor.getRequestStats().getChannelWriteStats() .registerFailedEvent(writeElapsedNanos, TimeUnit.NANOSECONDS); } else { requestProcessor.getRequestStats().getChannelWriteStats() .registerSuccessfulEvent(writeElapsedNanos, TimeUnit.NANOSECONDS); } if (StatusCode.EOK == code) { statsLogger.registerSuccessfulEvent(MathUtils.elapsedNanos(enqueueNanos), TimeUnit.NANOSECONDS); } else { statsLogger.registerFailedEvent(MathUtils.elapsedNanos(enqueueNanos), TimeUnit.NANOSECONDS); } } }); } ...... } ``` ### Changes The `READ_ENTRY_SCHEDULING_DELAY` metric is processed before the Read request is processed, which proved that bookie receives the read request from the client and can well meet the needs of `BookieStickyReadsTest`. This makes the `BookieStickyReadsTest` test more stabled.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
I found the following flaky-test: org.apache.bookkeeper.bookie.BookieStickyReadsTest.stickyReadsWithFailures:
https://github.com/apache/bookkeeper/actions/runs/3367374609/jobs/5584792353
In the stickyReadsWithFailures test, the client successfully reads the entry, but the
READ_ENTRY_REQUESTmetric does not get bonuses. After reading theREAD_ENTRY_REQUESTupdate logic, I found the metric updating after netty channel sends the response successfully, the metric is updated through theChannelFutureListenercallback, and the asynchronous update causes the above test to fail.Changes
The
READ_ENTRY_SCHEDULING_DELAYmetric is processed before the Read request is processed, which proved that bookie receives the read request from the client and can well meet the needs ofBookieStickyReadsTest.This makes the
BookieStickyReadsTesttest more stabled.