-
Notifications
You must be signed in to change notification settings - Fork 895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 1403: ArrayIndexOutOfBoundsException is thrown on readLastAddConfirmedAndEntry #1404
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
*Motivation* mvn build fails on a centos/7 box. Exceptions are thrown as below: ``` [ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:2.17:check (checkstyle) on project circe-checksum: Execution checkstyle of goal org.apache.maven.plugins:maven-checkstyle-plugin:2.17:check failed: An API incompatibility was encountered while executing org.apache.maven.plugins:maven-checkstyle-plugin:2.17:check: java.lang.NoSuchMethodError: org.slf4j.spi.LocationAwareLogger.log(Lorg/slf4j/Marker;Ljava/lang/String;ILjava/lang/String;Ljava/lang/Throwable;)V ``` *Solution* The problem was addressed in https://issues.apache.org/jira/browse/MCHECKSTYLE-335. We need to bump the checkstyle plugin version to `3.0.0`
*Motivation* apache#1352 update the BC tests to include newer versions, however it removes the flags that used for 4.6.0 to handle badly shutdown. This change makes the BC tests become very flaky. *Solution* - apply `badlyshutdown` flag to both from 4.6.0 to 4.6.1 and from 4.6.1 to 4.6.2 upgrade
*Motivation* Currently all bookkeeper's integration tests and bc tests are docker based and written using arquillian framework. Due to the network issues in docker and the way how bookies advertise themselves in the cluster, we can *ONLY* run integration tests in a linux environment. It is inconvinient to people who is using macOS as the development environment. *Solution* This PR is providing a Vagrantfile to lanuch a dev vm that contains all the tools required for running integration tests and bc tests. So people can actually debug and test integration tests locally using a linux vm.
*Motivation* TestSmoke#testTailingReads is testing the tailing behavior at bookkeeper. However the current testing logic is a bit problematic on checking whether the reader has caught up with the expected last add confirmed. so the read logic break earlier before reading the expectedLastConfirmedEntry. *Solution* Fix the checking logic on breaking the read loop.
eolivelli
approved these changes
May 12, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! +1
jiazhai
approved these changes
May 14, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, need merge with master
retest this please |
sijie
added a commit
that referenced
this pull request
May 15, 2018
…onfirmedAndEntry Descriptions of the changes in this PR: *Motivation* There are two bugs in `ReadLastAddConfirmedAndEntry`: 1) a regression was introduced by #657. the long poll read op is attempting to long-poll reading lac. since lac is stored across ensemble, so the retry logic assumes it will attempt over all the bookies in the ensemble. however #657 use a `write-quorum-size` write set for tracking those attempts. this leads to ArrayIndexOutOfBoundsException reported at #1403. The integrate tests added in this PR can easily reproduce this issue. 2) there was a bug on retry logic when a ledger whose ensemble size is larger than write quorum size. when this happens, it will claim lac is not advanced prior to attempt the bookie in the ensemble. so the client will never know lac is advanced if using long poll reads on such ledgers. The integrate tests added in this PR can also catch this issue. disclaim: twitter uses long poll reads but never uses `ensembleSize > writeQuorumSize`. so this is not a problem for dlog users. *Solution* - introduce a `getWriteSetForLongPoll` call that uses `ensembleSize` for building the write set. this would address problem 1) - fix the assignment of `numEmptyResponsesAllowed`, so the long poll reads can work with `ensembleSize > writeQuorumSize` - add integration tests for long polling reads at the same time, also add an integration test for normal tailing reads with Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes #1404 from sijie/longpoll_tests, closes #1403 (cherry picked from commit 9259173) Signed-off-by: Sijie Guo <sijie@apache.org>
reddycharan
pushed a commit
to reddycharan/bookkeeper
that referenced
this pull request
Oct 17, 2018
…stAddConfirmedAndEntry Descriptions of the changes in this PR: *Motivation* There are two bugs in `ReadLastAddConfirmedAndEntry`: 1) a regression was introduced by apache#657. the long poll read op is attempting to long-poll reading lac. since lac is stored across ensemble, so the retry logic assumes it will attempt over all the bookies in the ensemble. however apache#657 use a `write-quorum-size` write set for tracking those attempts. this leads to ArrayIndexOutOfBoundsException reported at apache#1403. The integrate tests added in this PR can easily reproduce this issue. 2) there was a bug on retry logic when a ledger whose ensemble size is larger than write quorum size. when this happens, it will claim lac is not advanced prior to attempt the bookie in the ensemble. so the client will never know lac is advanced if using long poll reads on such ledgers. The integrate tests added in this PR can also catch this issue. disclaim: twitter uses long poll reads but never uses `ensembleSize > writeQuorumSize`. so this is not a problem for dlog users. *Solution* - introduce a `getWriteSetForLongPoll` call that uses `ensembleSize` for building the write set. this would address problem 1) - fix the assignment of `numEmptyResponsesAllowed`, so the long poll reads can work with `ensembleSize > writeQuorumSize` - add integration tests for long polling reads at the same time, also add an integration test for normal tailing reads with Author: Sijie Guo <sijie@apache.org> Reviewers: Enrico Olivelli <eolivelli@gmail.com>, Jia Zhai <None> This closes apache#1404 from sijie/longpoll_tests, closes apache#1403 (cherry picked from commit 9259173) Signed-off-by: Sijie Guo <sijie@apache.org> Signed-off-by: JV Jujjuri <vjujjuri@salesforce.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Descriptions of the changes in this PR:
Motivation
There are two bugs in
ReadLastAddConfirmedAndEntry
:a regression was introduced by DistributionSchedule uses custom wrapped int[] rather than HashSet #657. the long poll read op is attempting to long-poll reading lac. since lac is stored across ensemble, so the retry logic assumes it will attempt over all the bookies in the ensemble. however DistributionSchedule uses custom wrapped int[] rather than HashSet #657 use a
write-quorum-size
write set for tracking those attempts. this leads to ArrayIndexOutOfBoundsException reported at ArrayIndexOutOfBoundsException is thrown on readLastAddConfirmedAndEntry #1403. The integrate tests added in this PR can easily reproduce this issue.there was a bug on retry logic when a ledger whose ensemble size is larger than write quorum size. when this happens, it will claim lac is not advanced prior to attempt the bookie in the ensemble. so the client will never know lac is advanced if using long poll reads on such ledgers. The integrate tests added in this PR can also catch this issue.
disclaim: twitter uses long poll reads but never uses
ensembleSize > writeQuorumSize
. so this is not a problem for dlog users.Solution
getWriteSetForLongPoll
call that usesensembleSize
for building the write set. this would address problem 1)numEmptyResponsesAllowed
, so the long poll reads can work withensembleSize > writeQuorumSize
at the same time, also add an integration test for normal tailing reads with