-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-10029; Don't update completedReceives when channels are closed to avoid ConcurrentModificationException #8705
KAFKA-10029; Don't update completedReceives when channels are closed to avoid ConcurrentModificationException #8705
Conversation
How about making a collection copy of |
@chia7712 I was tempted to do that initially, but that is not the pattern we use for everything else in Selector and it has always been this way (for several years), so adding tests to make sure we don't break it made more sense. |
@@ -1742,6 +1746,12 @@ class SocketServerTest { | |||
selector = Some(testableSelector) | |||
testableSelector | |||
} | |||
|
|||
override private[network] def processException(errorMessage: String, throwable: Throwable, isUncaught: Boolean): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isUncaught
is used by testing only so it is a bit awkward to production code. Could you check the errorMessage
instead of adding new argument? for example:
if (errorMessage == "Processor got uncaught exception.") uncaughtExceptions += 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny you should say that, I was initially checking the error message and then felt that the test wouldn't fail if the error message was changed. But since then I had also updated the test which triggers the uncaught exception code path, so it is actually safe now to check the error message. Have updated the code.
@rajinisivaram What's the implication of not removing the completed receive in |
…to avoid ConcurrentModificationException
a551d91
to
09589be
Compare
@ijuma By not updating |
@rajinisivaram We also retain a reference to the |
This change is probably OK, but the way we call |
One more thing, can we improve |
@ijuma Based on our discussion, I have added |
channelOpt.foreach { channel => map.put(channel.id, receive) } | ||
} | ||
cachedCompletedSends.currentPollValues.foreach(super.completedSends.add) | ||
cachedDisconnected.currentPollValues.foreach { case (id, state) => super.disconnected.put(id, state) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment explaining what we're trying to do here? It's not clear why we do (for example):
cachedCompletedSends.update(super.completedSends.asScala)
followed by
cachedCompletedSends.currentPollValues.foreach(super.completedSends.add)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have moved the second line that updates current result into update()
. They were done separately because each type uses slightly different format, but it is clearer if they are together. Added comments as well.
@@ -1807,6 +1817,7 @@ class SocketServerTest { | |||
currentPollValues ++= newValues | |||
} else | |||
deferredValues ++= newValues | |||
newValues.clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the goal of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We return minPerPoll
results together, so the current values are cleared and then populated as necessary. The code is now in the same place, so hopefully that is clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I was asking is that we call toBuffer
before calling this method for a couple of cases. That will basically create a copy, so then this clear
won't do anything useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I refactored a bit to clear the original buffer in each case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, looks good overall. I had some questions about the tests and one nit suggestion.
* Clear the results from the prior poll | ||
* Clears completed receives. This is used by SocketServer to remove references to | ||
* receive buffers after processing completed receives, without waiting for the next | ||
* poll() after all results have been processed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe after all results have been processed
is a bit redundant? Same for the clearCompletedSends
docs.
@ijuma Thanks for the review, have addressed the comments. |
cachedCompletedSends.update(super.completedSends.asScala) | ||
cachedDisconnected.update(super.disconnected.asScala.toBuffer) | ||
|
||
val map: util.Map[String, NetworkReceive] = JTestUtils.fieldValue(this, classOf[Selector], "completedReceives") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this completedReceivesMap
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ijuma Thanks for the review, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. Just one minor suggestion below, no need for re-review.
val channelOpt = Option(super.channel(receive.source)).orElse(Option(super.closingChannel(receive.source))) | ||
channelOpt.foreach { channel => completedReceivesMap.put(channel.id, receive) } | ||
} | ||
|
||
// For each result type (completedReceives/completedSends/disconnected), defer the result to a subsequent poll() | ||
// if `minPerPoll` results are not yet available. When sufficient results are available, all available results | ||
// including previously deferred results are returned. This allows tests to process `minPerPoll` elements as the | ||
// results of a single poll iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your refactoring has added the comments to other places. Maybe we can add "This allows tests to process minPerPoll
elements as the results of a single poll iteration" to the update
method and remove this?
…to avoid ConcurrentModificationException (#8705) Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
…to avoid ConcurrentModificationException (#8705) Reviewers: Ismael Juma <ismael@juma.me.uk>, Chia-Ping Tsai <chia7712@gmail.com>
* 'trunk' of github.com:apache/kafka: (36 commits) Remove redundant `containsKey` call in KafkaProducer (apache#8761) KAFKA-9494; Include additional metadata information in DescribeConfig response (KIP-569) (apache#8723) KAFKA-10061; Fix flaky `ReassignPartitionsIntegrationTest.testCancellation` (apache#8749) KAFKA-9130; KIP-518 Allow listing consumer groups per state (apache#8238) KAFKA-9501: convert between active and standby without closing stores (apache#8248) KAFKA-10056; Ensure consumer metadata contains new topics on subscription change (apache#8739) MINOR: Log the reason for coordinator discovery failure (apache#8747) KAFKA-10029; Don't update completedReceives when channels are closed to avoid ConcurrentModificationException (apache#8705) MINOR: remove unnecessary timeout for admin request (apache#8738) MINOR: Relax Percentiles test (apache#8748) MINOR: regression test for task assignor config (apache#8743) MINOR: Update documentation.html to refer to 2.6 (apache#8745) MINOR: Update documentation.html to refer to 2.5 (apache#8744) KAFKA-9673: Filter and Conditional SMTs (apache#8699) KAFKA-9971: Error Reporting in Sink Connectors (KIP-610) (apache#8720) KAFKA-10052: Harden assertion of topic settings in Connect integration tests (apache#8735) MINOR: Slight MetadataCache tweaks to avoid unnecessary work (apache#8728) KAFKA-9802; Increase transaction timeout in system tests to reduce flakiness (apache#8736) KAFKA-10050: kafka_log4j_appender.py fixed for JDK11 (apache#8731) KAFKA-9146: Add option to force delete active members in StreamsResetter (apache#8589) ... # Conflicts: # core/src/main/scala/kafka/log/Log.scala
* apache-github/2.6: (32 commits) KAFKA-10083: fix failed testReassignmentWithRandomSubscriptionsAndChanges tests (apache#8786) KAFKA-9945: TopicCommand should support --if-exists and --if-not-exists when --bootstrap-server is used (apache#8737) KAFKA-9320: Enable TLSv1.3 by default (KIP-573) (apache#8695) KAFKA-10082: Fix the failed testMultiConsumerStickyAssignment (apache#8777) MINOR: Remove unused variable to fix spotBugs failure (apache#8779) MINOR: ChangelogReader should poll for duration 0 for standby restore (apache#8773) KAFKA-10030: Allow fetching a key from a single partition (apache#8706) Kafka-10064 Add documentation for KIP-571 (apache#8760) MINOR: Code cleanup and assertion message fixes in Connect integration tests (apache#8750) KAFKA-9987: optimize sticky assignment algorithm for same-subscription case (apache#8668) KAFKA-9392; Clarify deleteAcls javadoc and add test for create/delete timing (apache#7956) KAFKA-10074: Improve performance of `matchingAcls` (apache#8769) KAFKA-9494; Include additional metadata information in DescribeConfig response (KIP-569) (apache#8723) KAFKA-10056; Ensure consumer metadata contains new topics on subscription change (apache#8739) KAFKA-10029; Don't update completedReceives when channels are closed to avoid ConcurrentModificationException (apache#8705) KAFKA-10061; Fix flaky `ReassignPartitionsIntegrationTest.testCancellation` (apache#8749) KAFKA-9130; KIP-518 Allow listing consumer groups per state (apache#8238) KAFKA-9501: convert between active and standby without closing stores (apache#8248) MINOR: Relax Percentiles test (apache#8748) MINOR: regression test for task assignor config (apache#8743) ...
Committer Checklist (excluded from commit message)