Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-6300 SelectorTest may fail with ConcurrentModificationException #4288

Closed
wants to merge 7 commits into from
Closed

Conversation

tedyu
Copy link
Contributor

@tedyu tedyu commented Dec 3, 2017

Synchronization is added w.r.t. sockets ArrayList to avoid ConcurrentModificationException

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@ijuma
Copy link
Contributor

ijuma commented Dec 3, 2017

cc @rajinisivaram

@rajinisivaram
Copy link
Contributor

@tedyu Thanks for the PR.

SslSelectorTest.testImmediatelyConnectedCleaned fails intermittently with ConcurrentModificationException in the tearDownbecause it creates client connections and goes straight into tearDown and closes the server. Since the connections may not have been processed by the server yet, it can still add to the sockets list while the server is iterating through the list to close all sockets. Synchronizing on sockets would avoid ConcurrentModificationException , but it can potentially leak sockets and threads. We need to ensure that once EchoServer#close` is called, no more sockets are added to the list. Any socket processed after that should be closed immediately and the thread to process the sockets should not be created.

@@ -73,7 +74,9 @@ public void run() {
try {
while (true) {
final Socket socket = serverSocket.accept();
sockets.add(socket);
synchronized (sockets) {
if (!closing) sockets.add(socket);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ted, I quickly had a look at this change if you don't mind. My concern is the following:
Let's suppose that if right after serverSocket.accept() a closeConnection() gets scheduled and completes, then this block would indeed run without adding the current socket to a list but it would still start the thread with a leaked socket.
I think we shouldn't even accept any sockets or start threads if the server is being shut down.
Let me know if you disagree or I missed something :).

@tedyu
Copy link
Contributor Author

tedyu commented Dec 5, 2017

@viktorsomogyi
I agree.
I was about to make this change.

@tedyu
Copy link
Contributor Author

tedyu commented Dec 6, 2017

retest this please

} catch (IOException e) {
// ignore
} finally {
synchronized (sockets) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closing flag should be checked inside the synchronized block. Otherwise, this block could be executed after closeConnections on a socket that was already accepted, but not added to the list (this is the cause of the transient test failure). The !closing check in while is not really needed since accept would just throw IOException and return, but you want to keep it, it should be synchronized or made volatile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once closing flag becomes true, the while loop in EchoServer#run() should exit.
What about the last assertion in testServerDisconnect() ?

        assertEquals("hello", blockingRequest(node, "hello"));

It seems the above should be dropped - otherwise it would hang.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ tedyu left comments on fixing this.

for (Socket socket : sockets)
socket.close();
synchronized (sockets) {
closing = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closing should be set in close() rather than closeConnections - i.e. set in when closing server socket, not when closing client connections.


// reconnect and do another request
blockingConnect(node);
assertEquals("hello", blockingRequest(node, "hello"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert this change please, after setting closing flag in EchoServer#close()?

@rajinisivaram
Copy link
Contributor

@tedyu Can you also rebase the PR since it has conflicts? Thanks.

@tedyu
Copy link
Contributor Author

tedyu commented Dec 6, 2017

New PR coming which addresses Rajini's comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants