Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Connection problems #70

Closed
wants to merge 23 commits into from

2 participants

Carlos Tasada vinoth chandar
Carlos Tasada

Solves different connection problems. Basically some connection handles were left open, causing "Too Many Open files" errors

Carlos Tasada

Fixes the next stacktrace

ERROR 08/03/2012 07:44:39 [ClientRequestExecutor]/SelectorManagerWorker

java.lang.NullPointerException
at voldemort.store.socket.clientrequest.ClientRequestExecutor.read(ClientRequestExecutor.java:186)
at voldemort.utils.SelectorManagerWorker.run(SelectorManagerWorker.java:98)
at voldemort.utils.SelectorManager.run(SelectorManager.java:194)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Carlos Tasada

Catches properly the next exception:

java.nio.channels.ClosedChannelException
at java.nio.channels.spi.AbstractSelectableChannel.register(AbstractSelectableChannel.java:167)
at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager.processEvents(ClientRequestExecutorFactory.java:341)
at voldemort.utils.SelectorManager.run(SelectorManager.java:172)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Carlos Tasada

When the thread is interrupted, the connection is not properly closed. That seems to be the cause of some ClientRequestExecutor leak as per the java dumps I have.

As a side effect is producing a "Too Many Open Files" problem given enough time.

Carlos Tasada

Fixes a bug introduced in a previous commit:

ERROR 15/03/2012 13:24:19 [ClientRequestExecutorFactory$ClientRequestSelectorManager]/ClientRequestExecutorFactory$ClientRequestSelectorManager

java.nio.channels.ClosedSelectorException
at sun.nio.ch.SelectorImpl.keys(SelectorImpl.java:51)
at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager.processEvents(ClientRequestExecutorFactory.java:370)
at voldemort.utils.SelectorManager.run(SelectorManager.java:172)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Carlos Tasada ctasada Solves bug with unclosed connections when
voldemort.store.UnreachableStoreException: Client response not read/parsed, cannot determine result
	at voldemort.store.socket.clientrequest.AbstractClientRequest.getResult(AbstractClientRequest.java:78)
	at voldemort.store.socket.SocketStore$NonblockingStoreCallbackClientRequest.complete(SocketStore.java:335)
	at voldemort.store.socket.clientrequest.ClientRequestExecutor.completeClientRequest(ClientRequestExecutor.java:273)
	at voldemort.store.socket.clientrequest.ClientRequestExecutor.close(ClientRequestExecutor.java:158)
	at voldemort.store.socket.clientrequest.ClientRequestExecutor.checkTimeout(ClientRequestExecutor.java:83)
	at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager.processEvents(ClientRequestExecutorFactory.java:381)
	at voldemort.utils.SelectorManager.run(SelectorManager.java:172)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:662)
173233b
vinoth chandar vinothchandar commented on the diff
src/java/voldemort/store/socket/SocketStore.java
@@ -248,6 +248,7 @@ public void close() throws VoldemortException {
blockingClientRequest.await();
return blockingClientRequest.getResult();
} catch(InterruptedException e) {
+ clientRequestExecutor.close();
vinoth chandar Collaborator

I am fixing this here. https://github.com/vinothchandar/voldemort/compare/client-conn-cleanup . There was a subtle time out race.

Carlos Tasada
ctasada added a note

Subtle difference, but looks better :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
vinoth chandar vinothchandar commented on the diff
...store/socket/clientrequest/ClientRequestExecutor.java
@@ -156,8 +156,11 @@ public void close() {
if(!isClosed.compareAndSet(false, true))
return;
- completeClientRequest();
- closeInternal();
+ try {
+ completeClientRequest();
+ } finally {
+ closeInternal();
vinoth chandar Collaborator

Just curious. Are you trying to handle the case where checkin() fails and we dont closeInternal? I am asking since checkin by itself will destroy the resource if its invalid. In general can you add all the stack traces you got, to fix these?

Carlos Tasada
ctasada added a note

Hi vinoth,

If the completeClienteRequest() throws some exception, then the closeIntenal() is not executed. In this case the stack trace that happened was:

voldemort.store.UnreachableStoreException: Client response not read/parsed, cannot determine result
at voldemort.store.socket.clientrequest.AbstractClientRequest.getResult(AbstractClientRequest.java:78)
at voldemort.store.socket.SocketStore$NonblockingStoreCallbackClientRequest.complete(SocketStore.java:335)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.completeClientRequest(ClientRequestExecutor.java:273)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.close(ClientRequestExecutor.java:158)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.checkTimeout(ClientRequestExecutor.java:83)
at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager.processEvents(ClientRequestExecutorFactory.java:381)
at voldemort.utils.SelectorManager.run(SelectorManager.java:172)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

I'll try to include always the stack trace. In fact I have it in my own branch, but I forgot to include it in the patch. Sorry

vinoth chandar Collaborator

Cool. I see the issue here. But, if you notice, this is the selector manager thread. The client code (blocking or non blocking) will finally checkin the clientrequestexecutor, at which point the socket will be closed. . But seems good to have anyway. will pull this in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
vinoth chandar vinothchandar commented on the diff
...ocket/clientrequest/ClientRequestExecutorFactory.java
@@ -388,6 +389,10 @@ protected void processEvents() {
} catch(Exception e) {
if(logger.isEnabledFor(Level.ERROR))
logger.error(e.getMessage(), e);
+ } finally {
+ if(closedChannel) {
+ super.close();
vinoth chandar Collaborator

If your concern is accessing the selector even after its closed due to ClosedSelectorException, we should just call close, log the exception and return from the method.. Much cleaner

Carlos Tasada
ctasada added a note

That was exactly my first aproach. But then I discovered I had a memory leak in my code. After some dumping I found that the leak was produced by the next sequence:

Since the close executed, the selector was null and the ClientRequestExecutor where never properly closed. The memory dump was pointing clearly to a leak in the ClientRequestExecutors (I think I had around 60% of the dump with ClientRequestExecutor objects).

When I modified the code to this approach, the memory leak disappeared and I don't have connections leaks anymore, AFAIK.

If you can think of a cleaner solution I'll be happy to test it :)

vinoth chandar Collaborator

Let me think more and get back to you. I will probably pull your changes in my local branch and merge them in.

Carlos Tasada
ctasada added a note

Great :)

vinoth chandar Collaborator

Carlos, can you rebase your master on the latest and have a branch with these two changes alone? I will pull it in and commit.

Carlos Tasada
ctasada added a note

Sure! Just double-checking before doing it. You only need the ClientRequestExecutor.java and ClientRequestExecutorFactory.java changes, right?

vinoth chandar Collaborator

yep. thanks !

Carlos Tasada
ctasada added a note

Done in pull request 73.

I thing this one can be closed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
vinoth chandar
Collaborator

WIll pull in the other one..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Feb 19, 2012
  1. Carlos Tasada

    Minor code optimizations

    ctasada authored
    Imported some code optimizations from mebigfatguy
  2. Carlos Tasada

    Build fixes

    ctasada authored
    Removes the classes folders from the distribution files
  3. Carlos Tasada

    voldemort-env.sh config file

    ctasada authored
    Imported akkumar changes to support an specific voldemort-env.sh config
    file
Commits on Feb 21, 2012
  1. Carlos Tasada

    Fixes in the setenv

    ctasada authored
  2. Carlos Tasada

    Adds windows batch scripts

    ctasada authored
    Fixes issue 268
Commits on Feb 22, 2012
  1. Carlos Tasada
Commits on Feb 27, 2012
  1. Carlos Tasada

    Admin Interface

    ctasada authored
    Shows the real service name, and not the service type
Commits on Feb 28, 2012
  1. Carlos Tasada

    Admin

    ctasada authored
    Fixed bug showing if a Node is available and when was checked last time
Commits on Mar 3, 2012
  1. Carlos Tasada
  2. Carlos Tasada
  3. Carlos Tasada

    Revert "GitHub for Mac: Throw-away commit."

    ctasada authored
    This reverts commit 8a4636c.
Commits on Mar 6, 2012
  1. Carlos Tasada
  2. Carlos Tasada

    Added Windows batch scripts

    ctasada authored
  3. Carlos Tasada
  4. Carlos Tasada
Commits on Mar 8, 2012
  1. Carlos Tasada
  2. Carlos Tasada
  3. Carlos Tasada
  4. Carlos Tasada
  5. Carlos Tasada
  6. Carlos Tasada
Commits on Mar 15, 2012
  1. Carlos Tasada

    Fixed bug in previous commit closing the selector too early when gett…

    ctasada authored
    …ing a ClosedChannelException
Commits on Mar 30, 2012
  1. Carlos Tasada

    Solves bug with unclosed connections when

    ctasada authored
    voldemort.store.UnreachableStoreException: Client response not read/parsed, cannot determine result
    	at voldemort.store.socket.clientrequest.AbstractClientRequest.getResult(AbstractClientRequest.java:78)
    	at voldemort.store.socket.SocketStore$NonblockingStoreCallbackClientRequest.complete(SocketStore.java:335)
    	at voldemort.store.socket.clientrequest.ClientRequestExecutor.completeClientRequest(ClientRequestExecutor.java:273)
    	at voldemort.store.socket.clientrequest.ClientRequestExecutor.close(ClientRequestExecutor.java:158)
    	at voldemort.store.socket.clientrequest.ClientRequestExecutor.checkTimeout(ClientRequestExecutor.java:83)
    	at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager.processEvents(ClientRequestExecutorFactory.java:381)
    	at voldemort.utils.SelectorManager.run(SelectorManager.java:172)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    	at java.lang.Thread.run(Thread.java:662)
This page is out of date. Refresh to see the latest.
1  src/java/voldemort/store/socket/SocketStore.java
View
@@ -248,6 +248,7 @@ public void close() throws VoldemortException {
blockingClientRequest.await();
return blockingClientRequest.getResult();
} catch(InterruptedException e) {
+ clientRequestExecutor.close();
vinoth chandar Collaborator

I am fixing this here. https://github.com/vinothchandar/voldemort/compare/client-conn-cleanup . There was a subtle time out race.

Carlos Tasada
ctasada added a note

Subtle difference, but looks better :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
throw new UnreachableStoreException("Failure in " + operationName + " on "
+ destination + ": " + e.getMessage(), e);
} catch(IOException e) {
35 src/java/voldemort/store/socket/clientrequest/ClientRequestExecutor.java
View
@@ -156,8 +156,11 @@ public void close() {
if(!isClosed.compareAndSet(false, true))
return;
- completeClientRequest();
- closeInternal();
+ try {
+ completeClientRequest();
+ } finally {
+ closeInternal();
vinoth chandar Collaborator

Just curious. Are you trying to handle the case where checkin() fails and we dont closeInternal? I am asking since checkin by itself will destroy the resource if its invalid. In general can you add all the stack traces you got, to fix these?

Carlos Tasada
ctasada added a note

Hi vinoth,

If the completeClienteRequest() throws some exception, then the closeIntenal() is not executed. In this case the stack trace that happened was:

voldemort.store.UnreachableStoreException: Client response not read/parsed, cannot determine result
at voldemort.store.socket.clientrequest.AbstractClientRequest.getResult(AbstractClientRequest.java:78)
at voldemort.store.socket.SocketStore$NonblockingStoreCallbackClientRequest.complete(SocketStore.java:335)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.completeClientRequest(ClientRequestExecutor.java:273)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.close(ClientRequestExecutor.java:158)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.checkTimeout(ClientRequestExecutor.java:83)
at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager.processEvents(ClientRequestExecutorFactory.java:381)
at voldemort.utils.SelectorManager.run(SelectorManager.java:172)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

I'll try to include always the stack trace. In fact I have it in my own branch, but I forgot to include it in the patch. Sorry

vinoth chandar Collaborator

Cool. I see the issue here. But, if you notice, this is the selector manager thread. The client code (blocking or non blocking) will finally checkin the clientrequestexecutor, at which point the socket will be closed. . But seems good to have anyway. will pull this in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ }
}
@Override
@@ -184,21 +187,25 @@ protected void read(SelectionKey selectionKey) throws IOException {
// the position to 0 in preparation for reading in the RequestHandler.
inputStream.getBuffer().flip();
- if(!clientRequest.isCompleteResponse(inputStream.getBuffer())) {
- // Ouch - we're missing some data for a full request, so handle that
- // and return.
- handleIncompleteRequest(position);
- return;
- }
+ if(clientRequest != null) {
+ if(!clientRequest.isCompleteResponse(inputStream.getBuffer())) {
+ // Ouch - we're missing some data for a full request, so handle
+ // that
+ // and return.
+ handleIncompleteRequest(position);
+ return;
+ }
- // At this point we have the full request (and it's not streaming), so
- // rewind the buffer for reading and execute the request.
- inputStream.getBuffer().rewind();
+ // At this point we have the full request (and it's not streaming),
+ // so
+ // rewind the buffer for reading and execute the request.
+ inputStream.getBuffer().rewind();
- if(logger.isTraceEnabled())
- logger.trace("Starting read for " + socketChannel.socket());
+ if(logger.isTraceEnabled())
+ logger.trace("Starting read for " + socketChannel.socket());
- clientRequest.parseResponse(new DataInputStream(inputStream));
+ clientRequest.parseResponse(new DataInputStream(inputStream));
+ }
// At this point we've completed a full stand-alone request. So clear
// our input buffer and prepare for outputting back to the client.
13 src/java/voldemort/store/socket/clientrequest/ClientRequestExecutorFactory.java
View
@@ -18,7 +18,7 @@
import java.net.ConnectException;
import java.net.InetSocketAddress;
-import java.nio.channels.ClosedSelectorException;
+import java.nio.channels.ClosedChannelException;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.SocketChannel;
@@ -321,6 +321,7 @@ public Selector getSelector() {
@Override
protected void processEvents() {
+ boolean closedChannel = false;
try {
ClientRequestExecutor clientRequestExecutor = null;
@@ -342,11 +343,11 @@ protected void processEvents() {
SelectionKey.OP_WRITE,
clientRequestExecutor);
- } catch(ClosedSelectorException e) {
+ } catch(ClosedChannelException e) {
if(logger.isDebugEnabled())
- logger.debug("Selector is closed, exiting");
+ logger.debug("SocketChannel is closed, exiting");
- close();
+ closedChannel = true;
break;
} catch(Exception e) {
@@ -388,6 +389,10 @@ protected void processEvents() {
} catch(Exception e) {
if(logger.isEnabledFor(Level.ERROR))
logger.error(e.getMessage(), e);
+ } finally {
+ if(closedChannel) {
+ super.close();
vinoth chandar Collaborator

If your concern is accessing the selector even after its closed due to ClosedSelectorException, we should just call close, log the exception and return from the method.. Much cleaner

Carlos Tasada
ctasada added a note

That was exactly my first aproach. But then I discovered I had a memory leak in my code. After some dumping I found that the leak was produced by the next sequence:

Since the close executed, the selector was null and the ClientRequestExecutor where never properly closed. The memory dump was pointing clearly to a leak in the ClientRequestExecutors (I think I had around 60% of the dump with ClientRequestExecutor objects).

When I modified the code to this approach, the memory leak disappeared and I don't have connections leaks anymore, AFAIK.

If you can think of a cleaner solution I'll be happy to test it :)

vinoth chandar Collaborator

Let me think more and get back to you. I will probably pull your changes in my local branch and merge them in.

Carlos Tasada
ctasada added a note

Great :)

vinoth chandar Collaborator

Carlos, can you rebase your master on the latest and have a branch with these two changes alone? I will pull it in and commit.

Carlos Tasada
ctasada added a note

Sure! Just double-checking before doing it. You only need the ClientRequestExecutor.java and ClientRequestExecutorFactory.java changes, right?

vinoth chandar Collaborator

yep. thanks !

Carlos Tasada
ctasada added a note

Done in pull request 73.

I thing this one can be closed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ }
}
}
}
Something went wrong with that request. Please try again.