Connection problems #70

Closed
wants to merge 23 commits into
from
Commits
+31 −18
Split
@@ -248,6 +248,7 @@ public void close() throws VoldemortException {
blockingClientRequest.await();
return blockingClientRequest.getResult();
} catch(InterruptedException e) {
+ clientRequestExecutor.close();
vinothchandar
vinothchandar Apr 3, 2012 Collaborator

I am fixing this here. https://github.com/vinothchandar/voldemort/compare/client-conn-cleanup . There was a subtle time out race.

ctasada
ctasada Apr 4, 2012 Contributor

Subtle difference, but looks better :)

throw new UnreachableStoreException("Failure in " + operationName + " on "
+ destination + ": " + e.getMessage(), e);
} catch(IOException e) {
@@ -156,8 +156,11 @@ public void close() {
if(!isClosed.compareAndSet(false, true))
return;
- completeClientRequest();
- closeInternal();
+ try {
+ completeClientRequest();
+ } finally {
+ closeInternal();
vinothchandar
vinothchandar Apr 4, 2012 Collaborator

Just curious. Are you trying to handle the case where checkin() fails and we dont closeInternal? I am asking since checkin by itself will destroy the resource if its invalid. In general can you add all the stack traces you got, to fix these?

ctasada
ctasada Apr 5, 2012 Contributor

Hi vinoth,

If the completeClienteRequest() throws some exception, then the closeIntenal() is not executed. In this case the stack trace that happened was:

voldemort.store.UnreachableStoreException: Client response not read/parsed, cannot determine result
at voldemort.store.socket.clientrequest.AbstractClientRequest.getResult(AbstractClientRequest.java:78)
at voldemort.store.socket.SocketStore$NonblockingStoreCallbackClientRequest.complete(SocketStore.java:335)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.completeClientRequest(ClientRequestExecutor.java:273)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.close(ClientRequestExecutor.java:158)
at voldemort.store.socket.clientrequest.ClientRequestExecutor.checkTimeout(ClientRequestExecutor.java:83)
at voldemort.store.socket.clientrequest.ClientRequestExecutorFactory$ClientRequestSelectorManager.processEvents(ClientRequestExecutorFactory.java:381)
at voldemort.utils.SelectorManager.run(SelectorManager.java:172)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

I'll try to include always the stack trace. In fact I have it in my own branch, but I forgot to include it in the patch. Sorry

vinothchandar
vinothchandar Apr 6, 2012 Collaborator

Cool. I see the issue here. But, if you notice, this is the selector manager thread. The client code (blocking or non blocking) will finally checkin the clientrequestexecutor, at which point the socket will be closed. . But seems good to have anyway. will pull this in.

+ }
}
@Override
@@ -184,21 +187,25 @@ protected void read(SelectionKey selectionKey) throws IOException {
// the position to 0 in preparation for reading in the RequestHandler.
inputStream.getBuffer().flip();
- if(!clientRequest.isCompleteResponse(inputStream.getBuffer())) {
- // Ouch - we're missing some data for a full request, so handle that
- // and return.
- handleIncompleteRequest(position);
- return;
- }
+ if(clientRequest != null) {
+ if(!clientRequest.isCompleteResponse(inputStream.getBuffer())) {
+ // Ouch - we're missing some data for a full request, so handle
+ // that
+ // and return.
+ handleIncompleteRequest(position);
+ return;
+ }
- // At this point we have the full request (and it's not streaming), so
- // rewind the buffer for reading and execute the request.
- inputStream.getBuffer().rewind();
+ // At this point we have the full request (and it's not streaming),
+ // so
+ // rewind the buffer for reading and execute the request.
+ inputStream.getBuffer().rewind();
- if(logger.isTraceEnabled())
- logger.trace("Starting read for " + socketChannel.socket());
+ if(logger.isTraceEnabled())
+ logger.trace("Starting read for " + socketChannel.socket());
- clientRequest.parseResponse(new DataInputStream(inputStream));
+ clientRequest.parseResponse(new DataInputStream(inputStream));
+ }
// At this point we've completed a full stand-alone request. So clear
// our input buffer and prepare for outputting back to the client.
@@ -18,7 +18,7 @@
import java.net.ConnectException;
import java.net.InetSocketAddress;
-import java.nio.channels.ClosedSelectorException;
+import java.nio.channels.ClosedChannelException;
import java.nio.channels.SelectionKey;
import java.nio.channels.Selector;
import java.nio.channels.SocketChannel;
@@ -321,6 +321,7 @@ public Selector getSelector() {
@Override
protected void processEvents() {
+ boolean closedChannel = false;
try {
ClientRequestExecutor clientRequestExecutor = null;
@@ -342,11 +343,11 @@ protected void processEvents() {
SelectionKey.OP_WRITE,
clientRequestExecutor);
- } catch(ClosedSelectorException e) {
+ } catch(ClosedChannelException e) {
if(logger.isDebugEnabled())
- logger.debug("Selector is closed, exiting");
+ logger.debug("SocketChannel is closed, exiting");
- close();
+ closedChannel = true;
break;
} catch(Exception e) {
@@ -388,6 +389,10 @@ protected void processEvents() {
} catch(Exception e) {
if(logger.isEnabledFor(Level.ERROR))
logger.error(e.getMessage(), e);
+ } finally {
+ if(closedChannel) {
+ super.close();
vinothchandar
vinothchandar Apr 4, 2012 Collaborator

If your concern is accessing the selector even after its closed due to ClosedSelectorException, we should just call close, log the exception and return from the method.. Much cleaner

ctasada
ctasada Apr 5, 2012 Contributor

That was exactly my first aproach. But then I discovered I had a memory leak in my code. After some dumping I found that the leak was produced by the next sequence:

Since the close executed, the selector was null and the ClientRequestExecutor where never properly closed. The memory dump was pointing clearly to a leak in the ClientRequestExecutors (I think I had around 60% of the dump with ClientRequestExecutor objects).

When I modified the code to this approach, the memory leak disappeared and I don't have connections leaks anymore, AFAIK.

If you can think of a cleaner solution I'll be happy to test it :)

vinothchandar
vinothchandar Apr 6, 2012 Collaborator

Let me think more and get back to you. I will probably pull your changes in my local branch and merge them in.

ctasada
ctasada Apr 6, 2012 Contributor

Great :)

vinothchandar
vinothchandar Apr 18, 2012 Collaborator

Carlos, can you rebase your master on the latest and have a branch with these two changes alone? I will pull it in and commit.

ctasada
ctasada Apr 18, 2012 Contributor

Sure! Just double-checking before doing it. You only need the ClientRequestExecutor.java and ClientRequestExecutorFactory.java changes, right?

vinothchandar
vinothchandar Apr 18, 2012 Collaborator

yep. thanks !

ctasada
ctasada Apr 19, 2012 Contributor

Done in pull request 73.

I thing this one can be closed :)

+ }
}
}
}