ClientProxy and connection have a circular reference, causing Python not to be able to garbage collect. #25

Closed
wants to merge 2 commits into
from

Projects

None yet

2 participants

@rogerhu
Contributor
rogerhu commented Jun 3, 2013

Found a circular reference that may have been causing memory leak issues. The offending code appears to be the _ClientProxy() class that takes in a connection object. Since the connection.client points to _ClientProxy() and _ClientProxy() instance is pointing back to connection, Python fails to call the del function inside the Connection class.

image

 def _create_connection(self):
     """Create a new connection with monkey-patched Thrift client."""
     connection = Connection(**self._connection_kwargs)
     connection.client = _ClientProxy(connection, connection.client)
     return connection

See the problem? ClientProxy points to connection, and connection.client points to ClientProxy.

I put inside the del(self) of connection to verify the connection will die:

+ bl/happybase-0.5/connection.py
 -167,6 +167,7 @@ class Connection(object)
       self.transport.close()

   def __del__(self):
       print "dying"
       try:
           self._initialized
       except AttributeError:
from happybase.pool import ConnectionPool

pool = ConnectionPool(3, host='myhost.com', port=9090)

with pool.connection() as conn:
    print conn

import objgraph
objgraph.show_backrefs([conn, conn.client])

With this fix, the print "dying" appears and memory is reclaimed after the ConnectionPool() instead goes out of scope (http://eli.thegreenplace.net/2009/06/12/safely-using-destructors-in-python/).

Is there a way to avoid this issue by avoiding needing to pass the connection object into the ClientProxy?

@rogerhu
Contributor
rogerhu commented Jun 4, 2013

Tried to implement the ClientProxy as a new-style class that decorates the function...let me know what you think.

@wbolster
Owner
wbolster commented Jun 4, 2013

Hmmm, I think you're right: the current approach introduces a circular reference. Thanks for bringing this to my attention. (And I didn't know about objgraph; thanks for the pointer.)

I have a different solution in mind though. The docs at http://docs.python.org/2.7/library/contextlib.html#contextlib.contextmanager contain this snippet:

Thus, you can use a try...except...finally statement to trap the error (if any), or ensure that some cleanup takes place. If an exception is trapped merely in order to log it or to perform some action (rather than to suppress it entirely), the generator must reraise that exception.

So I think I'm going to completely eliminate the proxy by handling any exceptions inside the pool.connection() context manager. Sounds good?

@rogerhu
Contributor
rogerhu commented Jun 4, 2013

That works too.. simpler. :)

Regenerate the Thrift bindings too? There is a Thrift 0.9.0 now.. :)

@wbolster wbolster added a commit that referenced this pull request Jun 4, 2013
@wbolster Rewrite connection pool exception handling
The connection pool now uses a much simpler try/except/finally block in
the context manager function to detect network/Thrift errors. The pool
will now only refreshes connections when Thrift or socket errors occur,
and will not react to unrelated application errors anymore.

With this approach, the _ClientProxy hack is not needed anymore, so it
has been completely eliminated.

See issue #25.
615e2eb
@wbolster
Owner
wbolster commented Jun 4, 2013

Could you please test my latest connection pool changes and check whether it solves the original problems you reported? Thanks in advance!

@rogerhu
Contributor
rogerhu commented Jun 5, 2013

Yep seems like this issue is fixed.

The test in there is an extra sanity check but don't think it's absolutely necessary now that it's been rewritten.

@rogerhu rogerhu closed this Jun 5, 2013
@wbolster
Owner
wbolster commented Jun 5, 2013

Wrt. your inline comments at commit commit 615e2eb:

If we're raising back to the caller, this means we're not continuing with the rest of the with statement? Wondering whether things like socket.timeout's should be explicitly caught and returned back to the pool without reraising...

No, once an exception is raised, the with block ends, but the "cleanup" routine (__exit__()) does run. This is how the context manager protocol works.

Also I'm wondering if the connection pool can be depleted and not added back into the pool if all connections go bad when hitting these exceptions...still checking..

No, that should not be possible. The except block refreshes the underlying socket if an exception occurred, so after this point the happybase.Connection is fresh again (because the underlying socket is fresh). The finally clause guarantees that the connection returns to the pool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment