Connection Pool Not Releasing Connections #29

kguner · 2012-05-18T17:24:24Z

Description: in PooledConnectionProvider.cs _usedConnections is being filled up (100) and not being released.

Usage (my code):

var builder = new ConnectionBuilder(
                keyspace: ConfigurationManager.AppSettings["keyspace"],
                host: ConfigurationManager.AppSettings["CassandraServer"],
                pooling: true);

using (var db = new CassandraContext(ConnectionBuilder()))
            {
                var family = db.GetColumnFamily("Qdex");

                dynamic c = family.Get(sequence).FirstOrDefault();

                foreach (dynamic line in c)
                {
                    lines.Add(Serializer.Deserialize<QdexLine>(line.ColumnValue));
                }

                return lines.AsQueryable<QdexLine>();
            }

in MultiGetColumnFamilySlice.Execute() (comments added)

// This line calls CreateConnection() and Close()
// in PooledConnectionProvider
var schema = ColumnFamily.GetSchema();

            var parent = new CassandraColumnParent {
                ColumnFamily = ColumnFamily.FamilyName
            };

            SlicePredicate = Helper.SetSchemaForSlicePredicate(SlicePredicate, schema);

            Dictionary<byte[], List<Apache.Cassandra.ColumnOrSuperColumn>> output = null;
            try
            {
                 // This line calls CreateConnection() in PooledConnectionProvider
                 // Close() is never called
                 output = Session.GetClient().multiget_slice(
                    Keys,
                    parent,
                    SlicePredicate,
                    Session.ReadConsistency
                );
            }

nberardi · 2012-05-21T22:52:31Z

Closing of connections don't get called in commands, because there are valid reasons to leave them open to the database for multiple instructions to be sent back and forth. Which is what Pooled Connections do.

Can you show me another command that is closing or releasing the connection?

If you want connections to be closed with each call to the server, you don't want to use a connection pool. Connection pools are designed to be intelligent about opening and closing of connections. You want to use the Normal connection provider which opens and closes connections based on the session.

See the differences in how the PooledConnectionProvider differs from the NormalConnectionProvider.

https://github.com/managedfusion/fluentcassandra/blob/master/src/Connections/PooledConnectionProvider.cs
https://github.com/managedfusion/fluentcassandra/blob/master/src/Connections/NormalConnectionProvider.cs

kguner · 2012-05-22T01:28:14Z

Thank you for responding. I didn't express clearly that the problem is not that the connection is not being closed, but rather that it it is not being freed to be reused. Close is merely the name of the method in PooledConnectionProvider.

Close() in PooledConnectionProvider does not actually close the connection, but rather moves it from _usedConnections to _freeConnections queue.

As I tried to show in the code above for every call to family.Get(mykey).FirstOrDefault() two connections are added to _usedConnections and only one of them is moved back to _usedConnections.

After 100 iterations, usedConnections.Count = 100 and I get the message

"No connection could be made,timed out trying to aquire a connection from the connection pool."

Excuse me if I am doing something incorrectly. If you still think there is no problem, I will dig into the codebase deeper.

nberardi · 2012-05-22T02:09:06Z

@kguner Either way you should dig in. Open source software is a collaborative process, it not only relies on people like you contributing issues, but also digging in. FluentCassandra is like most open source projects, managed by one person.

You seem to have an excellent grasp of the problem. So your guess would be as good as mine was to why Close() isn't being called. You can set the break point here to see if the code is ever hit:

https://github.com/managedfusion/fluentcassandra/blob/master/src/Connections/PooledConnectionProvider.cs#L85

But my guess is that it is, and the List.Remove() method isn't working correctly. Here is a little background to get you started. When the IQueryable is enumerated. This method is called.

https://github.com/managedfusion/fluentcassandra/blob/master/src/Operations/CassandraSlicePredicateQuery.cs#L23

Then

https://github.com/managedfusion/fluentcassandra/blob/master/src/BaseCassandraColumnFamily.cs#L145

Then the command is executed

https://github.com/managedfusion/fluentcassandra/blob/master/src/BaseCassandraColumnFamily.cs#L112

Then the session is disposed of.

https://github.com/managedfusion/fluentcassandra/blob/master/src/BaseCassandraColumnFamily.cs#L112

Then the connection is closed as long as it isn't null

https://github.com/managedfusion/fluentcassandra/blob/master/src/CassandraSession.cs#L192

It is not that I don't want to track this down my self, I just don't have the time. And open source software such as FluentCassandra is driven my need and time, and with out both, it doesn't get developed. It sounds like you have the need, so if you have the time to put in, it will be greatly appreciated.

kguner · 2012-05-22T23:40:21Z

Thank you for the breakdown. I will look into this when I get the chance.

edweip3 · 2012-09-04T14:27:04Z

I ran into the exact same problem. the problem is that in CassandraSession class, the GetClient() function is creating the _connection, but returning _connection.Client. and hence a connection is leaked. Instead, the _connection should be created in ExecuteOperation. I've tested that this fixes the problem.

nberardi · 2012-09-04T14:50:04Z

@edweip3 there are cases where GetClient is used and it doesn't go through ExecuteOperation. The connection should stick around as long as the session regardless if only one ExecuteOperation is called or 30 ExecuteOperation's are called.

Also as long as the CassandraSession is disposed of, the connection is Closed against the provider, and that should return it back to the pool. I am not quite sure how creating the connection in ExecuteOperation would solve this. Can you please share your code.

I don't know if I buy the connection being leaked through GetClient is causing the issue.

edweip3 · 2012-09-05T13:51:17Z

@nberardi I think I figured out what is wrong with the code. the problem is in CassandraContext ExecuteOperation function. in the sample code given above by @kguner, CassandraContext is constructed from the ConnectionBuilder, not CassandraSession. so in the CassandraContext.ExecuteOperation function, we create a local session, and then disposes it at the end of the function call. so by the end of this call from the sample code:

dynamic c = family.Get(sequence).FirstOrDefault();

a local session was created and disposed. but on the next line:
foreach (dynamic line in c)
{
lines.Add(Serializer.Deserialize(line.ColumnValue));
}

this causes a connection to be created, because on iterators, the actual call to get the data (i.e. Execute() function on an operation) is delayed until the data is being accessed through the iterator. In most of the Execute() functions of an operation, Session.GetClient() is called. and hence the connection leak.

moving the instantiation of the _connection in CassandraSession from GetClient to ExecuteOperation helped because now the instantiation of _connection is not being delayed for execution. Hence, .Net knows how to free this properly when the time comes. Perhaps this isn't the best solution, but it is working for me.

nberardi · 2012-09-05T14:17:02Z

That makes total sense, I knew those iterators would bite me in the butt eventually. I guess the easy way is to just get rid of the iterators. Luckily they are only used in a few places.

sdether · 2012-10-07T04:54:56Z

It seems the reason the pool handling gets tricky is because CassandraContext creates a CassandraSession per ExecuteOperation if constructed with a ConnectionBuilder. If instead I use the ConnectionBuilder to create a CassandraSession and provide that to the context, only one session is used for the context, which seems reasonable.

Of course, now there seems to be a memory leak and i die with out of memory after about 26k operations, so maybe i'm just shifting the problem around. I assume it's because i am keeping the context around for each worker thread and doing lots of operations, so there must be something holding on to references in the session that would normally get released on a session dispose.

UPDATE: Yep, it's mutation trackers. Not knowing their purpose, I don't know when or if they should be flushed. But since it's part of the Context not the session, I would say it's unrelated and I'll investigate and start a separate issue.

So, unless there is specific reasoning why a context should not hold on to a session during its lifetime, my recommendation would be tying the session to context, which should solve the iterator issue, since those would be invalid if called outside of a context anyhow.

sdether · 2012-10-07T05:56:38Z

Sorry to keep beating on this, but using a CassandraSession per context does not solve the problem after all, it just delays it.

In order to flush the mutators, i was running 100 threads that would create new context after every 1000 records. This had a sustained throughput over about 1700 records/second (supercolumn appends), i.e. ~1-2 context dispose/creates per second and i hit pool exhaustion again after about 200k records.

I backed the context re-creation off to only after every 5000 records and that time it got to 630k records, but hit pool exhaustion once again. Looking at the pool, early on i saw that all 200 connections where in the used part of the pool, but it ran for quite a while after that, so it isn't that every session creation used and lost a connection.

UPDATE: It's getting late. I should do more testing before posting. When you provide a session to a context, the context considers it an outside session and does not dispose it, which means that the only way the connection gets released is by the finalizer getting hit. Since i was putting enough pressure on the GC, the finalizer did hit frequently, which is why i didn't exhaust the pool quickly, but not enough pressure to keep the pool from getting exhausted eventually.

Looks like per context session is a viable solution (and a quite a bit more efficient due to less lock contention and session churn). Here's how i've changed it locally for testing and so far with 200 threads, each recreating the context after 1000 creates for over 1m records, i have not encountered any more pool issues:

        /// <summary>
        /// 
        /// </summary>
        /// <param name="session"></param>
        public CassandraContext(CassandraSession session) : this(session, true) { }

        /// <summary>
        /// 
        /// </summary>
        /// <param name="connectionBuilder"></param>
        public CassandraContext(IConnectionBuilder connectionBuilder) : this(new CassandraSession(connectionBuilder), false) { }

        private CassandraContext(CassandraSession session, bool isOutsideSession) {
            ThrowErrors = true;

            _trackers = new List<IFluentMutationTracker>();
            _session = session;
            _isOutsideSession = isOutsideSession;
            ConnectionBuilder = _session.ConnectionBuilder;
            Keyspace = new CassandraKeyspace(ConnectionBuilder.Keyspace, this);
        }

And in ExecuteOperation, i've removed the concept of a local session.

nberardi · 2012-10-08T19:24:30Z

The pool obviously needs to be redesigned. Just haven't had too much time to figure out a good solution.

sdether · 2012-10-08T22:59:57Z

If you don't have objections to using a session per context, i'd gladly clean-up my changes and send you a pull.

ilushka85 · 2012-10-09T20:06:20Z

Can you send your pull request because we'd gladly use it.

nberardi · 2012-10-09T21:50:43Z

We already support a session per context. You just create a session and pass it into the context.

sdether · 2012-10-09T22:01:26Z

Correct, but then you are responsible for creating the session and passing it to the context and disposing of it afterwards.

My local change abolishes the concept of a local session per execution and instead creates a session from the provided ConnectionBuilder at context creation, disposing it at context disposal (unless it was manually provided, which leaves the disposal responsibility with the caller).

I'll clean up my change and let you decide whether you like it. It's served me well for my stress testing over the weekend.

nberardi · 2012-10-09T22:10:48Z

I looked at your source. The problem with what you have done is that it leaves the connection open for possibly a very long time if somebody say puts it in a static. Or more realistically a global variable, like in a desktop app.

What I am doing here is no different than how other frameworks like EF function. I think the problem is with the Connection Pool. I am not really interested in modifying the behavior of Context, because if the pool released connections correctly under pressure. The current Context wouldn't have any problem.

sdether · 2012-10-09T22:17:33Z

I guess I view the CassandraContext more analogous to getting a Database connection from a pool, where once you have a connection you keep it until you release it. Considering that cassandra is generally lots of small, short calls, creating a session for each call seems like an undue burden, both performance wise and GC pressure wise.

nberardi · 2012-10-09T22:26:24Z

The CassandraSession is more analogous to getting a Database connection from a pool, at least that was the way it was designed. Because that is where the connection is stored. Secondly CassandraContext was designed as a secondary add-on to CassandraSession, which provides all the niceties which make FluentCassandra easy to use. If I had to equate this to the SQL Server interface I would say.

CassandraSession = SqlConnection
CassandraContext = DbContext

In that you can do anything you want with SqlConnection, but if you want all the nice add-ons such as LINQ and everything else that comes with the Entity Framework. You have to wrap SqlConnection with DbContext. The same is true with CassandraSession and CassandraContext, you can use CassandraSession and use all the raw RPC calls against Cassandra with out ever spinning up CassandraContext, however if you want all the nice add-ons such as LINQ and everything else you need to wrap it with CassandraContext.

That was how I designed the two.

sdether · 2012-10-09T22:46:10Z

Ok, but even DbContext doesn't create a new SqlConnection per DB action. It either gets a connection or creates a LazyInternalContext with a LazyInternalConnection, i.e. once a connection is initialized for a DbContext, it stays with that one connection. I can certainly change my code to not initialize the session at context creation time, but on first execute if you prefer.

Just seems that creating sessions on a per execute call is more trouble than its worth, given the performance aspects and the problems with iterators accessing disposed sessions since they leak outside the execute scope.

sdether · 2012-10-10T03:40:00Z

Ok, pull submitted: #73

Once i went for lazy initializing, the ctor changes went away and only difference is removal of local session from Execute

nberardi · 2012-10-10T04:22:52Z

You should check out EnsureConnection. Because there are 4 cases where the connection is opened and closed.

http://entityframework.codeplex.com/SourceControl/changeset/view/0eed9c4e9ebe#src%2fEntityFramework%2fCore%2fObjects%2fObjectContext.cs

Before heading down this route that changes fuctionality for the many users that have been using this library for 2+ year. We should try to determine why you feel this change is needed.

Because the connection pool manages the state of the connection with the server. And sessions are basically free to setup. So if we create one or 100 it shouldn't matter as long as the connection is being returned to the pool effeciently. For non connection pools. The user always has the option to pass in a session. But the safe route is to create a new one rather than relying that the session and connection are still connected.

sdether · 2012-10-10T16:47:15Z

Ok, I take it all back. I was not aware that DBContext could transparently re-connect and even re-acquire a transaction, so i've been projecting my assumptions on CassandraContext. It's not a philosophy I agree with but that's just something I will have to come to terms with.

So what precipitated this proposed change was issues I ran into. My initial goal with my Cassandra R&D was ensuring that it would meet our performance requirements and instead of testing Cassandra, i spent most of a day troubleshooting why the library would run out of connections as soon as I applied any load to it and the connection per execution stood out as the culprit.

However, even if you fix the pool and address the issue of iterators, under load the lock contention on the pool does make the session creation a noticeable overhead, so I would still suggest that CassandraContext at least move to a EnsureConnection approach of only creation a new session when an old one is bad. I can tackle that if you'd like.

nberardi · 2012-10-22T15:18:28Z

I have just updated in version 1.1.9 the thrift library to version 0.9.0. And they have made some nice strides to actually dispose of used connections after they are used. So your churn may increase. Here are the changes:

4cd5b26

One thing that you should note is that I have not yet incorporated the Thrift Dispose() calls into the Connection.Dispose() yet. But I am willing to accept a patch if anybody is interested.

eplowe · 2012-11-15T20:30:42Z

@ilushka85 https://github.com/eplowe/fluentcassandra/tree/connreleaseissue -- So, the change itself was rather simple, from my initial testing all seems well with the world. I removed the concept of a local session in CassandraContext. If ExecuteOperation runs and _session is null, it creates one (I am thinking this might be better to put in the constructor, anyways) --- I added a CloseClient() method to CassandraSession. In the finally block of CassandraContext::ExecuteOperation I call that method. Depending on the pool type being used that will result in the connection being physically closed, or the connection being released back into the pool. This will allow a context that lives in say a static variable, long running loop, etc etc to not hold on to the connection for the lifetime of the context.

eplowe · 2012-11-15T23:25:39Z

OK. I see your point of view. I'll make some changes to keep in line with
the concept of local sessions relative to the executeoperation

On Thu, Nov 15, 2012 at 6:23 PM, Eric Plowe eric.plowe@gmail.com wrote:

Well, the session is still technically local to the context, its just that
instead of disposing it, it just closes the client attached to it and
disposes it when cassandracontext::dispose is called and it was not passed
it on the constructor -- from what I've seen in my testing, there was no
real negative impact of doing that. I'll take a gander again at the thread.

On Thu, Nov 15, 2012 at 6:20 PM, Nick Berardi notifications@github.comwrote:

Looks good except for the ExecuteOperations stuff. As discussed before at
length. Local sessions are here to stay.

Nick Berardi
(484) 302-0125
Sent on the go from my phone.

On Nov 15, 2012, at 3:30 PM, Eric Plowe notifications@github.com
wrote:

@ilushka85 https://github.com/ilushka85
https://github.com/eplowe/fluentcassandra/tree/connreleaseissue -- So,
the
change itself was rather simple, from my initial testing all seems well
with the world. I removed the concept of a local session in
CassandraContext. If ExecuteOperation runs and there _session is null, it
creates one (I am thinking this might be better to put in the
constructor,
anyways) --- I added a CloseClient() method to CassandraSession. In the
finally block of CassandraContext::ExecuteOperation I call that method.
Depending on the pool type being used that will result in the connection
being physically closed, or the connection being released back into the
pool. This will allow a context that lives in say a static variable, long
running loop, etc etc to not hold on to the connection for the lifetime
of
the context.

—
Reply to this email directly or view it on
GitHub<
https://github.com/managedfusion/fluentcassandra/issues/29#issuecomment-10424146>.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-10430243.

nberardi · 2012-11-15T23:26:25Z

I would need actual unit tests showing this vs the current system. I tried
to remove this a while ago and it broke a whole bunch of applications at
banks who use this library, so I have to proceed with caution here.

Nick Berardi
(484) 302-0125
Sent on the go from my phone.

On Nov 15, 2012, at 6:23 PM, Eric Plowe notifications@github.com wrote:

Well, the session is still technically local to the context, its just that
instead of disposing it, it just closes the client attached to it and
disposes it when cassandracontext::dispose is called and it was not passed
it on the constructor -- from what I've seen in my testing, there was no
real negative impact of doing that. I'll take a gander again at the thread.

On Thu, Nov 15, 2012 at 6:20 PM, Nick Berardi notifications@github.comwrote:

Looks good except for the ExecuteOperations stuff. As discussed before at
length. Local sessions are here to stay.

Nick Berardi
(484) 302-0125
Sent on the go from my phone.

On Nov 15, 2012, at 3:30 PM, Eric Plowe notifications@github.com wrote:

@ilushka85 https://github.com/ilushka85
https://github.com/eplowe/fluentcassandra/tree/connreleaseissue -- So,
the
change itself was rather simple, from my initial testing all seems well
with the world. I removed the concept of a local session in
CassandraContext. If ExecuteOperation runs and there _session is null, it
creates one (I am thinking this might be better to put in the
constructor,
anyways) --- I added a CloseClient() method to CassandraSession. In the
finally block of CassandraContext::ExecuteOperation I call that method.
Depending on the pool type being used that will result in the connection
being physically closed, or the connection being released back into the
pool. This will allow a context that lives in say a static variable, long
running loop, etc etc to not hold on to the connection for the lifetime
of
the context.

—
Reply to this email directly or view it on
GitHub<

https://github.com/managedfusion/fluentcassandra/issues/29#issuecomment-10424146>.

—
Reply to this email directly or view it on GitHub<
https://github.com/managedfusion/fluentcassandra/issues/29#issuecomment-10430243>.

—
Reply to this email directly or view it on
GitHubhttps://github.com//issues/29#issuecomment-10430337.

eplowe · 2012-11-15T23:54:50Z

Understand completely. I'll see if I can get the unit tests written tonight
and committed into my branch. If not, it'll have to wait till early next
week as I'll be out of town start tomorrow morning.

On Thu, Nov 15, 2012 at 6:26 PM, Nick Berardi notifications@github.comwrote:

I would need actual unit tests showing this vs the current system. I tried
to remove this a while ago and it broke a whole bunch of applications at
banks who use this library, so I have to proceed with caution here.

Nick Berardi
(484) 302-0125
Sent on the go from my phone.

On Nov 15, 2012, at 6:23 PM, Eric Plowe notifications@github.com wrote:

Well, the session is still technically local to the context, its just that
instead of disposing it, it just closes the client attached to it and
disposes it when cassandracontext::dispose is called and it was not passed
it on the constructor -- from what I've seen in my testing, there was no
real negative impact of doing that. I'll take a gander again at the
thread.

On Thu, Nov 15, 2012 at 6:20 PM, Nick Berardi notifications@github.comwrote:

Looks good except for the ExecuteOperations stuff. As discussed before
at
length. Local sessions are here to stay.

Nick Berardi
(484) 302-0125
Sent on the go from my phone.

On Nov 15, 2012, at 3:30 PM, Eric Plowe notifications@github.com
wrote:

@ilushka85 https://github.com/ilushka85
https://github.com/eplowe/fluentcassandra/tree/connreleaseissue -- So,
the
change itself was rather simple, from my initial testing all seems well
with the world. I removed the concept of a local session in
CassandraContext. If ExecuteOperation runs and there _session is null,
it
creates one (I am thinking this might be better to put in the
constructor,
anyways) --- I added a CloseClient() method to CassandraSession. In the
finally block of CassandraContext::ExecuteOperation I call that method.
Depending on the pool type being used that will result in the connection
being physically closed, or the connection being released back into the
pool. This will allow a context that lives in say a static variable,
long
running loop, etc etc to not hold on to the connection for the lifetime
of
the context.

—
Reply to this email directly or view it on
GitHub<

https://github.com/managedfusion/fluentcassandra/issues/29#issuecomment-10424146>.

—
Reply to this email directly or view it on GitHub<

https://github.com/managedfusion/fluentcassandra/issues/29#issuecomment-10430243>.

—
Reply to this email directly or view it on
GitHub<
https://github.com/managedfusion/fluentcassandra/issues/29#issuecomment-10430337>.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-10430431.

eplowe · 2012-11-16T07:24:36Z

If we force a ToList() in the Execute method of the classes that inherit from QueryableColumnFamilyOperation that will force eager loading instead of the expected lazy loading which causes us the end up in the situation where GetEnumerator() is deferred and CassandraContext::ExecuteOperation() can not dispose the session properly because it's _connection field was still null on the first pass and CassandraSession::GetClient() is called later on with no way to call CassandraSession::Dispose() again I've gone through it for a couple of hours now and all my tests are showing expected results. From what I see we would need to change 6 classes.

e.g:

public override IEnumerable<FluentColumnFamily> Execute()
{
     return GetFamilies(ColumnFamily).ToList();
}

IMHO I think this would be the best route as functionality doesn't change and we have business as normal. If we don't go that route we need to explicitly call a CloseClient/Dispose method in those methods and that would mean the session has now effectively been disposed twice which feels bad to me.

eplowe · 2012-11-16T09:20:33Z

https://github.com/eplowe/fluentcassandra/tree/connreleaseissue -- I've pushed a new set of changes up. In my tests connections were properly released to the pool, unit tests still ran, local session is still local session. I've also added a little tweak to memoize the Enumerator on the first run and then use a cached version so if you were doing something like call Any() to see if anything existed and then calling something else inside, it won't make a another call out to cassandra.

Let me know what you guys think.

ilushka85 · 2012-11-16T13:12:41Z

@eplowe i list an exampled fix in issue #82 that could be fixed by adding the following to all execute operations:

var temp = output.GetEnumerator();
if (Session.WasDisposed)
{
Session.WasDisposed = false;
Session.Dispose();
}

How is this different? I think ideally we need a way to keep lazy loading and still keep proper return of connections to the connection pool

eplowe · 2012-11-16T15:12:34Z

@ilushka85 --- A lot. The way you're doing it is simply trying to fix the
byproduct of whats happening. As I explained before, because of the
deferred execution that comes from the nature of GetEnumerator(), the
session was already disposed once. Something finally triggers a call to
GetEnumerator() and we finally execute the operation, hit cassandra etc etc,
but now there is no way to properly close the client unless we Dispose(), like
you're doing, the session again in the same block that issued the
GetClient() call -- which just feels really wrong. Returning a list stops
that delayed execution, and allows disposing/cleanup to happen at the same
scope. To the client/consumer everything is still the same. The difference
now is that we put the results into memory, instead of yielding one
by one.

It also allows you to do some interesting things e.g. caching the iterator
for that particular instance of CassandraSlicePredicateQuery so multiple
calls to GetEnumerator don't result in another call to cassandra for the
same set of data to just iterate over it again.

Now, we could implement a CloseClient() in CassandraSession, similar to
your fix, and implement it in those 6 classes, but then you're
executing a method (and actually we are right now by calling GetClient())
on a class that has already been explicitly disposed. That just feels bad
to me. This at least keeps inline with the current design for local
session/outside session and when context calls Dispose() on the session if
its local, it's actually going to do what it should be doing.

On Fri, Nov 16, 2012 at 8:12 AM, ilushka85 notifications@github.com wrote:

@eplowe https://github.com/eplowe i list an exampled fix in issue #82 https://github.com/managedfusion/fluentcassandra/issues/82that could be fixed by adding the following to all execute operations:

var temp = output.GetEnumerator();
if (Session.WasDisposed)
{
Session.WasDisposed = false;
Session.Dispose();
}

How is this different? I think ideally we need a way to keep lazy loading
and still keep proper return of connections to the connection pool

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-10446401.

ilushka85 · 2012-11-16T15:25:37Z

@eplowe I agree that your method has its bennefits as I previously called mine just a hack to get things to work. But i still feel that both yours and mine have one key flaw and thats the requirement of calling getenumerator when its unintended. I think to preserve original and intended functionality a call needs to not execute on the server unless an operation is done to cause it.

Calling ToList() means we are forcing it.

eplowe · 2012-11-16T15:45:33Z

I agree, but I also tend to believe that if you're requesting it, you want
the data. So if its requested now vs. later is a philosophical debate IMO.
To keep lazy loading would require, unless I am not seeing the forest for
the trees, a big overhaul to how fluentcassandra operates internally.

On Fri, Nov 16, 2012 at 10:25 AM, ilushka85 notifications@github.comwrote:

@eplowe https://github.com/eplowe I agree that your method has its
bennefits as I previously called mine just a hack to get things to work.
But i still feel that both yours and mine have one key flaw and thats the
requirement of calling getenumerator when its unintended. I think to
preserve original and intended functionality a call needs to not execute on
the server unless an operation is done to cause it.

Calling ToList() means we are forcing it.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-10450401.

eplowe · 2012-11-20T19:13:03Z

Does anyone else want to chime in on this? Nick?
On Nov 16, 2012 10:45 AM, "Eric Plowe" eric.plowe@gmail.com wrote:

I agree, but I also tend to believe that if you're requesting it, you want
the data. So if its requested now vs. later is a philosophical debate IMO.
To keep lazy loading would require, unless I am not seeing the forest for
the trees, a big overhaul to how fluentcassandra operates internally.

On Fri, Nov 16, 2012 at 10:25 AM, ilushka85 notifications@github.comwrote:

@eplowe https://github.com/eplowe I agree that your method has its
bennefits as I previously called mine just a hack to get things to work.
But i still feel that both yours and mine have one key flaw and thats the
requirement of calling getenumerator when its unintended. I think to
preserve original and intended functionality a call needs to not execute on
the server unless an operation is done to cause it.

Calling ToList() means we are forcing it.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-10450401.

nberardi · 2012-11-20T19:32:46Z

I have wrestled with this myself, and since these are RPC commands where the data is either returned or it isn't. The commands shouldn't really shouldn't be lazy. If anything should be lazy, it should be the Fluent objects such as Column, Record, or Family.

I personally think all "yeild return" should be removed from the RPC based commands. And for the case of the upcoming binary CQL access, they should implement an async model.

Lets leave the lazy loading to the Fluent objects where nessisary.

eplowe · 2012-11-20T19:38:16Z

So based on what you said that's what I am doing: converted all the
execute/rpc calls to return a list which forces eager loading on the rpc
calls but without affecting the implementation from the perspective of the
end user.
On Nov 20, 2012 2:32 PM, "Nick Berardi" notifications@github.com wrote:

I have wrestled with this myself, and since these are RPC commands where
the data is either returned or it isn't. The commands shouldn't really
shouldn't be lazy. If anything should be lazy, it should be the Fluent
objects such as Column, Record, or Family.

I personally think all "yeild return" should be removed from the RPC based
commands. And for the case of the upcoming binary CQL access, they should
implement an async model.

Lets leave the lazy loading to the Fluent objects where nessisary.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-10569191.

nberardi · 2012-11-20T19:38:50Z

@eplowe sounds good

ilushka85 · 2012-11-23T13:35:14Z

@nberardi when do you plan to merge this into the main fork?

nberardi · 2012-11-23T13:59:56Z

It was merged.

Nick Berardi
(484) 302-0125
Sent on the go from my phone.

On Nov 23, 2012, at 8:35 AM, ilushka85 notifications@github.com wrote:

@nberardi https://github.com/nberardi when do you plan to merge this into
the main fork?

—
Reply to this email directly or view it on
GitHubhttps://github.com//issues/29#issuecomment-10659642.

eplowe · 2012-11-23T14:48:22Z

I didn't did a pull request for that branch. I can send one if you want.

nberardi · 2012-11-23T15:34:01Z

I need one. It is a pain to do it with out. Plus a pull request starts a
conversation about the actual change, which is what I want for these types
of changes. Make sure you reference the issue in the pull request.

On Fri, Nov 23, 2012 at 9:48 AM, Eric Plowe notifications@github.comwrote:

I didn't did a pull request for that branch. I can send one if you want.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-10661443.

nberardi · 2012-11-25T15:34:11Z

@eplowe any luck in sending the pull request. I am sending the current changes to NuGet on Dec 1.

nberardi · 2012-11-27T20:03:04Z

We will need to see if pull request #90 fixes this issue. I would love to close this ticket.

eplowe · 2012-11-27T20:08:29Z

Based on my testing everything seemed kosher. Then again, I just started digging my nails into the code base recently so you might see find a use case I didn't take into account. Btw, I am on the plane to Vegas for the AWS re:invent conf. You going?

nberardi · 2012-11-27T20:16:26Z

I wish, but no. :)

On Tue, Nov 27, 2012 at 3:08 PM, Eric Plowe notifications@github.comwrote:

Based on my testing everything seemed kosher. Then again, I just started
digging my nails into the code base recently so you might see find a use
case I didn't take into account. Btw, I am on the plane to Vegas for the
AWS re:invent conf. You going?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-10774564.

nberardi · 2013-01-03T01:45:44Z

Closing because everything seems stable. Thanks for the work everybody.

ilushka85 · 2013-01-10T21:05:22Z

We are still encountering issues where cassandra is dropping / closing / timeing out connections after this happening many times the connection pool will never return a connection despite cluster being functional.

eplowe · 2013-01-10T21:39:43Z

Some of the errors you listed will blacklist a server. The default recovery time is 30 seconds. If you get enough of those in that window you could exhaust your pool. I would look into why you're getting dropped / closed / timing out connections cassandra in such a constant rate as to cause that to happen. You may need to tune your cluster/servers, or scale out, etc etc.

ilushka85 · 2013-01-12T21:39:10Z

@eplowe what happens is that a timeout occurs on one secondary index(we are trying to figure out why) and after this it corrects itself but by the time it does fluent will never grant a connection.... its like the servers never come off the blacklist... basically after we hav etheh initial set of timeouts the only errors we get is timing out waiting for a connection from pool

eplowe · 2013-01-12T21:44:13Z

I would step through it. Like I said the default recovery timer is 30
seconds. When the recovery method runs it tries to open a connection. if
its successful, its put back in the server queue. Can you create a unit
test simulating your situation?
On Jan 12, 2013 4:39 PM, "Ilya Beyrak" notifications@github.com wrote:

@eplowe https://github.com/eplowe what happens is that a timeout occurs
on one secondary index(we are trying to figure out why) and after this it
corrects itself but by the time it does fluent will never grant a
connection.... its like the servers never come off the blacklist...
basically after we hav etheh initial set of timeouts the only errors we get
is timing out waiting for a connection from pool

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-12184949.

ilushka85 · 2013-01-12T21:46:11Z

@eplowe i will try this but it will be very hard to reproduce in our test environment. It appears the recovery method never runs since we can leave the application running for 24 hours and the only errors that will happen is that it cant get a connection from the pool. Restarting the app at any point then makes it work fine.

eplowe · 2013-01-12T21:47:38Z

Are you running the most recent release of fluentcassandra?
On Jan 12, 2013 4:46 PM, "Ilya Beyrak" notifications@github.com wrote:

@eplowe https://github.com/eplowe i will try this but it will be very
hard to reproduce in our test environment. It appears the recovery method
never runs since we can leave the application running for 24 hours and the
only errors that will happen is that it cant get a connection from the
pool. Restarting the app at any point then makes it work fine.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-12185054.

ghost assigned nberardi May 22, 2012

sdether mentioned this issue Oct 7, 2012

Socket failures under sustained load #70

Closed

sdether mentioned this issue Oct 10, 2012

One session per context #73

Closed

This was referenced Oct 22, 2012

Connection pool not getting cleaned up on first socket error #80

Closed

Connections not being closed in latest version #82

Closed

dlbromen mentioned this issue Nov 15, 2012

Black Listing To Aggressive #51

Closed

nberardi mentioned this issue Nov 27, 2012

Connreleaseissue #90

Merged

nberardi closed this as completed Jan 3, 2013

Connection Pool Not Releasing Connections #29

Connection Pool Not Releasing Connections #29

Comments

kguner commented May 18, 2012

nberardi commented May 21, 2012

kguner commented May 22, 2012

nberardi commented May 22, 2012

kguner commented May 22, 2012

edweip3 commented Sep 4, 2012

nberardi commented Sep 4, 2012

edweip3 commented Sep 5, 2012

nberardi commented Sep 5, 2012

sdether commented Oct 7, 2012

sdether commented Oct 7, 2012

nberardi commented Oct 8, 2012

sdether commented Oct 8, 2012

ilushka85 commented Oct 9, 2012

nberardi commented Oct 9, 2012

sdether commented Oct 9, 2012

nberardi commented Oct 9, 2012

sdether commented Oct 9, 2012

nberardi commented Oct 9, 2012

sdether commented Oct 9, 2012

sdether commented Oct 10, 2012

nberardi commented Oct 10, 2012

sdether commented Oct 10, 2012

nberardi commented Oct 22, 2012

eplowe commented Nov 15, 2012

eplowe commented Nov 15, 2012

nberardi commented Nov 15, 2012

eplowe commented Nov 15, 2012

eplowe commented Nov 16, 2012

eplowe commented Nov 16, 2012

ilushka85 commented Nov 16, 2012

eplowe commented Nov 16, 2012

ilushka85 commented Nov 16, 2012

eplowe commented Nov 16, 2012

eplowe commented Nov 20, 2012

nberardi commented Nov 20, 2012

eplowe commented Nov 20, 2012

nberardi commented Nov 20, 2012

ilushka85 commented Nov 23, 2012

nberardi commented Nov 23, 2012

eplowe commented Nov 23, 2012

nberardi commented Nov 23, 2012

nberardi commented Nov 25, 2012

nberardi commented Nov 27, 2012

eplowe commented Nov 27, 2012

nberardi commented Nov 27, 2012

nberardi commented Jan 3, 2013

ilushka85 commented Jan 10, 2013

eplowe commented Jan 10, 2013

ilushka85 commented Jan 12, 2013

eplowe commented Jan 12, 2013

ilushka85 commented Jan 12, 2013

eplowe commented Jan 12, 2013