All subsequent Cassandra inserts fail after one insert fails due to an UnavailableException #127

Closed
tanayamitshah opened this Issue Mar 29, 2013 · 2 comments

Projects

None yet

4 participants

@tanayamitshah

This is the first time I am submitting an issue and my explanation may seem convoluted. I apologize before hand for the same. Feel free to edit the contents of this description.

There is an error in the Cassandra driver which causes all inserts to fail after one insert fails due to UnavailableException (the execption that occurs when not enough nodes are up to serve the write request with a consistency level which is not ANY). Please refer to the src file: cassandra/src/main/java/com/yahoo/ycsb/db/CassandraClient10.java

The problem is that the Cassandra client stores all the mutations that it needs to perform on the columns of one key as one operation in List mutations, Map<String, List> mutationMap and Map<ByteBuffer, Map<String, List>> record member variables (Line 89-91).

After an insert is performed, these variable have to be cleared (Line 479-481). However if an exception is thrown in the client.batch_mutate(record, writeConsistencyLevel) call (Line 477) then these variables don't get cleared as the flow directly jumps to the corresponding catch block.

Since these variables are not cleared, the next batch_mutate also causes all subsequent writes to fail because though it adds new writes to mutation variables, it still tries to perform the old failed mutation again and again which is not going to succeed as not enough nodes are up.

I'm not having any problems running the CassandraClient10. I did have one issue (though I cant remember if it was exactly the same). My C* was running on a different port so I patched the class file with my port number and updated the jar. Then worked fine for me in two different environments.

Collaborator
busbey commented Jan 31, 2016

I confirmed this is still present in at least the Cassandra10Client. The saved containers should either be cleared in a finally block or at the start of the loop. Or removed entirely.

The issue isn't present in the CQL based clients we're moving folks to.

@busbey busbey added a commit to busbey/YCSB that referenced this issue Mar 26, 2016
@busbey busbey [cassandra] properly clear cached containers.
clear containers before we use them rather than after, so that in the case of
exceptions we can recover.

closes #127
63956fb
@busbey busbey closed this in #673 Mar 30, 2016
@omerzilb omerzilb added a commit to omerzilb/YCSB that referenced this issue Apr 5, 2016
@busbey @omerzilb busbey + omerzilb [cassandra] properly clear cached containers.
clear containers before we use them rather than after, so that in the case of
exceptions we can recover.

closes #127
a7b69ee
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment