Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

fail to update column family definition under certain situation #233

xurenjie opened this Issue Mar 11, 2013 · 4 comments


None yet
3 participants

I am currently cassandra 1.1.6 and astyanax 1.0.6 and I am confused when updating column family definition.

I have a column family 'cfname', and some columns(A, B, C) like below:


After I create this column family like below:

ColumnFamilyDefinition cfDef = cluster.makeColumnFamilyDefinition.setKeyspace("keyspaceName").setName(cfName).setComparatorType(ComparatorType.BYTESTYPE.getTypeName());
ColumnDefinition cd = cfDef.makeColumnDefinition();

I only make column definition for column A, and then I insert values 'a', 'b', 'c' to A, B, C columns of row key 'One'.

After this, I use IndexQuery to query on A and it successfully get the result 'One'. And Then I use following way intending to create index on column B(set targetColumnName to B and needToINdexThisColumn to true):

ColumnFamilyDefinition cfDef = keyspace.describeKeyspace().getColumnFamily('cfname');
List<ColumnDefinition> columnDefs = cfDef.getColumnDefinitionList();
for (ColumnDefinition def : columnDefs) {
   if (def.getName equal to targetColumnName) {
       ColumnDefinition newColumnDef = cfDef.makeColumnDefinition();
       if (needToIndexThisColumn) {
   } else {

Later, when I try to query on column B using IndexQuery, it fails to get the result.

But what confused me is if I cancel index on column A using above way(setting targetColumnName to A and needToIndexThisColumn to false) and then immediately index column A again(setting targetColumnName to A and needToIndexThisColumn to true), and then query on A, it can get the result 'One'.

So it seems that if I didn't index on some column when I created this column family, then I can't update column family definition to create index on this column which doesn't have index. So I checked this by getting column family definition right after cluster.updateColumnFamily and strangely I found this new returned column family definition include the right info showing that there exists index on column B.

And I also check the log info(I am using CassandraDeamon), in the A case(cancel and again create index on A), the log shows like

'build index of indexNameA complete'

but in the B case, there is no such info indicating index has not been created.

So I wonder what's the problem I have encountered, is there something I am missing, and how can I create index on column B successfully by updating column family or is there some other way to do this?

Thanks very much. I appreciate your reply.

I test more, seems that if I first create index then insert value, the value can be indexed correctly(although there is still on log info showing the index has been created completely), I can successfully search the result. But if I first insert a new column into column family then create index on this column, it will fail.
It seems that the new created index can not be applied to the old existing data.
But if I use cassandra-cli, I can successfully do this(first insert value then create index), is there something I am missing?


elandau commented Mar 11, 2013

First, please be aware that secondary indexes can be very problematic so you should really try to find a way to model your data so that you don't need them. With that said, have you tried to rebuild the index after creating the index for B? Here are instructions for rebuilding an index, http://dev.nuclearrooster.com/2013/01/20/using-nodetool-to-rebuild-secondary-indexes-in-cassandra/. Also, you are using a very old version of astyanax. You may want to update to the latest version.

Thanks for your reply.
Can you supply some info or articles showing the secondary indexes are problematic?
And I have tried the version 1.56.26, seems there still existing this problem.
I will try the nodetool rebuild_index but what makes me confused is when I use cassandra-cli, I don't need to use rebuild_index and I can execute cmd like get mycf where 'name'='value' after executing cmd update column family mycf with column_metadata=[{column_name:name, validation_class:BytesType, index_type:0, index_name=idx}];. From my understanding, I guess updateColumnFamily does the same like adding column_metadata, so why it can not index the existing data, or my understanding is wrong?

@ghost ghost assigned shyamalan Apr 1, 2013


shyamalan commented Apr 18, 2013

Hi, I verified that this functionality is working as expected. Need to give a small pause after the update call to make sure the indexes are re-built properly. Also make sure to use a globally uniques name for the name of the index something like:
__Idx or something to that effect.

@shyamalan shyamalan closed this Apr 18, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment