Support for CQL collections data types #182

Closed
mohammedguller opened this Issue Jan 15, 2013 · 41 comments

Projects

None yet

6 participants

@mohammedguller

CQL v3 in Cassandra v1.2 allows collections such as Map, Set, and Lists as data types. Are you planning to support CQL collections in Astyanax?

Thanks,
Mohammed

@elandau
Netflix, Inc. member

I hope to get to that in the next month. I will need to branch for 1.2 support since the thrift API changed.

@mohammedguller

Great. It seems 1.2 (final) has some nice features, so it would be good to have an Astyanax version supporting Cassandra 1.2.

Thanks,
Mohammed

@mohammedguller

Hi Eran - I was wondering whether you got chance to work an adding support for CQL collection data types?

Thanks,
Mohammed

@elandau
Netflix, Inc. member

Astyanax has been updated for cassandra 1.2. You should be able to use all the functionality of CQL3 now. I'm also trying to see if I can support this in thrift as well.

@mohammedguller

That is very cool! Thank you very much.

@vermest

How actually we can use collections with it?
I would expect something like this:
keyspace.prepareQuery(MY_CF).withCql("INSERT INTO my_cf (id ,tags) VALUES (?,?)")
.asPreparedStatement().withStringValue(id)withSetValue(myTags).with.execute();

where tags is a Set

Or is the only way the writeByteBufferValue, and custom serializer?

@elandau
Netflix, Inc. member

Sorry, I still haven't written the serializer for this.

@vermest

Thanks, then it is the byteBufferSerializer for now..

@vermest

Btw, just in case someone else bumps into this... for me this did the trick
package com.seven.oc.pcf.db.astyanax;

import java.nio.ByteBuffer;
import java.util.Set;

import org.apache.cassandra.db.marshal.SetType;
import org.apache.cassandra.db.marshal.UTF8Type;

import com.netflix.astyanax.serializers.AbstractSerializer;

public class SetSerializer extends AbstractSerializer> {

SetType<String> myset = SetType.getInstance(UTF8Type.instance);

@Override
public Set<String> fromByteBuffer(ByteBuffer arg0) {
    return myset.compose(arg0);
}

@Override
public ByteBuffer toByteBuffer(Set<String> arg0) {
    return myset.decompose(arg0);
}

}

@elandau
Netflix, Inc. member

I was actually thinking of changing all the serializers to use the internal cassandra db.marshal classes. Would you be able to make this more generic to support different element types and submit a pull request? Also, can you add support for Map and List.

@vermest

Yes, I can do this. Most probably beginning of next week.

@vermest

Done the pull request on Friday.

@mohammedguller

Which version includes this pull?

BTW, the version on Maven is still from a week ago (1.56.26).

@elandau
Netflix, Inc. member

Sorry, part of this included splitting the project into multiple sub projects. I realize now that I need to create a -bundle for this. For now please including the individual sub projects separately. I updated the wiki to reflect this.
https://github.com/Netflix/astyanax/wiki/Getting-Started

@mohammedguller

Thanks for updating the wiki.

Can you confirm whether 1.56.29 includes the pull request from vermest?

@elandau
Netflix, Inc. member

Yep, it's in.

@mohammedguller

I got 1.56.29 working with 1.2.2. One thing that I am unable to figure out is how to read a column of type Set, Map or List using Astyanax. It would be great if you could update the wiki with a sample code for reading collection type columns. Thanks.

@mohammedguller

Vermest - how are you using the serializers that you wrote? Can you post sample code to show how one can use a set serializer to read a column of type set? Thanks.

@vermest

For example:
ColumnList columnList = row.getColumns();
Set<String> tags = columnList.getValue("tags",
  new SetSerializer<String>(UTF8Type.instance),
  new HashSet<String>());

It is also in my plans to add getSet... getList, etc methods to the interface when I have a bit more time..

@mohammedguller

Thanks for the quick response.

I tried that yesterday before I reached out to you and was getting a compiler error. My code is in Scala, but I have been able to use Astyanax without any issues so far. Here is the error message thrown by the Scala compiler:

type mismatch; found : com.netflix.astyanax.serializers.SetSerializer[String] required: com.netflix.astyanax.Serializer[Object] Note: String <: Object, but Java-defined trait Serializer is invariant in type T. You may wish to investigate a wildcard type such as _ <: Object. (SLS 3.2.10)

@vermest

Ok, i will check with scala as well.

@mohammedguller

Appreciate your help.

I was able to get the code to compile by using casting, but then Scala then throws exception at run-time.

It looks like either the Serializer interface needs to be modified to allow a covariant type or the SetSeralizer needs an appropriate get method. Scala's strict type check system would not allow me to use the code otherwise.

I am not a Java expert, my understanding is that you can make a type covariant using this notation:
Interface Serializer< ? extends T>

@mohammedguller

Another option would be to modify the getValue signature to address the covariance issue.

@mohammedguller

Hi guys - any update? Do you think you will be able to update the Serializer interface or the getValue method so that getValue can be called from Scala for reading a Set, Map or List type?

Alternatively, generic methods such as getSet, getMap, and getList would also be great.

Thanks.

@vermest

Hi, I will look into this beginning of next week.

@rzvoncek

Hello. After struggling to implement my data model using Thrift and Hector, I now managed to get it done using Astyanax quite neatly thanks to this very feature. Therefore thank you so much!

The only issue I have left is that I cannot instantiate the MapSerializer for other pair of K/V types than UTF8Type/UTF8Type. Is there some reason I am unaware of for this to happen?

@vermest

Hm.. MapSerializer should be pretty generic. Could you post some code, which fails for you?

@rzvoncek

In CQL, my map looks like this: planStep map<int,text>
Accessing the data would look like this:

Map<IntegerType,String> myMap = cols.getValue(
    "planStep", 
    new MapSerializer<IntegerType,String>(IntegerType.instance,UTF8Type.instance),
    new HashMap<IntegerType,String>()
 );

The error I am getting says: The constructor MapSerializer(IntegerType, UTF8Type) is undefined

@elandau
Netflix, Inc. member

Try

Map<Integer, String> instead of Map<IntegerType, String>
@rzvoncek

Map instead of Map? I am not sure what you mean.

@elandau
Netflix, Inc. member

Github hid some stuff in the post. I meant

Map<Integer, String> instead of Map<IntegerType, String>
@elandau
Netflix, Inc. member

Oh, also use Int32Type instead of IntegerType. IntegerType is actually a BigInteger.

@rzvoncek

Yes, this solved the issue. Thank you once more.

@mohammedguller

Hi guys - sorry to bug you, but I was curious whether you got chance to look into how the getValue method can be made to work with Scala code for collection data types . To refresh your memory, here is my scala code:

val permissions = columns.getValue("permissions", new SetSerializer[String](UTF8Type.instance), Set[String]()) 

Here is the compiler error that I get:

type mismatch; found : com.netflix.astyanax.serializers.SetSerializer[String] required: com.netflix.astyanax.Serializer[Object] Note: String <: Object, but Java-defined trait Serializer is invariant in type T. You may wish to investigate a wildcard type such as `_ <: Object`. (SLS 3.2.10)

The Scala compiler is giving this error message for the second parameter passed to getValue.

Please let me know if there is anything I can do to help resolve this issue. Thanks.

@mohammedguller

For some reason, github removed the square brackets around the type argument String passed to SetSerializer.
It should read:

val permissions = columns.getValue("permissions", new SetSerializer[ String ] (UTF8Type.instance), Set[ String ]())
@shyamalan shyamalan was assigned Mar 13, 2013
@vermest

Strange, this seems to work for me with cassandra 1.2.2:

var  GET_CHANGES = "SELECT * FROM policy where id=? and version='SAVED'";
var cf = ColumnFamily.newColumnFamily("policy", StringSerializer.get(),
                StringSerializer.get());
var result = context.getEntity().prepareQuery(cf).withCql(GET_CHANGES).asPreparedStatement().withUUIDValue(uuid).execute
     if (result.getResult().getRows().size() == 1) {
       var res = result.getResult().getRows().getRowByIndex(0).getColumns().getValue("tags", new SetSerializer[String](UTF8Type.instance), new HashSet[String]())
       println(res)
     }

tags is set<text>

@mohammedguller

That is indeed strange. I copied and pasted your code to make sure that I was not making some silly typo mistake. I still get the same compiler error for this line:

        val permissions = columns.getValue("permissions", new SetSerializer[String](UTF8Type.instance), new scala.collection.immutable.HashSet[String]())

What version of Scala are you using? I am using Scala 2.10

@vermest

I'm using 2.9.3-20130201-111002-58961c7455 with the eclipse plugin.
I guess, the tiny detail, that can affect this might be, that I'm using java.util.HashSet instead of scala.collection.immutable.HashSet
With the scala one it of course gives compilation error.

@mohammedguller

That was a great catch. Yes, it is not the second parameter, but the third parameter that was causing the compiler error. As soon as I replaced the scala Set with a Java HashSet, the compiler error went away.

Since the rest of my code expects an immutable Scala Set, I had to convert the result returned by getValue to a Scala set, but it is working now.

In case someone else runs into this issue, here is what I did:

  1. import scala.collection.JavaConversions._
  2. permissions.toSet

Thank you very much for your help!

@shyamalan

All CQL collections set,map, list are now supported int astyanax.

@shyamalan shyamalan closed this Apr 2, 2013
@YangZhong

I've been using the latest greatest Astyanax 1.56.44, Column interface has only getValue, following document doesn't mention SetSerializer/ListSerializer/MapSerializer, should I construct a serializer & call getValue please?
HttpS://GitHub.com/Netflix/astyanax/wiki/Cql-and-cql3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment