edanuff/CassandraCompositeType
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
CompositeType for Cassandra For a more detailed description of how this might be used, please read: http://www.slideshare.net/edanuff/indexing-in-cassandra http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html For an example implementation of using it for indexing, see: https://github.com/edanuff/CassandraIndexedCollections ------------------------------------------------------------------------------- Note: Be aware that most of the capabilities provided by original CompositeType comparer have been incorporated in the proposed patch described at https://issues.apache.org/jira/browse/CASSANDRA-2231 and are now incorporated into Cassandra 0.8.1. A version of new CompositeType and DynamicCompositeType that has been backported to 7.x is available at: https://github.com/riptano/hector-composite The index-building techniques described here will still be applicable though and should be used in place of SuperColumn-based approaches and, once support for composite-based indexes are incorporated into the major client libraries, will often be preferable to the built-in secondary indexes as well. ------------------------------------------------------------------------------- Although Cassandra provides SuperColumns which allow you to have columns containing columns, it's often desirable to use regular columns and to be able to combine two or more ids into a sortable column name. This is especially important for building inverted indexes for searches or being able to do things like creating a table to map user names sorted by last name then first name to user ids. This can accomplished by concatenating two ids together with a separator, such as "last.first.randomuuid" or "item.version", and assuming that each id is properly passed and compares correctly via bytewise comparison, this approach might work. But, if the ids are numerical or time-based UUIDs for example, the results are usually less than satisfactory. CompositeType is a comparer that allows you to combine the existing Cassandra types into a composite type that will then be compared correctly for each of the component types. To use this, you must specify the comparer in your cassandra.yaml file: column_families: - name: Stuff compare_with: compositecomparer.CompositeType To construct a composite name for a new column, use the following: Composite c = new Composite(); c.addUTF8("smith").addUTF8("bob").addLong(System.currentTimeMillis()); ByteBuffer column_name = c.serializeToByteBuffer(); A convenience method is provided as well, although it makes certain assumptions that you might want to verify are applicable. You use it like this: import static compositecomparer.Composite.serialize; byte[] cname = serialize("smith", "bob", new Long(System.currentTimeMillis())); If you wanted to find all users with the last name "smith" whose name started with "b", you could do the following: byte[] slice_start = serialize("smith", "b"); byte[] slice_end = serialize("smith", "b\uFFFF"); This has also been updated to work with ByteBuffers for Cassandra 7.0: import static compositecomparer.Composite.serializeToByteBuffer; ByteBuffer cname = serializeToByteBuffer("document", version); The composite type is encoded as a byte array consisting of a four byte prefix containing an identifier and a version number, followed by each component part. Each component part starts with 1 byte to specify the component type, and then for variable length types such as ASCII strings, 2 bytes are used for the length of the string.
About
CompositeType for composite column names in Cassandra
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published