Dealing with Predicate #23

JordanP opened this Issue Mar 14, 2010 · 9 comments


None yet
2 participants

JordanP commented Mar 14, 2010

At the moment Prandra lacks a correct implementation of Predicate. You can do less with Pandra than with the thrift API.

In Pandra, 3 methods are dealing with cassandra_SlicePredicate :
Core::getRangeKeys, Core::getCFSliceMulti and Core::getCFSlice
In particular, for the last two, you can't use SliceRange Predicate.

IMHO, these 3 methods should not create a cassandra_SlicePredicate on their own, but instead receive it as a parameter. But that will break compatibility with previous version of Pandra, as the number and type of arguments would change.

So I am going to propose, 3 small modifications to these 3 methods, and keep the compatibility.

JordanP commented Mar 14, 2010

Just a thing : in Core::getRangeKeys, one can define a SlicePredicate based on column names() AND SliceRange. It should'nt be so because Thrift API says that "If column_names is specified, slice_range is ignored.".

So in these method, parameters $columnNames and $rangeOptions, should be "fusionned" in one argument, called for instance $slicePredicateProperty (note the singular of Property, to emphasis that $columnNames and $rangeOptions are mutually exclusive)


mjpearson commented Mar 14, 2010

Thanks, that's all good advice I ran into that problem today trying to build an OQL (stubbed in Query.class.php) and it definitely looks to need splitting out. It's much better to get these ironed out early with a small user base rather than later.

btw. if you feel like forking please go ahead and I'd be happy to merge back. I'll resolve what you've mentioned in the next day or two.

JordanP commented Mar 14, 2010

I am just knew to GitHub (and git btw) so I am not very confident with forking. Plus, at the moment, as the user base is quite small it's better if we can talk all together on what to do and how :)

I have just seen you last commit on getCFSliceMulti, where you change the propotype to add SliceRange. I am not sure you're going in the right direction by adding new arguments to this method. Are you planning to add 2 other arguments to support reverse and count property of cassandra_SliceRange ? I mean 8 arguments is already too much.

Here is a patch for Core::getCFSliceMulti and Core::getCFSlice. IMHO these implementations are more "modular" than your, but one have to remember the structure of argument $predicateProperty (you will see in the code). The patch keeps backward compatibility (before your very last commit).

But, in the end, I still think that Core::getCFSliceMulti and Core::getCFSlice have to receive a SlicePredicate as a parameter. There should be a kind of "factory" in Pandra to build a Predicate based on either columns name or SliceRange.

Core::getRangeKeys has to be rewritten as get_range_slice is deprecated in 0.6.

I dont meant to be very "critic" (dont know how to say that in good English), you have done a great work with Pandra.

I have understood you are currently aiming to build an OQL, it could ease the use of Cassandra. But don't forget those who really need a small overhead with Pandra and search for high performance : try to build a OQL as smart as possible (easier to say than to do :)) with the minimum of query possible and do not force your user to use the OQL if they happy with the current methods in Pandra.

It was just my though, keep the good work !

URL for the patch :


mjpearson commented Mar 14, 2010

Hey JordanP, thankyou that's great. Funnily enough the commit I made was a few minutes before reading your earlier message ;) The arguments to those methods are fine to be replaced with a slicePredicate object, in terms of impact it will only need a little refactoring in the ColumnContainer children and perhaps anyone who is calling core directly - which I can't imagine they are because it's improperly implemented as you said. Thanks for taking the time to work through the code base, it's appreciated.

JordanP commented Mar 15, 2010

I finally end up with a class that wraps up cassandra_SlicePredicate class. I called it PandraSlicePredicate :). It is in charge of building a SlicePredicate object where either column_names attribute OR slice_range attribute is defined. This highlights the fact that one can defines two "types" of SlicePredicate based either on column names OR (exclusive or) a SliceRange object.

Here's the class :

Then a modified (forget about the old one) my previous patch, so that PandraCore::getCFSlice and PandraCore::getCFSliceMulti now receive a PandraSliceAttribute as a parameter. This patch should be backward compatible.

The patch :


mjpearson commented Mar 16, 2010

Hey Jordan, thanks a lot :) I'll merge in later today/tonight but have a slight modification - i'm not sure it's necessary to maintain backwards compatability as there's only been 1 tagged stable. Type hinting the predicate to cassandra_SlicePredicate should assure proper usage, and means not having to support two vectors for that argument.

Also in those methods I'm thinking it might be better to reduce the explicit column family/supercolumn arguments to a single cassandra_ColumnParent type. ColumnContainer can then just extend this base type and pass a self reference through to Core.

so, for those Core methods, something like :

static public function getCFSliceMulti($keySpace, array $keyIDs, cassandra_ColumnParent $columnParent, cassandra_SlicePredicate $predicate, $consistencyLevel = NULL) { ...

reduced to 5 arguments. Hopefully it doesn't mess too badly with what you're doing?

JordanP commented Mar 16, 2010

I have taken into account your concideration regarding cassandra_ColumnParent.

The patch :
And again, PandraSlicePredicate with few modifications :

I hope it will do :)


mjpearson commented Mar 17, 2010

Hey :) thanks that's been added and patches merged with a few cleanups. I moved 'Columns' and 'SlicePredicate' keywords into constants to standardise and help code autocompletion. Also changed the constructor a little, the default " protected $_slicePredicateType = 'SliceRange';
" was breaking the 'set' call as it had already been defined.

You should really fork so I can merge back on your user if you intend on doing more work :p I'm tracking bugs through Pivotal Tracker, are you on this? There's also a pandra-dev google group if you're interested in open discussion.

(thanks again, nice one)


JordanP commented Mar 17, 2010

You should really fork so I can merge back on your user if you intend on doing more work

Yep, I am planning to do more, either adding some unit tests or new methods added in cassandra 0.6. So I will probably fork.

I don't know Pivotal Tracker yet, I will take a look at it. I will subscribe to pandra-dev, it's easier for discussion.

This issue was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment