Skip to content
This repository has been archived by the owner on Aug 4, 2020. It is now read-only.

IndexesColumnFamilyIterator::get_buffer() returns duplicated rows #145

Closed
vladbalmos opened this issue Jan 31, 2014 · 2 comments
Closed

Comments

@vladbalmos
Copy link

I'm running the following query using v1.0.a.5 on cassandra 1.0.3

$handle = new ColumnFamily($conn, 'transactions');
$handle->return_format = ColumnFamiliy::ARRAY_FORMAT;
$indexExpr = new IndexExpression('userID', 39);
$indexClause = new IndexClause(array($indexExpr), '', 5000);
$rows = $handle->get_indexed_slices($IndexClause);

$rows should return only ~250 results for that specific userID, instead it returns 5000 records (the count value for the index clause).
It basically duplicates the valid 250 rows until it fills the 5000 limit.
I came to that conclusion by digging into IndexedColumnFamilyIterator and writing the $current_buffer to file during a request
and then checking the primary keys for all those records on a first pass, then checking the serialized footprint for each row on a second pass.
As I said, only ~250 records are unique.

Is there something wrong with my query? Is it phpcassa or a cassandra bug?

Thank you very much!

@vladbalmos
Copy link
Author

running the same query in cassandra-cli returns the correct number of records:

get transactions where userID = 39

250 Rows Returned
Elapsed time: 2515 msec(s).

@thobbs
Copy link
Owner

thobbs commented Feb 4, 2014

I can't reproduce this on the latest master, and I think what you're seeing is this: fbdc231.

Upgrading to 1.0.a.6 or 1.1.0 should resolve the problem.

@thobbs thobbs closed this as completed Feb 4, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants