Conversation
… especially on large chunks
Hi @SalomonBrys thx for the PR, I'll take a look & get back to you soonish! |
Hi @SalomonBrys
I merged into Cheers |
Hi @nhachicha I do agree with Closing the iterator when reaching the end of the collection, but what is the purpose of line 53 since this is exactly what close() does ? I've thought a bit about I think that, 90% of the time, One could argue that a for-each with KeyIterator it = db.findKeysIterator("prefix");
while (it.hasNext()) {
String[] keys = it.next(BATCH_SIZE);
for (String key : keys) {
/*...*/
}
}
it.close(); For now, I haven't found a usage of The loop-on-batch algorithm written above could even be integrated on the library with What do you think ? |
Hey @SalomonBrys line 53 is useless indeed, good catch :) You raised a good point in your analysis
Average Time complexity is ~ Bottom line:
The size of key set IMO drives the decision to use one approach over the other. Hence, we should give this choice to the user by highlighting the pro/con of each approach via javadoc. WDYT |
I agree with everything you said, except that, even on large key set, we should not allow to iterate over all keys one by one. Such case would effectively mean that M == N and therefore we would encourage the worst case scenario by allowing the very simple I think we should "force" the user to reduce M by having to handle keys by batch and force him to iterate on key batchs and not keys: |
Ok, I like this pattern where only |
OK, I'll re-fork the project and work on this feature in devel ;) |
Hi,
this pull request adds three features that I need to use SnappyDB in my future app.
My future app will manage chat objects, potentially tens of thousands of them.
Count
Counting is in fact loading data and counting the number of entries. The problem I have with the
findKeys("prefix").length
is that all keys are loaded into the JVM as managed Strings that I will not use. For small collections, this is not an issue, but for very big collections (as I plan to use SnappyDB for), counting in C++ and not having the JVM manage unused key strings is, I think, important.Hence, this request adds
countKeys
andcountKeysBetween
.Offset & limit
@tieubao asked it in #21. It allows paging (however, iterators are more suited) and allows to access directly a limited view of the key collection.
Iterators
This is the most important feature. It allows to acces data one by one or by batch and to keep a "pointer" to the position inside the collection so that we can continue browsing later. It is very useful for a huge collection because:
I have documented and tested these three features.
I have also tried to respect your code style, both in C++ and in Java.
Let me know what you think ;)
Salomon