Skip to content
This repository has been archived by the owner on Nov 29, 2022. It is now read-only.

feat: add an optional predicate to table iteration #92

Merged
merged 1 commit into from
Jul 13, 2022

Conversation

huitseeker
Copy link
Contributor

As this is liable to overload memory, the present PR introduces an optional Predicate argument (a Fn(&(Key, Value)) -> bool) which allows filtering a subset of the table's items instead of the whole table.

As this is liable to overload memory, the present PR introduces an optional Predicate argument (a `Fn(&(Key, Value)) -> bool`) which allows filtering a subset of the table's items instead of the whole table.
@mystenmark
Copy link
Contributor

Seems fine to me, although I wonder why we don't just have Iter return a stream, so that the caller can turn it into an iterator and use the full panoply of iterator methods on the result. ("I don't have time for that right now" is a fine answer.)

@huitseeker
Copy link
Contributor Author

although I wonder why we don't just have Iter return a stream, so that the caller can turn it into an iterator and use the full panoply of iterator methods on the result.

"Use the panoply of iterator methods" is confusing me: are you calling for a return of an fully-in-memory object that happens to have iterator semantics (in which case I would note HashMap implements IntoIterator)? Or are you instead calling for an Iterator object that would happen to stream values in memory, one by one?

If the later, the answer of why this is difficult (aka "I don't have time") is because:

  1. Iterator is a trait object, so it's probably much easier to return this as a method on rocksdb::DBMap than trait::Map,
  2. we want this to be accessible from an async method (.iter), which by pattern requires the return going through a channel, which requires pretty annoying constraints on the pointer to the iterator (essentially pinned stack allocation),
  3. and then the Iterator itself does not yield asynchronously.

Past those trifling details I suspect your mind is in the right place, though and that we want to turn this API into one that returns a futures::Stream. Want to open an issue for that?

/cc @laura-makdah for why this API may end up deprecated soon-ish, and what it may be replaced by.

@mystenmark
Copy link
Contributor

Its not important to do this right now - what i should have suggested is using a tokio mpsc channel, which can be adapted by the caller into a stream, which works fine async.

Copy link
Contributor

@lanvidr lanvidr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good intermediate solution, thanks!

@huitseeker huitseeker merged commit 7e9b756 into MystenLabs:main Jul 13, 2022
@huitseeker huitseeker deleted the filter_iter branch January 21, 2023 13:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants