Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upordered query API #1195
Conversation
Gankro
added some commits
Jul 9, 2015
Gankro
added
the
T-libs
label
Jul 9, 2015
Gankro
self-assigned this
Jul 9, 2015
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Looks good to me. I assume you went with Should there also be Along those lines, we could also provide something like fn last_entry(&mut self) -> Option<OccupiedEntry<K, V>>;
fn pred_inc_entry<Q: ?Sized>(&mut self, key: &Q) -> Option<OccupiedEntry<K, V>>
where K: Borrow<Q>, Q: Ord;
...I've been experimenting with this in my BST library. It's a niche use-case, but allows code to inspect the key and value before deciding whether to remove it. |
bluss
reviewed
Jul 9, 2015
| where `pred(Unbounded)` is max, and `succ(Unbounded)` in min by assuming you're getting the | ||
| predecessor and successor of positive and negative infinity. This RFC does not propose this | ||
| API because it is crazy-pants and would make our users cry. |
This comment has been minimized.
This comment has been minimized.
bluss
Jul 9, 2015
I think this is a serious alternative.
Bound and .range() have existed for a while, are they not something we want to keep? Can I drag the alternative of using range syntax into this? (Bounded2 = Inclusive | Exclusive) so std::ops::Range<Bounded2> etc could be an alternative.
This comment has been minimized.
This comment has been minimized.
Gankro
Jul 9, 2015
Author
Contributor
This seems to just shuffle the combinatorics around and make the calling convention more awkard, as far as I can tell. No?
This comment has been minimized.
This comment has been minimized.
bluss
Jul 9, 2015
I think it's inconsistent if we want to keep using Bound as it is (or even changed) in some places (.range()), and then have these methods not use it.
This comment has been minimized.
This comment has been minimized.
Gankro
Jul 9, 2015
Author
Contributor
I regard Bound as a necessary evil for range because the combinatorics there seem truly catastrophic (18 iterator methods). That said I've never been super happy with the range design. Someone once suggested a builder pattern to me like:
// unbounded RHS
.range().from(x).into_iter()
// bounded RHS
.range().from(x).to(y).into_iter()
...etcMight be worth considering that more seriously.
This comment has been minimized.
This comment has been minimized.
|
@apasel422 Also if your type is actually ordered I actually intended to add I had also concluded that an Entry API was silly since VacantEntry is nonsensical, but I suppose |
This comment has been minimized.
This comment has been minimized.
|
@Gankro Presumably a dedicated removal method can be (slightly) more efficient than removing through the entry API, due to less bookkeeping. I hate to increase the combinatoric problem even more, but since the map types already have |
This comment has been minimized.
This comment has been minimized.
|
It's not clear to me that Only OccupiedEntry would have overhead. Constructing an OccupiedEntry is literally running |
This comment has been minimized.
This comment has been minimized.
|
It might be fine to just have the |
nrc
reviewed
Jul 9, 2015
| * succ_exc | ||
| * first | ||
| * last | ||
|
|
This comment has been minimized.
This comment has been minimized.
nrc
Jul 9, 2015
Member
Do these names have precedence from other libraries? They seem a bit too succinct to me (although a big plus one to the actual functionality, I've wanted this).
This comment has been minimized.
This comment has been minimized.
Gankro
Jul 9, 2015
Author
Contributor
Java: higher/lower/ceil/floor
C++: lower_bound/upper_bound (these names are terrible and I explicitly killed them in collections reform)
Everything else I looked at: chaos or doesn't seem to have this precise collection/functionality.
I briefly pondered before/after and next/prev before letting my theory background take over and demand predecessor/successor.
This comment has been minimized.
This comment has been minimized.
apasel422
Jul 10, 2015
Member
Another potential naming scheme could involve {lt, le, ge, gt}, optionally with a prefix or suffix if we're concerned about conflicting with PartialOrd's methods.
This comment has been minimized.
This comment has been minimized.
cristicbz
Jul 10, 2015
Some ideas:
before, after, before_eq, after_eq
find_{lt, le, ge, gt}
get_{lt, le, ge, gt}
This comment has been minimized.
This comment has been minimized.
cristicbz
Jul 10, 2015
Incidentally the lack of genericity over mutability is killing me. Don't how I'd do it, but there's so much repetition in API's these days because of it.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Gankro
Jul 19, 2015
Author
Contributor
I do really like that lt/leq/etc is an established naming convention that people can bring into understanding.
This comment has been minimized.
This comment has been minimized.
apasel422
Jul 19, 2015
Member
leq or le? The former might be easier to grok, but the latter is consistent with PartialOrd and has the minor benefit of having the same number of characters as {lt, gt}.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Honestly, I would prefer using the builder suggestion from above. |
This comment has been minimized.
This comment has been minimized.
|
@benaryorg These are orthogal API discussions. One is for doing direct queries, one is for iterating ranges. While one can be implemented in terms of the other, this is not necessarily efficient or desirable. |
This comment has been minimized.
This comment has been minimized.
|
@Gankro So you are planning to build two APIs, which one of them might (please) use a builder pattern and the other being cursor-like? Sorry if I do not quite get the idea behind the second API. |
This comment has been minimized.
This comment has been minimized.
|
This RFC is proposing an API just for answering queries of the form "who is the predecessor/successor/minimum/etc". All it does is return The range API that was being discussed above would produce an Cursors are Yet Another thing that are not currently being proposed here, and that the standard library does not currently have a notion of. Cursors and iterators -- particularly &mut ones -- must be implemented as separate types because they have different semantics. Iterators say you can always call |
This comment has been minimized.
This comment has been minimized.
|
Okay, I understand now. I'll leave function naming to you as I am the worst at that. |
This comment has been minimized.
This comment has been minimized.
|
@apasel422 Would you be fine with punting on remove/entry APIs until BTreeMap is rewritten to use parent pointers? I believe they can be added afterwards without an RFC based on "natural API holes" logic. |
This comment has been minimized.
This comment has been minimized.
|
@Gankro Absolutely. |
This comment has been minimized.
This comment has been minimized.
|
I've renamed the APIs per discussion. |
Gankro
referenced this pull request
Jul 20, 2015
Merged
RFC: Add item recovery collection APIs #1194
This comment has been minimized.
This comment has been minimized.
|
@Gankro you may want to update your original comment. I read |
This comment has been minimized.
This comment has been minimized.
|
Hm, how about combating the combinatoric explosion with type paramters? .get_rel::<LE>(&Q) -> Option<(&K, &V)>;If fn get<Ord=EQ>(&Q) -> Option<(&K, &V)>; |
This comment has been minimized.
This comment has been minimized.
|
@Kimundi I've talked about something like that with @Gankro in the past: Gankro/collect-rs#120 (comment). |
This comment has been minimized.
This comment has been minimized.
|
|
Gankro
added
the
final-comment-period
label
Jul 29, 2015
apasel422
reviewed
Jul 30, 2015
| modulo, but this is a more general problem for the *ordered map* API. There are surely types for | ||
| which a straight-up query will be cheaper than iterator initialization. | ||
|
|
||
| It is also siginificantly more ergonomic/discoverable to have `pred_inc_mut(&K)` over |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
apasel422
reviewed
Jul 30, 2015
| min|max: | ||
| ```rust | ||
| fn first(&self) -> Option<(&K, &V)>; |
This comment has been minimized.
This comment has been minimized.
apasel422
reviewed
Jul 30, 2015
| (min|max)_mut: | ||
| ```rust | ||
| fn first_mut(&mut self) -> Option<(&K, &mut V)>; |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
(It would be good for the typos that @apasel422 has noticed to be fixed if/before this is merged.) I find the combinatorics here really really bad. It seems a little crazy to add so many methods for what I suspect are relatively niche use-cases. @Gankro, I know you're rabidly against any use of enums in APIs to collapse functionality (i.e. the |
This comment has been minimized.
This comment has been minimized.
|
Returning an impl<K, V> Map<K, V> {
fn get_lt<Q: ?Sized>(&self, key: &Q) -> Option<(&K, &V)>;
fn lt_entry<Q: ?Sized>(&mut self, key: &Q) -> Option<OccupiedEntry<K, V>>;
// ... `le`, `ge`, `gt`
}or enum Query<T> {
Min,
Lt(T),
Le(T),
Ge(T),
Gt(T),
Max,
}
impl<K, V> Map<K, V> {
fn query<Q: ?Sized = K>(&self, query: Query<&Q>) -> Option<(&K, &V)>;
fn query_entry<Q: ?Sized = K>(&mut self, query: Query<&Q>) -> Option<OccupiedEntry<K, V>>;
} |
Gankro
added some commits
Aug 5, 2015
This comment has been minimized.
This comment has been minimized.
|
I disagree that it's niche -- it's one of the primary reasons to use an ordered map. |
This comment has been minimized.
This comment has been minimized.
diwic
commented
Aug 5, 2015
|
Is there also use for a "nearest" version? I e, if the treemap looks for 1000 and can only find 900 and 1010, it will choose 1010 because it is nearest. That seems useful - although maybe that will require some additional trait bound (e g Bikeshed wise, I don't know why |
This comment has been minimized.
This comment has been minimized.
arthurprs
commented
Aug 5, 2015
|
This, please! I'm missing this for a while. Otherwise there's very little reason to have an ordered map! It's a shame it requires so much code though, we need those parent pointers. |
This comment has been minimized.
This comment has been minimized.
|
@arthurprs parent pointers wouldn't solve the duplication, it's a pure descent algorithm. (I suppose it would reduce duplication with other APIs.) |
This comment has been minimized.
This comment has been minimized.
|
In think that the following is missing:
When seeing all the
I see two main use-cases for the query API:
So I basically expected to see two functions @Gankro you asked for feedback, I hope this is some constructive one :P |
This comment has been minimized.
This comment has been minimized.
|
The time and space complexity of these operations is an implementation Can you provide an example of how you would use the user-provided closure On Thursday, August 6, 2015, gnzlbg notifications@github.com wrote:
|
This comment has been minimized.
This comment has been minimized.
Basically I just want to know if I can call these in a loop without ending up in N^2 complexity or blowing up the stack. If I'm doing something latency-related I also need to know if they allocate any memory in the heap. If the implementation improves the complexity in the future, that is a non-breaking change, but the current complexity guarantees should be there.
I'm not sure either, but for the |
This comment has been minimized.
This comment has been minimized.
|
I don't think this RFC needs to make complexity guarantees. People can decide to use these methods based on the public documentation of their current complexity, not the contents of this RFC. But if you need the predecessor of a key for a certain algorithm, you have to find it somehow, so providing these APIs will be beneficial regardless of their complexity. Even if they are implemented completely naively at first, code that calls the methods instead of doing a manual iterator-based search will be made more efficient automatically when the implementation improves. I don't understand what |
This comment has been minimized.
This comment has been minimized.
cristicbz
commented
Aug 6, 2015
|
An alternative to the enum would also be a static dispatch version, similar to the way struct Min;
struct Max;
struct Le<Q: ?Sized>(Q);
// ...
trait Query<K, V, Selector: ?Sized> {
fn query(&self, query: &Selector) -> Option<(&K, &V)>;
fn query_mut(&mut self, query: &Selector) -> Option<(&K, &mut V)>;
// maybe query_entry as well
}
impl<K, V> Query<K, V, Min> for Map<K, V> {
/* ... */
}
impl<K, V, Q> Query<K, V, Le<Q>> for Map<K, V>
where K: Borrow<Q> {
/* ... */
}
// ...Cons: you'd have to import |
This comment has been minimized.
This comment has been minimized.
|
You actually wouldn't have to import the query trait, because we could add inherent methods to the map that simply call out to the appropriate impl. You would have to import the query structs themselves, though, and there would have to be a different trait for set queries, which won't expose mutable elements. I'm not opposed to the static dispatch approach, because it can be nice to represent the queries themselves as values (e.g. passing around |
This comment has been minimized.
This comment has been minimized.
cristicbz
commented
Aug 6, 2015
|
To avoid having a separate trait for trait Query<Selector: ?Sized> {
type Output;
fn query(self, query: &Selector) -> Option<Self::Output>;
}
trait QueryMut<Selector: ?Sized> {
type Output;
fn query_mut(self, query: &Selector) -> Option<Self::Output>;
}
impl<'a, K, V> Query<Min> for &'a Map<K, V> {
type Output = (&'a K, &'a V);
fn query(self, query: &Selector) -> Option<Self::Output> {
/* ... */
}
}
impl<'a, K, V> QueryMut<Min> for &'a mut Map<K, V> {
type Output = (&'a K, &'a mut V);
fn query_mut(self, query: &Selector) -> Option<Self::Output> {
/* ... */
}
}
impl<'a, E> Query<Min> for &'a Set<E> {
type Output = &'a E;
fn query(self, query: &Selector) -> Option<Self::Output> {
/* ... */
}
}If you do provide inherent methods though, I don't know if there is much value in using the same trait (which would really only ever show up to bound the argument of the inherent methods). |
This comment has been minimized.
This comment has been minimized.
|
@cristicbz I've put a POC implementation of what you are suggesting here: https://github.com/apasel422/bst/tree/query. |
This comment has been minimized.
This comment has been minimized.
|
Here's a thought on an API variant to deal with combinatorics while still being friendly: fn max<Q: ?Sized, R>(&self, range: R) -> Option<(&K, &V)>
where K: Borrow<Q>, Q: Ord, AnyRange<&Q>;
fn min<Q: ?Sized, R>(&self, range: R) -> Option<(&K, &V)>
where K: Borrow<Q>, Q: Ord, R: AnyRange<&Q>;
fn max_entry<Q: ?Sized, R>(&mut self, range: R) -> Option<OccupiedEntry<K, V>>
where K: Borrow<Q>, Q: Ord, R: AnyRange<&Q>;
fn min_entry<Q: ?Sized, R>(&mut self, range: R) -> Option<OccupiedEntry<K, V>>
where K: Borrow<Q>, Q: Ord, R: AnyRange<&Q>;Given inclusive ranges, you can cover all of the cases you wanted to with your API, without requiring any extra imports or names to be used. UPDATE: in case the above is unclear, here are some examples: // get_le
map.max(...&k)
// get_lt
map.max(..&k)
// get_ge
map.min(&k..)
// get the smallest element:
map.min(..)However, @Gankro points out on IRC that since exclusive ranges only exclude on the right, we can't express |
This comment has been minimized.
This comment has been minimized.
cristicbz
commented
Aug 7, 2015
|
@aturon The inability to express |
This comment has been minimized.
This comment has been minimized.
|
FWIW, it seems C++'s equivalent container |
This comment has been minimized.
This comment has been minimized.
|
The libs team has decided to close this RFC pending investigating alternative API solutions. In particular I think there's a promising opportunity with a range builder pattern. For now this functionality could be provided by an external crate -- at least semantically, not necessarily perf-wise -- on top of |
Gankro commentedJul 9, 2015
rendered
Add the following to BTreeMap
and to BTreeSet: