Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RangeMap::overlapping_mut? #62

Open
jasonwhite opened this issue Feb 2, 2023 · 12 comments
Open

RangeMap::overlapping_mut? #62

jasonwhite opened this issue Feb 2, 2023 · 12 comments

Comments

@jasonwhite
Copy link
Contributor

jasonwhite commented Feb 2, 2023

Thanks for this crate! Also thanks for adding RangeMap::overlapping recently; this is super useful.

Do you see any reason why we cannot also add RangeMap::overlapping_mut to mutate the values? BTreeMap has range_mut, so it should be easy to implement. I can't see how this would break any of the internal invariants of splitting or coalescing. As of now, the only way to mutate a value is by inserting a new value. This is quite cumbersome when the value is a Vec<_> that we might want to append to, for example.

If this seems reasonable, I'm happy to implement it.

@jasonwhite
Copy link
Contributor Author

This was actually pretty easy to add! See #63

@jeffparsons
Copy link
Owner

Do you see any reason why we cannot also add RangeMap::overlapping_mut to mutate the values? BTreeMap has range_mut, so it should be easy to implement. I can't see how this would break any of the internal invariants of splitting or coalescing.

I do think this presents a problem, for the same reason that I don't expose the more basic iter_mut method: if you use the iterator to update a value to be the same as that stored for one or two immediately-adjacent range(a), then they should be merged into one range.

This is at present quite fiddly (impossible?) to do without it having very surprisingly bad performance when such a case arises.

When BTreeMap cursors land in the Rust standard library, this will all get a whole lot easier, so I was planning to revisit this sort of thing then.

@jeffparsons
Copy link
Owner

In case you're curious, there is a PR open to add BTreeMap cursors here: rust-lang/rust#105641

@jasonwhite
Copy link
Contributor Author

Oof, I hadn't thought too deeply about the case where adjacent ranges get merged since I don't need coalescing for my use case (because the values are unique). I'll close that PR for now. I think I can work around it by mapping ranges to a unique key and doing something like this:

let mut value_map: HashMap<u32, Vec<u32>> = HashMap::new();
let mut range_map: RangeMap<u32, u32> = RangeMap::new();
let mut key = 0;

range_map.insert(0..42, key);
key += 1;

if let Some(v) = range_map.get(4).map(|v| value_map.get_mut(v)) {
    v.push(42);
}

Then, I'll always have to check if insert would clobber an existing range so that the corresponding values can be removed from the hash map. Would be nice if insert could somehow return the ranges and their values that get completely clobbered.

Feel free to close this issue.

I'm eagerly .awaiting stabilization of BTreeMap cursors. ;-)

@jasonwhite
Copy link
Contributor Author

I ended up using RefCell to enable internal mutability of the value, for example:

use std::cell::RefCell;
let mut map: RangeMap<u32, RefCell<Vec<u32>> = RangeMap::new();

map.insert(0..42, RefCell::new(Vec::new()));

for (range, v) in map.overlapping() {
    v.borrow_mut().push(42);
}

RefCell conveniently implements Eq and Clone if the underlying type does, so this works quite well. I think this also shows that the internal coalescing can still be broken even with the current API.

@jeffparsons
Copy link
Owner

Neat trick. :)

I think this also shows that the internal coalescing can still be broken even with the current API.

Well... you can use RefCell or an incorrect Eq implementation to break all kinds of APIs, including in the standard library. But that doesn't mean the API itself is broken; it just means that if you don't play by the rules, then "some bets are off". BTreeMap, for example, has the following in its documentation:

It is a logic error for a key to be modified in such a way that the key’s ordering relative to any other key, as determined by the [Ord](https://doc.rust-lang.org/stable/std/cmp/trait.Ord.html) trait, changes while it is in the map. This is normally only possible through [Cell](https://doc.rust-lang.org/stable/std/cell/struct.Cell.html), [RefCell](https://doc.rust-lang.org/stable/std/cell/struct.RefCell.html), global state, I/O, or unsafe code. The behavior resulting from such a logic error is not specified, but will be encapsulated to the BTreeMap that observed the logic error and not result in undefined behavior. This could include panics, incorrect results, aborts, memory leaks, and non-termination.

To be fair, I've never actually mentioned anything like this in the RangeMap docs. So I should probably add a similar comment to the above, referencing the docs for BTreeMap.

@jeffparsons
Copy link
Owner

BTW, @jasonwhite, are you by any chance using a Nightly compiler? I'm planning to have a go at exposing some handy functions like overlapping_mut (behind an "unstable" feature gate) when rust-lang/rust#105641 lands on nightly, but that won't be of much use to people using stable Rust compilers.

@jeffparsons
Copy link
Owner

Also, (sorry to spam your inbox with this stream of consciousness), I'm surprised by how many people have mentioned that they don't care about coalescing. I'm beginning to consider that maybe I should support options for either disabling coalescing entirely, or suspending coalescing while you make a handful of operations followed by an implied O(n) pass over the whole collection to "re-coalesce" if you made any changes. Do you think that would be useful to you?

@jasonwhite
Copy link
Contributor Author

@jeffparsons Yes, I am fine with using unstable features. I expect the BTreeMap cursors stuff will be in that state for quite a while.

Regarding optional coalescing: I view it as in the same domain as defragging the blocks of a disk or garbage collection. That is, it shouldn't be required for correctness, only for performance under certain cases. In my scenario, I'm not worried about fragmentation of the ranges because they're frequently overwritten. This behavior paired with unique values means that any time spent doing comparisons of adjacent ranges isn't very helpful.

I'll try to summarize the options you mentioned:

  1. Be able to disable coalescing at compile time.
  2. Toggle coalescing at runtime, like toggling the garbage collector.
  3. Do a single pass on the collection in O(n) time.

I think (1) and (3) would give the best overall performance for the case where coalescing isn't needed. However, I can imagine paying the O(n) cost once in a while can hurt (like in a game). The implementation complexity of (2) seems high. Ideally, whatever we go with should be a zero-cost abstraction. One very simple way to implement the zero-cost abstraction is to have a third generic parameter that toggles it, like this:

pub struct RangeMap<K, V, const COALESCE: bool = true> {
    // ...
}

@ripytide
Copy link

ripytide commented Feb 4, 2023

Alternatively, an option 4. might be to have multiple insert functions depending on the required coalescing. (This is what RangeBoundsMap does)

@jeffparsons
Copy link
Owner

I think I like option 4. We'll just need to make sure we catch any existing implementation details that assume everything is always coalesced after inserts. I don't think there's anything...

@jeffparsons
Copy link
Owner

Regarding optional coalescing: I view it as in the same domain as defragging the blocks of a disk or garbage collection. That is, it shouldn't be required for correctness, only for performance under certain cases.

I think I'm convinced by this. I'd originally thought of splitting/shrinking/removing and coalescing as being all part of the same set of guarantees, but I can see that they don't need to be. So as long as the more obvious method still does coalescing, nothing assumes coalescing has been done for correctness, and we don't make any guarantees that we won't arbitrarily coalesce things if we feel like it, then I think we can start adding these operations that give a bit more flexibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants