Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for amortized resizes #137

Closed
wants to merge 1 commit into from
Closed

Conversation

jonhoo
Copy link

@jonhoo jonhoo commented Jul 20, 2020

Hi there!

As you may have seen, over the past few weeks, I've been digging deep down into some latency spikes I was observing when using IndexMap. Further investigation revealed that they were caused by resizes, which can get quite expensive as the map grows large. To mitigate the issue, I built an amortized version of hashbrown (griddle), and and amortized version of Vec (atone), and then plugged them into IndexMap. This worked really well, and cut my tail latency by several orders of magnitude.

I'm opening this mostly to serve as a place to discuss a path forward. It might be that that path forward is to publish a separate crate as I was told to do for hashbrown (rust-lang/hashbrown#166 (comment)). I'd be curious to hear your thoughts on including this as a proper feature of indexmap.

The changes in this PR are actually fairly few, but there were many changes needed just to be able to abstract away the use of Vec/slices to something that can be either a slice or an atone::Vc (this is the EntryVec type alias). The primary functional changes are:

  • Calls to make_contiguous prior to sorting, which are nightly only (tracking issue)
  • impl Debug for IntoIter does not print the items with feature = "amortize", since vec_deque::IntoIter does not provide by-ref access to the "remainder"
  • Changes to capacity are a little less predictable (see the tests), since the storage is backed by a VecDeque, not a Vec.

@jonhoo
Copy link
Author

jonhoo commented Jul 22, 2020

@bluss and @cuviper, I'd be curious to hear your take on whether this is something that might make it into indexmap (even if not in its current form), or whether you think I should just maintain a near-fork with this patch?

@cuviper
Copy link
Member

cuviper commented Jul 22, 2020

At a high level, I'm a little concerned about the global effect by doing this with a feature. There are trade-offs in this, and that would be okay if a root application crate made this choice (akin to a global allocator), but any crate in the dependency graph could activate this feature affecting IndexMap everywhere. From that angle, a forked variant seems better.

Cargo.toml Outdated Show resolved Hide resolved
@jonhoo
Copy link
Author

jonhoo commented Jul 22, 2020

I'm a little concerned about the global effect by doing this with a feature

Yeah, that's a strong point. I guess the flip side is that keeping a separate crate would require ongoing maintenance to keep the projects in sync. I wonder if there's a way for indexmap to provide a raw module the same way hashbrown does, so that the fork doesn't have to also copy over all the internals... That would certainly make keeping a fork much less work! Though also potentially more work on your side.

@cuviper
Copy link
Member

cuviper commented Jul 23, 2020

Changes here are usually pretty infrequent. The hashbrown change was a biggy, and I do have a few more things lined up, but after that I would expect it to slow down again. Plus, it's not like a fork needs to be aggressive about staying in sync, just good enough for your forked users' needs.

I'm not sure what a raw API would look like here...

@jonhoo
Copy link
Author

jonhoo commented Jul 23, 2020

That's true. I guess I'll pursue a fork instead then. Can you give me an idea of the changes you have in mind?

As for a raw API, yeah, it'd be tricky. I can imagine what it'd look like (basically a "smart" Bucket wrapper whose methods maintain the indexmap invariants), basically like your current map/core, but I don't know if it'd even be sufficient for what I need. Specifically because the change from [] to atone is so invasive.

@jonhoo jonhoo closed this Jul 23, 2020
@cuviper
Copy link
Member

cuviper commented Jul 23, 2020

Can you give me an idea of the changes you have in mind?

Direct indexing (#132), and also parameterizing the index type in the raw table (e.g. storing u32 or a newtype index).

@jonhoo
Copy link
Author

jonhoo commented Jul 27, 2020

Published as https://github.com/jonhoo/indexmap-amortized

@cuviper
Copy link
Member

cuviper commented Jul 27, 2020

Congrats! Not as clever as your previous names though... 😉

@jonhoo
Copy link
Author

jonhoo commented Jul 27, 2020

@cuviper hehe, I figured for this specifically it'd be good to keep the indexmap association.

One observation already: the amortize feature vs new crate decision keeps on occurring downstream. For example, I need amortization for evmap and hashbag too, and I really don't want to maintain forks for all of those. Even though it's awkward, I think I'm just going to go with a feature for this, and then hope that something like rust-lang/cargo#2980 eventually provides me with a way to say that "this feature causes a divergence of behavior, and should produce a separate compilation target".

jonhoo added a commit to jonhoo/left-right that referenced this pull request Dec 6, 2020
Keep in mind that since this is a feature, _all_ evmap instances get
amortized if _anything_ in the dependency graph enables the feature
(indexmap-rs/indexmap#137 (comment)).
This is unfortunate, but given that the alternative is forking evmap,
we'll go with the feature approach for now. Hopefully one day we'll have
a way to [avoid this](rust-lang/cargo#2980 (comment)).

Not landing this yet since it's nightly-only at the moment due to
certain methods [in
`atone`](https://github.com/jonhoo/atone/blob/45ae1e42deaaaaa9a919957ade49982a7ac4655b/src/lib.rs#L58).
Nightly-only will go away once
rust-lang/rust#70929 and
rust-lang/rust#74217 both stabilize. The first
one will land on next stable, but the second has some more time to go
first.
jonhoo added a commit to jonhoo/left-right that referenced this pull request Dec 17, 2020
Keep in mind that since this is a feature, _all_ evmap instances get
amortized if _anything_ in the dependency graph enables the feature
(indexmap-rs/indexmap#137 (comment)).
This is unfortunate, but given that the alternative is forking evmap,
we'll go with the feature approach for now. Hopefully one day we'll have
a way to [avoid this](rust-lang/cargo#2980 (comment)).

Not landing this yet since it's nightly-only at the moment due to
certain methods [in
`atone`](https://github.com/jonhoo/atone/blob/45ae1e42deaaaaa9a919957ade49982a7ac4655b/src/lib.rs#L58).
Nightly-only will go away once
rust-lang/rust#70929 and
rust-lang/rust#74217 both stabilize. The first
one will land on next stable, but the second has some more time to go
first.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants