Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Make BTreeMap::new() not allocate #50266
Comments
|
I am interested in mentoring on this for the "do clever shit" approach, as I probably know enough to make this change, and would probably be the one to review it, but don't have the time/energy to do the work myself right now. I don't believe this is appropriate for a beginner to take on, but I'm willing to help teach someone with a bit of experience the advanced details necessary to accomplish this task (probably with a lot of linking to the relevant sections of the nomicon). That said I'm stretched pretty thin, so I can't walk you through every single detail. A certain amount of self-direction will be necessary. Be warned: BTreeMap is some of the most complex unsafe code in libstd, and this would be making it more complex. There are a lot of subtle details, such as why thin-vec has this abomination. If you want to learn a lot about unsafe rust, and the advanced details of rust's type system, this is a great project! (or if you already know that stuff... hey there, wanna make rust faster for me?) DIFFICULTY RATING |
|
Also I highly recommend doing profiling/benchmarking of incremental steps (e.g. checking if swapping the layout of LeafNode does anything) so you can make informed optimization decisions. |
|
@porglezomp has volunteered via twitter, though they won't be able to look at this for a few days |
|
Just sticking my name on the issue here. I haven't contributed to Rust in a few months, so I'll make sure I've got everything building properly tonight. |
|
@porglezomp beated me :). At some point if you don't have time or need some help, please ping me. |
|
@porglezomp i can participate this pr cycle and do some perf testing on some code base. just learn rust some months. just let me try. |
|
relevant profiling info: https://blog.mozilla.org/nnethercote/2018/04/30/how-to-speed-up-the-rust-compiler-in-2018/ |
|
I have not read the BTreeMap code yet, but does putting the root node on the stack instead of allocating helps speeding up? |
|
Unlikely. Due to the large'ish fanout it'll make the BTreeMap struct way too big. The child pointers alone make it cross the 100 bytes barrier. |
|
See also #50491, where I just sped up rustc by replacing more BTreeMaps with LazyBTreeMaps. |
Currently,
BTreeMap::new()allocates an empty root node. This is in spite of all of our other collections managing to avoid this. In #50240 it was demonstrated that there's massive wins to be had by avoiding allocating here. In the past we avoided implementing this because it simplified the code of BTreeMap a lot (which is already pretty gnarly code).I see two ways to do this
Some examples of clever shit:
The core Range iterator primitive can always always be initialized with two
NonZero::dangling()pointers andlen = 0. The existing branches will then do the correct thing. (this is how empty slices generally work)By changing the layout of LeafNode to have the metadata first, we can statically allocate a
LeafNode<u64, u64>(say) for all empty instances of BTreeMap to share. get/remove queries will then naturally work correctly without extra branches. See thin-vec for an example of doing this. An extra branch will probably need to be added toinsertto check if we're pointing at the static singleton; not a big deal.