-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Measure the impact of using BlockPtr #4
Comments
ProposalI think that Custom allocatorIn case of The only problem here is that Rust use std::rc::Rc;
type BlockRef(Rc<RefMut<Block>>); // step 1: naive implementation This way we could expose only those of Rc's functions, that we're interested about. Eventually we could replace native Rc implementation with our own: // our Rc semi-compatible type
struct Rc<T, A: Allocator>(/* impl */);
// our super optimized allocator type for data type of same size
struct Pool<T: Sized> { /* impl */ }
// Alocator API is available in experimental phase: https://doc.rust-lang.org/nightly/std/alloc/trait.Allocator.html
impl<T: Sized> Allocator for Pool<T> {
// impl
}
// step 2: make our Rc look like native std::rc::Rc and swap them where necessary
type Rc<T: Sized> = Rc<T, Pool<T>>; In that shape, this functionality could be even included as part of some more generic lib like lib0. There is even potential, that we might not need custom Rc, but rather use Rc::from_raw function over memory we allocated by ourselves. BlockPtr and smart pointersAnother idea for optimization is to use concept of smart pointers to represent data from use lib::rc::Rc; // we can now use our "better" Rc implementation
type BlockPtr(Rc<RefMut<Block>>);
pub enum Block {
Item(Item),
Skip(Skip),
GC(GC),
}
pub struct Skip {
pub id: ID,
pub len: u32
}
pub struct GC {
pub id: ID,
pub len: u32,
} However we could go further. Let's assume that
This way we can tell what kind of a block is BlockPtr pointing to without even jumping to a memory address where the |
I think you are right that we will eventually end up writing our own allocator. For now, I wanted to use I'm absolutely not concerned about the time it takes to iterate through the BlockStore or the time it takes to allocate more memory for a BlockList. The important part is that the initial loading of the document is fast and that we can optimize in the future.
Moving a range of memory to another location is very fast on modern hardware and optimized by the operating system. The important thing is that we optimize loading the document initially because that's where users are actually blocked from editing. A user won't notice if inserting a single character takes 1μ or one ms. Not that a
The code you mentioned does quite a lot of things. In the ideal case, we retrieve the BlockList and then access the
That's exactly the case. But implementing a custom allocator might be out of scope for now. I don't think that the current implementation is optimal. But it is very close to what I did in Yjs and should provide overall good performance even in WASM. I want a custom allocator. But first, we should have something to compare to using benchmarks. In the future, |
I couldn't agree here - allocator gives us direct memory address. Block access is isolated and very fast in this scenario. BlockPtr not only requires us to do multiple jumps, but also leaks that complexity into library code. Since every
Even if we're not going back to |
Closing this one - results can be seen in #88. |
I still have got some doubts around concept behind
BlockPtr
. While I got the idea behind the initial concept I'm thinking if all of the operations where taken in mind.As @dmonad mentioned original test was about populating BlockStore of known size with blocks at the initialization time. Inline approach is faster here: we just make
Vec::with_capacity<Block>(known_size)
and we have entire chunk ready right away with blocks inside already initialized. Compared to thatVec::with_capacity<Rc<RefMut<Block>>(known_size)
seems subpar for two reasons:Vec<Block>
here.However this approach also comes with several other issues, I'm not sure we talked about:
When new blocks will be added to a vector, and its capacity will be reached, before appending the next element, vector will have to assign new internal array (also on heap) of
sizeof(Block) * vec.capacity() * 2
(vec growth factor is around 2.0), then copy all of the existing blocks over to new space.Block
size atm is around ~300B. For block store with 1024 elements, this means copying ~300kB of memory before adding new element. If we're talking about block store asVec<Rc<RefMut<Block>>>
, 1024 elements takes only 8kB, making resizing significantly easier.Another issue is an access time based on
BlockPtr
- since it's used asItem
's left and right block pointer, every access to left/right block involves going through block store. Compared to access byRc
this is really expensive.I don't have yet enough intuition to say how often this will happen.
Here's the assembly of BlockStore::get_item_mut compiled with release options (-C opt-level=2). Keep in mind that this code will be executed every single time we're about to retrieve item instance from block store.
It would be good to have some performance tests for
Rc
s andBlockPtr
in many different scenarios. Potentially we could also create something that works "like"Rc
(eg. using unsafe pointer arithmetic) and let's us to leverage the knowledge about the structures that know we're going to use, which generic RC mechanism won't do.The text was updated successfully, but these errors were encountered: