Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas about future enhancements #18

Closed
4 tasks
dmonad opened this issue May 25, 2021 · 1 comment
Closed
4 tasks

Ideas about future enhancements #18

dmonad opened this issue May 25, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@dmonad
Copy link
Contributor

dmonad commented May 25, 2021

  • Create a struct for clientID with custum hashing function and custom ordering.
  • Change type.map to Option<Box<HashMap>>
  • Create a custom Map with a custom Hasher.
  • Create a default HashMap & HashSet type that uses a standard hasher that we use across y-crdt and lib0
@dmonad dmonad added the enhancement New feature or request label May 25, 2021
@dmonad dmonad assigned dmonad and unassigned dmonad May 25, 2021
@dmonad dmonad changed the title Future enhancements Ideas about future enhancements May 25, 2021
@Horusiath
Copy link
Collaborator

Horusiath commented Nov 16, 2021

Since now we have benchmarks in place, we'll be able to eventually evaluate several optimization options, that we only loosely talked about so far.

Use custom arena allocator for items

One of the reasons why blocks/items are "allocated" as part of ClientBlockList is that list is implemented using exponentially growing vector buffer. It was useful property when we eg. wanted to load documents, which meant that we need to create potentially thousands/millions of blocks, which is kinda expensive when done block by block. Vector buffer instead preallocates more and more space (and can be used to preallocate exact number of items we know about).

The problems with using this form of allocation via Vec are:

  1. When we miss with buffer size, vec will allocate a new bigger buffer, copy old items to it and drop old buffer. This may be expensive operations. Implementation using linked list of buffers may be potentially faster as it doesn't copy old contents.
  2. GCing and item brings little to no benefits, as Block is an enum of GC/Skip/Item which means it's memory size is always the same as the biggest case (which is an item), which means that GC size is (almost, if we don't consider internal strings references on the head) the same as Items.
  3. It's not really a good abstraction. What we're after is to solve problem of slow allocator. And for that having a document-level arena allocator would be a good thing.
  4. We cannot reference items in an other way that carrying block store around, as there's no reliable way to reference an item because it may be copied over to a different memory location as Vec internal buffer is expanding.

Use direct item references instead of BlockPtr

This is something that me and Kevin are fighting over for a while ;) Right now we have a special dedicated structure that is containing information to locate an item within a block store, called BlockPtr. It's used in all kinds of situations when an item needs to be referenced (eg. left/right neighbors, shared map item pointers etc.).

There are several problem with it:

  1. Using BlockPtr requires to carry over a reference to a block store all the time. And that combined with the Rust borrow-checker rules causes to somewhat cumbersome pieces of code to appear eg. on few occasions we need to re-assign ClientBlockList reference (even if we had it before) just to make borrow checker calm.
  2. It causes an indirect access, which I believe will be much more expensive - access item via reference is a single random memory access. Access via block pointer is:
    • block store access
    • calculate position of client block list (which even with optimized hasher approach still needs to be computed)
    • jump to block store (which is hashmap) table
    • get client block list
    • check if client block list hash collision occurred (that's how hashmaps work)
    • jump to client block's list buffer
    • use BlockPtr::pivot to get index hint of block within client block list
    • get block if pivot position is correct
    • check if obtained block is correct
    • use binary search if we hit pivot miss
      These steps must be executed every time, and we're talking about at least 5 random memory accesses (into memory pages that most are not aligned nearby in memory), 2 comparisons and 1 hash calculation compared to a single direct memory access.
  3. BlockPtr is a combination of 3 fields: u64, u32 and u32, which makes it heavy compared to a reference (which is u32/u64 depending on OS). The initial justification was that allocation of items on a heap also requires some extra implicit fields (like ref counters), but if we combine that with arena allocators, we can greatly reduce any extra weight in this department.

Optimize Item and Branch memory layout

Calculating sizes in Yjs was pretty easy, as every field was either a primitive or a reference to a heap, which basically meant it fits into 4/8-byte range. In Rust it's much easier to put some extra weight, as most fields are inlined as structs eg.:

  • BlockPtr alone is 24B, and Option<BlockPtr> (used ie. by item's left/right neighbor pointers) is 32B.
  • Block alone is 280B struct, Item - 272B, ItemContent - 88B, and we're not even talking about their total weight (the contents can introduce additional extra bytes on the heap).
  • Branch struct is 152B, and since it's hidden behind Rc<RefCell<>> we should add 8-12 extra bytes as well, then an extra for its map field, which is always initialized etc.

We should put some effort into reduction of these sizes - eg. by changing Option<BlockPtr> with item references we could loose almost 100B per item. We can use different techniques, like smart pointers encoding some extra data into spare unused bits etc.

Use SmallVec / SmallString

Right now we're using standard String/Vec implementations, which will always allocate a dynamic buffer on a heap no matter how small they are. This may be non-optimal in some cases, when eg. a single item is used for storing a single value or character. Rust enables us to define collection types that can use a stack/local memory up to a certain limit and then automatically switch to a heap-backed buffers once we pass a certain size. We should make use of them.

Make ClientBlockList work like a rope

It's a minor improvement, but right now searching random clock position makes use of binary searches over a client block list. We could optimize this step by implementing it as a rope structure, so that we could split it into buffers (preventing from copying old buffer content to a new one on resize) with their lengths attached. Again, this is a minor thing.

@Horusiath Horusiath mentioned this issue Nov 16, 2021
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants