Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for even higher performance gains #4

Open
cipharius opened this issue Feb 28, 2024 · 3 comments
Open

Suggestion for even higher performance gains #4

cipharius opened this issue Feb 28, 2024 · 3 comments

Comments

@cipharius
Copy link

Checking out your current encode/decode logic the system seems neat and could be relatively easily adjusted to drop the need for dynamically growing buffer object by allocating just the right size buffer.

I think this could be achived by adding new method to the dataTypeInterface<T> called size, which would then compute and return size of the value it represents.

Not only this would greatly speed up your encoding/decoding process, but also simplify the code by removing explicit memory management via bufferWriter.alloc which scatters dynamic allocations across the codebase.

This same idea could be also applied to the server and client queueing logic. Instead of appending the new packet's buffer to the pending buffer, keep all the pending buffers in a table, which will be your queue and then right before sending compute the total packet size, allocate one large buffer and copy each buffers content into the final buffer.

I use similar method in my Luau msgpack implementation and given that it can outperform native JSON encode/decode methods shows that now one can actually reason about memory allocation patterns in Luau and do a better job than native code that most likely neglected these concerns.

@ffrostfall
Copy link
Owner

This method is faster and better performance without native code generation, however, in native code generation, buffers outperform tables and functions.

I used to use an approach like this, and it works well w/o codegen enabled. But the second you enable codegen, things change.

I'll likely be using this approach on the client, and the resizing approach on the server.

@cipharius
Copy link
Author

I don't see how native codegen could change the memory access and allocation patterns. I am not suggesting to use table, but to use a single buffer that is allocated once instead of reallocated multiple times.

Memory reallocation has signifficant performance implications even in compiled C/C++/Rust whichever code.

@ffrostfall
Copy link
Owner

Because using purely buffers is faster than using tables alongside buffers. That's all.

Tables also need to allocate their own memory. I'll leave this open because it's actually accurate: this is the better method when considering the overhead of creating buffers, but there is no overhead for that in native codegen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants