-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize vector literals by storing them in the constant table #1096
Optimize vector literals by storing them in the constant table #1096
Conversation
…rted only for 3 arguments)
…supporting it would require a new opcode
Linking #1091 because all the context is missing. |
Common/include/Luau/Bytecode.h
Outdated
LBC_VERSION_MAX = 4, | ||
LBC_VERSION_TARGET = 4, | ||
LBC_VERSION_MAX = 5, | ||
LBC_VERSION_TARGET = 5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TARGET has to be left at 4 until flag is removed (Roblox will take care of this future removal).
BytecodeBuilder::getVersion()
should be updated to return 5 when flag is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Co-authored-by: vegorov-rbx <75688451+vegorov-rbx@users.noreply.github.com>
Co-authored-by: vegorov-rbx <75688451+vegorov-rbx@users.noreply.github.com>
Co-authored-by: vegorov-rbx <75688451+vegorov-rbx@users.noreply.github.com>
…inen/luau into petrih-vector-literals
@vegorov-rbx Thanks! Committed your suggestions and fixed bytecode versioning. |
uint64_t value; | ||
uint32_t extra = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unnecessarily increases the size of ConstantKey to 24 bytes due to its placement after value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll fix it on Monday.
One more thought about 4d literals: We could always store 4 components in compiler data structures and bytecode, i.e. LBC_CONSTANT_VECTOR would have 4 components regardless of LUA_VECTOR_SIZE. In luau_load call setvvalue with the 4 components read from the input. When LUA_VECTOR_SIZE = 3, the VM would simply truncate the vector to 3 components. The only downside I see is that constants would always be stored with 4 components in bytecode dumps. I think this would not break any existing code or assumptions and would be a reasonable compromise. Note that luaF_vector has the same truncating behavior. My main motivation is to avoid having to remember rules of thumb such as:
We could leave this for future work, but if we did this now we could avoid another bytecode version bump. What do you think @zeux? |
I'm curious what @vegorov-rbx thinks, I think I'd be fine with this given that the behavior of builtin vector constructor is to ignore 4th component in vector3 builds, and default it to 0 in vector4 builds, so replicating this logic in the compiler is fine. The one small downside is that vector4 is an exceptional configuration (I know you use it, but that's not the default and most configs won't use it), so we're paying a small extra price in bytecode size for a non-default configuration. (plus small extra compile time cost for inflated constants unfortunately) However, the price is not that large, and I think it's probably still a net win vs what we currently get with fastcall and separate numeric constants encoded into instructions / constant table as appropriate. FWIW on our internal codebase this entire change is fairly neutral on code size, so we don't use constant vectors as heavily, so from the operational perspective it's fine either way. |
I've updated the PR to support four arguments so that we can see more easily how the code looks like in practice. Waiting for comments, but from my side it looks ok now. |
I think having 4 components in bytecode is a good solution given that while vector4 configuration is not the default one, it's still supported. |
I'm getting codecov test failure... I think these happen because LuauVectorLiterals is not enabled by default. For example, codecov reports that |
Not sure what's up with codecov; it's not supposed to fail at all regardless of the diff coverage. Should be safe to ignore, might be an infrastructure issue on their side. |
Add a comment
Now with one less space
Thanks for the approval! |
With this optimization, built-in vector constructor calls with exactly three arguments are detected by the compiler and turned into vector constants when the arguments are constant numbers. Requires optimization level 2 because built-ins are not folded otherwise by the compiler. Bytecode version is bumped because of the new constant type, but old bytecode versions can still be loaded.
The following synthetic benchmark shows ~6.6x improvement.
Also tried a more real world scenario and could see a few percent improvement.
Added a new fast flag LuauVectorLiterals for enabling the feature.