New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefer lexical names list #1307
Conversation
Shouldn't that be decided by the threshold of the number of elements at which a linear lookup becomes slower than a hash lookup? Regardless of the number of small frames? |
src/core/bytecode.c
Outdated
* search of the list. This should be faster for short lists. | ||
* The choice of 5 entries is guesswork, and ought to be refined by | ||
* benchmarking. This location is the only place where we need this | ||
* "magic" constant, hence I don't see the ned to #define it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"ned"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lizmat Not sure if I'm answering the wrong question here...
There is a list and (currently) a lookup hash maintained for each and every static frame. The threshold for "when not to bother with a hash" is going to be per-frame. It's easiest to have it as a constant, and the same constant for all frames.
You're right that it should be "becomes slower than a hash lookup" - I have no good idea when that is, or even how to measure it well. 5 was a guess. The memory overhead of the hash structure is a lot more than 5 structs, and there's a bunch more CPU work before you even get to compare the wanted string key with the entry in the hash. Plus doing the hash lookup generates (at least) 1 more pointer indirection, 2 if your hash isn't perfectly spread out and you have two entries in the same linked list.
…mes. The former is an array, the latter a hash. Given that they have the same contents, it's much simpler to iterate over the list.
This permits us to easily tweak the storage format. The "Indexes were formerly stored off-by-one" comment related to code before 2013, back when hashes were implemented with the Apache Portable Runtime. I don't think that it's useful to retain it. :-)
If there are 5 or fewer lexicals in the static frame, don't create a lookup hash for them. Instead, find them by a linear scan of the list. For a short list, this won't be slower, and we save some memory. The value 5 is an educated guess - there might be a better cut-over point. The trade offs include amount of CPU used on brute force string comparison, and number of cache misses in chasing pointers in the hash structure, and will also be affected by whether names in the frame have different lengths. (Our string comparison will rapidly return "not equal" for strings of different lengths.)
e24eb0f
to
3658e4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks sensible to me. I think 5 is a reasonable guess; in many cases we'll have frames where all lexicals are lowered except those needed dynamically ($/
and $!
), which cannot be lowered to locals. The strings will typically also be of short length, I guess 'cus variable names don't tend to be all that long - and non-short ones are more likely to have different lengths and hit the length fast path anyway.
These aren't used anymore after MoarVM#1307
The three patches are related, but they all need a little different review.
The first eliminates 5 hash iterations, replacing them with array traversal. It would be a no-brainer, but it affects the debug server, and I don't know how to test that.
I believe that wrapping access to lexical_names makes sense, even if the next patch isn't wanted, but it makes more sense with it
The third patch, eliminating the lookup hash for small lists, should help performance slightly, but