Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Dispatch #132

Closed
tebbi opened this issue Sep 8, 2020 · 5 comments
Closed

Dynamic Dispatch #132

tebbi opened this issue Sep 8, 2020 · 5 comments

Comments

@tebbi
Copy link

tebbi commented Sep 8, 2020

It's still an open question how dynamic dispatch, in particular virtual and interface method calls, should work post-MVP.
To contribute to the discussion about this, I wrote up a (sketch of a) proposal containing two ideas:

@RossTate
Copy link
Contributor

I find the idea of RTT dictionaries interesting, but I am confused about some of the applications. One thing in particular is I am not sure why this would be more efficient than the current MVP. I understand the issue of having an extra word for storing the v-table, but that can be resolved by being able to put information into the RTT (at statically-determinable offsets). So I'm more focusing on time efficiency rather than space efficiency.

For class methods, yes, in the current MVP you need to cast the this pointer, but that overhead seems likely to be smaller than a hash-table lookup (even if you know the lookup will succeed). For Java, the dictionaries for frequently invoked methods like equals and toString will need to have an entry for every class (because hashing does not incorporate inheritance), making lookups for these methods particularly slow (and speculative optimizations for devirtualization for these methods particularly unreliable).

For interface methods, the approach described in the Call Tags proposal (and in the referenced paper) enables the hash-index of a method and the collisions within a class to be statically determined, again enabling what is likely a more efficient implementation strategy than this proposal.

So I'm doubtful that this approach would provide better time performance, at least for dynamic dispatch. It seems to me that the opportunity for improvement lies more in providing a mechanism for placing data within RTTs (at statically determinable offsets) to improve space performance. My impression was that design of such a feature was deliberately deferred to after the MVP. Then again, it's not mentioned in the Post-MVP, but that might have just been an oversight or an effort to focus the Post-MVP document on items for which the design strategy was unclear. Anyways, I'm happy to share thoughts there if you're interested.

@tebbi
Copy link
Author

tebbi commented Sep 16, 2020

Thank you for the interesting points!

I agree that for normal virtual dispatch, vtables are probably faster, and embedding them somehow into the RTT would also save the header word without the need for a more complex dispatch mechanism. By making this RTT storage special, we can probably even save the this pointer cast, similar to RTT dictionaries. This doesn't cover interface dispatch though.

Comparing RTT dictionaries with the Call Tags proposal, I would say both approaches are essentially hash-tables, just one uses a table in the RTT dictionary and RTTs are keys, while the Call Tags proposal uses a table in the vtable/RTT and the call tags are keys, with the special feature of using jitted code to select the right entry from a hash bucket. Call tags and RTT dictionaries correspond to each other here, so we are comparing two ways to index into the same two-dimensional space. That being said, I agree that there are clear advantages to the second approach: Offsets become static since call sites usually know their call tag / RTT dictionary, which avoids an offset computation as part of the lookup. And the number of methods per class has less variance than the number of overloads per method, as your Java toString example illustrates.

Interestingly, I think that RTT dictionaries as described in my doc also allow for this second implementation strategy, without any change to the specification (I wasn't aware of that when writing the doc). What I describe now uses roughly the call tags way to index things, but doesn't use code to select the right function, this is still done in the normal hash-table style:

The RTT dictionary just becomes a globally unique identifier instead of a data structure, with the actual data being stored inside the RTT. Every RTT contains a fixed-size table to store RTT dictionary entries, plus an optional second-level storage if the fixed-size table becomes too small. The lookup sequence for an object o and an RTT dictionary d is then:

let rtt = o.rtt
let key = rtt.table[hash(d)%table_size].key
if (key != d) { ... call into slow case ... }
use rtt.table[hash(d)%table_size].value

None of these lines contains more than one load (if RTTs contain their table inline).
The table lookups might appear complicated, but it's actually a small static offset that can be embedded into the code as a single load instruction (on x86 and ARM). The whole instruction sequence on the fast path should then just be about 5 instructions: load RTT, load key, compare key with dictionary, branch if not equal, load value.
The tables can be re-arranged dynamically so that the most frequently used entries end up in the first level, which is actually an advantage over call tags, where re-arranging the switch would require re-jitting it.
In addition, this approach preserves the other advantages of RTT dictionaries: flexible usage for both methods and values and no cast for the this pointer.

@RossTate
Copy link
Contributor

That does seem like an improvement. But I am still concerned that there will be frequent collisions, both due to lots of elements (classes tend to have more virtual methods than interface methods) and due to uncoordinated index assignment (with interface methods, you can typically pick a different index for each method of an interface to avoid them colliding with each other). I worry that this would still be substantial overhead for something like field access (though more reasonable for calls).

So I think the improvement is that the revision puts the data in the rtt, but there is still room for improvement by making it possible to ensure that a particular datum is always at a specific location within the rtt.

@tebbi
Copy link
Author

tebbi commented Sep 28, 2020

True, for virtual dispatch, hashing is clearly less than optimal. With a slight change in specification, one could allow the user to specify a key when creating an RTT dictionary, which is then used to map to offsets, the idea being that for virtual methods, the user can specify the obvious consecutive offsets manually. For interface methods, the user could either resort to random values or pick carefully selected non-colliding offsets.
At this point, it would become very interesting to go for a statically typed option again, since that could result in virtual dispatch without any unnecessary overhead: load the RTT, load the method on the known offset, no need for any checking or casting of "this". Specifying such a system would become easier if we replaced RTTs with static types, since it avoids the complications of attaching all the RTT dictionary information to the RTT at RTT creation time. That's one of the reasons why I'm interested in going into a more static direction for nominal types.

@tlively
Copy link
Member

tlively commented Nov 1, 2022

We have various ideas to improve dyamic dispatch via custom meta-objects or more direct support for vtables in the post-MVP doc, and PRs would be welcome adding more. Since this issue is not actionable for the MVP, I'll close this issue.

@tlively tlively closed this as completed Nov 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants