Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Truffle] New implementation of non-small hashes #2328

Merged
merged 12 commits into from Dec 17, 2014
Merged

[Truffle] New implementation of non-small hashes #2328

merged 12 commits into from Dec 17, 2014

Conversation

chrisseaton
Copy link
Contributor

@chrisseaton chrisseaton commented Dec 16, 2014

@eregon @nirvdrum please review.

We've got two new things here. First of all we've got a proper implementation of hash where there are more than 3 key-value pairs. We have to have a custom data structure because we need to call Ruby methods for hash and eql? with somewhere to store the state for the call site. We can't re-use JRuby's, for the usual reasons, and using the Truffle-style of storage strategies we don't want encapsulation - we want the logic to be in the nodes.

Secondly I've also moved some of the logic of the hash into a node - the basic operation of finding the right bucket. That allows us to store the state and possibly do some branch profiling and value profiling but still provide a nice method call interface.

I've added a test to stress hash implementations.

We aren't doing any rebalancing for overloaded indices yet - we never grow the number of slots.


@Specialization
public RubyArray uniq(RubyArray array) {
notDesignedForCompilation();
Copy link
Member

@eregon eregon Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MRI uses a temporary Hash to avoid O(n2) here.

Copy link
Member

@eregon eregon Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe we could have an overload of notDesignedForCompilation() mapping to CompilerAsserts.neverPartOfCompilation(String message) so to keep track of what is not compilation ready when not obvious.

Copy link
Contributor Author

@chrisseaton chrisseaton Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - we can go through and add a reason for all the notDesigneds when we do a spring clean after 0.6 is released.

@eregon
Copy link
Member

@eregon eregon commented Dec 16, 2014

Looks good globally.
Of course HashOperations should disappear, except maybe for some debug stuff. As well as DebugOps.send.

I am slightly uncomfortable with the naming of Bucket for what is a "hashtable entry".
But I understand the need to differentiate from a usual Map.Entry-like entry.
But maybe we don't want such Map.Entry-like entry? Walking directly on what is called Buckets here sounds good (might use an iterator), except maybe for data race issues.
In my intuition, a bucket is usually an element in the storage array, the array of buckets/slots. We likely don't have such a concept as an object in a practical implementation, as it's just a linked chain of hashtable entries.

public static boolean isOtherObjectArray(RubyHash hash, RubyHash other) {
return other.getStore() instanceof Object[];
// Arrays are covariant in Java!
return hash.getStore() instanceof Object[] && !(hash.getStore() instanceof Bucket[]);
Copy link
Member

@eregon eregon Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this means there is no good way to check that the ObjectArray strategy is actually only using just a Object[] with instanceof?
But getClass() should do it then, so the assertions in RubyHash constructor should be adapted?
Wondering if 2 instanceof is also better than 1 getClass().

@nirvdrum
Copy link
Contributor

@nirvdrum nirvdrum commented Dec 16, 2014

This needs to merge in the change from 7a4ab54.


for (int n = 0; n < RubyHash.HASHES_SMALL; n++) {
for (int n = 0; n < HashOperations.SMALL_HASH_SIZE; n++) {
if (n < size && eqlNode.call(frame, store[n * 2], "eql?", null, key)) {
return store[n * 2 + 1];
Copy link
Contributor

@nirvdrum nirvdrum Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was already here, but maybe it'd be easier to follow if KEY_OFFSET and VALUE_OFFSET constants were used.

Copy link
Contributor Author

@chrisseaton chrisseaton Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - I'll do that on the master branch later.


import java.util.LinkedHashMap;
import java.util.*;
Copy link
Contributor

@nirvdrum nirvdrum Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but JRuby core prefers these be expanded.

@nirvdrum
Copy link
Contributor

@nirvdrum nirvdrum commented Dec 16, 2014

I'll have to check out the branch to navigate the code since I'm finding it too annoying in GItHub's web UI. But at first blush this looks pretty good.

eregon
Copy link
Member

@eregon eregon commented on 35c1350 Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
previousBucket might not be needed for many operations, but this is future optimization.
It can also be re-computed cheaply if we know the chain at an index does not get too long.

@chrisseaton
Copy link
Contributor Author

@chrisseaton chrisseaton commented Dec 16, 2014

I renamed the buckets to entries and removed the backwards link in the bucket chain.

eregon
Copy link
Member

@eregon eregon commented on 96c8da6 Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks funny but I guess it makes sense once you know this Entry is actually a chain of Entries, that is a bucket.

eregon
Copy link
Member

@eregon eregon commented on 96c8da6 Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a rename here.

eregon
Copy link
Member

@eregon eregon commented on 96c8da6 Dec 16, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the chain of entries for a given index, forming a bucket

chrisseaton added a commit that referenced this issue Dec 17, 2014
[Truffle] New implementation of non-small hashes
@chrisseaton chrisseaton merged commit 8c9f381 into master Dec 17, 2014
1 check failed
@chrisseaton chrisseaton deleted the truffle-hash branch Dec 17, 2014
@enebo enebo added this to the JRuby 9.0.0.0 milestone Dec 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants