Skip to content

Loading…

Review request: getHash candidate for TypeInfo_AssociativeArray #171

Merged
merged 4 commits into from

3 participants

@quickfur
D Programming Language member

To address AA problems such as issue 6210 and issue 7371, I wrote an override for getHash in TypeInfo_AssociativeArray. I've tested the code and confirmed it fixes issues 6210 and 7371. Basically it iterates over key/value pairs, and computes the hash of the key/value's respective hashes, summing the hashes over all pairs. (See code comments for explanation.)

However, I'm not sure if aaA.d is the best place to put this code. It adds another function to the aaA.d API, which changes D's AA ABI. However, I'm not sure if this function is implementable in object_.d without really dirty hacks such as struct AssociativeArray(K,V).{Slot,Hashtable} (and even with those, I don't know how to instantiate AssociativeArray given only the associated TypeInfo object).

Comments?

@quickfur
D Programming Language member

Rebased to prevent stagnation.

@andralex andralex commented on an outdated diff
src/rt/aaA.d
@@ -796,6 +796,32 @@ BB* _d_assocarrayliteralTX(TypeInfo_AssociativeArray ti, void[] keys, void[] val
}
+static TypeInfo_AssociativeArray _aaUnwrapTypeInfo(TypeInfo tiRaw)
+{
+ TypeInfo_AssociativeArray ti;
+ while (true)
+ {
+ if ((ti = cast(TypeInfo_AssociativeArray)tiRaw) !is null)
+ break;
+ else if (auto tiConst = cast(TypeInfo_Const)tiRaw) {
@andralex D Programming Language member
andralex added a note

remove "else"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@andralex andralex commented on an outdated diff
src/rt/aaA.d
((8 lines not shown))
+ {
+ if ((ti = cast(TypeInfo_AssociativeArray)tiRaw) !is null)
+ break;
+ else if (auto tiConst = cast(TypeInfo_Const)tiRaw) {
+ // The member in object_.d and object.di differ. This is to ensure
+ // the file can be compiled both independently in unittest and
+ // collectively in generating the library. Fixing object.di
+ // requires changes to std.format in Phobos, fixing object_.d
+ // makes Phobos's unittest fail, so this hack is employed here to
+ // avoid irrelevant changes.
+ static if (is(typeof(&tiConst.base) == TypeInfo*))
+ tiRaw = tiConst.base;
+ else
+ tiRaw = tiConst.next;
+ } else
+ assert(0); // ???
@andralex D Programming Language member
andralex added a note

Add a note mentioning the bugs if this is ever reached.

@quickfur D Programming Language member

Hi Andrei, this code was moved verbatim from __aaEqual so that it can be reused. I don't know the reason for the original assert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@andralex andralex commented on an outdated diff
src/rt/aaA.d
@@ -903,3 +911,78 @@ int _aaEqual(TypeInfo tiRaw, AA e1, AA e2)
return 1; // equal
}
+
+
+/*****************************************
+ * Computes a hash value for the entire AA
+ * Returns:
+ * Hash value
+ */
+extern (C)
+hash_t _aaGetHash(AA* aa, TypeInfo tiRaw)
+{
+ import rt.util.hash;
+
+ hash_t h = 0;
+
+ if (aa.a)
@andralex D Programming Language member
andralex added a note

This guards a large portion of the code. Better yet mention

if (!aa.a) 
{
    return 0;
}

at the very beginning of the function, after which you get to handle the real case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@andralex andralex commented on an outdated diff
src/rt/aaA.d
((24 lines not shown))
+
+ foreach (e; aa.a.b)
+ {
+ while (e)
+ {
+ auto pkey = cast(void*)(e + 1);
+ auto pvalue = pkey + keysize;
+
+ // Compute a hash for the key/value pair by hashing their
+ // respective hash values.
+ hash_t[2] hpair;
+ hpair[0] = e.hash;
+ hpair[1] = valueti.getHash(pvalue);
+
+ // Combine the hash of the key/value pair with the running hash
+ // value using a commutative operator (+) so that the resulting
@andralex D Programming Language member
andralex added a note

s/commutative/associative/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@andralex
D Programming Language member

@quickfur What's the status of this? Thanks!

@andralex
D Programming Language member

ping?

@quickfur
D Programming Language member

Rebased & updated per Andrei's comments.

@andralex
D Programming Language member

This now breaks the builds, presumably because of the const changes (some of which I'll undo). Stay tuned and please fix for next week. Thanks!

@quickfur
D Programming Language member

Are you reverting making toHash const? The changes with const toHash are non-trivial. If you're reverting making toHash const, I'm going to wait until afterwards.

@andralex
D Programming Language member

Honest I'm not sure what the best path is for toHash. @9rnsr ?

@andralex
D Programming Language member

ping @9rnsr and @quickfur. How should we move forward with this?

@andralex
D Programming Language member

reping

@quickfur
D Programming Language member

Sorry for the late update; this morning I finally figured out a simple way to fix the code to work with const toHash without requiring massive changes. So this has been rebased and is ready to merge now.

@quickfur
D Programming Language member

Rebased to avoid stagnation

@quickfur
D Programming Language member

Rebased to avoid stagnation.

@andralex andralex was assigned
@alexrp
D Programming Language member

Assigning to @andralex

@quickfur
D Programming Language member

Should I squash it too?

@alexrp
D Programming Language member

@andralex waiting on you

@andralex
D Programming Language member

worksforme

@andralex andralex merged commit 0650b6e into D-Programming-Language:master

1 check was pending

Details default Pass: 6, Pending: 3
@andralex
D Programming Language member

merged, thanks!

@quickfur quickfur deleted the quickfur:gethash_fixes branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 110 additions and 20 deletions.
  1. +6 −0 src/object_.d
  2. +104 −20 src/rt/aaA.d
View
6 src/object_.d
@@ -689,6 +689,11 @@ class TypeInfo_AssociativeArray : TypeInfo
this.value == c.value;
}
+ override hash_t getHash(in void* p) nothrow @trusted
+ {
+ return _aaGetHash(cast(void*)p, this);
+ }
+
// BUG: need to add the rest of the functions
override @property size_t tsize() nothrow pure const
@@ -2100,6 +2105,7 @@ extern (C)
int _aaApply2(void* aa, size_t keysize, _dg2_t dg);
void* _d_assocarrayliteralT(TypeInfo_AssociativeArray ti, size_t length, ...);
+ hash_t _aaGetHash(void* aa, const(TypeInfo) tiRaw) nothrow;
}
struct AssociativeArray(Key, Value)
View
124 src/rt/aaA.d
@@ -103,7 +103,7 @@ struct AA
* in value.
*/
-size_t aligntsize(size_t tsize)
+size_t aligntsize(size_t tsize) nothrow
{
version (D_LP64)
// Size of key needed to align value on 16 bytes
@@ -802,6 +802,34 @@ BB* _d_assocarrayliteralTX(TypeInfo_AssociativeArray ti, void[] keys, void[] val
}
+static TypeInfo_AssociativeArray _aaUnwrapTypeInfo(const(TypeInfo) tiRaw) nothrow
+{
+ const(TypeInfo)* p = &tiRaw;
+ TypeInfo_AssociativeArray ti;
+ while (true)
+ {
+ if ((ti = cast(TypeInfo_AssociativeArray)*p) !is null)
+ break;
+
+ if (auto tiConst = cast(TypeInfo_Const)*p) {
+ // The member in object_.d and object.di differ. This is to ensure
+ // the file can be compiled both independently in unittest and
+ // collectively in generating the library. Fixing object.di
+ // requires changes to std.format in Phobos, fixing object_.d
+ // makes Phobos's unittest fail, so this hack is employed here to
+ // avoid irrelevant changes.
+ static if (is(typeof(&tiConst.base) == TypeInfo*))
+ p = &tiConst.base;
+ else
+ p = &tiConst.next;
+ } else
+ assert(0); // ???
+ }
+
+ return ti;
+}
+
+
/***********************************
* Compare AA contents for equality.
* Returns:
@@ -823,25 +851,7 @@ int _aaEqual(TypeInfo tiRaw, AA e1, AA e2)
// Check for Bug 5925. ti_raw could be a TypeInfo_Const, we need to unwrap
// it until reaching a real TypeInfo_AssociativeArray.
- TypeInfo_AssociativeArray ti;
- while (true)
- {
- if ((ti = cast(TypeInfo_AssociativeArray)tiRaw) !is null)
- break;
- else if (auto tiConst = cast(TypeInfo_Const)tiRaw) {
- // The member in object_.d and object.di differ. This is to ensure
- // the file can be compiled both independently in unittest and
- // collectively in generating the library. Fixing object.di
- // requires changes to std.format in Phobos, fixing object_.d
- // makes Phobos's unittest fail, so this hack is employed here to
- // avoid irrelevant changes.
- static if (is(typeof(&tiConst.base) == TypeInfo*))
- tiRaw = tiConst.base;
- else
- tiRaw = tiConst.next;
- } else
- assert(0); // ???
- }
+ TypeInfo_AssociativeArray ti = _aaUnwrapTypeInfo(tiRaw);
/* Algorithm: Visit each key/value pair in e1. If that key doesn't exist
* in e2, or if the value in e1 doesn't match the one in e2, the arrays
@@ -909,3 +919,77 @@ int _aaEqual(TypeInfo tiRaw, AA e1, AA e2)
return 1; // equal
}
+
+
+/*****************************************
+ * Computes a hash value for the entire AA
+ * Returns:
+ * Hash value
+ */
+extern (C)
+hash_t _aaGetHash(AA* aa, const(TypeInfo) tiRaw) nothrow
+{
+ import rt.util.hash;
+
+ if (!aa.a)
+ return 0;
+
+ hash_t h = 0;
+ TypeInfo_AssociativeArray ti = _aaUnwrapTypeInfo(tiRaw);
+ auto keyti = ti.key;
+ auto valueti = ti.next;
+ const keysize = aligntsize(keyti.tsize());
+
+ foreach (e; aa.a.b)
+ {
+ while (e)
+ {
+ auto pkey = cast(void*)(e + 1);
+ auto pvalue = pkey + keysize;
+
+ // Compute a hash for the key/value pair by hashing their
+ // respective hash values.
+ hash_t[2] hpair;
+ hpair[0] = e.hash;
+ hpair[1] = valueti.getHash(pvalue);
+
+ // Combine the hash of the key/value pair with the running hash
+ // value using an associative operator (+) so that the resulting
+ // hash value is independent of the actual order the pairs are
+ // stored in (important to ensure equality of hash value for two
+ // AA's containing identical pairs but with different hashtable
+ // sizes).
+ h += hashOf(hpair.ptr, hpair.length * hash_t.sizeof);
+
+ e = e.next;
+ }
+ }
+
+ return h;
+}
+
+unittest
+{
+ string[int] key1 = [1: "true", 2: "false"];
+ string[int] key2 = [1: "false", 2: "true"];
+
+ // AA lits create a larger hashtable
+ int[string[int]] aa1 = [key1: 100, key2: 200];
+
+ // Ensure consistent hash values are computed for key1
+ assert((key1 in aa1) !is null);
+
+ // Manually assigning to an empty AA creates a smaller hashtable
+ int[string[int]] aa2;
+ aa2[key1] = 100;
+ aa2[key2] = 200;
+
+ assert(aa1 == aa2);
+
+ // Ensure binary-independence of equal hash keys
+ string[int] key2a;
+ key2a[1] = "false";
+ key2a[2] = "true";
+
+ assert(aa1[key2a] == 200);
+}
Something went wrong with that request. Please try again.