Fix Bug 286: Returning AA value whose key is a RefCounted type gets wrong value. #640

ibuclaw · 2018-03-11T19:45:57Z

Though waiting for the first test result, as this code-gen rewrite was added for a reason, but the test case is hidden somewhere in #571.

jpf91

OK, but:

No Changelog?
Testcase is not completely reduced. Is it possible the test case may no longer test this issue correctly if phobos changes?
Could you summarize what this changes exactly? In which cases do we need to stabilize the LHS, in which should we not do that?

ibuclaw · 2018-03-16T22:29:53Z

Yep, I'll add it in. Just wanted opinion first (also makes it easy to rebase as changelog entries always conflict).
Well I hope phobos doesn't change, as its only importing a primitive and a constant. :-)

I'll see if it could be reduced further but I chose to be as close to the original code where the problem was found.

The bug here is this line.

return (matcherCache[set] = S286(set));

Where set is a recounted type and having the codegen rewrite in stabilize_expr meant that the side effect was not saved. And so matcherCache was accessed twice by two different keys - first for assignment and second for return.

ibuclaw · 2018-03-16T22:37:45Z

I'll open a bug report anyway about these routines, as they could be better organized as to what should handle performing an unary operation on a comprex expression and what should remove all side effects. Currently there's some conflation going on between the two that was quickly done to support struct types that are never copied to temp.

Preferably we'll agree something this weekend as its affecting dub in ubuntu 18.04, and so would like it in before the freeze.

jpf91 · 2018-03-17T10:00:43Z

Well I hope phobos doesn't change, as its only importing a primitive and a constant. :-) I'll see if it could be reduced further but I chose to be as close to the original code where the problem was found.

I understand the motivation, but what if CodePointSet gets changed into a non-RC struct some time? Then the test would silently test something different.

And so matcherCache was accessed twice by two different keys

So the key differs depending on the current reference count? Isn't that already a phobos bug by itself?

Preferably we'll agree something this weekend as its affecting dub in ubuntu 18.04, and so would like it in before the freeze.

OK. The current solution is fine by me, feel free to merge at any time. I'm just trying to understand this in detail to ensure the fix really is general enough to fix similar problems.

So if we have (a = b) += c then we need to explicitly rewrite to a = b + c, a in binop_assignment and we can't do that in stabilize_expr because then return (matcherCache[set] = S286(set)) gets rewritten to (matcherCache[set] = S286(set), matcherCache[set]) which evaluates side effect accessing set twice? I think I kinda understand this now, but the whole stabilize_expr/reference stuff is still quite a mystery to me.

So we do we need that special case for (a = b) += c, is this simply because the backend does not handle such assignment chains? In other words, is there any possibility there are other code paths where we have to do something similar and our tests just don't catch it right now, or is this approach completely specific to binop_assignment?

ibuclaw · 2018-03-17T16:33:20Z

So the key differs depending on the current reference count? Isn't that already a phobos bug by itself?

It's a bug on our side because.

return (matcherCache[set] = S286(set));

Should lower to (pseudo-code).

tmp = set.postblit(); // increments ref count.
slot = matcherCache[set];
*slot = S286(tmp);
tmp.dtor();  // decrements ref count.
return *slot;

Whereas we were instead generating.

tmp = set.postblit(); // increments ref count.
slot = matcherCache[set];
*slot = S286(tmp);
tmp.dtor();  // decrements ref count.
return matcherCache[set];

The rewrite erroneously removed the side effect, so slot was never saved for re-use in the return, we just called _aaGetY again.

ibuclaw · 2018-03-17T16:41:24Z

I think I kinda understand this now, but the whole stabilize_expr/reference stuff is still quite a mystery to me.

Yes, it's a bit of a mess, in part related to just getting rvo working. I think stabilize_expr should really force a temporary, returning a TARGET_EXPR slot should do.

Then what the function is currently used for renamed to something like complex_unary_op where it generically handles any operation passed to it.

e.g:

// (a, b, c)  ->  (a, b, &c);
exp = complex_unary_op(ADDR_EXPR, exp);
// (a, b, c)  ->  (a, b, *c);
exp = complex_unary_op(INDIRECT_REF, exp);

Then go though the painful process of splitting out what really needs a side-effect stabilized, and what just wants to apply an operation to the relevant part of the expression.

ibuclaw · 2018-03-17T16:43:09Z

this approach completely specific to binop_assignment?

This approach is specific to binop_assignment, putting it in as a generic rewrite was a mistake. But I think things could be simplified as per previous comment.

jpf91 · 2018-03-17T18:14:05Z

So the key differs depending on the current reference count? Isn't that already a phobos bug by itself?

It's a bug on our side because.
The rewrite erroneously removed the side effect, so slot was never saved for re-use in the return, we just called _aaGetY again.

I get that part and that's a bug in GDC for sure.
But why does matcherCache[set] even return different values? As far as I understand only the reference count of set is different, but it still isn't 0 in the second call, correct? So we pass two valid instances to _aaGetY with the only difference being the reference count value. But I don't think the AA key hash should take the reference count into account, that does not make any sense and seems like a possible source of bugs. So the set type should probably overload the toHash function.

This approach is specific to binop_assignment, putting it in as a generic rewrite was a mistake. But I think things could be simplified as per previous comment.

Ok thanks for explaining. So this fix should really be fine for now.

ibuclaw · 2018-03-18T12:28:52Z

But why does matcherCache[set] even return different values? As far as I understand only the reference count of set is different, but it still isn't 0 in the second call, correct? So we pass two valid instances to _aaGetY with the only difference being the reference count value. But I don't think the AA key hash should take the reference count into account, that does not make any sense and seems like a possible source of bugs. So the set type should probably overload the toHash function.

Just to clarify, this is the actual generated code in release mode.

__aaval2414 = {.filter={}, .this=0B};
__fieldPostblit (&(__copytmp2412 = *set));
*__ctor (&__aaval2414, (struct InversionList &) &__copytmp2412);
return <retval> = *(struct CharMatcher *) _aaGetY (&matcherCache, &_D99TypeInfo_HS3std3uni38__T13InversionListTS3std3uni8GcPolicyZ13InversionListS3reg4mainFZ11CharMatcher6__initZ, 24, (struct InversionList *) set) = __aaval2414;

As it's release mode there is no SAVE_EXPR generated for the call to _aaGetY (bounds checking requires multiple evaluation).

And because all the rewrite does is look for e1 = e2 and returns e1, with disregard to whether or not e1 has a side effect, e1 ends up being evaluated multiple times.

And so the bad codegen does:

<retval> = *_aaGetY(set) = aavalue;
// -- rewritten to -> 
*_aaGetY(set) = aavalue;
<retval> = *_aaGetY(set);

In the first call of _aaGetY(set), the value of set is [..., 1], in the second call, [..., 2].

Why does it increment inside every call to _aaGetY? Because the key has a postblit.
https://github.com/D-Programming-GDC/GDC/blob/31c6fcdd6bcd2e68a7ee5f4d48622238a38e4b2b/libphobos/libdruntime/rt/aaA.d#L397-L403

…rong value.

ibuclaw · 2018-03-18T12:40:11Z

OK, replaced test with one that doesn't depend on phobos.

jpf91 · 2018-03-18T13:28:36Z

Just to clarify, this is the actual generated code in release mode [...]

Thanks for explaining again in detail. I understand that part, but I question that the reference count should even be considered when treating a CodepointSet as a key for AAs. This causes other problems, such as this: https://run.dlang.io/is/ubKJT5 Don't you think that's a bug worth filing against phobos?

OK, replaced test with one that doesn't depend on phobos.

Looks good, this should be ready for merging 👍

ibuclaw · 2018-03-18T21:51:11Z

@jpf91 it may be worth asking about. There might be some explanation for it that I can't think of.

ibuclaw force-pushed the brokenregex branch from a04524f to d26f777 Compare March 12, 2018 00:43

ibuclaw requested a review from jpf91 March 15, 2018 08:48

jpf91 approved these changes Mar 16, 2018

View reviewed changes

ibuclaw force-pushed the brokenregex branch 2 times, most recently from 88736f4 to a1c4b11 Compare March 18, 2018 12:38

Fix Bug 286: Returning AA value whose key is a RefCounted type gets w…

6d67b5e

…rong value.

ibuclaw force-pushed the brokenregex branch from a1c4b11 to 6d67b5e Compare March 18, 2018 12:39

ibuclaw merged commit faab4ac into D-Programming-GDC:master Mar 18, 2018

ibuclaw deleted the brokenregex branch March 18, 2018 21:48

ibuclaw mentioned this pull request May 1, 2018

Add build_unary_op, use it instead of stabilize_expr for various unary ops #658

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Bug 286: Returning AA value whose key is a RefCounted type gets wrong value. #640

Fix Bug 286: Returning AA value whose key is a RefCounted type gets wrong value. #640

ibuclaw commented Mar 11, 2018

jpf91 left a comment

ibuclaw commented Mar 16, 2018

ibuclaw commented Mar 16, 2018

jpf91 commented Mar 17, 2018 •

edited

ibuclaw commented Mar 17, 2018 •

edited

ibuclaw commented Mar 17, 2018 •

edited

ibuclaw commented Mar 17, 2018 •

edited

jpf91 commented Mar 17, 2018

ibuclaw commented Mar 18, 2018

ibuclaw commented Mar 18, 2018

jpf91 commented Mar 18, 2018

ibuclaw commented Mar 18, 2018

Fix Bug 286: Returning AA value whose key is a RefCounted type gets wrong value. #640

Fix Bug 286: Returning AA value whose key is a RefCounted type gets wrong value. #640

Conversation

ibuclaw commented Mar 11, 2018

jpf91 left a comment

Choose a reason for hiding this comment

ibuclaw commented Mar 16, 2018

ibuclaw commented Mar 16, 2018

jpf91 commented Mar 17, 2018 • edited

ibuclaw commented Mar 17, 2018 • edited

ibuclaw commented Mar 17, 2018 • edited

ibuclaw commented Mar 17, 2018 • edited

jpf91 commented Mar 17, 2018

ibuclaw commented Mar 18, 2018

ibuclaw commented Mar 18, 2018

jpf91 commented Mar 18, 2018

ibuclaw commented Mar 18, 2018

jpf91 commented Mar 17, 2018 •

edited

ibuclaw commented Mar 17, 2018 •

edited

ibuclaw commented Mar 17, 2018 •

edited

ibuclaw commented Mar 17, 2018 •

edited