New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 3825 - AAs entries are default initialized before the new value is evaluated #1465
Conversation
My initial impression was that the oddness comes from e2ir.c. IndexExp::toIR(). if (modifiable) s = taa->aaGetSymbol("GetX", 1); and this is the only bit of code that uses modifiable (it's set in IndexExp::semantic()). Looks to me like a nasty hack that doesn't actually work... |
What do you mean? I know The root cause of bug 3825 is an incorrect runtime evaluation order of array element setting. To fix the problem, I reordered AST in front-end. |
@9rnsr can we please work this out so it has the same exact semantics as a user-defined version? We'd want |
@andralex Yes. As I commented here, aa[key] op= val; // (op == empty or any binary operators) is rewritten to: ref __aatmp = aa;
ref __aakey = key;
ref __aaval = val;
__aatmp[__aakey] op= __aaval; // assignment (allocate and set slot) Then the runtime evaluation order is equal to user defined |
We were wondering in the message boards, what behavior does this code produce: int[int] a;
++a[0]; Does this insert If it does insert Thanks. |
Hmm, I didn't considered about that. unittest {
int[int] naa;
++naa[0]; // !!
// This is implicitly translated to: naa[0] += 1;
// Allocate the slot naa[0] first, set int.init next,
// and finally increment it.
assert(naa[0] == 1);
import core.exception;
struct S {
void opUnary(string op : "++")() { assert(0); }
}
S[int] saa;
bool thrown = false;
try {
++saa[0]; // !!
assert(0);
} catch (RangeError) thrown = true;
// Getting slot saa[0] first, then call opUnary!"++".
// So RangeViolation occurs.
assert(thrown);
assert(saa.length == 0);
} This seems inconsistent behavior to me, but I'm not sure it is unexpected or not... |
For what it's worth, I think both If it doesn't throw, then |
OK. That seems reasonable. I updated compiler code and test case to implement it. |
I think this is two much, and will break a lot of code. |
The idiom Therefore this is somewhat of a special case. If the value type is numeric, |
To drive the point home: I think the current behavior is useful and principled enough to justify for its being a special case. |
…alue is evaluated
OK. I reverted recent change, and explicitly added test cases for |
Could you make the tests run with ctfe as well? |
@yebblies Hmm, I tried to it, but following errors are shown: In current, maybe, CTFE interpreter is designed as that an invalid AA access immediately makes an error, instead of throwing compile-time RangeError object. To @donc: is this an improvement point of CTFE engine? |
@andralex this seems to me like a weak argument for a special case. Granted, it is probably a frequent usage of AA, but you'll find another frequent use (even more frequent in my experience) in caching. Caching with D's AA is quite bad. First, you can't cache anything immutable as immutable(T)[U] is roughly equivalent to immutable(T[U]) and because you have no way to compute some cached value without 2 round trip into the cache when 1 should be enough. Additionally, this very specific use case can be handled quite nicely doing : Something that can be done as simply as this shouldn't justify any special casing, especially when other really frequent uses case are handled in a totally backward way. |
I'm on the fence about whether to allow ++a[0]. Consider what happens, and what should happen, if the AA is double[int], for example. Making the result 1.0 would be very convenient, but inconsistent with double.init being nan. Making the result nan is equivalent to not having this special case. So that means it's really only useful for int-valued AA's. I don't like the idea of introducing a special case for specific value types; it makes AA behaviour inconsistent which later will lead to problems with generic code. |
Yes, but it's not efficient. Andrei posted the efficient way of doing it, which doesn't involve two AA lookups. I suppose that for the common case of word counters and such there could be a library type wrapping AAs which allows declaring a "default value", used in |
@CyberShadow I know that it does involve 2 aa lookups (and that this is bad). You'll note that, with the current AA interface, 2 lookups when only one is needed is required for many use cases, and that this special case will only solve a very specific one. If 2 lookup are considered an issue (I do consider this is) then this very problem have to be addressed. On a final note, Andrei's solution require 2 AA lookup when the value is created, which is still suboptimal. |
The root of the problem is the design of the current interface. There is no direct way to differentiate between "lookup X and create it if it doesn't exist" vs. "lookup X and throw error if it doesn't exist". An ideal interface lets you differentiate between the two, which solves all of these issues without efficiency problems and without need of special cases. IMO, it's the compiler's job to optimize a[k] = a.get(k,0)+1 into a single lookup. But the underlying AA API needs to provide the necessary hooks for the compiler to do this (by providing a distinct way of saying "lookup X" vs. "lookup or create X"). |
Oh, that's true. There should probably be an AA operation which gives you the value pointer (like |
That would solve the problem, yes. Either that, or add a bool parameter to aa.get to create the entry if it doesn't already exist. So we would have: Value* get(Key key, lazy Value defValue=Value.init, bool create=false) If defValue is not specified, create cannot (should not) be true anyway, so this works. |
|
@quickfur boolean in interface usually ends up confusing everybody. But if get return a ref, the general idea make sense. @CyberShadow can't auto ref help here ? |
What happens if I do: |
In such a case, you can't return a reference so you get a compile error. If the argument is a ref, then a ref can be returned. Anyway, you can always return a reference when you create the entry, so we probably need a function to do that that return a ref instead of a flag as @quickfur suggested. |
You're right, better not conflate getOrCreate with get. It should be a separate method. |
I'd would prefer that getOrCreate became get and get became something else, but this breakage is probably not acceptable. This behavior is much more useful. |
That's all fine. My point was that both the IndexExp::modifiable hack, and this one, are special cases that wouldn't be necessary at all, if AAs really were a library type (this would be an opIndexOpAssign). But on further thought, I think it would be too disruptive to change that right now. So this is probably the best approach in the short term. For the CTFE stuff, I found this while working on bug 9023. While generating test cases, I came to the conclusion that the existing runtime behaviour doesn't make any sense. I wanted to clarify what the correct behaviour should be, before I duplicate it in CTFE. After merging this in, it'd be much easier to do in CTFE. I'll add the CTFE cases when I fix 9023. Summary: I think we should merge this in to get the semantics correct, with the understanding that it'll be removed eventually when a complete AA overhaul happens. LGTM. |
@donc OK, understood. Thanks for your reply. |
Issue 3825 - AAs entries are default initialized before the new value is evaluated
http://d.puremagic.com/issues/show_bug.cgi?id=3825