Fix emplace #1082

monarchdodra · 2013-01-21T16:23:36Z

Because it's been broken for months, with no improvements in sight. Having a correct emplace is (IMO) critical. I'm opening this again (3rd time)

I wrote this fix, which is

small and relatively simple (as simple as emplace gets)
stand alone (no dependency on other pulls)
efficient
complete (AFAIK)

Fixes:

correct emplace(void) implementation for static arrays.
unittest emplace(args) for static arrays.
emplace never calls opAssign.
emplace choses postblit over constructor (when both are possible).
Correctly postblits and/or constructs nested members...
...but refuses to call disabled postblits.
Correctly handles emplacement from alias this.
Correctly refuses to modify const data
Supports bug 8847.
Correctly refuses to posblit from an immutable, when implicit cast is not possible.

...phew ! And unittests for all of that, of course. Did I miss anything?

Also, can work in safe and nothrow code! And sometimes CTFE to boot (yay).

Has a deprecated branch for calling opCall: arguably, opCall is not construction, so it should not be supported. However, the old implementation allowed opCall, so I added a deprecated branch.

Finally: ALWAYS diagnoses illegal args at top level with a verbose explanation for special cases (EG, no internal errors).

In regards to context pointer, it is copied when available. Otherwise, emplace will initialize it to null. This is kind of ugly, but it consistent behavior with dealing with voldemort types (such as when they are constructed inside any template).

PLEASE PLEASE PLEASE take the time to review this.

This isn't some small bug fix, or some petty performance improvement. This needs to make it into phobos.

There may be ways to optimize certain branches (thinking static array default emplacement), but correctness trumps efficiency at this point...

If you have any doubts, please voice them, as much as you can. I can justify the behavior (or lack thereof) of everything in this pull.

alexrp · 2013-01-21T16:27:12Z

Waiting for the auto tester, then I'll review.

alexrp · 2013-01-21T16:36:10Z

std/conv.d

    else static if (is(typeof(T(args))))
    {
        // Struct without constructor that has one matching field for
-        // each argument
-        *chunk = T(args);
+        // each argument. Individually emplace each attribute


quickfur · 2013-01-25T05:51:31Z

Did dmd git HEAD break again? Why is the autotester failing in dmd test cases when this is only a Phobos pull request?

jmdavis · 2013-01-25T05:55:37Z

Did dmd git HEAD break again? Why is the autotester failing in dmd test cases when this is only a Phobos pull request?

Because the dmd test suite uses Phobos.

MartinNowak · 2013-01-29T11:14:30Z

The reason I opened my pull request was that I use std.conv.emplace as reference implementation for initializing raw memory. Finding obvious bugs in this function is bad, so I totally agree with you that this is important.
It bothers me that the implementation of such a fundamental idiom has become so complex.
Maybe we should make this a druntime function object.initialize as complement to destroy?
We should get @andralex and @WalterBright on board for correct semantics.

MartinNowak · 2013-01-29T11:23:23Z

std/conv.d

+            memcpy(chunk, p, T.sizeof);
+        else
+            memset(chunk, 0, T.sizeof);
+    }


IIUC the purpose of bitblit+optional postblit is performance. Here you prefer the compiler generated assign-bitblit over memcpy.
Can the compiler actually do better than calling memcpy?

AFAIK, the difference is that memcpy is a 100% run-time call: memcpy does not know the size of the thing to be copied, nor the source/destination addresses. This makes it inherently slower. It has to do a lot of run-time checking, and must implement a copy loop.

By comparison a (non-elaborate) an opAssign is a straight up assembly memcopy: the compiler knows at all (most) the parameters at compile time: the source, and the size. This means the call is just replaced by an assembly copy.

Also: memcpy is neither safe nor CTFE-able.

OK, I had the same intuition.
Even though you have separate functions for the different semantics it wasn't immediately obvious to me that this two static if branches are supposed to do the same. Maybe a comment or not separating the functions would be more expressive.

memcpy is handled as an intrinsic in many C++ compiler so it does exploit size and alignment when known statically. Not sure whether dmd does anything in particular with it. cc @WalterBright

In any case, regardless of memcpy's performance, it is neither safe nor CTFE. It may also have to pay for a pre-typeid run-time call (That's run-time... right?)

Currentlymemcpy is not handled as intrinsic by dmd. Also it does not seem to take any advantage for D array copies with know size.

monarchdodra · 2013-01-29T11:52:27Z

It bothers me that the implementation of such a fundamental idiom has become so complex.

One of the reasons it is so complex, is that emplace (for structs) merges both the notions of emplace from another instance (postblit) and actual construction from args. This means it has to try to Analise which one you are trying to call, which can be difficult when you take into account that D does have constructors that take the same type:

struct S
{
    this(this){writeln("postblit");}
    this(S)   {writeln("CC");}
    this(int) {writeln("this(int)");}
}

void main()
{
    S a;        //Nothing
    S b = a;    //postblit
    S c = S(a); //CC
    S d = S(5); //Constructor
}

In particular, there are some strange corner cases, where if the target type and destination types match, but postblit is disabled, then emplace has to try and guess if maybe you want to try to fall back to CC'ing. It makes a mess of things.

Things would have been much simpler if we had an extra explicit functions: First would be the normal emplace (basically emplace with 1 arg, which matching types) that basically just postblits. Then, we'd have one that calls actual constructor (eg emplace from arguments).

The caller should always know which of the two he wants anyways. It would have made things simpler and safer. EG:

S a = void;
S b;
emplace(&a, b); //Explicit request for postblit initialization
emplaceArgs(&a, 1); //Explicit request to *construct* a from the *arguments* "1".

Oh well, that's how it is now... Still, I think it is worth thinking about such a change.

andralex · 2013-01-29T15:17:19Z

I'm not particularly worried, though indeed simpler would be nicer. This is highly generic and highly leveraged code, the kind is usually inside the compiler. We can hoist it into the language proper because of introspection, and from that perspective it looks as expected.

MartinNowak · 2013-01-29T16:19:27Z

emplace (for structs) merges both the notions of emplace from another instance (postblit) and actual construction from args

Thanks for the detailed explanation.
So basically emplace supports all initialization schemes for structs and should have exactly the same semantics (nice test case). Therefor it also supported static opCall.
This really makes sense and is actually very simple.

monarchdodra · 2013-01-29T16:27:56Z

So basically emplace supports all initialization schemes for structs and should have exactly the same semantics.

That's the goal yes. The idea is:

Do the same as S a = "args";
IF that doesn't work, do the same as S a = S(args);

Therefor it also supported static opCall.

Well... technically, static opCall is not a construction scheme, so emplace is not supposed to call it. The reason for this is that opCall is just a function like any other that returns a value, and that value can then be assigned/postBlittted onto your current instance.

If you want to emplace from an opCall, then you should just call emplace(&a, S(args));

That's my stance anyways. It's up to debate.

DmitryOlshansky · 2013-02-03T17:10:29Z

Given preparations for beta, it would great if we could squeeze this into the next release. Thoughts?

monarchdodra · 2013-02-21T21:24:07Z

Rebased.

Added unittest for, new but already fixed by this, bug 9559:

http://d.puremagic.com/issues/show_bug.cgi?id=9559

dnadlinger · 2013-03-23T21:38:42Z

std/conv.d

+unittest
+{
+    ////Works, but breaks in "-w -O" because of @@@9332@@@.
+    ////Uncomment test when 9332 is fixed.


This should be the case now?

Yes, TY. Uncommenting.

monarchdodra · 2013-03-28T11:35:17Z

Fixes:

Issue 9824 - Emplace is broken

http://d.puremagic.com/issues/show_bug.cgi?id=9824

IgorStepanov · 2013-04-22T17:45:18Z

What the status of this pull?
Did it allows to emplace variable at compile time?

    static struct Json {
       int a;
        void opAssign(Json) {}
        size_t length() const { return aa.length; }
    }
    struct Pair(A,B)
    {
       A first;
       B second;
       this(A f, B s)
       {
          emplace(&first, f);
          emplace(&second, s);
       }
    }
    static p = Pair!(int, const(Json))(4, Json(99)); //CTFE; simple assignment does not allowed, because Json.opAssign is not const.

monarchdodra · 2013-04-23T06:54:16Z

@IgorStepanov

Thank you for your code participation. Allow me to answer:

What the status of this pull?

Waiting on a good Samaritan to review it. I am also crawling the boards looking for usecases for emplace to test it.

Did it allows to emplace variable at compile time?

Very partially, and experimentally. Partially, because it requires the types to have a non-elaborate assign. Experimentally, because it has a tendency to break the compiler (pointer stuff).

static p = Pair!(int, const(Json))(4, Json(99)); //CTFE; simple assignment does not allowed, because Json.opAssign is not const.

There is a dual problem in that code.

The first problem is the const if the input type T (in this case, a Json), then emplace simply can't do the emplace. I suggest changing the code to:

    struct Pair(A,B)
    {
       A first;
       B second;
       this(A f, B s)
       {
          emplace(cast(Unqual!A*)&first, f);
          emplace(cast(Unqual!B*)&second, s);
       }
    }

This works, although there may be some "unexpected consequences" to the cast I have not thought of? AFAIK, it should be safe.

The second problem is that this can't be done at compile time, because Json has an elaborate opAssign, ergo Pair has an elaborate opAssign. If you comment it out, it should work, but you'll actually just get an internal compiler error. As I said: Experimental.

monarchdodra · 2013-04-23T10:24:41Z

I rebased, did some more tweaking/simplifying/documenting. Added some more early diagnosticating for qualified objects.

I also hit a compiler bug with CTFE:
http://d.puremagic.com/issues/show_bug.cgi?id=9982

IgorStepanov · 2013-04-23T14:05:31Z

@donc please see this bug: http://d.puremagic.com/issues/show_bug.cgi?id=9982
Error, when you geting address of struct member and dereference it. Is it hard to fix it?

monarchdodra · 2013-04-24T09:59:30Z

Apart from the compiler bug (which is not blocking, and only sometimes triggers with CTFE), AFAIK, everything works.

Could I get a second review on this? I reworked the code a little, and now the "flow" inside emplace is (IMO) simple, clear and straight forward. There are (I'd say) enough unittests to validate correct behavior.

Could we try to get this through? It's important.

IgorStepanov · 2013-04-24T19:41:00Z

Do emplace strongly depends on phobos? Maybe this function correctly place to the druntime?
Will it require a lot of effort?

monarchdodra · 2013-04-25T10:45:35Z

Do emplace strongly depends on phobos? Maybe this function correctly place to the druntime?

Not much, it only uses a few minor traits: hasElaborateAssign, isAssignable, and isStaticArray. The only problem is that it does need access to the passed parameter types, and, AFAIK, there are no templates inside druntime. Furthermore, emplace currently uses things like typeid.postblit: This makes a runtime call that is actually not necessary with correct compile time introspection.

I think it would be better to leave it in phobos for now.

If you ask me though, the best place to put such a functionality though would be straight into the compiler, via placement new. Compiler knows best; emplace merely reverse engineers what the compiler does for its construction sequence...

MartinNowak · 2013-04-25T12:11:54Z

I too think it belongs into druntime because it's kind of the complement to destroy and manual memory management is an intrinsic language property.

Not much, it only uses a few minor traits: hasElaborateAssign, isAssignable, and isStaticArray. The only problem is that it does need access to the passed parameter types, and, AFAIK, there are no templates inside druntime.

The template is not the problem, not having access to std.traits is.

monarchdodra · 2013-07-24T07:20:50Z

New reg:

http://d.puremagic.com/issues/show_bug.cgi?id=10690

I think we should try to review this. Even if the plan is to move it to druntime, or change it to typeid(construct), there is currently a lot of code that relies on emplace, and it should be fixed.

pinver · 2013-08-25T19:23:32Z

I would like to push for a fix of emplace: it's like disseminating mines in the phobos fields, and it's a pain for learners like me to hit some of them (see http://forum.dlang.org/thread/nxbdgtdlmwscocbiypjs@forum.dlang.org)

monarchdodra · 2013-08-25T21:05:13Z

I would like to push for a fix of emplace: it's like disseminating mines in the phobos fields, and it's a pain for learners like me to hit some of them (see http://forum.dlang.org/thread/nxbdgtdlmwscocbiypjs@forum.dlang.org)

Thank you for your comment. I just checked, and your code does work perfectly fine with this fix. I added your code to the test suite.

I think I've had just about enough of this broken emplace. Bugs are reported for it all the time, but it doesn't get fixed.

MartinNowak · 2013-08-28T07:32:44Z

std/conv.d

 Returns: A pointer to the newly constructed object (which is the same
 as $(D chunk)).
 */
-T* emplace(T)(T* chunk)
-    if (!is(T == class))
+T* emplace(T)(T* chunk) @safe nothrow pure


A function that dereferences a pointer isn't @safe.

No, it is safe as long as it dereferences it at offset 0. It's the arithmetic that is unsafe.

I'm unsure anyways in these cases, as emplacing over something already constructed can bypass the destructor, leading to a state that may corrupt memory integrity, and/or leak.

That said, extracting a pointer is usually an unsafe operation to begin with...

You're right @klickverbot it's pointer arithmetics and reinterpreting memory which are unsafe.
Using a reference would be cleaner solution though.

Well, while emplace might be "memory safe", it is still a very dangerous to use function. By making emplace take a pointer as an argument, it means emplace cannot be used in a pure @safe scope, unless some @trusted function first provided the pointer. I think this is a good thing.

MartinNowak · 2013-08-28T07:37:35Z

I will review it a second time when I find some time.
Meanwhile I wonder whether the ()@trusted{}() trick made it into the official idioms list.

Fix emplace

dnadlinger · 2013-08-28T08:37:02Z

Okay, I went out on a limb and merged this.

While it is potentially a high-impact change (since emplace is so widely used), I just reviewed it for a second (third? forth?) time and couldn't find any serious issues.

There is some cleanup/further fixes left to do (see e.g. the comments – @monarchdodra, are there issues for those?), but this fixes a host of difficult to track down bugs, and we absolutely need to ship a fix for those soon.

monarchdodra · 2013-08-28T11:35:10Z

Thankyou @klickverbot. A bold move, but I think it was the right move. As you saw, this fixes a couple of bugs. I'm now doing some follow up, and fixing things that depended on emplace being correct.

I just opened 2 new pulls:

Fixup unittest following emplace fix #1528
Fix appender form elaborate assign types #1529
Is mostly trivial, and merrely writes unittests.
Is more complicated, but it fixes the remaining issues in the bug tracker that weren't immediately fixed.

MartinNowak · 2013-09-02T13:22:24Z

std/conv.d

+        else
+        {
+            static immutable T i;
+            ()@trusted{memcpy(chunk, &i, T.sizeof);}();


We discussed compiler optimization, so how about (cast(ubyte*)chunk)[0..T.sizeof] = (cast(ubyte*)&i)[0..T.sizeof]? It will be rewritten as memcpy but the compiler might directly copy small arrays.

Or better yet ?

enum N = T.sizeof; (*cast(ubyte[N]*)chunk) = (*cast(ubyte[N]*)&i);

This statically calls static array copy. I'm no assembler expert, but I'd be curious to compare the generated asm.

Yeah, I also thought of this after posting. It looks slightly better (and saves a few keystrokes to type) but in both cases the compiler has the same knowledge, it's copying array with constant boundaries.
Also note that dmd doesn't use this, it will simply call memcpy.
It should be fairly simple to add an optimization because the backend already has an IR elem OPmemcpy and also directly uses rep movsq for struct blitting.

So what is your recommendation? Leave it as memcpy, or, cast to ubyte[N]?

If you think we should be casting, then it might be worth writing a helper void binaryBlit(T)(T* chunk, ref S s) function. It might be worth writing it either way, as the emplace implementations use this a lot, it would consolidate it to whatever we choose to do.

denis-sh · 2013-09-10T09:28:39Z

Sorry, but I have to write it again. It is obvious for me for ages that emplace is broken by design. And I can't even imagine any arguments against my opinion. If someone still didn't think about it, just try to formulate what exactly does emplace do? This is magic stuff just like destroy.

denis-sh · 2013-09-10T09:36:12Z

Also I'm sure this opinion is already proven by (tons of?) fundamental error in e.g. Phobos ranges algorithms because of emplace design. So imagine what I feel when I show a someone broken-design-function, potential errors with it, (lots of?) real error he did because of this and get as a response: "dude, you are incorrect. I'm correct because I'm correct".

So I sincerely ask to think about it those who care about D.

monarchdodra · 2013-09-10T10:20:28Z

Could you elaborate how it is broken "by design" ? There is, to my knowlege, no more broken cases with emplace (bar using static arrays, which I am currently fixing).

If you do now of a broken use case, please share it.

denis-sh · 2013-09-10T10:36:16Z

Could you elaborate how it is broken "by design" ?

as I wrote:

just try to formulate what exactly does emplace do?

monarchdodra · 2013-09-10T10:39:03Z

just try to formulate what exactly does emplace do?

Builds a T at memory address chunk from the arguments arg?

denis-sh · 2013-09-10T11:27:05Z

How do you define "builds"?
I'm asking it as when I was fixing emplace this was the question I was trying to answer for hours (no jokes), reading examples and source code. IMO, this is also the same problem as with action definition for destroy.

JakobOvrum · 2013-09-10T13:00:22Z

emplace should construct a T at the given address. Surely that's a sufficient definition? I think "construct" can be defined trivially and uncontroversially for most types. One exception I can think of are associative array types, those might have some room for interpretation.

As for destroy, I think the documentation is plentiful. It's not being very specific because it doesn't need to; the point is that trying to reuse the destroyed value is a logic error. With the abstract definition, its implementation is open to change because it's an error if user code relied on implementation details.

alexrp reviewed Jan 21, 2013
View reviewed changes

monarchdodra mentioned this pull request Jan 25, 2013

check for postblit and opAssign in emplace #1097

Closed

MartinNowak reviewed Jan 29, 2013
View reviewed changes

This was referenced Mar 19, 2013

Make array CTFE-able #1213

Closed

Finally fix emplace for structs #949

Closed

dnadlinger reviewed Mar 23, 2013
View reviewed changes

monarchdodra mentioned this pull request Jul 10, 2013

Fix DirEntry string construction and CopyContruction #1407

Merged

monarchdodra mentioned this pull request Jul 21, 2013

Make std.array.array @safe if possible #1425

Merged

monarchdodra added 8 commits August 25, 2013 22:56

Fix emplace

fa96149

Fix spelling

75c935b

Issue 9559 - Range of Nullable doesn't work with std.array.array

e951b2c

Tweak emplace some

62644db

emplace: diagn for quals; simplf alias this brnch

d711117

make "emplace(singleArg)" safe pure nothrow

fbf8ae2

reword emplace deprecation message

6d3fe90

Yet another emplace unittest that is fixed

52cadf9

MartinNowak reviewed Aug 28, 2013
View reviewed changes

dnadlinger added a commit that referenced this pull request Aug 28, 2013

Merge pull request #1082 from monarchdodra/emplace

40c6760

Fix emplace

dnadlinger merged commit 40c6760 into dlang:master Aug 28, 2013

MartinNowak reviewed Sep 2, 2013
View reviewed changes

Uh oh!

Fix emplace #1082

Fix emplace #1082

Uh oh!

Conversation

monarchdodra commented Jan 21, 2013

Uh oh!

alexrp commented Jan 21, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

quickfur commented Jan 25, 2013

Uh oh!

jmdavis commented Jan 25, 2013

Uh oh!

MartinNowak commented Jan 29, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monarchdodra commented Jan 29, 2013

Uh oh!

andralex commented Jan 29, 2013

Uh oh!

MartinNowak commented Jan 29, 2013

Uh oh!

monarchdodra commented Jan 29, 2013

Uh oh!

DmitryOlshansky commented Feb 3, 2013

Uh oh!

monarchdodra commented Feb 21, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

monarchdodra commented Mar 28, 2013

Uh oh!

IgorStepanov commented Apr 22, 2013

Uh oh!

monarchdodra commented Apr 23, 2013

Uh oh!

monarchdodra commented Apr 23, 2013

Uh oh!

IgorStepanov commented Apr 23, 2013

Uh oh!

monarchdodra commented Apr 24, 2013

Uh oh!

IgorStepanov commented Apr 24, 2013

Uh oh!

monarchdodra commented Apr 25, 2013

Uh oh!

MartinNowak commented Apr 25, 2013

Uh oh!

monarchdodra commented Jul 24, 2013

Uh oh!

pinver commented Aug 25, 2013

Uh oh!

monarchdodra commented Aug 25, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinNowak commented Aug 28, 2013