Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

[RFC] Fix `move` #923

Closed
wants to merge 4 commits into from

6 participants

@denis-sh

It's better to analyze by commit and see commits comments for description.

And yes, I also dislike moving const/immutabe objects. But have no idea how to avoid it and not make move unusable.

[EDITED] 3 times
My move:

void move(T)(ref T source, ref T target)
in { assert(&source == &target || !pointsTo(source, source)); }
body
{
    // Performance optimization:
    // Do not compare addresses if we don't have side effects,
    // will not need addresses in `memcpy`, and T fits in register.
    static if (hasElaborateCopyConstructor!T || hasElaborateAssign!T ||
               hasElaborateDestructor!T || !isAssignable!T ||
               T.sizeof > size_t.sizeof)
        if (&source == &target) return;

    static if (hasElaborateDestructor!T)
        destruct(target, false);

    // If there is no elaborate assign and no elaborate copy constructor,
    // `target = source` has no side effects.
    static if (hasElaborateAssign!T || hasElaborateCopyConstructor!T || !isAssignable!T)
        memcpy(cast(void*) &target, cast(const void*) &source, T.sizeof);
    else
        target = source;

    static if (hasElaborateCopyConstructor!T || hasElaborateDestructor!T)
        setInitialState(source);
}

T move(T)(ref T source)
{
    // Can avoid to check aliasing here.

    static if (hasElaborateCopyConstructor!T || hasElaborateDestructor!T)
    {
        T result = void;
        memcpy(cast(void*) &result, cast(const void*) &source, T.sizeof);
        setInitialState(source); // from pull 928
        return result;
    }
    else
    {
        return source;
    }
}

I also dislike that move doesn't always set source to it's init. As it is documented now, it's not such a big problem as it was but it still looks inconsistent.

So can I just tell move to write init to source obligatory?

And don't ask me to fill all fixed move bug in bugzilla as it will take a full day at least and I will probably still miss something.

Requires setInitialState from pull #928 and destruct from #929.

@denis-sh denis-sh Move `move` precondition to `in` block
* also improve precondition skipping comment
36fc09a
@denis-sh denis-sh referenced this pull request
Closed

Move improvements #922

@DmitryOlshansky
Collaborator

While trying to make move work under CTFE I realized move can (and in fact already does in Phobos unittest) overwrite immutable/const fileds of structs without so much as a notice. WTF.
I'm still curious how to add this bit-blaster attitude to CTFE version as it disallows reinterpret casts and memcpy...

Other then this - one more thing. Please move aliasing check &soource == &target to apply only for structs, it makes little sense to do this check for primitives. (I'd argue that some small structs could be fine wth plain assign and no aliasing check)
Judging by asm output DMD's optimizer keep this check in for e.g. integers ... just madness.Needless yo say it leads to some performance problems I've found in
#787

@monarchdodra
Collaborator

Nitpick:

  1. I'm not entirelly sure we need such a huge example of the dangers of move. A simple warning should do, imo.

  2. avoid using pronouns in documentation : ˋs/you/the caller/ ˋ

@denis-sh

@monarchdodra's documentation change and @blackwhale's optimization are incorporated.

@denis-sh

to @monarchdodra:

I'm not entirelly sure we need such a huge example of the dangers of move. A simple warning should do, imo.

Maybe. Will wait others to show their opinions here.

@monarchdodra
Collaborator

Your new implementation of ˋT move(T)(ref T source)ˋ is awesome. It puts my consolidation change to shame...

@monarchdodra
Collaborator

Hum... your approach to the context pointer issue is indeed a 180° approach from mine. I'm not entirelly sure this is what the standard had in mind about "an init value must be destroyable".

In particular, I can't imagine any sane implementation of a destructor that starts with ˋif (this == init)ˋ

I think this is especially relevent in the sense that the context pointer's value isn't really part of the object's state.

Will wait for others to show their opinions.


I think my context pointer fix is a valid one, so I'll keep my pull open. I may cherry pick some stuff from here though... hope you don't mind.

@denis-sh

Fixed my @blackwhale's optimization implementation.

@DmitryOlshansky
Collaborator

I like the way optimization looks, very simple and generic. But I'm still a bit uneasy about how compiler copies small structs (IRC it should just do a couple of moves but maybe..), does it work for odd-length structs e.g. 3-byte ones?

Of course it would be extreamly useful if it was ever specified how compiler copies structs depending on size, something like:
a) < X register widths - unrolled sequence of moves
b) > X but < Y register widths - rep movsq/movsw/movsb (copy loop)
c) everything else goes to memcpy (function call)

Looking at this kind of list I'd say that a should skip aliasing check but everything from b and c can afford it do it.
Problem is that X & Y are CPU specific most likely...

By far the most interesting case is a struct that contains an array slice (2 words) and it would be great to make it be just 2 moves. Not sure if we can count on it being legal.

Maybe we can get some help from compiler gurus..
@donc can you shed some light on this, please?

@donc
Collaborator

The relevant code is cdstreq() in backend/cod2.c. ('streq' = 'struct equals' = struct assignment). It seems to basically do a rep movs as long as it is more than two registers long. I don't think it ever calls memcpy. Which makes sense since DMC's memcpy is not very fast, it just does rep movsd. If it is less than two registers I have a nasty feeling it just does a single movsd or movsw. Hope I'm wrong, that would be pretty awful.

@DmitryOlshansky
Collaborator

Thanks for the input. It is as I feared it's pretty much tied to backend. Peeked at LDC backend, appears to be transformed into memcpy or analoguos intrinsic.

One last thing I'd love to know is if there is any danger at all of doing memcpy(&this, &this, this.sizeof); - self copying (aside from potentialy wasting cycles) w.r.t. to potential SIMD optimizations and what not. Keeping in mind that memcpy is typically an instrinsic.

Looking at the code I see that this is a primary branch for small structs:

 if (numbytes <= REGSIZE * (6 + (REGSIZE == 4)))
  {     while (numbytes >= REGSIZE)
        {
            c3 = gen1(c3,0xA5);         /* MOVSW                        */
            code_orrex(c3, rex);
            numbytes -= REGSIZE;
        }
        while (numbytes--)
            c3 = gen1(c3,0xA4);         /* MOVSB                        */
  }

From what I understand it should do an unrolled sequence of:
x times movsw
y times movsb
Except actually it does movsd not movsw. The code appears to be quite old.

And that might be okay but... if I'm not mistaken it does:
a) setups esi/edi
b) kills out order execution by using interdependent movs(b/w/d/q)

A lot of work to transport 1-2 words.

Hope I'm wrong, that would be pretty awful.

I'm afraid it is the case :(
For this 6 byte struct it does 1 movsd + 2 movsb:

align(1)
struct C{
    align(1) short abc;
    align(1) short efg;
    align(1) short hij;
}
static assert(C.sizeof == 6);

void main(){
    C c = C(1, 2, 3);
    C x;
    x = c;
    assert(x.abc == 1);
    assert(x.efg == 2);
    assert(x.hij == 3);
}

Generated asm:

enter   10h, 0
push    ebx
push    esi
push    edi
mov     ax, 1
mov     [ebp+var_10], ax
mov     cx, 2
mov     [ebp+var_E], cx
mov     dx, 3
mov     [ebp+var_C], dx
lea     ebx, [ebp+var_8]
xor     eax, eax
mov     [ebx], eax
mov     [ebx+4], ax
lea     esi, [ebp+var_10]
lea     edi, [ebp+var_8]
movsd
movsb
movsb
... //series of cmp/call assert
@denis-sh

To @blackwhale :

One last thing I'd love to know is if there is any danger at all of doing memcpy(&this, &this, this.sizeof); - self copying (aside from potentialy wasting cycles) w.r.t. to potential SIMD optimizations and what not. Keeping in mind that memcpy is typically an instrinsic.

Why are you taking memcpy which behavior is undefined in the case of overlapping source and destination and trying to analyze if a concrete realization will work for this exact overlapping case instead of using memmove which allow overlapping?

@DmitryOlshansky
Collaborator

To @denis-sh I'm not even sure now;) The main argument was being able to make it as fast as possible. I believe you see that checking aliasing for small things (which memmove most definetly does) is not good enough.

My line of reasoning was going like this:

  1. Simple built-in structs are copied bitwise on assignment, there is no mention how this magic works but obviously self assignment must work. (peeking at compilers reveals all kind of _builtin_memcpy though).
  2. Now we need to do the same to 'move' small things with elaborate destructor/postblit that we'd rather avoid.
  3. And all of a sudden we can't get this bitwise assignment magic that we want to imitate because it was never specified in detail.
  4. And looking at C's spec we can't use memcpy directly without aliasing check... even exact overlap

So for now the case of smallish PODs is optimized and it's fine. Elaborate small structs are not for the reasons above.

My lasy problem is with making move CTFE-able. With CTFE-ability requirement I don't see how to implement the bitwise move in the library. If we disallow moving structs with immutable fields it can be implemented as copy (given it's during compile-time I couldn't care less). I'd love to disallow it but we need a consensus esp as it breaks some obscure unittest in std.container.

Anyway I think I got carried away way a bit. The focus should be on the pull request itself.
And I'd say it's good to go as is.

@denis-sh

To @blackwhale:

I believe you see that checking aliasing for small things (which memmove most definetly does) is not good enough.

I don't know much about C's memmove implementations but I don't see why it should check aliasing as it can just compare pointers and choose a direction.

IMHO, it's very bad and is a source of bugs currently and will lead to more bugs in the future that we use C library (especially buggy Digital Mars one) without any reason.

A already proposed to have our own implementation (my variant is called copyOverlapped, it is rather fast but not optimized for small structs), but, as always, nobody needed:
http://forum.dlang.org/thread/jiqkgp$29os$1@digitalmars.com

@DmitryOlshansky
Collaborator

I don't know much about C's memmove implementations but I don't see why it should check aliasing as it can just compare pointers and choose a direction.

That's it. Comparing pointers if we copy say 4 bytes. Yeah, call me crazy but as we see generic code ends up doing this stuff and even worse. In fact, the call itself is too much but we can only hope on inliner.

IMHO, it's very bad and is a source of bugs currently and will lead to more bugs in the future that we use C library (especially buggy Digital Mars one) without any reason.

I tried to rise the topic of over relience on C's mem* well probably folks are alright with it but not going to step in:
http://www.digitalmars.com/d/archives/digitalmars/D/rawCopy_rawTransfer_rawFind_179379.html

The signature it uses allows it to optimize. The intended use is to cover both arrays of and just single structs.
And knowing the exact type it can optimize.

Also truth be told glibc has real fast memcpy with SIMD under the hood. On windows we have no such luck...

A already proposed to have our own implementation (my variant is called copyOverlapped, it is rather fast but not optimized for small structs), but, as always, nobody needed:
http://forum.dlang.org/thread/jiqkgp$29os$1@digitalmars.com

I recall that quite some time ago.
It's fine but to be truly awesome it needs to have versions involving SIMD with at least SSE2. I think array ops in D actually started using them. Need to disassemble again or dig up druntime to check.

Another important moment about it is that not knowning the type it can't optimize for small structs. In other words it's a great as fallback function to use when all special cases are handled. In my propsal of rawTransfer if there is no better way it should forward to something like this.

@DmitryOlshansky
Collaborator

I think I've tackled the CTFE problem, take a look : #936
It just copies for now but there is little else I can do at CTFE.

@denis-sh denis-sh referenced this pull request
Closed

ctfe-able move #936

@DmitryOlshansky
Collaborator

Somehow it started failing the autotester.

@denis-sh

To @blackwhale:

Somehow it started failing the autotester.

I added dependencies on my other pulls. See Requires at the end of pull description.

@denis-sh

By the way, my move(source, target) had incorrect condition for memcpy call. Fixed and added comment.

@DmitryOlshansky
Collaborator

There is a way to list them with some keyword so that auto-tester pulls them too. Can't remember the exact magic though...

@ghost Unknown referenced this pull request
Closed

Add some static array traits #924

@andralex
Owner

LGTM once it passes the tester

@denis-sh

It can't pass the tester as it requires functions from unmerged pulls.

std/algorithm.d
@@ -1443,56 +1443,86 @@ unittest
// move
/**
-Moves $(D source) into $(D target) via a destructive
-copy. Specifically: $(UL $(LI If $(D hasAliasing!T) is true (see
-$(XREF traits, hasAliasing)), then the representation of $(D source)
-is bitwise copied into $(D target) and then $(D source = T.init) is
-evaluated.) $(LI Otherwise, $(D target = source) is evaluated.)) See
-also $(XREF exception, pointsTo).
+Moves $(D source) into $(D target).
+
+Specifically:
+$(UL
+ $(LI Do nothing if $(D &source is &target) (for the first overload only).)
@klickverbot Collaborator

»do« -> »does«, etc. for conformity.

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
denis-sh added some commits
@denis-sh denis-sh Simplify and fix complicated and error prone `move`
Issues fixed:
* `move` is over-complicated
* `move` doesn't work with qualified structs
* `move` doesn't work with const/immutable non-struct types
* for a nested struct, `move` tries to call destructor on possibly invalid struct filled by `init` except with a context pointer of `target` (such mix is unexpected and invalid e.g. if default constructor is disabled)
* `move` fails do the correct mix of `init` and context pointer of `target` in previous point in some cases
* lots of other issues I'm to tired to search in list here
67c1118
@denis-sh denis-sh Fix and improve `move` documentation 6b71413
@denis-sh denis-sh Incorporate performance optimization suggested by @blackwhale e6a79de
@DmitryOlshansky
Collaborator

@denis-sh
Still inclined to go for it?
How about rebasing and putting in in all dependant primitives as private helpers?

Another matter is making it CTFE-able with yours rawCopy
http://denis-sh.github.com/phobos-additions/unstd.array.html#rawCopy
but probably we'd better start with this fix alone.

@DmitryOlshansky
Collaborator

Failed to penetrate. More over issues mentioned in commit comments seem to be addressed in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Nov 3, 2012
  1. @denis-sh

    Move `move` precondition to `in` block

    denis-sh authored
    * also improve precondition skipping comment
Commits on Jan 14, 2013
  1. @denis-sh

    Simplify and fix complicated and error prone `move`

    denis-sh authored
    Issues fixed:
    * `move` is over-complicated
    * `move` doesn't work with qualified structs
    * `move` doesn't work with const/immutable non-struct types
    * for a nested struct, `move` tries to call destructor on possibly invalid struct filled by `init` except with a context pointer of `target` (such mix is unexpected and invalid e.g. if default constructor is disabled)
    * `move` fails do the correct mix of `init` and context pointer of `target` in previous point in some cases
    * lots of other issues I'm to tired to search in list here
  2. @denis-sh
  3. @denis-sh
This page is out of date. Refresh to see the latest.
Showing with 90 additions and 70 deletions.
  1. +90 −70 std/algorithm.d
View
160 std/algorithm.d
@@ -1443,56 +1443,85 @@ unittest
// move
/**
-Moves $(D source) into $(D target) via a destructive
-copy. Specifically: $(UL $(LI If $(D hasAliasing!T) is true (see
-$(XREF traits, hasAliasing)), then the representation of $(D source)
-is bitwise copied into $(D target) and then $(D source = T.init) is
-evaluated.) $(LI Otherwise, $(D target = source) is evaluated.)) See
-also $(XREF exception, pointsTo).
+Moves $(D source) into $(D target).
+
+Specifically:
+$(UL
+ $(LI Does nothing if $(D &source is &target) (for the first overload only).
+ )
+ $(LI Destroys $(D target) if needed (for the first overload only, see
+ $(XREF traits, hasElaborateDestructor))
+ )
+ $(LI Bitwise copies $(D source) into $(D target).
+ )
+ $(LI If $(D hasElaborateCopyConstructor!T || hasElaborateDestructor!T)
+ is $(D true) (see $(XREF traits, hasElaborateCopyConstructor)),
+ then sets $(D source) to $(D T.init).
+ )
+)
+
+Note:
+
+move is $(RED dangerous) as it works for $(B every) type $(D T) whatever
+qualifiers or $(D const)/$(D immutable) members it has.
+So it's the caller responsibility to use move properly.
+
+Example:
+---
+// Consider three logically equal functions except `f1` and `f2` require
+// type `S` to not disable default construction.
+void f1(S s)
+{
+ { S tmp; } // <- Destructor (if any) call for `S.init` here
+ g(s);
+}
+
+void f2(S s)
+{
+ S moved;
+ move(s, moved);
+ g(moved);
+ // <- Same destructor call here
+}
+
+void f3(S s)
+{
+ S moved = move(s); // <- Same destructor call here
+ // Also a temporary for `move` return value may be created, resulting
+ // in a postblit and, then, a destructor call for a bitwise copy of `s`,
+ // but it likely will be removed by optimizer (e.i. no temporary created).
+ // This optimization is called NRVO (Named Return Value Optimization).
+ g(moved);
+}
+---
+
+See also $(XREF exception, pointsTo), $(GLOSSARY nrvo).
Preconditions:
$(D &source == &target || !pointsTo(source, source))
*/
void move(T)(ref T source, ref T target)
-{
- if (&source == &target) return;
- assert(!pointsTo(source, source));
- static if (is(T == struct))
- {
- // Most complicated case. Destroy whatever target had in it
- // and bitblast source over it
- static if (hasElaborateDestructor!T) typeid(T).destroy(&target);
-
- memcpy(&target, &source, T.sizeof);
-
- // If the source defines a destructor or a postblit hook, we must obliterate the
- // object in order to avoid double freeing and undue aliasing
- static if (hasElaborateDestructor!T || hasElaborateCopyConstructor!T)
- {
- static T empty;
- static if (T.tupleof.length > 0 &&
- T.tupleof[$-1].stringof.endsWith("this"))
- {
- // If T is nested struct, keep original context pointer
- memcpy(&source, &empty, T.sizeof - (void*).sizeof);
- }
- else
- {
- memcpy(&source, &empty, T.sizeof);
- }
- }
- }
+in { assert(&source == &target || !pointsTo(source, source)); }
+body
+{
+ // Performance optimization:
+ // Do not compare addresses if we don't have side effects,
+ // will not need addresses in `memcpy`, and T fits in register.
+ static if (hasElaborateCopyConstructor!T || hasElaborateAssign!T ||
+ hasElaborateDestructor!T || !isAssignable!T ||
+ T.sizeof > size_t.sizeof)
+ if (&source == &target) return;
+
+ static if (hasElaborateDestructor!T)
+ destruct(target, false);
+
+ static if (hasElaborateAssign!T || !isAssignable!T)
+ memcpy(cast(void*) &target, cast(const void*) &source, T.sizeof);
else
- {
- // Primitive data (including pointers and arrays) or class -
- // assignment works great
target = source;
- // static if (is(typeof(source = null)))
- // {
- // // Nullify the source to help the garbage collector
- // source = null;
- // }
- }
+
+ static if (hasElaborateCopyConstructor!T || hasElaborateDestructor!T)
+ setToInitialState(source);
}
unittest
@@ -1511,6 +1540,12 @@ unittest
move(s11, s12);
assert(s11.a == 10 && s11.b == 11 && s12.a == 10 && s12.b == 11);
+ shared S1 sharedS11, sharedS12;
+ move(sharedS11, sharedS12);
+
+ const int constI11, constI12;
+ void constTest(in int ci1) { const ci2 = move(ci1); }
+
static struct S2 { int a = 1; int * b; }
S2 s21 = { 10, null };
s21.b = new int;
@@ -1548,39 +1583,19 @@ unittest
/// Ditto
T move(T)(ref T source)
{
- // Can avoid to check aliasing.
+ // Can avoid to check aliasing here.
- T result = void;
- static if (is(T == struct))
+ static if (hasElaborateCopyConstructor!T || hasElaborateDestructor!T)
{
- // Can avoid destructing result.
-
- memcpy(&result, &source, T.sizeof);
-
- // If the source defines a destructor or a postblit hook, we must obliterate the
- // object in order to avoid double freeing and undue aliasing
- static if (hasElaborateDestructor!T || hasElaborateCopyConstructor!T)
- {
- static T empty;
- static if (T.tupleof.length > 0 &&
- T.tupleof[$-1].stringof.endsWith("this"))
- {
- // If T is nested struct, keep original context pointer
- memcpy(&source, &empty, T.sizeof - (void*).sizeof);
- }
- else
- {
- memcpy(&source, &empty, T.sizeof);
- }
- }
+ T result = void;
+ memcpy(cast(void*) &result, cast(const void*) &source, T.sizeof);
+ setToInitialState(source);
+ return result;
}
else
{
- // Primitive data (including pointers and arrays) or class -
- // assignment works great
- result = source;
+ return source;
}
- return result;
}
unittest
@@ -1597,6 +1612,9 @@ unittest
S1 s12 = move(s11);
assert(s11.a == 10 && s11.b == 11 && s12.a == 10 && s12.b == 11);
+ shared S1 sharedS11, sharedS12 = move(sharedS11);
+ void constTest(in int ci1, in int ci2) { move(ci1, ci2); }
+
static struct S2 { int a = 1; int * b; }
S2 s21 = { 10, null };
s21.b = new int;
@@ -1664,6 +1682,8 @@ unittest// Issue 8057
int x;
~this()
{
+ // Struct always can equal to its `init`
+ if(this == S.init) return;
// Access to enclosing scope
assert(n == 10);
}
Something went wrong with that request. Please try again.