WIP: Codegen for NewExp::argprefix. #914

kinke · 2015-04-25T12:23:51Z

This is an attempt to fix runnable/sdtor.d:3438.

Reduced test case:

import core.exception, core.stdc.stdio;

struct S {
    __gshared int count;
    ~this() { ++count; }
}

class C
{
    this(S s, int f, S s2) {}
}

int foo(bool flag)
{
    if (flag)
        throw new Exception("hello");
    return 1;
}

void main()
{
    bool flag = true;
    bool threw = false;
    try { new C(S(), foo(flag), S()); }
    catch (Exception) { threw = true; }
    assert(threw == flag);
    printf("S.count = %d\n", S.count);
    if (flag)
        assert(S.count == 1);
    else
        assert(S.count == 2);
}

Without this patch, the first two argument expressions S() and foo(flag) for C::this() cannot be resolved to the variables (__pfx and __pfy) declared for them in DMD's functionParameters(). Their declaration expressions (and dtor gates) form a CommaExp chain represented by NewExp::argprefix which is currently not codegen'd at all.

This patch replaces the first arg subexpression of any NewExp with argprefix, arg0 to include argprefix during codegen. It works for above test case if flag is set to false and no exception is thrown. The two variables are allocated on the caller's stack; S instances are then passed in memory (byval) to C's ctor, and C's ctor then destructs the copies.

The problem is that if an exception occurs when evaluating argprefix, we need to destruct all until then constructed objects at the call site, as C's ctor isn't invoked.

kinke · 2015-04-25T15:50:58Z

This argprefix is only set by the front-end in case any arg expression may throw and previously constructed objects need to be destructed (dlang/dmd#4078). So ideally, we'd try to evaluate argprefix right before the constructor call, and only destruct it if its evaluation failed due to an exception, as the constructor takes care of destruction on success, i.e., something like:

try { <evaluate argprefix>; }
catch { <destruct>; rethrow; }
<call ctor>;

One issue of this approach arises when the NewExp is itself evaluated as part of a toElemDtor() call. In this case, I think the variables might be destructed a second time...
Thoughts? :)

/edit: This seems more feasible:

bool destructArgprefix = true;
try
{
    <evaluate argprefix>;
    destructArgprefix = false;
}
finally
{
    if (destructArgprefix)
        <destruct>;
}

<call ctor>;

kinke · 2015-04-27T22:16:36Z

The 3rd commit fixes this issue for NewExp expressions. Above test works for flag=true too.
runnable/sdtor.d still fails in line 3414 though, for a CallExp (no NewExp). The front-end simply replaces the original CallExp by argprefix, f(). So currently, the objects are destructed by the callee if no exception occurs, otherwise they aren't destructed at all.
Do we destruct CommaExp expressions properly?

kinke · 2015-04-28T19:24:25Z

Seems like we don't:

import core.stdc.stdio;

struct Struct { ~this() { printf("dtor\n"); } }
int foo(bool doThrow) { if (doThrow) throw new Exception("bla"); return 1; }

void main()
{
    bool doThrow = true;
    int a;
    try
    {
        a = (Struct(), foo(doThrow)); // or Struct(), a = foo(doThrow);
        printf("exiting try\n");
    }
    catch (Exception e) { printf("catch\n"); }
}

The dtor isn't invoked if foo() throws. We should probably destruct a CommaExp in a finally block, similar to argprefix (but no need for a gate).

kinke · 2015-05-03T20:45:25Z

The issue can be mitigated by (1) always destructing in a finally block, i.e., enforcing mode DestructInFinally (no DestructNormally), and (2) using an explicit CallExp::argprefix with implicit dtor gate for LDC, analog to NewExp::argprefix (to avoid destructing twice, as the callee takes care of proper parameter destruction). This leads to the creation of an unused landing pad block though if there's no potentially throwing call during the lifetime of the temporary instance (featuring a destructor). @redstar @klickverbot Would that be acceptable? Otherwise we'd have to scan the whole expression tree in toElemDtor() for a potentially throwing operation to determine whether to use DestructInFinally or simply DestructNormally...

Another example of currently failing code:

import core.stdc.stdio;

struct Struct { int a = 6; ~this() { printf("dtor\n"); } }
int foo(bool doThrow) { printf("foo()\n"); if (doThrow) throw new Exception("bla"); return 1; }

void main()
{
    bool doThrow = true;
    int r = Struct().a + foo(doThrow);
}

The destructor of the temporary Struct instance is currently not invoked if foo() throws. It's not just CommaExp, but also BinExp and (all?) other multi-expressions...

dnadlinger · 2015-05-11T18:01:02Z

To be honest, it has been a while since I last looked at that piece of code, so @redstar or @AlexeyProkhin might be the better people to ask.

redstar · 2015-05-15T16:58:27Z

gen/llvmhelpers.h

This prototype can be removed.

redstar · 2015-05-15T17:15:55Z

@kinke The SearchVarsWithDestructors already scans the expression tree for temporaries which must be destructed. Can this visitor reused / extended?

The problem here is that the frontend AST is not well suited for our code generation. An additional lowering step is missing. (Insert the try ... finally code into the AST.) This was discussed earlier and rejected because the frontend AST is not stable enough.

dnadlinger · 2015-05-15T18:48:43Z

There is still the option of convincing people to change the upstream AST accordingly.

kinke · 2015-07-06T09:00:23Z

I've been thinking about this issue and how to fix it properly.

Currently, most expressions are codegen'd via toElem(), some via toElemDtor() (e.g., expression trees in statements or secondary subexpressions). toElemDtor() uses an additional visitor to find all DeclarationExp introducing new temporaries and invokes their destructors at the end of that expression, and, if the outer-most expression is a function call or an assert, also adds a landing pad so that the destructors are invoked when an exception occurs. This is obviously not enough, as an expression tree may be way more complex. For example, in that landing pad, we can't simply invoke all destructors, as some objects may not exist yet if the exception occurs rather early. And just looking at the root of an expression tree to check for potentially thrown exceptions is by no means enough.

That design is a problem imo. I'm in favor of collecting all temporaries when toElem()ing an expression tree, right when visiting a DeclarationExp in ToElemVisitor. Then, when calling any potentially throwing function (incl. the assert runtime functions), a landing pad should be prepared which will destruct all temporaries declared so far. At the end of the outer-most toElem() call, all declared temporaries should be destructed too, i.e., destroy all temporaries at the end of the statement if no exception has occurred.
There'd be no need for toElemDtor() anymore if one doesn't mind destructing all temporaries at the end of the statement and not earlier. There'd be no chance for leaking temporaries, and we wouldn't have to keep multiple visitors in sync. A toElemDtor() could obviously still be implemented (pretty easily) to keep fine-grained scopes for temporaries.

There are some issues with this approach though. Firstly, toElem() calls are re-entrant, so one probably needs a stack of ToElemVisitors/temporaries. Furthermore, the destructors themselves (VarDeclaration::edtor) are expressions and need to be codegen'd via ToElemVisitor as well.
Another problem is the cached lvalue of an expression tree. The nested lvalue of an expression tree is currently pre-evaluated before evaluating the tree (using the cached lvalue for the lvalue expression in the 2nd pass), just to get ahold of its llvm::Value. A more elegant solution would be to store the lvalue in ToElemVisitor when visiting an appropriate lvalue expression, and pass that lvalue up to ancestor expressions like (Bin)AssignExp. That might also get rid of the ugly LDC-specific cachedLvalue in front-end Expressions.

That'd obviously require quite some effort. What do you think about this approach? And how should exceptions thrown by destructors be handled?

redstar · 2015-07-11T22:33:08Z

Your approach seems reasonable. It extends the current search for temporaries to the required level.

kinke · 2015-07-12T22:35:38Z

Thanks Kai. The 4th commit is a 1st sketch. It compiles druntime, but phobos fails (only) for std.stdio (Linux x86_64, merge-2.067, LLVM 3.6):

Instruction does not dominate all uses!
  %__tmpfordtor1934 = alloca %"std.internal.cstring.tempCString!(char, char).tempCString.Res", align 8
  call void @_D3std8internal7cstring21__T11tempCStringTaTaZ11tempCStringFNbNixAaZ3Res6__dtorMFNbNiZv(%"std.internal.cstring.tempCString!(char, char).tempCString.Res"* %__tmpfordtor1934), !dbg !4585
LLVM ERROR: Broken function found, compilation aborted!

I guess I messed up the scope somewhere.

dnadlinger · 2015-07-13T22:51:56Z

Self-contained test case:

struct Foo {
    ~this() {}
    int a;
    alias a this;
}

void callee(int, lazy string);

void caller(string s) {
    callee(Foo(), "prefix" ~ s);
}

The issue is that the delegate function body for the lazy parameter is codegen'd while the calling statement is, and picks up the temporaries from within the caller because of the way static is used.

I think the correct solution is to drop all uses of static (which is not exactly a clean solution anyway) and keep the state in IrFunction. This way, this problem will be resolved naturally.

At some point in the future we should also consider explicitly passing around the visitor(s) though all the codegen functions instead of relying on global state.

kinke · 2015-07-14T18:28:49Z

Thanks David. The static was/still is a hack as I hadn't figured out a proper place for the state yet. IrFunction seems fine.

kinke · 2015-07-14T21:16:55Z

I chose FuncGen, which already contains other stacks. Seems to work quite well so far; the lazy issue is fixed.

kinke · 2015-07-16T18:33:26Z

gen/toir.cpp

I've reordered the two blocks here, so that the more probable assertPassed block comes before assertFailed. I find the resulting .ll code way more readable this way.
Unfortunately, it's not that simple for landing pads - their basic blocks currently need to be created to push (finalize) a landing pad. The normally expected postinvoke basic block is created later, before the invoke instruction and thus placed after the landing pad, leading to .ll code in which following the expected control flow is overly complicated, especially wrt. nested invokes (e.g, in the landing pad).

kinke · 2015-07-16T19:36:03Z

Alright.

import core.stdc.stdio;

struct Struct
{
    int a = 6;
    alias a this;
    ~this() { printf("dtor\n"); }
}

int foo(bool doThrow)
{
    printf("foo()\n");
    if (doThrow)
        throw new Exception("bla");
    return 1;
}

void main()
{
    bool doThrow = true;
    int r = (Struct(), foo(false), Struct()) + foo(doThrow);
}

now yields this .ll:

define i32 @_Dmain({ i64, { i64, i8* }* } %unnamed) #0 {
  %doThrow = alloca i8, align 1                   ; [#uses = 2 type = i8*]
  %r = alloca i32, align 4                        ; [#uses = 1 type = i32*]
  %__slStruc2 = alloca %toelem.Struct, align 4    ; [#uses = 6 type = %toelem.Struct*]
  %.structliteral = alloca %toelem.Struct         ; [#uses = 2 type = %toelem.Struct*]
  %__slStruc3 = alloca %toelem.Struct, align 4    ; [#uses = 4 type = %toelem.Struct*]
  %.structliteral1 = alloca %toelem.Struct        ; [#uses = 2 type = %toelem.Struct*]
  store i8 1, i8* %doThrow
  %1 = getelementptr %toelem.Struct* %.structliteral, i32 0, i32 0 ; [#uses = 1 type = i32*]
  store i32 6, i32* %1
  %2 = bitcast %toelem.Struct* %__slStruc2 to i8* ; [#uses = 1 type = i8*]
  %3 = bitcast %toelem.Struct* %.structliteral to i8* ; [#uses = 1 type = i8*]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %2, i8* %3, i64 4, i32 1, i1 false)
  %tmp = invoke i32 @_D6toelem3fooFbZi(i1 false)
          to label %postinvoke unwind label %temporariesLandingPad ; [#uses = 0 type = i32]

postinvoke:                                       ; preds = %0
  %4 = getelementptr %toelem.Struct* %.structliteral1, i32 0, i32 0 ; [#uses = 1 type = i32*]
  store i32 6, i32* %4
  %5 = bitcast %toelem.Struct* %__slStruc3 to i8* ; [#uses = 1 type = i8*]
  %6 = bitcast %toelem.Struct* %.structliteral1 to i8* ; [#uses = 1 type = i8*]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %5, i8* %6, i64 4, i32 1, i1 false)
  %7 = getelementptr %toelem.Struct* %__slStruc3, i32 0, i32 0 ; [#uses = 1 type = i32*]
  %8 = load i8* %doThrow                          ; [#uses = 1 type = i8]
  %9 = trunc i8 %8 to i1                          ; [#uses = 1 type = i1]
  %tmp4 = invoke i32 @_D6toelem3fooFbZi(i1 %9)
          to label %postinvoke3 unwind label %temporariesLandingPad2 ; [#uses = 1 type = i32]

postinvoke3:                                      ; preds = %postinvoke
  %10 = load i32* %7                              ; [#uses = 1 type = i32]
  %11 = add i32 %10, %tmp4                        ; [#uses = 1 type = i32]
  store i32 %11, i32* %r
  invoke void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc3)
          to label %postinvoke10 unwind label %temporariesLandingPad9

postinvoke10:                                     ; preds = %postinvoke3
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  ret i32 0

temporariesLandingPad9:                           ; preds = %postinvoke3
  %landing_pad11 = landingpad { i8*, i32 } personality i32 (i32, i32, i64, i8*, i8*)* @_d_eh_personality
          cleanup                                 ; [#uses = 2 type = { i8*, i32 }]
  %12 = extractvalue { i8*, i32 } %landing_pad11, 0 ; [#uses = 1 type = i8*]
  %13 = extractvalue { i8*, i32 } %landing_pad11, 1 ; [#uses = 0 type = i32]
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  call void @_d_eh_resume_unwind(i8* %12)
  unreachable

temporariesLandingPad2:                           ; preds = %postinvoke
  %landing_pad5 = landingpad { i8*, i32 } personality i32 (i32, i32, i64, i8*, i8*)* @_d_eh_personality
          cleanup                                 ; [#uses = 2 type = { i8*, i32 }]
  %14 = extractvalue { i8*, i32 } %landing_pad5, 0 ; [#uses = 1 type = i8*]
  %15 = extractvalue { i8*, i32 } %landing_pad5, 1 ; [#uses = 0 type = i32]
  invoke void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc3)
          to label %postinvoke7 unwind label %temporariesLandingPad6

postinvoke7:                                      ; preds = %temporariesLandingPad2
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  call void @_d_eh_resume_unwind(i8* %14)
  unreachable

temporariesLandingPad6:                           ; preds = %temporariesLandingPad2
  %landing_pad8 = landingpad { i8*, i32 } personality i32 (i32, i32, i64, i8*, i8*)* @_d_eh_personality
          cleanup                                 ; [#uses = 2 type = { i8*, i32 }]
  %16 = extractvalue { i8*, i32 } %landing_pad8, 0 ; [#uses = 1 type = i8*]
  %17 = extractvalue { i8*, i32 } %landing_pad8, 1 ; [#uses = 0 type = i32]
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  call void @_d_eh_resume_unwind(i8* %16)
  unreachable

temporariesLandingPad:                            ; preds = %0
  %landing_pad = landingpad { i8*, i32 } personality i32 (i32, i32, i64, i8*, i8*)* @_d_eh_personality
          cleanup                                 ; [#uses = 2 type = { i8*, i32 }]
  %18 = extractvalue { i8*, i32 } %landing_pad, 0 ; [#uses = 1 type = i8*]
  %19 = extractvalue { i8*, i32 } %landing_pad, 1 ; [#uses = 0 type = i32]
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  call void @_d_eh_resume_unwind(i8* %18)
  unreachable
}

Pseudo-code translation of the main statement int r = (Struct(), foo(false), Struct()) + foo(doThrow);:

auto __slStruc2 = Struct();
try { foo(false); }
catch // temporariesLandingPad
{
    destruct __slStruc2;
    throw;
}

// no throw: postinvoke
auto __slStruc3 = Struct();
int tmp4;
try { tmp4 = foo(doThrow); }
catch // temporariesLandingPad2
{
    try { destruct __slStruc3; }
    catch // temporariesLandingPad6
    {
        destruct __slStruc2;
        throw;
    }

    // no throw: postinvoke7
    destruct __slStruc2;
    throw;
}

// no throw: postinvoke3
int r = __slStruc3 + tmp4;
try { destruct __slStruc3; }
catch // temporariesLandingPad9
{
    destruct __slStruc2;
    throw;
}

// no throw: postinvoke10
destruct __slStruc2;

We'd like to evaluate expression NewExp::argprefix and only destruct it if an exception interrupts its evaluation.

The front-end uses a single bool variable as gate for all destructors (by replacing each dtor expression by `gate || dtor()`). This flag is set at the end of the argprefix chain of CommaExp. So when no exception occurs in argprefix, the dtors aren't invoked in the inserted finally block. Right after the finally block we have the call to NewExp's ctor, which will take care of destructing its parameters.

…m().

kinke · 2015-07-18T19:54:41Z

There's at least one major issue remaining - if a temporary's constructor throws, it is later destructed in a landing pad. That's because the temporary is declared by a nested DeclarationExp before invoking its constructor method within a CallExp.
This is the reason for failing test13095 in runnable/sdtor.d. test13661, test14022, test14023 and test13669 fail too, all involving static arrays. The previous issues appear to be fixed.

dnadlinger · 2015-07-18T21:24:44Z

For which case was "Restore toElemDtor() functionality." necessary? It seemed like the tests already looked quite good before that.

kinke · 2015-07-18T23:08:00Z

toElemDtor() is still necessary, e.g., for DMD Bugzilla 8360 (an assert's msg is only to be evaluated (and destructed) if the assert fails). For the other cases, it's primarily useful to shorten the temporariesLandingPads.
The 8th commit fixes DMD issue 13095.

kinke · 2015-07-18T23:20:03Z

gen/toir.cpp

DMD uses a similarly ugly hack to detect this case.

dnadlinger · 2015-07-19T16:05:38Z

Ah okay, thanks for the explanation! I'm merging this because it is clearly better than what we currently have, and to make collaboration easier.

WIP: Codegen for NewExp::argprefix.

smolt · 2015-07-21T06:23:19Z

FYI - compile times for std/uni.d unittest soared after this commit. It went from 8 seconds to 4 minutes for x86_64 OS X and 8 seconds to 20 minutes for ARM.

kinke · 2015-07-24T13:23:32Z

@smolt Wow, that's extreme. I haven't noticed any significant slow-down on Linux x64, but haven't measured the compile time for std.uni yet. Note that I've fixed a memory leak in the meantime.

kinke changed the title ~~Codegen for NewExp::argprefix.~~ WIP: Codegen for NewExp::argprefix. Apr 25, 2015

kinke mentioned this pull request Apr 25, 2015

merge-2.067 issues #868

Closed

kinke force-pushed the argprefix branch 2 times, most recently from f1b89ce to 6a452b1 Compare April 25, 2015 15:36

kinke force-pushed the argprefix branch from 6a452b1 to 03eed30 Compare April 25, 2015 16:25

redstar reviewed May 15, 2015
View reviewed changes

gen/llvmhelpers.h Outdated

Copy link

Member

redstar May 15, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prototype can be removed.

kinke force-pushed the argprefix branch from 39a8cea to e747e3a Compare July 12, 2015 22:22

kinke force-pushed the argprefix branch from e747e3a to a4e9cc9 Compare July 14, 2015 18:21

kinke force-pushed the argprefix branch from 7ba3efa to 6d026db Compare July 16, 2015 18:20

kinke reviewed Jul 16, 2015
View reviewed changes

kinke added 6 commits July 18, 2015 13:34

Codegen for NewExp::argprefix.

1672b5d

Prepare for more flexible toElemDtor().

9f1e6f1

We'd like to evaluate expression NewExp::argprefix and only destruct it if an exception interrupts its evaluation.

First attempt to simplify and fix destruction of temporaries in toEle…

2304b52

…m().

Move temporaries-related state from ToElemVisitor to FuncGen.

cf52b53

After invoke instruction: move postinvoke block before landing pad.

deabad0

kinke force-pushed the argprefix branch from 3891004 to 9f262f0 Compare July 18, 2015 11:48

Restore toElemDtor() functionality.

b9c77c2

kinke force-pushed the argprefix branch from 9f262f0 to b9c77c2 Compare July 18, 2015 15:18

dnadlinger closed this Jul 18, 2015

dnadlinger reopened this Jul 18, 2015

Don't destruct a temporary if its constructor throws.

6fefec5

kinke reviewed Jul 18, 2015
View reviewed changes

gen/toir.cpp

Copy link

Member Author

kinke Jul 18, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DMD uses a similarly ugly hack to detect this case.

dnadlinger added a commit that referenced this pull request Jul 19, 2015

Merge pull request #914 from kinke/argprefix

055d893

WIP: Codegen for NewExp::argprefix.

dnadlinger merged commit 055d893 into ldc-developers:merge-2.067 Jul 19, 2015

kinke mentioned this pull request Jul 30, 2015

2.067 branch: Enormous compile time increase after dtor rework #1010

Closed

kinke deleted the argprefix branch August 24, 2017 19:43

Uh oh!

WIP: Codegen for NewExp::argprefix. #914

WIP: Codegen for NewExp::argprefix. #914

Uh oh!

Conversation

kinke commented Apr 25, 2015

Uh oh!

kinke commented Apr 25, 2015

Uh oh!

kinke commented Apr 27, 2015

Uh oh!

kinke commented Apr 28, 2015

Uh oh!

kinke commented May 3, 2015

Uh oh!

dnadlinger commented May 11, 2015

Uh oh!

redstar May 15, 2015

Choose a reason for hiding this comment

Uh oh!

redstar commented May 15, 2015

Uh oh!

dnadlinger commented May 15, 2015

Uh oh!

kinke commented Jul 6, 2015

Uh oh!

redstar commented Jul 11, 2015

Uh oh!

kinke commented Jul 12, 2015

Uh oh!

dnadlinger commented Jul 13, 2015

Uh oh!

kinke commented Jul 14, 2015

Uh oh!

kinke commented Jul 14, 2015

Uh oh!

kinke Jul 16, 2015

Choose a reason for hiding this comment

Uh oh!

kinke commented Jul 16, 2015

Uh oh!

kinke commented Jul 18, 2015

Uh oh!

dnadlinger commented Jul 18, 2015

Uh oh!

kinke commented Jul 18, 2015

Uh oh!

kinke Jul 18, 2015

Choose a reason for hiding this comment

Uh oh!

dnadlinger commented Jul 19, 2015

Uh oh!

smolt commented Jul 21, 2015

Uh oh!

kinke commented Jul 24, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants