Skip to content

Conversation

@kinke
Copy link
Member

@kinke kinke commented Apr 25, 2015

This is an attempt to fix runnable/sdtor.d:3438.

Reduced test case:

import core.exception, core.stdc.stdio;

struct S {
    __gshared int count;
    ~this() { ++count; }
}

class C
{
    this(S s, int f, S s2) {}
}

int foo(bool flag)
{
    if (flag)
        throw new Exception("hello");
    return 1;
}

void main()
{
    bool flag = true;
    bool threw = false;
    try { new C(S(), foo(flag), S()); }
    catch (Exception) { threw = true; }
    assert(threw == flag);
    printf("S.count = %d\n", S.count);
    if (flag)
        assert(S.count == 1);
    else
        assert(S.count == 2);
}

Without this patch, the first two argument expressions S() and foo(flag) for C::this() cannot be resolved to the variables (__pfx and __pfy) declared for them in DMD's functionParameters(). Their declaration expressions (and dtor gates) form a CommaExp chain represented by NewExp::argprefix which is currently not codegen'd at all.

This patch replaces the first arg subexpression of any NewExp with argprefix, arg0 to include argprefix during codegen. It works for above test case if flag is set to false and no exception is thrown. The two variables are allocated on the caller's stack; S instances are then passed in memory (byval) to C's ctor, and C's ctor then destructs the copies.

The problem is that if an exception occurs when evaluating argprefix, we need to destruct all until then constructed objects at the call site, as C's ctor isn't invoked.

@kinke kinke changed the title Codegen for NewExp::argprefix. WIP: Codegen for NewExp::argprefix. Apr 25, 2015
@kinke kinke mentioned this pull request Apr 25, 2015
@kinke kinke force-pushed the argprefix branch 2 times, most recently from f1b89ce to 6a452b1 Compare April 25, 2015 15:36
@kinke
Copy link
Member Author

kinke commented Apr 25, 2015

This argprefix is only set by the front-end in case any arg expression may throw and previously constructed objects need to be destructed (dlang/dmd#4078). So ideally, we'd try to evaluate argprefix right before the constructor call, and only destruct it if its evaluation failed due to an exception, as the constructor takes care of destruction on success, i.e., something like:

try { <evaluate argprefix>; }
catch { <destruct>; rethrow; }
<call ctor>;

One issue of this approach arises when the NewExp is itself evaluated as part of a toElemDtor() call. In this case, I think the variables might be destructed a second time...
Thoughts? :)

/edit: This seems more feasible:

bool destructArgprefix = true;
try
{
    <evaluate argprefix>;
    destructArgprefix = false;
}
finally
{
    if (destructArgprefix)
        <destruct>;
}

<call ctor>;

@kinke
Copy link
Member Author

kinke commented Apr 27, 2015

The 3rd commit fixes this issue for NewExp expressions. Above test works for flag=true too.
runnable/sdtor.d still fails in line 3414 though, for a CallExp (no NewExp). The front-end simply replaces the original CallExp by argprefix, f(). So currently, the objects are destructed by the callee if no exception occurs, otherwise they aren't destructed at all.
Do we destruct CommaExp expressions properly?

@kinke
Copy link
Member Author

kinke commented Apr 28, 2015

Seems like we don't:

import core.stdc.stdio;

struct Struct { ~this() { printf("dtor\n"); } }
int foo(bool doThrow) { if (doThrow) throw new Exception("bla"); return 1; }

void main()
{
    bool doThrow = true;
    int a;
    try
    {
        a = (Struct(), foo(doThrow)); // or Struct(), a = foo(doThrow);
        printf("exiting try\n");
    }
    catch (Exception e) { printf("catch\n"); }
}

The dtor isn't invoked if foo() throws. We should probably destruct a CommaExp in a finally block, similar to argprefix (but no need for a gate).

@kinke
Copy link
Member Author

kinke commented May 3, 2015

The issue can be mitigated by (1) always destructing in a finally block, i.e., enforcing mode DestructInFinally (no DestructNormally), and (2) using an explicit CallExp::argprefix with implicit dtor gate for LDC, analog to NewExp::argprefix (to avoid destructing twice, as the callee takes care of proper parameter destruction). This leads to the creation of an unused landing pad block though if there's no potentially throwing call during the lifetime of the temporary instance (featuring a destructor). @redstar @klickverbot Would that be acceptable? Otherwise we'd have to scan the whole expression tree in toElemDtor() for a potentially throwing operation to determine whether to use DestructInFinally or simply DestructNormally...

Another example of currently failing code:

import core.stdc.stdio;

struct Struct { int a = 6; ~this() { printf("dtor\n"); } }
int foo(bool doThrow) { printf("foo()\n"); if (doThrow) throw new Exception("bla"); return 1; }

void main()
{
    bool doThrow = true;
    int r = Struct().a + foo(doThrow);
}

The destructor of the temporary Struct instance is currently not invoked if foo() throws. It's not just CommaExp, but also BinExp and (all?) other multi-expressions...

@dnadlinger
Copy link
Member

To be honest, it has been a while since I last looked at that piece of code, so @redstar or @AlexeyProkhin might be the better people to ask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This prototype can be removed.

@redstar
Copy link
Member

redstar commented May 15, 2015

@kinke The SearchVarsWithDestructors already scans the expression tree for temporaries which must be destructed. Can this visitor reused / extended?

The problem here is that the frontend AST is not well suited for our code generation. An additional lowering step is missing. (Insert the try ... finally code into the AST.) This was discussed earlier and rejected because the frontend AST is not stable enough.

@dnadlinger
Copy link
Member

There is still the option of convincing people to change the upstream AST accordingly.

@kinke
Copy link
Member Author

kinke commented Jul 6, 2015

I've been thinking about this issue and how to fix it properly.

Currently, most expressions are codegen'd via toElem(), some via toElemDtor() (e.g., expression trees in statements or secondary subexpressions). toElemDtor() uses an additional visitor to find all DeclarationExp introducing new temporaries and invokes their destructors at the end of that expression, and, if the outer-most expression is a function call or an assert, also adds a landing pad so that the destructors are invoked when an exception occurs. This is obviously not enough, as an expression tree may be way more complex. For example, in that landing pad, we can't simply invoke all destructors, as some objects may not exist yet if the exception occurs rather early. And just looking at the root of an expression tree to check for potentially thrown exceptions is by no means enough.

That design is a problem imo. I'm in favor of collecting all temporaries when toElem()ing an expression tree, right when visiting a DeclarationExp in ToElemVisitor. Then, when calling any potentially throwing function (incl. the assert runtime functions), a landing pad should be prepared which will destruct all temporaries declared so far. At the end of the outer-most toElem() call, all declared temporaries should be destructed too, i.e., destroy all temporaries at the end of the statement if no exception has occurred.
There'd be no need for toElemDtor() anymore if one doesn't mind destructing all temporaries at the end of the statement and not earlier. There'd be no chance for leaking temporaries, and we wouldn't have to keep multiple visitors in sync. A toElemDtor() could obviously still be implemented (pretty easily) to keep fine-grained scopes for temporaries.

There are some issues with this approach though. Firstly, toElem() calls are re-entrant, so one probably needs a stack of ToElemVisitors/temporaries. Furthermore, the destructors themselves (VarDeclaration::edtor) are expressions and need to be codegen'd via ToElemVisitor as well.
Another problem is the cached lvalue of an expression tree. The nested lvalue of an expression tree is currently pre-evaluated before evaluating the tree (using the cached lvalue for the lvalue expression in the 2nd pass), just to get ahold of its llvm::Value. A more elegant solution would be to store the lvalue in ToElemVisitor when visiting an appropriate lvalue expression, and pass that lvalue up to ancestor expressions like (Bin)AssignExp. That might also get rid of the ugly LDC-specific cachedLvalue in front-end Expressions.

That'd obviously require quite some effort. What do you think about this approach? And how should exceptions thrown by destructors be handled?

@redstar
Copy link
Member

redstar commented Jul 11, 2015

Your approach seems reasonable. It extends the current search for temporaries to the required level.

@kinke
Copy link
Member Author

kinke commented Jul 12, 2015

Thanks Kai. The 4th commit is a 1st sketch. It compiles druntime, but phobos fails (only) for std.stdio (Linux x86_64, merge-2.067, LLVM 3.6):

Instruction does not dominate all uses!
  %__tmpfordtor1934 = alloca %"std.internal.cstring.tempCString!(char, char).tempCString.Res", align 8
  call void @_D3std8internal7cstring21__T11tempCStringTaTaZ11tempCStringFNbNixAaZ3Res6__dtorMFNbNiZv(%"std.internal.cstring.tempCString!(char, char).tempCString.Res"* %__tmpfordtor1934), !dbg !4585
LLVM ERROR: Broken function found, compilation aborted!

I guess I messed up the scope somewhere.

@dnadlinger
Copy link
Member

Self-contained test case:

struct Foo {
    ~this() {}
    int a;
    alias a this;
}

void callee(int, lazy string);

void caller(string s) {
    callee(Foo(), "prefix" ~ s);
}

The issue is that the delegate function body for the lazy parameter is codegen'd while the calling statement is, and picks up the temporaries from within the caller because of the way static is used.

I think the correct solution is to drop all uses of static (which is not exactly a clean solution anyway) and keep the state in IrFunction. This way, this problem will be resolved naturally.

At some point in the future we should also consider explicitly passing around the visitor(s) though all the codegen functions instead of relying on global state.

@kinke
Copy link
Member Author

kinke commented Jul 14, 2015

Thanks David. The static was/still is a hack as I hadn't figured out a proper place for the state yet. IrFunction seems fine.

@kinke
Copy link
Member Author

kinke commented Jul 14, 2015

I chose FuncGen, which already contains other stacks. Seems to work quite well so far; the lazy issue is fixed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reordered the two blocks here, so that the more probable assertPassed block comes before assertFailed. I find the resulting .ll code way more readable this way.
Unfortunately, it's not that simple for landing pads - their basic blocks currently need to be created to push (finalize) a landing pad. The normally expected postinvoke basic block is created later, before the invoke instruction and thus placed after the landing pad, leading to .ll code in which following the expected control flow is overly complicated, especially wrt. nested invokes (e.g, in the landing pad).

@kinke
Copy link
Member Author

kinke commented Jul 16, 2015

Alright.

import core.stdc.stdio;

struct Struct
{
    int a = 6;
    alias a this;
    ~this() { printf("dtor\n"); }
}

int foo(bool doThrow)
{
    printf("foo()\n");
    if (doThrow)
        throw new Exception("bla");
    return 1;
}

void main()
{
    bool doThrow = true;
    int r = (Struct(), foo(false), Struct()) + foo(doThrow);
}

now yields this .ll:

define i32 @_Dmain({ i64, { i64, i8* }* } %unnamed) #0 {
  %doThrow = alloca i8, align 1                   ; [#uses = 2 type = i8*]
  %r = alloca i32, align 4                        ; [#uses = 1 type = i32*]
  %__slStruc2 = alloca %toelem.Struct, align 4    ; [#uses = 6 type = %toelem.Struct*]
  %.structliteral = alloca %toelem.Struct         ; [#uses = 2 type = %toelem.Struct*]
  %__slStruc3 = alloca %toelem.Struct, align 4    ; [#uses = 4 type = %toelem.Struct*]
  %.structliteral1 = alloca %toelem.Struct        ; [#uses = 2 type = %toelem.Struct*]
  store i8 1, i8* %doThrow
  %1 = getelementptr %toelem.Struct* %.structliteral, i32 0, i32 0 ; [#uses = 1 type = i32*]
  store i32 6, i32* %1
  %2 = bitcast %toelem.Struct* %__slStruc2 to i8* ; [#uses = 1 type = i8*]
  %3 = bitcast %toelem.Struct* %.structliteral to i8* ; [#uses = 1 type = i8*]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %2, i8* %3, i64 4, i32 1, i1 false)
  %tmp = invoke i32 @_D6toelem3fooFbZi(i1 false)
          to label %postinvoke unwind label %temporariesLandingPad ; [#uses = 0 type = i32]

postinvoke:                                       ; preds = %0
  %4 = getelementptr %toelem.Struct* %.structliteral1, i32 0, i32 0 ; [#uses = 1 type = i32*]
  store i32 6, i32* %4
  %5 = bitcast %toelem.Struct* %__slStruc3 to i8* ; [#uses = 1 type = i8*]
  %6 = bitcast %toelem.Struct* %.structliteral1 to i8* ; [#uses = 1 type = i8*]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %5, i8* %6, i64 4, i32 1, i1 false)
  %7 = getelementptr %toelem.Struct* %__slStruc3, i32 0, i32 0 ; [#uses = 1 type = i32*]
  %8 = load i8* %doThrow                          ; [#uses = 1 type = i8]
  %9 = trunc i8 %8 to i1                          ; [#uses = 1 type = i1]
  %tmp4 = invoke i32 @_D6toelem3fooFbZi(i1 %9)
          to label %postinvoke3 unwind label %temporariesLandingPad2 ; [#uses = 1 type = i32]

postinvoke3:                                      ; preds = %postinvoke
  %10 = load i32* %7                              ; [#uses = 1 type = i32]
  %11 = add i32 %10, %tmp4                        ; [#uses = 1 type = i32]
  store i32 %11, i32* %r
  invoke void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc3)
          to label %postinvoke10 unwind label %temporariesLandingPad9

postinvoke10:                                     ; preds = %postinvoke3
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  ret i32 0

temporariesLandingPad9:                           ; preds = %postinvoke3
  %landing_pad11 = landingpad { i8*, i32 } personality i32 (i32, i32, i64, i8*, i8*)* @_d_eh_personality
          cleanup                                 ; [#uses = 2 type = { i8*, i32 }]
  %12 = extractvalue { i8*, i32 } %landing_pad11, 0 ; [#uses = 1 type = i8*]
  %13 = extractvalue { i8*, i32 } %landing_pad11, 1 ; [#uses = 0 type = i32]
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  call void @_d_eh_resume_unwind(i8* %12)
  unreachable

temporariesLandingPad2:                           ; preds = %postinvoke
  %landing_pad5 = landingpad { i8*, i32 } personality i32 (i32, i32, i64, i8*, i8*)* @_d_eh_personality
          cleanup                                 ; [#uses = 2 type = { i8*, i32 }]
  %14 = extractvalue { i8*, i32 } %landing_pad5, 0 ; [#uses = 1 type = i8*]
  %15 = extractvalue { i8*, i32 } %landing_pad5, 1 ; [#uses = 0 type = i32]
  invoke void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc3)
          to label %postinvoke7 unwind label %temporariesLandingPad6

postinvoke7:                                      ; preds = %temporariesLandingPad2
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  call void @_d_eh_resume_unwind(i8* %14)
  unreachable

temporariesLandingPad6:                           ; preds = %temporariesLandingPad2
  %landing_pad8 = landingpad { i8*, i32 } personality i32 (i32, i32, i64, i8*, i8*)* @_d_eh_personality
          cleanup                                 ; [#uses = 2 type = { i8*, i32 }]
  %16 = extractvalue { i8*, i32 } %landing_pad8, 0 ; [#uses = 1 type = i8*]
  %17 = extractvalue { i8*, i32 } %landing_pad8, 1 ; [#uses = 0 type = i32]
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  call void @_d_eh_resume_unwind(i8* %16)
  unreachable

temporariesLandingPad:                            ; preds = %0
  %landing_pad = landingpad { i8*, i32 } personality i32 (i32, i32, i64, i8*, i8*)* @_d_eh_personality
          cleanup                                 ; [#uses = 2 type = { i8*, i32 }]
  %18 = extractvalue { i8*, i32 } %landing_pad, 0 ; [#uses = 1 type = i8*]
  %19 = extractvalue { i8*, i32 } %landing_pad, 1 ; [#uses = 0 type = i32]
  call void @_D6toelem6Struct6__dtorMFZv(%toelem.Struct* %__slStruc2)
  call void @_d_eh_resume_unwind(i8* %18)
  unreachable
}

Pseudo-code translation of the main statement int r = (Struct(), foo(false), Struct()) + foo(doThrow);:

auto __slStruc2 = Struct();
try { foo(false); }
catch // temporariesLandingPad
{
    destruct __slStruc2;
    throw;
}

// no throw: postinvoke
auto __slStruc3 = Struct();
int tmp4;
try { tmp4 = foo(doThrow); }
catch // temporariesLandingPad2
{
    try { destruct __slStruc3; }
    catch // temporariesLandingPad6
    {
        destruct __slStruc2;
        throw;
    }

    // no throw: postinvoke7
    destruct __slStruc2;
    throw;
}

// no throw: postinvoke3
int r = __slStruc3 + tmp4;
try { destruct __slStruc3; }
catch // temporariesLandingPad9
{
    destruct __slStruc2;
    throw;
}

// no throw: postinvoke10
destruct __slStruc2;

kinke added 6 commits July 18, 2015 13:34
We'd like to evaluate expression NewExp::argprefix and only destruct
it if an exception interrupts its evaluation.
The front-end uses a single bool variable as gate for all destructors
(by replacing each dtor expression by `gate || dtor()`).
This flag is set at the end of the argprefix chain of CommaExp.
So when no exception occurs in argprefix, the dtors aren't invoked in
the inserted finally block.
Right after the finally block we have the call to NewExp's ctor, which
will take care of destructing its parameters.
@kinke
Copy link
Member Author

kinke commented Jul 18, 2015

There's at least one major issue remaining - if a temporary's constructor throws, it is later destructed in a landing pad. That's because the temporary is declared by a nested DeclarationExp before invoking its constructor method within a CallExp.
This is the reason for failing test13095 in runnable/sdtor.d. test13661, test14022, test14023 and test13669 fail too, all involving static arrays. The previous issues appear to be fixed.

@dnadlinger
Copy link
Member

For which case was "Restore toElemDtor() functionality." necessary? It seemed like the tests already looked quite good before that.

@dnadlinger dnadlinger closed this Jul 18, 2015
@dnadlinger dnadlinger reopened this Jul 18, 2015
@kinke
Copy link
Member Author

kinke commented Jul 18, 2015

toElemDtor() is still necessary, e.g., for DMD Bugzilla 8360 (an assert's msg is only to be evaluated (and destructed) if the assert fails). For the other cases, it's primarily useful to shorten the temporariesLandingPads.
The 8th commit fixes DMD issue 13095.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DMD uses a similarly ugly hack to detect this case.

@dnadlinger
Copy link
Member

Ah okay, thanks for the explanation! I'm merging this because it is clearly better than what we currently have, and to make collaboration easier.

dnadlinger added a commit that referenced this pull request Jul 19, 2015
WIP: Codegen for NewExp::argprefix.
@dnadlinger dnadlinger merged commit 055d893 into ldc-developers:merge-2.067 Jul 19, 2015
@smolt
Copy link
Member

smolt commented Jul 21, 2015

FYI - compile times for std/uni.d unittest soared after this commit. It went from 8 seconds to 4 minutes for x86_64 OS X and 8 seconds to 20 minutes for ARM.

@kinke
Copy link
Member Author

kinke commented Jul 24, 2015

@smolt Wow, that's extreme. I haven't noticed any significant slow-down on Linux x64, but haven't measured the compile time for std.uni yet. Note that I've fixed a memory leak in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants