rules for when deinit is called #11534

mppf · 2018-10-31T19:44:22Z

To the extent the language specification says anything about it, I think it says that variables local to a block are de-initialized at the end of the block. This issue proposes that we adjust the rule.

Here is what I want the rule to be:

Local variables, including temporaries are deinitialized at or before the end of the enclosing block
- the exception is values that are returned; these are instead transferred to the caller
Global variables (possibly including temporaries - see module-scope variables and temporaries #11526) are deinitialized at or before program shutdown
What does "or before" mean in the above? The compiler can move the deinitialization from the end of the enclosing block / end of the program to an earlier point if it is valid to do so. A valid point for a variable has the following properties:
- the variable is no longer used directly (i.e. the variable itself is dead)
- there are no references potentially pointing to that variable
- there are no borrows potentially from that variable
Because deinitializers can have user-visible side effects (such as flushing a buffer, printing something to standard out, updating a file), the exact point at which these run (as well as the program behavior) can vary across Chapel versions.

Implementing such analysis would allow the resolution of #11492.

Examples to think about:

I/O

For the case in #11492, we have a statement like f.writer().write(A); and would like the channel that is the result of writer to be deinited sooner than it is today. This approach would allow that.

Array Slices

proc makeArray() {
  var A:[1..100] int;
  return A;
}
proc main() {
  ref slice = makeArray()[1..10];
  // point 1
  writeln(slice);
  writeln();
  // point 2
}

In this example, there is a reference to a call-expression temporary storing the slice, which in turn refers to a call-expression temporary storing the whole array. As a result, the compiler may not deinitialize either the array or the slice array-view of it at point 1. It must perform this deinitialization after these are no longer used, at point 2.

Owned and Borrows

proc main() {
  var instance: borrowed C = new owned C(1);
  // point 1
  writeln(instance);
  writeln();
  // point 2
}

This case is similar to the array slice case. Here instance is a borrow from an unnamed temporary storing an owned C. The owned C may not be deinitialized until after instance is dead; that is only at point 2.

User Variables vs Temporaries

Many of the important cases to call deinit earlier apply to temporaries, but I'm proposing that this behavior also apply to user variables. E.g.:

var someRecord = new MyRecord();
... code not mentioning someRecord ...

Here someRecord could be deinitd immediately after it is created.

RAII Lock

Chapel doesn't have a C++-style RAII lock type and it's unclear if we will add that in the future. However if we did add it, the lock type could use an attribute (see #7355) to communicate to the compiler that deinit for that type should not be moved. What would a C++-style RAII lock look like in Chapel? E.g.

{
   var lock = takeLock();
   // lock deinit function releases lock, so lock released at the end of this block
}

I suspect that there are clearer ways of writing this pattern. Python with contexts come to mind as a good alternative.

The text was updated successfully, but these errors were encountered:

benharsh · 2018-10-31T23:18:48Z

Maybe instead of the array example you could use this code:

class Data {
  var x : [1..10] int;
}

record MyArr {
  var data = new owned Data();

  proc this(r : range(?,?,?)) {
    return new MySlice(r, data.borrow());
  }
}

record MySlice {
  var dom : domain(1);
  var data : borrowed Data;

  proc init(r : range(?,?,?), data : borrowed Data) {
    this.dom = {r};
    this.data = data;
  }

  proc writeThis(f) {
    f.write(data.x[dom]);
  }
}

proc arrayMaker() {
  return new MyArr();
}

proc main() {
  const ref slice = arrayMaker()[1..5];
  writeln(slice);
}

Which I think demonstrates the same issue but with less internal/special behind-the-scenes code (i.e. array views).

benharsh · 2018-10-31T23:29:41Z

If we're using the rule "at or before" can you actually rely on f.writer().write(A); doing what you want it to do? It seems like we should either say that temporaries are deinit'd at the end of the statement (with the ref-to-call-expr case being an exception), or say they're deinit'd at the end of the scope.

Edit: Otherwise I think it would become difficult for users to reason about the lifetime of their temporary/user variables by reading the code.

bradcray · 2018-11-01T00:03:11Z

I had the same reaction / question as @benharsh just above, but he typed it faster than I managed to.

mppf · 2018-11-01T15:04:54Z

If we're using the rule "at or before" can you actually rely on f.writer().write(A); doing what you want it to do? It seems like we should either say that temporaries are deinit'd at the end of the statement (with the ref-to-call-expr case being an exception), or say they're deinit'd at the end of the scope.

Yes, I think it could be relied upon in that case, because the compiler would do a certain minimal level of analysis once at the time we add this feature, and that would be enough to cover that case. In more complex cases, I'd expect the behavior of programs to potentially change over time.

We could certainly consider specifying in more detail beyond what I put there. In particular, one reasonable approach would be to specify that the temporaries are either deleted at the end of the statement or at the end of the block. (I.e. without necessarily saying which one the compiler will choose in a given case).

I think it'd be interesting to be able to say something like:

in a ref/borrow case, the a temporary is destroyed at the end of the block
otherwise it's destroyed at the end of the statement

but the question is, can we describe in enough detail to a user what "a ref/borrow case" actually means? I certainly am confident that we can get the compiler to detect such cases conservatively, but what about the boundary of analysis, where the compiler thinks it might be a ref/borrow case but it's actually not? It's easy to create aliasing situations, for example.

So, is there a simple conservative rule you'd use to differentiate between the two, that we can specify and explain to users? If not, I think we should allow some variability depending on the strength of the analysis in the implementation.

(One common trick from alias analysis is the "address not taken" rule. Maybe there's something similar we can do here. But, in the I/O example, the channel temporary is passed by const ref to write, so it's not going to be that simple).

mppf · 2018-11-01T15:05:48Z

Maybe instead of the array example you could use this code:

I'm happy that you posted that, but I wanted to emphasize problems users can run in to with simple use of the built-in types and functions.

benharsh · 2018-11-01T17:23:53Z

I was thinking about a rule like "free temporaries at end of statement unless the statement is variable declaration". It seems like that would be enough for records*, but maybe not for owned/borrowed classes? But maybe those need to be a special case anyways? For example:

class C {
  var x : int;

  proc deinit() {
    writeln("destroy: ", x);
  }
}

proc makeOwnedC() {
  return new owned C(5);
}

proc store(ref lhs : borrowed C, rhs : borrowed C) {
  lhs = rhs;
}

proc main() {
  var x : borrowed C;
  writeln("---- before ----");
  store(x, makeOwnedC().borrow());
  writeln("---- after ----");
  writeln(x);
}

This program prints:

---- before ----
---- after ----
{x = 5}
destroy: 5

* "seems like that would be enough for records": if we can ignore passing a record to an extern function by ref...

mppf · 2018-11-01T18:52:19Z

@benharsh - your owned/borrow example is an interesting one. I'd personally lean against special-casing owned/borrowed because records can contain owned/borrowed and then we have to special-case the records and then it doesn't feel so special as much as what happens half the time :)

Anyway, for this code:

proc main() {
  var x : borrowed C;
  writeln("---- before ----");
  store(x, makeOwnedC().borrow());
  writeln("---- after ----");
  writeln(x);
}

we can observe that there was a borrow from the owned temporary. Maybe that would trigger destroying that temporary at the end of the block. Maybe the idea you have is sufficient for refs.

mppf · 2018-11-02T19:04:37Z

As far as I can tell at the moment - Brad, BenH, Vass and I are on the same page for the general direction here.

However Brad and BenH would like more specific rules that can be in the specification (and read, understood, and relied upon) than what I put in the issue description.

What remains to do is to come up with the exact specific rules.

vasslitvinov · 2018-11-02T19:33:32Z

I suggest that "valid points" must be at statement boundaries. This should implicitly include the end of the block.

@benharsh

Destroy vars in scopeless blocks at end of parent block PR #11917 added blocks around .type expressions so that these expressions could be eliminated in general. However this caused certain uses of runtime types to have the runtime type be destroyed before it could be used. This problem is visible with a valgrind run of the test arrays/dinan/init_arraymember/test7 This PR adjusts the logic in addAutoDestroyCalls / walkBlock to add destruction of variables in a scopeless block at the end of the parent block (as if the scopeless block were not present). This matches the scopeless nature of these blocks. If the block needed to be scopeless so a variable declared within it could be used later, then that variable could be used later in the parent block, and so it shouldn't be destroyed until the end of the parent block. Testing of this branch revealed that error handling constructs within scopeless blocks do not function correctly. This is addressed by adding an assert for that case and adjusting the build.cpp code for begins to use a regular block instead of a scopeless one. Note that we have some desire to adjust where variables are destroyed - see #11534. - [x] full local testing - [x] test7 passes with -valgrindexe Reviewed by @benharsh - thanks!

mppf · 2020-02-13T21:59:12Z

So, we have several options identified:

deinit all variables at end of block (what we did before PR Adjust point of deinitialization based on expiring value analysis #14691)
deinit at last mention or end of block according to expiring value analysis (on master after PR # 14691)
deinit all user variables at end of block; call temps in variable declarations at end of block; call temps in other statements after the statement (rules for when deinit is called #11534 (comment))

I'd like to add another option:

deinit user variables at last mention. For call temporaries in a variable declaration, deinitialize just after the variable declared would be deinitialized (so e.g. for a slice captured by ref, the slice is deinitialized after the last mention of the ref). For call temporaries in other calls, deinit them after the statement.

@benharsh

Use simpler rule for deinit point This PR updates the compiler to use simpler rules for when a variable is dead. The expiring value analysis (proposed in #13704) did not end up being as useful as initially expected for copy elision. As a result we are shifting to a simpler design for when copy elision can occur - #14874. That leaves the expiring value analysis impacting the deinitialization point (#11534 #11492) and the ownership transfer from non-nilable checking (#13600). However the ownership transfer checking is not significantly improved with expiring value analysis (it is approximately as effective when handled by the must-alias analysis in the nil checker). That leaves the deinit points. Since a copy elision can cause a variable to be effectively dead in the current scope after it occurs, there is some relationship between deinit points and copy elision. That is why #13704 tried to address these two together. For example: ``` chapel proc acceptsInIntent( in arg ) { ... } proc test() { var x: MyRecord; ... acceptsInIntent(x); // copy elision occurs here // x is effectively dead in this scope // accesses through refs/borrows hopefully are compilation errors // but could be use-after-frees. } ``` However since the copy elision rule is relatively simple, this PR changes the rule for when a variable is dead to simply include the copy elision rule. Besides that, it is based upon the simpler rule in #13704 / #11534 (comment) . Here is the rule as of this PR: > A variable is dead: > * after copy elision if it occurred (after the last mention is > used to copy-initialize a variable or in intent argument) > * for the results of nested call expressions not involved in initializing > a user variable or reference after the end of the statement > * otherwise, at the end of the scope in which it is declared There are two important things to note about the above rule: 1. "involved in initializing a user variable or reference" includes through split-init 2. the above rule applies to module-scope variables (aka global variables) as well as local variables. For global variables Also note that `--no-early-deinit` is available to make all temps deinited end-of-block. This PR takes the following steps to implement the change to deinitialization points: * add a shared helper isInitOrReturn to CallExpr.cpp; use it in possiblyInitsDestroyedVariable * removes --report-expiring flag and the entire expiring value analysis * adjusts the normalizer to leave temps in module init functions (instead of making them global variables) so that this can be determined later (after split-inits are known) * fixes a few bugs with nil checking * adds MarkTempsVisitor / gatherTempsDeadLastMention / markTempsDeadLastMention which are called from fixPrimInitsAndAddCasts in resolveFunction.cpp. This analyzes temp variables by how they are used to determine if they are used to initialize a user variable. Besides adding `FLAG_DEAD_LAST_MENTION` or `FLAG_DEAD_END_OF_BLOCK`, if the variable is in a module initialization function, it will pull it out into global scope. * insertReturnTemps marks the new temps since this runs after MarkTempsVisitor * adjust several tests that have use-after-frees according to the new rules: * nbody_orig_1.chpl * assignRecord.chpl * owned-class-instantiation-types-user-init.chpl * owned-class-instantiation-types.chpl * (and these also which were added after checking in PR #15072 detected an issue with them) * iter-return-first-borrowed-array-element.chpl * owned-record-instantiation-types-user-init.chpl * classes/initializers/generics/inheritance/inherited.chpl Reviewed by @benharsh - thanks! - [x] full local testing

mppf · 2020-03-10T10:55:24Z

PR #15041 implements a simpler rule for deinit points that enables this program to behave as originally expected.

A variable is dead:

after copy elision if it occurred (after the last mention is
used to copy-initialize a variable or in intent argument)

for the results of nested call expressions not involved in initializing
a user variable or reference after the end of the statement

otherwise, at the end of the scope in which it is declared

mppf added type: Design area: Language labels Oct 31, 2018

mppf changed the title ~~rules for deinitializing~~ rules for when deinit is called Oct 31, 2018

This was referenced Oct 31, 2018

I/O and Destroying temporaries at the end of the block vs end of statement #11492

Closed

module-scope variables and temporaries #11526

Closed

mppf mentioned this issue Jan 2, 2019

Destroy vars in scopeless blocks at end of parent block #11955

Merged

2 tasks

mppf mentioned this issue Apr 12, 2019

issues with eliding copies from expiring values #9490

Closed

mppf mentioned this issue Aug 12, 2019

Proposed Language Rules for Expiring Values #13704

Closed

mppf mentioned this issue Dec 19, 2019

control flow expressions and lifetimes of temporaries #10879

Open

mppf self-assigned this Jan 23, 2020

mppf mentioned this issue Mar 3, 2020

Use simpler rule for deinit point #15041

Merged

1 task

mppf closed this as completed Mar 10, 2020

mppf mentioned this issue Aug 26, 2022

Borrowed classes get immediately freed after the first #20516

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rules for when deinit is called #11534

rules for when deinit is called #11534

mppf commented Oct 31, 2018 •

edited

benharsh commented Oct 31, 2018

benharsh commented Oct 31, 2018 •

edited

bradcray commented Nov 1, 2018

mppf commented Nov 1, 2018

mppf commented Nov 1, 2018

benharsh commented Nov 1, 2018 •

edited

mppf commented Nov 1, 2018

mppf commented Nov 2, 2018

vasslitvinov commented Nov 2, 2018

mppf commented Feb 13, 2020

mppf commented Mar 10, 2020

rules for when deinit is called #11534

rules for when deinit is called #11534

Comments

mppf commented Oct 31, 2018 • edited

I/O

Array Slices

Owned and Borrows

User Variables vs Temporaries

RAII Lock

benharsh commented Oct 31, 2018

benharsh commented Oct 31, 2018 • edited

bradcray commented Nov 1, 2018

mppf commented Nov 1, 2018

mppf commented Nov 1, 2018

benharsh commented Nov 1, 2018 • edited

mppf commented Nov 1, 2018

mppf commented Nov 2, 2018

vasslitvinov commented Nov 2, 2018

mppf commented Feb 13, 2020

mppf commented Mar 10, 2020

mppf commented Oct 31, 2018 •

edited

benharsh commented Oct 31, 2018 •

edited

benharsh commented Nov 1, 2018 •

edited