compiler: make `injectdestructors` a MIR pass #450

zerbina · 2022-10-16T19:23:02Z

First, a short summary of how injectdestructors used to operate:

an instruction-based control-flow-graph that also contains instruction related to data-flow analysis (i.e. use and def) is generated for the code fragment (in this case, either the body of a procedure or a fragment of a module's top-level code) the transformations are to be applied to
the control- / data-flow-graph is traversed (executed) in order to figure out which uses of locations are "last reads" and which writes to them are "first writes". The PNodes corresponding to the use and def instructions for which a "last read" or "first write" is detected are marked with the respective node flags nfLastRead and nfFirstWrite.
the AST is processed -- this is where the core logic happens. =destroy calls and temporaries are injected; assignments are rewritten into either =copy calls, =sink calls, or shallow copies (i.e. nkFastAsgn); copies are introduced for arguments to sink parameters where necessary; wasMoved calls are inserted when an lvalue is consumed; scopes are wrapped in try/finally statements/expression if necessary and the nkIdentDefs of relevant locals are moved to the start of their scope.
the optimizer (optimizer.nim) that eliminates wasMoved(x); =destroy(x) pairs that cancel each other out is run

With the new implementation, the transformations are split into multiple passes. The control-flow graph still uses an instruction-based representation, but no longer contains the data-flow-related instructions. There are two new instructions, join and loop, that together enable loop bodies (independent of their nesting) to be evaluated for their effects only once, and also for the graph to be efficiently traversed backwards. The mirexec module implements both the CFG construction and the traversal routines.

Instead of analyzing all uses of all locations for "last reads" in one single CFG traversal, each usage in a consume context (e.g. something passed to a sink parameter) is now analyzed separately. This greatly simplifies the analysis and reduces the amount of state that needs to be tracked, resulting in a large speedup.

The aforementioned analysis is modeled as "solving" ownership. Each value that results from a computation gets an ownership status assigned -- a value is considered to be "owned" if there are no operations coming after the operation it resulted from that observe its presence in the location that stores it. Same as with the previous implementation, location access through aliases (i.e. a pointer) is not considered here. That is, accessing a location through a pointer is not considered to be a read or write of the underlying location. If a value is an r-value (e.g. it's the result of a procedure call and is not a view) it is always considered to be owned.

For destructor injection, instead of tentatively injecting a destructor for each location that has a destructor and letting the optimizer elide it later, a destructor is now only inserted for locations for which it is not certain whether they store (and own) a value once control-flow exits their scope. This analysis also takes implicit moves into account.

Previously, a scope was wrapped in a try and the destructors called inside the corresponding finally clause if one of the following was present inside the scope: a call that raises or an explicit raise (Defects were ignored, independent of --panics), a break/continue statement, a return statement, another scope that contains one of the aforementioned. A finally clause is now only required if at least one location requiring destruction still stores a (owned) value when its scope is exited via unstructured control-flow. For example:

block:
  var s = @[1, 2]
  if cond:
    f_sink(s) # `s` is consumed
    raise ValueError.newException("")
  else:
    discard

s requires a destructor call at the end of the block, but a finally is not needed, since s no longer contains a value once control-flow reaches the raise. To be able to detect consumes, the alive analysis requires value ownership to be figured out already.

The actual assignment rewriting and destructor injection are performed as two separate but concurrently applied MIR passes. While it's possible to perform the copy injection for arguments that are not owned and are used in a consume context as another separate pass, it is applied as part of the assignment rewriting pass, in order to reduce the number of MirTree traversals.

When an assignment doesn't own the source operand and the value represents a resource (i.e. something that has a =destroy hook), the assignment is rewritten into a call to the type-bound =copy hook. If the source value is owned, depending on whether or not the destination location is empty, the assignment is rewritten into either a bit-wise copy (nkFastAsgn) or a call to the type-bound =sink hook. If the source is an lvalue (i.e. there exists a name for the containing location), a destructive move has to be performed (via a call to wasMoved), but only if it can't be proven that no connected write observes the now no longer owned value being present in the source location. The analysis (see needsReset and isLastWrite) takes into account if a destructor is present for the source location.

Since destructors are only injected for alive locations and wasMoved calls are only generated where necessary, the separate optimizer pass is no longer needed.

Fixes

Assignments to location with destructors where the right-hand side is a complex expression now respect the strict left-to-right evaluation order:

var
  i = 0
  s = "str"

# works now:
(i = 1; s) =
  if i == 1: "str2"
  else: doAssert false; ""

# also works now:
proc p(s: var string): var string =
  i = 2
  s

p(s) = block:
  doAssert i == 2
  "str2"

A mutation of the argument is now considered to happen when control-flow reaches the call, not when the argument is passed to a var parameter. The difference is subtle but important:

proc f_sink(x: var seq[int], y: sink seq[int]) =
  y[0] = 2
  doAssert x == [1, 2] # passes

proc main() =
  var s = @[1, 2]
  f_sink(s, s) # a full copy is now introduced for the second argument

main()

Globals are now only moved if it can proven that doing so does not affect program semantics. Previously, assigning a location derived from a global would perform a destructive move in certain circumstances, such as:

# at module scope
var a = "str"
var b = a # moved

proc p() =
  discard

echo a # echoes an empty string

Due to the way the new implementation works, it is now possible for locations derived from globals to be consumed (if safe) when passing them to sink parameters.

As a side-effect of the MIR's canonicalization, the alias analysis now handles tuple access correctly:

proc prc(x: var string) =
  var tup = (x: 0, y: "str")
  # `tup[1]` and `tup.y` were treated as two disjoint locations, causing the assignment
  # below to be erroneously optimized into a move
  x = tup[1]
  doAssert tup.y == "str"

var o: string
prc(o)

Details

Timings

Timings and peakmem are taken from compiling the compiler itself with the options -d:release --compileOnly and no cached compilation artifacts. The original compiler is build with -d:release --exceptions:goto.

compiler from	bootstrapped with	using	time taken	peakmem
#d7a70238d0	`--gc:refc`	`--gc:refc`	24.426s	1015.844MiB
this commit	`--gc:refc`	`--gc:refc`	24.687s	1014.254MiB
#d7a70238d0	`--gc:refc`	`--gc:orc`	45.206s	1.99GiB
this commit	`--gc:refc`	`--gc:orc`	29.531s	1.222GiB
#d7a70238d0	`--gc:orc`	`--gc:refc`	15.492s	978.938MiB
this commit	`--gc:orc`	`--gc:refc`	15.714s	979.023MiB
#d7a70238d0	`--gc:orc`	`--gc:orc`	22.218s	1.933GiB
this commit	`--gc:orc`	`--gc:orc`	18.266s	1.188GiB

To-Do

Notes for reviewers

I collected a list of issues/notes in the form of review comments

saem · 2022-10-17T01:17:26Z

I think I understand it, going to write that out for confirmation:

injectdestructors:

has mirgen generates an initial MirTree (genProcBody)
then comes up with a list of changes it's going to make (analyse)
applies them to the MirTree
converts the updated MirTree to PNode

Prior to reading it, what I was looking for:

that it's basically all lisp-y on the inside, the AST is a program (we briefly discussed it before in chat and it turns out we're very much on the same page)
translation to any intermediate should "remember" where it came from
the preivous point is most true wrt to changes made for precise error/warnings (need lineage)

It seems you've already accounted for the majority of it -- if not all and it's just my understanding falling short. I think it's mostly covered with the ChangeSet + Action which remembers the PNode it's acting upon. So errors should be able to trace their lineage back within MIR. The second bullet point I'm a little less sure about because I can't see the how we could correlate original PNode to initial MirNode -- the final hop.

So I'm not sure how we flow the info back up to Sem n order for us to ensure we can flow constraints (mostly types info) down from Sem and warnings/errors telemetry backup. Apologies if it's just my lack of understanding the code.

zerbina · 2022-10-17T21:38:14Z

I think I understand it, going to write that out for confirmation:

Your understanding is correct. I apologize for not having provided a high-level overview.

I think I insufficiently described my motivation for this PR, and since that information can help in understanding some decisions behind the current implementation, here goes a second attempt:

The main contributors to my decision of working on this PR first are as following:

the pre-PR injectdestructors produces AST that violates various (reasonable) expectations of irgen. For example, sometimes a produced nkStmtListExpr has no type (i.e. it isn't actually an expression) or it has a type but doesn't end in an expression. While possible to work around these things, additional logic would be required and irgen would also have to become less strict about it's input.
the idea of implementing the move-analyser at the back-end IR level has been around since the new back-end's inception, but I no longer think that it's a good idea. Over time, the additional complexity of having the back-end IR accommodate for the move-analyser started to outweigh the benefits -- the ideas I collected on how the move-analyser could be implemented work better (if at all) with a slightly higher-level IR.
irgen is quite complex. It does a lot of different things, like performing general transformations and lowering some magics. Then there's also the deferred type translation plus the subsequent fix-up of all type IDs required by it.

So to rephrase, the goal here is to introduce an IR that:

is meant to support injectdestructors, lambdalifting, and, later on, maybe also other parts of transf (e.g. closureiterators, although this one will probably become unnecessary with the introduction of CPS transforms) efficiently operating on it
works as the input to irgen

My current plan for the how the flow should look like in the near-term:

input ->
  transform() ->               # `transf`
    toMir() ->                 # currently `genProcBody`
      lambdalifting() ->
        injectdestructors() ->
          backend()            # `irgen`

Using MirNode in it's current form as the input to new back-end (or more specifically, irgen) would have the following benefits already:

multiple transformations previously performed during irgen (listed in the PR's description) are now performed in mirgen. This would simplify irgen a lot.
the elision of nodes irrelevant to the back-end (e.g. nkBindStmt, nkTypeDef, etc) together with the linear tree structure make scanning of dependencies (types, procedures, constants, literals, etc.) trivial, rendering, among other things, deferred type translation for irgen obsolete.

The main reason behind the MirNode -> PNode translation is to keep the current code-generators working without having to keep two implementations of injectdestructors around.

The second bullet point I'm a little less sure about because I can't see the how we could correlate original PNode to initial MirNode -- the final hop.

A MirNode's origin is currently not kept track of (maybe to a source location, but not to a PNode) - this is something that's still missing.

So I'm not sure how we flow the info back up to Sem n order for us to ensure we can flow constraints (mostly types info) down from Sem and warnings/errors telemetry backup.

I'm not sure I understand. In which case is it required for information to flow back from the MIR to Sem? Sorry if it's obvious.

saem · 2022-10-18T02:40:22Z

Thank you for the detailed explanation. The motivations make a lot of sense and the overall flow is sensible. I didn't need it, but consider me extra convinced. 😁

So I'm not sure how we flow the info back up to Sem n order for us to ensure we can flow constraints (mostly types info) down from Sem and warnings/errors telemetry backup.

I'm not sure I understand. In which case is it required for information to flow back from the MIR to Sem? Sorry if it's obvious.

Somewhat of a guess on my part, as we're doing lifetime analysis here, then any invariant enforcement would also need to be here?

Possibilities aside, at the time of my prior comment I had done a very quick search in the current inject destructors to look for localReport and saw a few hits raising sem errors. I'm presuming we'll have to do the same-ish. Not a very thorough investigation admittedly.

Reading between the lines we'll likely end up with origin being tracked and nothing about the existing approach seems to make that an impossibility, I think you nailed it. 🎉

compiler/sem/mirtraverse.nim

zerbina · 2022-11-02T20:39:42Z

The most recent commit contains a partial redesign of the MIR.

The previous iteration was a simple, linear-tree-based IR, which was structurally similar to PNode (albeit with a lot less node kinds). Except for things like symbols, literals, and other atoms, expressions and statements all used forward trees. That is, first comes the head node (mnkCall for example), then all child-nodes (which can be other sub-tree), and then the mnkEnd node.

One of the main differences when compared to PNode AST is the omission of complex expressions (if, case, try, block, and statement list expressions) in MIR -- those are transformed into assignments to temporaries. This was done to simplify both data-flow analysis and further code modifications. For example:

let x =
  if a: true
  else: false

get translated to the MIR equivalent of:

var tmp: bool
if a:
  tmp = true
else:
  tmp = false
let x = tmp

A significant issue with this is the loss of information about the control-flow dependencies between separate expressions. Consider:

proc fsink(a: sink string, b: string) = ...

fsink(notLastRead x, (if cond: y else: z))

which got transformed to the MIR equivalent of:

var tmp: string
if cond:
  tmp = y
else:
  tmp = z

fsink(notLastRead x, tmp)

x needs to copied, but it's not clear at which point, as the argument expressions x and tmp are entirely unrelated in the IR. While the observable effects would be minor in this specific case (string copies happening in a different order), injecting a copy operation directly before the call is wrong, as the copy is required to happen as part of evaluating the respective argument expression.

One solution to this problem would be to assign every argument expression to a temporary first, use the temporaries as the actual arguments, and then inject the copy (or any other operations performed as part of injectdestructors, e.g. sinking into an argument) at the position where the respective temporary is defined. This has the significant downside that these extra constraints and behaviour are not expressed explicitly in the IR.

There also were other issues, so I opted for a partial redesign instead.

The core of the redesign is a shift to a representation where the focus is more on the flow of values. Expressions are now not tree-based anymore but are instead a sequence of applying operations to lvalues and values. Except for if and case, statements and their representation were left untouched -- they still use sub-trees.

Each sequence starts with an input (e.g. a literal or the name of some entity), is followed by zero or more in-out operations, and ends in a sink. A sink is a kind of operation that doesn't produce a value. For example, both if and raise statements are sinks. A procedure call is treated as always returning a value and has to be explicitly followed by a dedicated void sink. Multiple arguments to an operation are supported via argument blocks and the arg sink.

As a simple demonstration:

doSomething(x, "a", (echo "a"; y.b.c.int))

would be represented in the MIR as:

argBlock:
  proc doSomething -> arg
  local x -> arg
  lit "a" -> arg
  argBlock:
    proc echo -> arg
    lit "a" -> arg
  end argBlock ->
  call -> void
  local y -> path -> stdConv -> arg
end argBlock ->
call -> void

Note that the callee procedure is now also only an argument to the call operation.

Reading the above top-to-bottom left-to-right yields the nodes in the order they're placed in memory. The arrows are only meant for visualizing the flow of values. For documentation purposes, I wrote down the new grammar as a doc comment in compiler/sem/mirtrees.nim.

I originally wanted for path expressions to use their own IR, mainly because this would allow for de-duplicating them and assigning each one a unique ID, making comparing two path expression for equality very efficient.

The separate IR is also how they're currently implemented, but none of the expected benefits have materialized so far (comparing two paths for equality hasn't come up yet). There are many not-yet-solved issues, and de-duplicating them is complicated by the fact that conversions can be part of paths, so sameType is required as part of the comparison (or it would be at least -- de-duplication is not yet implemented).

In general, there are still a lot of open questions, and some constructs, such as user-ops, are under- or unspecified. Nonetheless, I believe that in it's current state, the refactored MIR is already better than the previous iteration.

Related to the MIR changes, there's an experiment with a small template-based DSL for making MIR generation easier -- see mirconstr for the DSL and mirgen for how it's currently applied.

I also made general progress in regards to the actual injectdestructors rewrite, and did some down-scoping by removing the unfinished attempt of a re-implementation of cursor inference. Half of the tests in the destructors and arc categories succeed now.

There are some pre-existing issue with some of the rewrite-rules, mainly in how they interact (or not) with unstructured control-flow. I'll create tests for them, but am probably not going to fix them as part of this PR.

saem · 2022-11-06T00:31:56Z

I'm a little (very) sleep deprived so my review wasn't not very insightful I gather -- feel free to ignore the review comments if they're just unhelpful. I've read the code as best as I can and revisited your comment, I think the new approach is better (operatinos on l-values).

I'll take another peek tomorrow -- hopefully I'll be able to understand things better then. I wouldn't consider that a blocker.

Aside: minor thing is that perhaps there might be some vlaue in differentiating between callReturnValue and callUnit or something like that in order to drop the void requirement. But I'm pretty motivated to destroy conditionals/nested code.

compiler/sem/mirtrees.nim

saem · 2022-11-05T23:44:23Z

compiler/sem/mirtrees.nim

+## This module contais the type defitions for the mid-end IR (MIR) together
+## with routines for querying them.
+##
+## Routines for traversing the used flat-tree-structure are also located here.


Bringing over the high level description of the MIR approach from your recent PR comment would be a nice to have.

I'm popping back and forth because my crappy working memory has to continually remind itself of what's up. 😅

saem · 2022-11-05T23:46:32Z

compiler/sem/mirtrees.nim

+import
+  compiler/ast/[
+    ast_types
+  ]


Casual remark:

I had a fun little realization. I know this module is the entry point, in terms of learning, for the MIR "package" because it's got basically no dependencies. Paired with some instructive comment that's a pretty nice cue in terms of navigating this bit of the compiler.

compiler/sem/mirmodify.nim

compiler/sem/mirtraverse.nim

saem · 2022-11-06T00:21:12Z

compiler/sem/mirconstr.nim

+
+import
+  compiler/ast/ast_types,
+  compiler/sem/mirtrees


any particular reason to separate these out from mirtrees?

The construction procedures and utilities are only needed when generating new MIR code. For just comprehending and processing MIR (e.g. mirtraverse.nim, astgen.nim, etc.), they're not needed, so I thought it'd make sense to put them into a separate module.

Fair enough, I can go either way on this sort of thing. Maaaaybe consider renaming it mirtrees_constr.nim so it feels a bit subordinate to the other module. Might hint at "don't read me first".

compiler/ast/ast_query.nim

saem

I'm suggesting small bits like doc comments right now as I'm refreshing my memory, thankfully the complexity is largely in the subject matter and not the implementaiton. 😄

Observations thus far:

the code is clean and I feel like if there were issues I could tests and fix them
similar to above, I can see myself or others extending it
this obviously took a lot of effort and it shows, thank you thank you!
my confidence in the change is mostly around CI at this point

🎉

I'm marking it approved. I'll keep reading and if I find areas of improvement that meaningfully simplify things or can easily be extracted, I'll note them in the PR.

compiler/mir/analysis.nim

saem · 2022-12-03T23:58:33Z

compiler/mir/analysis.nim

+    ## Stores information about the produced values plus the lvalue effects of
+    ## operations
+    values: IndexedSeq[OpValue, ValueInfo]
+    # XXX: `values` currently stores an entry for each *node*. Not every node


Tables are really really really slow. :(

They absolutely tanked performance when I was first prototyping a DOD AST with them.

saem · 2022-12-04T21:27:08Z

compiler/sem/mirexec.nim

+  ## Computes the control-flow graph for the given `tree`. This is a very
+  ## expensive operation!


Suggested change

## Computes the control-flow graph for the given `tree`. This is a very

## expensive operation!

## Computes the control-flow graph for the given `tree`. This is a very

## expensive operation! The high cost is due to two essential reasons:

##

## 1. a control flow graph needs to materialize all paths for a given `tree`

## 2. the amount of allocations/bookkeeping necessary to do so

This is partly me just confirming my understanding, the first point I'm fairly confident about and the second one is more me guessing by looking at the push/pop of sequences and the Table in ClosureEnv.

Aside, I spent a few minutes trying to reason about replacing that table and the best I could figure was maybe the following:

type ClosureEnv = object # ... existing bits except joins joinedNodes: seq[NodePosition] nodeIsJoined: PackedSet[NodePosition]` proc addJoin(env: var ClosureEnv, dest: NodePosition): JoinPoint = let (id, isNew) = if env.nodeIsJoined.containsOrIncl(dest): env.nodeIsJoined.add dest (env.joinedNodes.len - 1, true) else: (env.joinedNodes.find dest, false) # likely more efficient with a reverse find JoinPoint(id: id, node: dest, isNew: isNew)

This is mostly a test of my understanding and a bit of fun, I doubt it's an improvement.

I wouldn't say that the control-flow graph needs to materialize all paths for tree, rather only all edges (in the form of instructions). However, I might be misunderstanding what you mean.

Regarding the reason for why computeCfg is expensive, the last time I profiled it, the by far largest amount of time was spent inside the tree traversal routines. For a general perspective, using a -d:release compiler built with --gc:orc to compile itself with --gc:orc as the benchmark, more time (~15%) is spent in computeCfg than in mirgen (i.e. genProcBody).

I did another profiling session using the newest commit -- the six most time consuming calls inside computeCfg are:

Procedure Time spent

mirtrees.findEnd 0.401 %

mirtrees.parent 0.224 %

algorithm.sort 0.0476 %

commit 0.0174 %

mirtrees.findParent 0.0174 %

addEdge 0.011 %

The "Time spent" is a percentage of the total compilation time. Using either recursion or a skip-list (or both) could reduce the amount of time spent on tree traversal.

The snippet you posted is, apart from two problems that I believe are unintended, correct. I was curious and ran a comparison between your code and the current implementation. Instrumenting a release compiler that is booted with --gc:orc and using it to compile the devel compiler with --gc:orc yields:

Time taken (in milliseconds) Allocations Deallocations

Table 151 35219 17779

PackedSet + seq + find 152 - 153 36865 19425

PackedSet + seq + rfind 154 - 155 36865 19425

The allocations really dictate performance, kinda sad that PackedSets weren't able to get ahead. I'll have to look closer especially once the AST starts to get the DOD treatment as I'd been hoping they'd do a lot of heavy lifting in a few areas.

zerbina · 2022-12-09T17:05:34Z

Thank you for the review @haxscramper and @saem!

The for-loop PR and should fix the issue that currently causes a C compiler error for nimgrep.

zerbina · 2022-12-22T16:50:43Z

Merging the MIR is progressing rather slowly, as there are a lot of separate fixes required first. The most recent ones being fixes to the JS code-generator that are blocking the merging of #498, which is the PR with the fixes for the issue that is making bootstrapping fail in the Linux CI for this PR (the bug causing the nimgrep to not compile is already fixed in the MIR PR).

zerbina · 2023-02-10T23:10:33Z

The PR is now rebased onto devel (a slightly outdated version of it), meaning that the things already split out and merged separately are no longer part of the changeset.

As for the test failures, the relevant ones are caused by the temporary injection pass breaking the assumptions of mirgen.

mirgen translates the definitions (i.e. nkIdentDefs) of procedure-level globals in a way so that the resulting MIR code is turned back into an nkVarSection + nkIdentDefs by astgen, but with the injection of temporaries now happening, the assumption made by mirgen that the code it generates reaches astgen without modifications is no longer true.

Since the fix does not depend on the this PR and requires changes to transf and the code-generators, I'm going implement it in a separate PR.

zerbina

After fixing the remaining bugs (the backward traversal algorithm didn't work for nested loops, which I only noticed due to a single test failing with Valgrind reporting a leak) all tests pass, and I consider this PR to be done (apart from squashing and writing a commit message).

While fixing many bugs the old implementation had, a few new ones were introduced. I noted them, together with their impact, down via review comments, but in general think that none of them are severe enough to block this PR. The most critical one can be hacked around with a medium amount of effort, though I'd prefer to not do so as part of this PR.

zerbina · 2023-02-15T16:49:04Z

compiler/mir/analysis.nim

+    # FIXME: none-lvalue-conversions also count as reads, but because no
+    #        distinction is being made between lvalue- and value-conversions
+    #        at the MIR level, they're currently not considered. This is an
+    #        issue and it needs to be fixed


This is a small pre-existing issue that dfa.nim also has. In the current implementation it's only a cosmetic issue, but it should eventually be fixed nonetheless. I post-poned it, since the fix requires adjustments of the MIR.

Example:

var x = float(0.0) discard int(x) # `x` is treated as being used when control-flow reaches the `discard` # statement, not when control-flow reaches the conversion

So this paired with very aggressive/eager free placement we could introduce a bug where we free before the last use?

I agree that this is minor, IIUC we're not that aggressive currently, nor would we want to be in the future.

It's very likely that there's never going to be an issue, even if we're freeing very aggressively. The value resulting from a non-lvalue conversion is always used directly after, and there's nothing in between where a destruction could be placed.

zerbina · 2023-02-15T16:57:55Z

compiler/sem/injectdestructors.nim

+  #      Another possible improvement is moving the actual hook injection
+  #      to a follow-up pass. The pass here would only inject ``mCopyAsgn`` and
+  #      ``mSinkAsgn`` magics, which the aforementioned follow-up pass then
+  #      expands into the hook calls. This would simplify the logic here a bit


It is almost certain that this is the approach (without the magics) that I'll go with when rewriting/refactoring the pass. In my opinion, the benefits (composes better, much simpler logic, easier to understand) far outweigh the downsides (one more pass -> likely slower).

Agreed, I suspect looking at pass logic coalescing which fuses a bunch of these operations will be easier if each pass's logic is very simple/straightforwad. Not to mention, it's likely easier to get some quick wins with a little bit of pass composition based optimizations, than maintain a bunch of very carefully hand tuned implementations.

zerbina · 2023-02-15T17:02:49Z

compiler/sem/injectdestructors.nim

+    # FIXME: discriminant assignment lowering also needs to be disabled for
+    #        when generating code running at compile-time (e.g. inside a
+    #        macro)


This is going to be a problem once the injectdestructors pass is also used on code meant for running in the VM, but since that's not the case yet, no action is required for now.

Agree that it can be deferred. Moreover, that it's likely a good thing, I have a funny feeling about handling the same thing at different levels.

zerbina · 2023-02-15T17:24:11Z

compiler/sem/mirexec.nim

+        # TODO: with this approach, either the last branch needs to fork
+        #       control-flow to the next exception handler or ``mirgen`` has to
+        #       introduce a catch-all handler


This is the second most severe issue (that is somewhat pre-exisiting). In the following example:

try: try: discard # 1 raise SomeError.newException("") except OtherError: discard except SomeError: discard # 2

there will be no path between 1 and 2 in the control-flow graph. The current plan for fixing this is to, as part of transf, add what amounts to except: raise to an nkTryStmt that has at least one exception handler but not a "catch-all" one. Not only would that fix the issue in question here, there's also logic in the C and JS code-generators that can be removed then.

Handling it in a common place makes sense.

Although, not sure I could quite put together the proposed solution, would it effectively be this?

try: try: try: discard # 1 raise SomeError.newException("") except OtherError: discard except SomeError: discard # 2 except: raise

Almost. The proposed solution would like this:

try: try: discard # 1 raise SomeError.newException("") except OtherError: discard except: raise except SomeError: discard # 2 except: raise

This is roughly what the C and JS code-generators already do, and the control-flow graph generation logic would generate the correct edges without requiring any changes to it.

compiler/sem/mirexec.nim

zerbina · 2023-02-15T17:33:48Z

tests/arc/topt_no_cursor.nim

+var saved
+var :tmp_1 = sibling[].right
+`=copy`(saved, :tmp_1)
+var :tmp_2 = sibling[].right


Although it looks like it, no copy is created here. The var :tmp2 = ... is actually a var :tmp2: var T, but there's an Empty node in the IdentDefs type slot, so nothing is rendered.

Hmm, wish there was a way to comment it without mucking up the input. 🤔

zerbina · 2023-02-15T17:36:42Z

tests/arc/topt_cursor.nim

-        while readLine(f, res):
+      block label_1:
+        while true:
+          if op(readLine(f, res)):


Since astgen has no way of knowing the name of the original symbol (apart from maybe inspecting the source node), it turns mnkMagic MIR nodes into symbols with the name op.

This is going to require far more fundamental fixes, outside the scope of this PR, of course.

zerbina · 2023-02-15T17:38:51Z

tests/lang_objects/destructor/tobjfield_analysis.nim

+    # a literal constructor expression (which ``initValue`` also expands to)
+    # not used *directly* in a consume context only produces a non-owning
+    # container
+    # TODO: this doesn't match the behaviour the specification describes (i.e.
+    #       constructor expression having the same semantics as procedure calls
+    #       in the context of lifetime hook injection). The behaviour here
+    #       should either be explicitly added to the spec or changed


This is pre-existing behaviour that's not described by the specification.

It's useful to know if it changed. Noting that this behaviour is not per the spec and one or the other should change is the right thing to do. I want to have more of a think about what it means, then we can discuss what makes sense.

Documenting it and leaving it as is good!

zerbina · 2023-02-15T18:01:16Z

tests/arc/tstrformat.nim

+  knownIssue: '''
+    Semantic analysis duplicates the analysed argument into each usage
+    inside the template. Since it contains definitions of local variables,
+    multiple definitions of the same entity exist in the expanded code, which
+    is illegal at the MIR level, thus making the `injectdestructors` pass fail
+    '''


I decided to disable this test for now. It's the most severe issue, but there are so many hacks in mirgen already that it felt wrong to introduce yet another one. For now, an entity being defined multiple times will raise an internal compiler error when the code reaches the injectdestructors pass.

This sounds like it should be fixed in semantic analysis, rather than hack up injectdestructors.

My next two days are really busy, but I could peek at sem and eval template code on the weekend, not sure what the fix will entail, though

Yep, semantic analysis is the correct place to fix this.

The issue here was what originally motivated me to look into the macro output sanitizer. Turning the typed AST passed to a typed template parameter into untyped AST before inserting it into the body could fix the problem, but work on the sanitizer is currently stuck due to issues with gensym'ed symbols.

compiler/sem/mirexec.nim

## Summary The previous `injectdestructors` pass operated on `PNode` AST, and while it should have been able to work with the AST as generated by `astgen`, it didn't, which required the canonicalization step to be disabled when the `injecdestructors` pass was needed. Since the goal is to have all code-generators operate on either the MIR or another IR derived from the former, it is necessary that the MIR -> AST step happens immediately before code-generation, with no further transformation in-between. The pass is rewritten from the ground up, fixing many issues / bugs the previous implementation had and also making the the pass significantly more time and memory efficient than it was previously. In addition, the pass generates more efficient code due to the smarter omission of `wasMoved` and `=sink` calls. An additional change is that calling a procedure that only raises `Defect`s is now also considered as having control-flow related side- effects. They previously weren't, which meant that locations sometimes weren't cleaned up (via invoking the destructor) when raising a `Defect`. ## Fixes Assignments to location with destructors where the right-hand side is a complex expression now respect the strict left-to-right evaluation order: ```nim var i = 0 s = "str" # works now: (i = 1; s) = if i == 1: "str2" else: doAssert false; "" # also works now: proc p(s: var string): var string = i = 2 s p(s) = block: doAssert i == 2 "str2" ``` A mutation of the argument is now considered to happen when control-flow reaches the call, not when the argument is passed to a `var` parameter. The difference is subtle but important: ```nim proc f_sink(x: var seq[int], y: sink seq[int]) = y[0] = 2 doAssert x == [1, 2] # passes proc main() = var s = @[1, 2] f_sink(s, s) # a full copy is now introduced for the second argument main() ``` Globals are now only moved if it can proven that doing so does not affect program semantics. Previously, assigning a location derived from a global would perform a destructive move in certain circumstances, such as: ```nim # at module scope var a = "str" var b = a # moved proc p() = discard echo a # echoes an empty string ``` Due to the way the new implementation works, it is now possible for locations derived from globals to be consumed (if safe) when passing them to `sink` parameters. As a side-effect of operating on the MIR, the alias analysis now handles tuple access correctly: ```nim proc prc(x: var string) = var tup = (x: 0, y: "str") # `tup[1]` and `tup.y` were treated as two disjoint locations, # causing the assignment below to be erroneously optimized into a # move x = tup[1] doAssert tup.y == "str" var o: string prc(o) ``` ## Details In contrast to the previous implementation, the new one splits the pass up into multiple sub-passes. This makes the code easier to reason about, more modular, and reduces the amount of state each sub-pass has access to. The core part, the lifetime-tracking hook injection, is split into a *setup*, *analysis*, and *application* step (executed in that order). During the *setup* step, a control-flow graph of the code is computed and a database about the entities (i.e. locations) relevant to the pass built. Using both of them, the *analysis* step (which is itself split up into multiple sub-steps) computes for all values appearing in a *consume* context (e.g. argument to a `sink` parameter, right-hand-side of an assignment) whether they're owned and what locations still store a value at the end of their scope (and thus needing destruction). Finally, the *application* step uses all previously gather information to rewrite assignments into copy or sink hook calls, inject copies for unowned values passed to `sink` parameters, and to, where necessary, inject calls to the destructors. For both simplicity and efficiency, the assignment rewriting and copy injection both perform additional data-flow analysis in order to omit unnecessary calls to `wasMoved` and to turn sink assignments into shallow assignments (i.e. `FastAsgn`). ### Data-flow analysis Compared to the data-flow analysis implementation (`dfa.nim`) used previously, definitions (`def`) and uses (`use`) are not encoded in the control-flow graph but rather stored separately. The new forward analysis also doesn't have quadratic time and memory complexity in the case of loops. How data-flow analysis is employed is also different: instead of computing the last-reads and first-writes of *all* locations during a single traversal of the full control-flow graph, they're now computed *on-demand*. For example, to figure whether the value coming from an lvalue expression is owned (in a consume context) forward data-flow analysis is perfomed to check whether a further read is reachable. ### Additional changes - fix `mirchangesets` not compiling - fix multiple of the sets defined in `mirtrees` having the wrong elements - fix the `indexLike` tag routine for `NodeInstance` not being exported from `mirtrees` - fix `parentEnd` yielding the wrong node in some cases - add a procedure for querying the enum of a types equality operator magic without the symbol - add the `OrdinalSeq` type to the `utils/containers.nim` module - add the `Cursor` type; meant as a temporary placeholder for immutable views - add multiple new destructor-related test cases - remove the `nfLastRead` and `nfFirstWrite` node flags - remove the `sfSingleUsedTemp` symbol flag - remove the `rsemUncollectableRefCycle` warning. It had questionable utility and removing it was simpler than to make it work again. Should there exist the desire to bring it back, the problematic assignment should be detected in `sempass2` - remove the `optimizer` module - flag the `arc/tstrformat.nim` test as a known issue ### Timings Timings and peakmem are taken from compiling the compiler itself with the options `-d:release --compileOnly` and no cached compilation artifacts. The original compiler is build with `-d:release --exceptions:goto`. |compiler from|bootstrapped with|using|time taken|peakmem| |:------------|:----------------|:----|:---------|:------| |#d7a70238d0|`--gc:refc`|`--gc:refc`|24.426s|1015.844MiB| |this commit|`--gc:refc`|`--gc:refc`|24.687s|1014.254MiB| |#d7a70238d0 |`--gc:refc`|`--gc:orc`|45.206s|1.99GiB| |this commit |`--gc:refc`|`--gc:orc`|29.531s|1.222GiB| |#d7a70238d0 |`--gc:orc`|`--gc:refc`|15.492s|978.938MiB| |this commit|`--gc:orc`|`--gc:refc`|15.714s|979.023MiB| |#d7a70238d0 |`--gc:orc`|`--gc:orc`|22.218s|1.933GiB| |this commit|`--gc:orc`|`--gc:orc`|18.266s|1.188GiB|

zerbina · 2023-02-17T22:43:57Z

bors r+

bors · 2023-02-17T23:27:11Z

Build succeeded:

saem reviewed Oct 23, 2022

View reviewed changes

compiler/sem/mirtraverse.nim Outdated Show resolved Hide resolved

saem reviewed Nov 6, 2022

View reviewed changes

haxscramper added compiler General compiler tag compiler/sem Related to semantic-analysis system of the compiler refactor Implementation refactor labels Nov 20, 2022

haxscramper added this to the C backend refactoring milestone Nov 22, 2022

zerbina changed the title ~~WIP: rewrite injectdestructors~~ compiler: rewrite injectdestructors as MIR passes Dec 3, 2022

zerbina requested review from haxscramper, saem and Clyybber December 3, 2022 22:31

zerbina marked this pull request as ready for review December 3, 2022 22:31

haxscramper approved these changes Dec 4, 2022

View reviewed changes

compiler/ast/ast_query.nim Outdated Show resolved Hide resolved

compiler/ast/ast_query.nim Outdated Show resolved Hide resolved

saem approved these changes Dec 4, 2022

View reviewed changes

saem reviewed Dec 4, 2022

View reviewed changes

zerbina mentioned this pull request Dec 6, 2022

compiler: introduce a mid-end IR (MIR) #490

Merged

7 tasks

saem assigned saem and unassigned saem Jan 8, 2023

zerbina force-pushed the injectdestructors-rewrite branch from 52d106a to 6849732 Compare February 10, 2023 20:27

zerbina mentioned this pull request Feb 12, 2023

cgen: make .global.s work more according to spec #536

Merged

zerbina marked this pull request as draft February 12, 2023 16:04

zerbina marked this pull request as ready for review February 15, 2023 00:12

zerbina mentioned this pull request Feb 15, 2023

disallow sink openArray[T] types #544

Draft

zerbina commented Feb 15, 2023

View reviewed changes

Clyybber reviewed Feb 16, 2023

View reviewed changes

compiler/sem/mirexec.nim Outdated Show resolved Hide resolved

zerbina force-pushed the injectdestructors-rewrite branch from 3d10776 to da47c7b Compare February 17, 2023 14:36

zerbina changed the title ~~compiler: rewrite injectdestructors as MIR passes~~ compiler: make injectdestructors a MIR pass Feb 17, 2023

bors bot merged commit ba81827 into nim-works:devel Feb 17, 2023

zerbina deleted the injectdestructors-rewrite branch February 20, 2023 20:59

haxscramper modified the milestones: C backend refactoring, MIR phase Feb 25, 2023

zerbina mentioned this pull request Jun 25, 2024

fix incorrect removal of self-assignments #1357

Merged

		## Computes the control-flow graph for the given `tree`. This is a very
		## expensive operation!

Procedure	Time spent
`mirtrees.findEnd`	0.401 %
`mirtrees.parent`	0.224 %
`algorithm.sort`	0.0476 %
`commit`	0.0174 %
`mirtrees.findParent`	0.0174 %
`addEdge`	0.011 %

	Time taken (in milliseconds)	Allocations	Deallocations
`Table`	151	35219	17779
`PackedSet` + `seq` + `find`	152 - 153	36865	19425
`PackedSet` + `seq` + `rfind`	154 - 155	36865	19425

compiler: make injectdestructors a MIR pass #450

compiler: make injectdestructors a MIR pass #450

Conversation

zerbina commented Oct 16, 2022 • edited Loading

Fixes

Details

Timings

To-Do

Notes for reviewers

saem commented Oct 17, 2022 • edited Loading

zerbina commented Oct 17, 2022

saem commented Oct 18, 2022

zerbina commented Nov 2, 2022

saem commented Nov 6, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saem left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zerbina commented Dec 9, 2022

zerbina commented Dec 22, 2022

zerbina commented Feb 10, 2023

zerbina left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saem Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saem Feb 16, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zerbina commented Feb 17, 2023

bors bot commented Feb 17, 2023

compiler: make `injectdestructors` a MIR pass #450

compiler: make `injectdestructors` a MIR pass #450

zerbina commented Oct 16, 2022 •

edited

Loading

saem commented Oct 17, 2022 •

edited

Loading

saem Feb 16, 2023 •

edited

Loading

saem Feb 16, 2023 •

edited

Loading