compiler: make `injectdestructors` a MIR pass #450

## Summary The previous `injectdestructors` pass operated on `PNode` AST, and while it should have been able to work with the AST as generated by `astgen`, it didn't, which required the canonicalization step to be disabled when the `injecdestructors` pass was needed. Since the goal is to have all code-generators operate on either the MIR or another IR derived from the former, it is necessary that the MIR -> AST step happens immediately before code-generation, with no further transformation in-between. The pass is rewritten from the ground up, fixing many issues / bugs the previous implementation had and also making the the pass significantly more time and memory efficient than it was previously. In addition, the pass generates more efficient code due to the smarter omission of `wasMoved` and `=sink` calls. An additional change is that calling a procedure that only raises `Defect`s is now also considered as having control-flow related side- effects. They previously weren't, which meant that locations sometimes weren't cleaned up (via invoking the destructor) when raising a `Defect`. ## Fixes Assignments to location with destructors where the right-hand side is a complex expression now respect the strict left-to-right evaluation order: ```nim var i = 0 s = "str" # works now: (i = 1; s) = if i == 1: "str2" else: doAssert false; "" # also works now: proc p(s: var string): var string = i = 2 s p(s) = block: doAssert i == 2 "str2" ``` A mutation of the argument is now considered to happen when control-flow reaches the call, not when the argument is passed to a `var` parameter. The difference is subtle but important: ```nim proc f_sink(x: var seq[int], y: sink seq[int]) = y[0] = 2 doAssert x == [1, 2] # passes proc main() = var s = @[1, 2] f_sink(s, s) # a full copy is now introduced for the second argument main() ``` Globals are now only moved if it can proven that doing so does not affect program semantics. Previously, assigning a location derived from a global would perform a destructive move in certain circumstances, such as: ```nim # at module scope var a = "str" var b = a # moved proc p() = discard echo a # echoes an empty string ``` Due to the way the new implementation works, it is now possible for locations derived from globals to be consumed (if safe) when passing them to `sink` parameters. As a side-effect of operating on the MIR, the alias analysis now handles tuple access correctly: ```nim proc prc(x: var string) = var tup = (x: 0, y: "str") # `tup[1]` and `tup.y` were treated as two disjoint locations, # causing the assignment below to be erroneously optimized into a # move x = tup[1] doAssert tup.y == "str" var o: string prc(o) ``` ## Details In contrast to the previous implementation, the new one splits the pass up into multiple sub-passes. This makes the code easier to reason about, more modular, and reduces the amount of state each sub-pass has access to. The core part, the lifetime-tracking hook injection, is split into a *setup*, *analysis*, and *application* step (executed in that order). During the *setup* step, a control-flow graph of the code is computed and a database about the entities (i.e. locations) relevant to the pass built. Using both of them, the *analysis* step (which is itself split up into multiple sub-steps) computes for all values appearing in a *consume* context (e.g. argument to a `sink` parameter, right-hand-side of an assignment) whether they're owned and what locations still store a value at the end of their scope (and thus needing destruction). Finally, the *application* step uses all previously gather information to rewrite assignments into copy or sink hook calls, inject copies for unowned values passed to `sink` parameters, and to, where necessary, inject calls to the destructors. For both simplicity and efficiency, the assignment rewriting and copy injection both perform additional data-flow analysis in order to omit unnecessary calls to `wasMoved` and to turn sink assignments into shallow assignments (i.e. `FastAsgn`). ### Data-flow analysis Compared to the data-flow analysis implementation (`dfa.nim`) used previously, definitions (`def`) and uses (`use`) are not encoded in the control-flow graph but rather stored separately. The new forward analysis also doesn't have quadratic time and memory complexity in the case of loops. How data-flow analysis is employed is also different: instead of computing the last-reads and first-writes of *all* locations during a single traversal of the full control-flow graph, they're now computed *on-demand*. For example, to figure whether the value coming from an lvalue expression is owned (in a consume context) forward data-flow analysis is perfomed to check whether a further read is reachable. ### Additional changes - fix `mirchangesets` not compiling - fix multiple of the sets defined in `mirtrees` having the wrong elements - fix the `indexLike` tag routine for `NodeInstance` not being exported from `mirtrees` - fix `parentEnd` yielding the wrong node in some cases - add a procedure for querying the enum of a types equality operator magic without the symbol - add the `OrdinalSeq` type to the `utils/containers.nim` module - add the `Cursor` type; meant as a temporary placeholder for immutable views - add multiple new destructor-related test cases - remove the `nfLastRead` and `nfFirstWrite` node flags - remove the `sfSingleUsedTemp` symbol flag - remove the `rsemUncollectableRefCycle` warning. It had questionable utility and removing it was simpler than to make it work again. Should there exist the desire to bring it back, the problematic assignment should be detected in `sempass2` - remove the `optimizer` module - flag the `arc/tstrformat.nim` test as a known issue ### Timings Timings and peakmem are taken from compiling the compiler itself with the options `-d:release --compileOnly` and no cached compilation artifacts. The original compiler is build with `-d:release --exceptions:goto`. |compiler from|bootstrapped with|using|time taken|peakmem| |:------------|:----------------|:----|:---------|:------| |#d7a70238d0|`--gc:refc`|`--gc:refc`|24.426s|1015.844MiB| |this commit|`--gc:refc`|`--gc:refc`|24.687s|1014.254MiB| |#d7a70238d0 |`--gc:refc`|`--gc:orc`|45.206s|1.99GiB| |this commit |`--gc:refc`|`--gc:orc`|29.531s|1.222GiB| |#d7a70238d0 |`--gc:orc`|`--gc:refc`|15.492s|978.938MiB| |this commit|`--gc:orc`|`--gc:refc`|15.714s|979.023MiB| |#d7a70238d0 |`--gc:orc`|`--gc:orc`|22.218s|1.933GiB| |this commit|`--gc:orc`|`--gc:orc`|18.266s|1.188GiB|

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compiler: make `injectdestructors` a MIR pass #450

compiler: make `injectdestructors` a MIR pass #450

Commits on Feb 17, 2023

compiler: make injectdestructors a MIR pass #450

compiler: make injectdestructors a MIR pass #450

Commits on Feb 17, 2023

compiler: make `injectdestructors` a MIR pass #450

compiler: make `injectdestructors` a MIR pass #450