idetools serve command seems to leak memory #450

gradha · 2013-05-26T16:38:53Z

When I run the following command inside nimrod's lib directory:

$ yes "idetools --def --track:system.nim,796,7" | \
  nimrod serve --server.type:stdin system.nim > /dev/null

and watch the process with the system monitor, the nimrod process starts with 30.2MB of total virtual memory usage. After about 4 minutes it has grown to 33.2MB and keeps increasing steadily.

Similarly, when I run https://github.com/gradha/the_hyperlink_vs_nimrod on itself in server mode, in the beginning it starts with 60MB virtual memory size, and about 14000 queries and 30 minutes later the process has grown to 360MB and keeps expanding.

The text was updated successfully, but these errors were encountered:

Araq · 2013-05-26T17:32:42Z

We know. Things are much better if you bootstrap with -gc:markAndSweep for idetools.

Araq · 2015-03-12T12:08:34Z

nimsuggest uses markAndSweep.

450: compiler: make `injectdestructors` a MIR pass r=zerbina a=zerbina First, a short summary of how `injectdestructors` used to operate: 1. an instruction-based control-flow-graph that also contains instruction related to data-flow analysis (i.e. `use` and `def`) is generated for the code fragment (in this case, either the body of a procedure or a fragment of a module's top-level code) the transformations are to be applied to 2. the control- / data-flow-graph is traversed (executed) in order to figure out which uses of locations are "last reads" and which writes to them are "first writes". The `PNode`s corresponding to the `use` and `def` instructions for which a "last read" or "first write" is detected are marked with the respective node flags `nfLastRead` and `nfFirstWrite`. 3. the AST is processed -- this is where the core logic happens. `=destroy` calls and temporaries are injected; assignments are rewritten into either `=copy` calls, `=sink` calls, or shallow copies (i.e. `nkFastAsgn`); copies are introduced for arguments to `sink` parameters where necessary; `wasMoved` calls are inserted when an lvalue is consumed; scopes are wrapped in `try`/`finally` statements/expression if necessary and the `nkIdentDefs` of relevant locals are moved to the start of their scope. 4. the optimizer (`optimizer.nim`) that eliminates `wasMoved(x); =destroy(x)` pairs that cancel each other out is run With the new implementation, the transformations are split into multiple passes. The control-flow graph still uses an instruction-based representation, but no longer contains the data-flow-related instructions. There are two new instructions, `join` and `loop`, that together enable loop bodies (independent of their nesting) to be evaluated for their effects only once, and also for the graph to be efficiently traversed backwards. The `mirexec` module implements both the CFG construction and the traversal routines. Instead of analyzing all uses of all locations for "last reads" in one single CFG traversal, each usage in a consume context (e.g. something passed to a ``sink`` parameter) is now analyzed separately. This greatly simplifies the analysis and reduces the amount of state that needs to be tracked, resulting in a large speedup. The aforementioned analysis is modeled as "solving" ownership. Each value that results from a computation gets an ownership status assigned -- a value is considered to be "owned" if there are no operations coming *after* the operation it resulted from that observe its presence in the location that stores it. Same as with the previous implementation, location access through aliases (i.e. a pointer) is not considered here. That is, accessing a location through a pointer is not considered to be a read or write of the underlying location. If a value is an r-value (e.g. it's the result of a procedure call and is not a view) it is always considered to be owned. For destructor injection, instead of tentatively injecting a destructor for each location that has a destructor and letting the `optimizer` elide it later, a destructor is now only inserted for locations for which it is not certain whether they store (and own) a value once control-flow exits their scope. This analysis also takes implicit moves into account. Previously, a scope was wrapped in a `try` and the destructors called inside the corresponding `finally` clause if one of the following was present inside the scope: a call that raises or an explicit raise (`Defect`s were ignored, independent of `--panics`), a `break`/`continue` statement, a `return` statement, another scope that contains one of the aforementioned. A `finally` clause is now only required if at least one location requiring destruction still stores a (owned) value when its scope is exited via *unstructured* control-flow. For example: ```nim block: var s = `@[1,` 2] if cond: f_sink(s) # `s` is consumed raise ValueError.newException("") else: discard ``` `s` requires a destructor call at the end of the block, but a `finally` is not needed, since `s` no longer contains a value once control-flow reaches the `raise`. To be able to detect consumes, the alive analysis requires value ownership to be figured out already. The actual assignment rewriting and destructor injection are performed as two separate but concurrently applied MIR passes. While it's possible to perform the copy injection for arguments that are not owned and are used in a consume context as another separate pass, it is applied as part of the assignment rewriting pass, in order to reduce the number of `MirTree` traversals. When an assignment doesn't own the source operand and the value represents a resource (i.e. something that has a `=destroy` hook), the assignment is rewritten into a call to the type-bound `=copy` hook. If the source value is owned, depending on whether or not the destination location is empty, the assignment is rewritten into either a bit-wise copy (`nkFastAsgn`) or a call to the type-bound `=sink` hook. If the source is an lvalue (i.e. there exists a name for the containing location), a destructive move has to be performed (via a call to `wasMoved`), but only if it can't be proven that no connected write observes the now no longer owned value being present in the source location. The analysis (see `needsReset` and `isLastWrite`) takes into account if a destructor is present for the source location. Since destructors are only injected for alive locations and `wasMoved` calls are only generated where necessary, the separate `optimizer` pass is no longer needed. ## Fixes Assignments to location with destructors where the right-hand side is a complex expression now respect the strict left-to-right evaluation order: ```nim var i = 0 s = "str" # works now: (i = 1; s) = if i == 1: "str2" else: doAssert false; "" # also works now: proc p(s: var string): var string = i = 2 s p(s) = block: doAssert i == 2 "str2" ``` A mutation of the argument is now considered to happen when control-flow reaches the call, not when the argument is passed to a `var` parameter. The difference is subtle but important: ```nim proc f_sink(x: var seq[int], y: sink seq[int]) = y[0] = 2 doAssert x == [1, 2] # passes proc main() = var s = `@[1,` 2] f_sink(s, s) # a full copy is now introduced for the second argument main() ``` Globals are now only moved if it can proven that doing so does not affect program semantics. Previously, assigning a location derived from a global would perform a destructive move in certain circumstances, such as: ```nim # at module scope var a = "str" var b = a # moved proc p() = discard echo a # echoes an empty string ``` Due to the way the new implementation works, it is now possible for locations derived from globals to be consumed (if safe) when passing them to `sink` parameters. As a side-effect of the MIR's canonicalization, the alias analysis now handles tuple access correctly: ```nim proc prc(x: var string) = var tup = (x: 0, y: "str") # `tup[1]` and `tup.y` were treated as two disjoint locations, causing the assignment # below to be erroneously optimized into a move x = tup[1] doAssert tup.y == "str" var o: string prc(o) ``` ## Details ### Timings Timings and peakmem are taken from compiling the compiler itself with the options `-d:release --compileOnly` and no cached compilation artifacts. The original compiler is build with `-d:release --exceptions:goto`. |compiler from|bootstrapped with|using|time taken|peakmem| |: Co-authored-by: zerbina <100542850+zerbina@users.noreply.github.com>

Araq added the Tools label Nov 13, 2014

Araq closed this as completed Mar 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idetools serve command seems to leak memory #450

idetools serve command seems to leak memory #450

gradha commented May 26, 2013

Araq commented May 26, 2013

Araq commented Mar 12, 2015

idetools serve command seems to leak memory #450

idetools serve command seems to leak memory #450

Comments

gradha commented May 26, 2013

Araq commented May 26, 2013

Araq commented Mar 12, 2015