-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
escape analysis: enables memory safe views (cf 1st class openarray) and optimizations #14976
Conversation
proc p(x: var ptr int) =
var y = 4
x = addr(y) |
|
yes, there are several examples dealing with variables across nested function scopes in tviewfroms.nim (eg, search for your example however is bogus for 3 reasons:
here's a correct example: var x: ptr int
block:
var y = [10, 11, 12]
x = addr y[0] # no warning, block in same stack frame as x
proc fn()=
var y1 = @[10,11,12]
var y2 = [10,11,12]
x = addr y1[0] # no warning, that address is in heap
x = addr y2[0] # Warning: local 'fn.y2' escapes via 'x' in 'x = addr y2[0]'
# but the address of the seq itself is on the stack:
var g {.global.}: ptr seq[int]
g = y1.addr # Warning: local 'fn.y1' escapes via 'g' in 'g = addr(y1)' |
But there's nothing to prevent the Nim compiler from inserting destructors at the end of a block for variables present only in the block's scope, right? I guess what I'm arguing for is that this analysis should also look at symbol scope, since it's unlikely for someone to intentionally want to access a variable that's been destroyed/is no longer accessible by symbolic means. |
f99e3d1
to
c615dfc
Compare
a destructor can't deallocate stack allocated memory, so stack allocated memory can't be invalidated even if a destructor is called. If you disagree please provide an example. |
I know; note that your example is correctly handled (gives however what currently is not supported is the following: when true:
proc fun(a: var ptr int, x: ptr int) = a = x
proc main: ptr int =
var l0=0
fun(result, l0.addr) and it's already catalogued in the test suite (see D20200711T130559) under what I'm not yet tracking is function calls like |
The pointer of a ref is stackallocated. And if the ref is destroyed the pointer will point into oblivion. |
as I said above, please provide a complete example for what you're describing. destructors are for tyObject, not tyRef. |
tyRef also uses destructors in gc:arc, and destructors can be used to implement custom ref types. Those would be tyObject. |
using an object after it was destroyed but that lives in the same stack frame is a different kind of problem than using memory that was allocated in a different stack frame. Handling the 1st case is useful but can be done in future work (and is "maybe" a very simple modification to when true:
type Foo = object
x: seq[int]
proc `=destroy`(a: var Foo) =
echo ("dtor", a.x)
a.x.reset
proc main() =
var f: ptr Foo
block:
var f1 = Foo(x: @[10])
f = f1.addr
echo f[]
echo f[]
main() prints:
=> not the same kind of corruption as that arising from dereferencing a pointer that was allocated in a different stackframe, eg you can't access invalid memory like that, the worse you get is an allocated object but in a destroyed state. |
Your example relies on the fact that seqs can't be nil. Replace the seq with a ref or a ptr. It will crash. |
it doesn't change the example fundamentally, you'd get SIGSEGV if trying to access type Foo = object
x: ref int again, it's a different kind of memory problem: SIGSEGV is the not same as dereference garbage on the stack. block scope can be handled in future work IMO. |
It's not memory safe. Keep in mind you don't know what the destructor will do with the pointer/ref; destructors do not have to reset their target anymore. |
I know, but like i said, it's a different kind of memory issue: 1 is stack corruption, the other is accessing memory that was invalidated for some other reason (eg use after free). In any case, I've now pushed a commit that fixes this, see |
If you want reviews, un-draft your PRs. :-) |
done, some files are needed for debugging and will be removed for final version so ignore those:
also, it would really help if #15016 could be merged and would cut down on the diff (needed for and instead focus on those:
the most important ones are these:
|
# this would hold currently and seems correct but is inconsistent | ||
# until bug #14986 is resolved | ||
|
||
# nim bug: there's no way to distinguish between |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is, the .global
is {sfGlobal, sfPure}
iirc.
Well it's a good start. Notes on the implementation
Notes on the design
var s: seq[int]
var x = keepVarness(s[i])
x = 4 # ok, mutate s[i]
setLen s, 0
x = 4 # invalid write! RFC #178 is fundamentally about detecting this case, whether |
Please have a look at #15030 which adds a mechanism of comparable complexity to the Nim compiler, without touching ast.nim, without introducing more recursive dependencies, in 250 lines of code. Of course #15030 is far from perfection too, but I hope it inspires you. (It's also covered by an RFC. ;-) ) |
We got --experimental:views instead. |
var s: seq[int]
var x = keepVarness(s[i])
x = 4 # ok, mutate s[i]
setLen s, 0
x = 4 # invalid write! @Araq I don't understand this example, |
{.experimental: "views".}
type
Foo = object
field: string
proc valid(s: var seq[Foo]) =
let v: lent Foo = s[0] # begin of borrow
echo v.field # end of borrow
s.setLen 0 # valid because 'v' isn't used afterwards
proc dangerous(s: var seq[Foo]) =
let meh: lent Foo = s[0]
s.setLen 0
echo meh.field |
@Araq {.views.} is easily fooled, eg this will compile and bypass the mutation check: proc dangerous(s: var seq[Foo]) =
let meh: lent Foo = s[0]
proc bar() = echo meh.field
s.setLen 0
bar() but let's discuss views separately and focus on this PR. This PR has the mechanism and modeling in place via symbol based constraint propagation to accomodate such kind of analysis: type
ViewConstraint* = object
# we could model other constraints, eg whether a parameter is being written to
lhs*: PSym
rhs*: PSym
addrLevel*: int # see also `ViewDep` eg in the following example it gives (there are more examples in PR): block:
type Foo = ref object
f0: Foo
proc fn24(a: Foo): auto =
result = (a.f0.f0, a.f0)
doAssert viewConstraints(fn24) == "fn24.result => fn24.a:-1; " this can be extended to model the fact that a routine (eg type
ViewConstraint* = object
lhs*: PSym # lhs = nil could be used to encode an invalidation constraint
rhs*: PSym
addrLevel*: int # see also `ViewDep` exampleproc fn(a, b: var seq[int]) =
a.setLen 0
b.setLen 0
proc dangerous(s, s2: var seq[Foo]) =
let meh: lent Foo = s[0] # by constraint propagation, we know `meh => s`
echo meh.field # ok
fn(s, s2) # by constraint propagation we know this invalidates s, s2
echo meh.field # this would issue a warning because meh => s was invalidated the analysis would then proceed in a similar way as done in this PR, adding a check that a view doesn't get invalidated via (possibly indirect) function calls. |
That's just a bug, it doesn't imply to switch strategies altogether over to your way which has no spec and not even a draft of a spec (!). |
I will add a spec. |
proc dangerous(s: var seq[Foo]) =
let meh: lent Foo = s[0]
proc bar() = echo meh.field
s.setLen 0
bar()
Not true but it's done in lambda lifting and so unfortunately only triggers when you try to use the code: proc dangerous(s: var seq[Foo]) =
let meh: lent Foo = s[0]
proc bar() = echo meh.field
s.setLen 0
bar()
var s = @[Foo(field: "")]
dangerous(s)
|
This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions. |
ptr T
a lot saferexample (see a lot more in tviewfroms.nim)
for example, the compiler figures out that fn.result depends on a2 but not a1, and uses that to infer that
result = fn(b1.addr, b2.addr)
is illegal butresult = fn(b2.addr, b1.addr)
is legal:notes
when it finds a dependency (lhs => rhs) such that lhs outlives rhs, it issues a warnings (which can be turned into an error)
future work
track memory originating from static data segment (read only memory, eg used for cstrings
var a = "foo".cstring
) to make sure at CT that such memory can't be written to