Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: add GOEXPERIMENT=checkptr #22218

Open
amscanne opened this issue Oct 11, 2017 · 23 comments

Comments

@amscanne
Copy link

commented Oct 11, 2017

This is a proposal for this change, and see if there's any interest in having it move forward.

Background

During stack growth, pointers that have values between 0 and minLegalPointer will cause a "invalid pointer found on stack" exception to be thrown by the runtime. This suggests that storing these pointers on the stack is not a valid operation, i.e. the following code may cause a runtime exception if the stack happens to grow.

p := unsafe.Pointer(uintptr(1))

Since this is not a permitted state and will cause a runtime exception, it is preferable for the user to receive a recoverable error at the time the state is introduced. Even if this results in the program crashing, the full stack trace will be available making the cause of the problem much easier to diagnose.

Proposal

I propose to add code that causes a recoverable panic at the time the conversion is made. This conversion adds a small cost to each unsafe.Pointer cast. I believe that these conversions in user code should be relatively rare (usually associated with some other costly event, such as allocating a new object), but there may be cases where it's common (e.g. some CGo usage?). The original change did not apply these checks to code in the runtime package itself, due to its presumed safety and the fact that the conversions were common for the implementation of core types (e.g. map, channels, etc.).

Alternate Proposal

As noted in the original change, this behavior may be surprising. Users may be encoding some CGo void* equivalents as pointers in Go, which could contain "illegal" values. The constraint imposed by the stack growth code is non-obvious and undocumented.

Currently, debug.invalidptr is used in both heapBitsForObject and stack adjustment and defaults to on. I propose that a new debug variable is introduced, debug.stackptr, that toggles the behavior of the check during stack growth. I would advocate that this check should default to off, but regardless it would allow just that check to be disabled if it proves problematic.

@gopherbot gopherbot added this to the Proposal milestone Oct 11, 2017

@gopherbot gopherbot added the Proposal label Oct 11, 2017

@cznic

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2017

I believe that these conversions in user code should be relatively rare (usually associated with some other costly event, such as allocating a new object), but there may be cases where it's common (e.g. some CGo usage?).

Virtual machines are perhaps rare, but they may use code making such conversions every few nanoseconds and in that case the performance would be hurt substantially.

@bcmills

This comment has been minimized.

Copy link
Member

commented Oct 12, 2017

the following code may cause a runtime exception if the stack happens to grow.

The example you give is one that could be caught at compile-time (e.g. by vet). Do you have more realistic examples to demonstrate the problem?

@bcmills

This comment has been minimized.

Copy link
Member

commented Oct 12, 2017

Users may be encoding some CGo void* equivalents as pointers in Go, which could contain "illegal" values.

Note that in C, converting an arbitrary integer to void* produces “implementation-defined” behavior. Portable C code cannot convert arbitrary integers to void*, just as portable Go code cannot convert arbitrary integers to unsafe.Pointer.

@cznic

This comment has been minimized.

Copy link
Contributor

commented Oct 12, 2017

Portable C code cannot convert arbitrary integers to void*, ...

True, yet unfortunately some well known and widely used C code bases do that (and not only to void*). For example SQLite, including magic-valued function pointers, IIRC.

@bcmills

This comment has been minimized.

Copy link
Member

commented Oct 12, 2017

See also #19928, #21897, and #21730.

@amscanne

This comment has been minimized.

Copy link
Author

commented Oct 13, 2017

Virtual machines are perhaps rare, but they may use code making such conversions every few nanoseconds and in that case the performance would be hurt substantially.

I assume you're talking about language-level virtual machines here, where the implementation is in Go. Why would they require constant conversions between uintptr and unsafe.Pointer? I don't follow. It seems like their internal implementations would aim to be well-typed, just like everything else. Do you have an example?

The example you give is one that could be caught at compile-time (e.g. by vet). Do you have more realistic examples to demonstrate the problem?

Sure. Suppose you have something like:

   // C.create_foo => unsafe.Pointer(), maybe encoding errors as well
   foo := create_foo()
   // ... runtime can die starting here.
   if is_foo_okay(foo) {
      ...
   }

You might argue that the create_foo() interface is bad. Agreed, but it's still common. It's even used within the runtime itself for e.g. mmap return values within mem_linux.go:

func sysAlloc(n uintptr, sysStat *uint64) unsafe.Pointer {
  	p := mmap(nil, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, -1, 0)
  	if uintptr(p) < 4096 {
  		if uintptr(p) == _EACCES {
  			print("runtime: mmap: access denied\n")
  			exit(2)
  		}
  		if uintptr(p) == _EAGAIN {
  			print("runtime: mmap: too much locked memory (check 'ulimit -l').\n")
  			exit(2)
  		}
  		return nil
  	}
  	mSysStatInc(sysStat, n)
  	return p
  }

The above function uses "invalid" behavior per #21897. If it were not marked with go:nosplit, and happened to grow while p was on the stack, then it would panic the runtime.

I also encountered this issue in a specific context as well that I will follow up with you about separately.

Note that in C, converting an arbitrary integer to void* produces “implementation-defined” behavior. Portable C code cannot convert arbitrary integers to void*, just as portable Go code cannot convert arbitrary integers to unsafe.Pointer.

I'm with you here, but it still happens. The go runtime itself is guilty of this.

@bcmills

This comment has been minimized.

Copy link
Member

commented Oct 13, 2017

You might argue that the create_foo() interface is bad. Agreed, but it's still common.

Isn't that exactly the use-case that the Go uintptr type (and the corresponding C99 uintptr_t type) is meant to address? (Why return void* / unsafe.Pointer when the caller can always do that conversion explicitly once they have verified that the value is, in fact, a valid pointer?)

It's even used within the runtime itself

The runtime can rely upon whatever implementation details it likes. (Runtime code is not user code!)

That said, arguably we should fix runtime.mmap to return a uintptr so that it sets a better example.

@cznic

This comment has been minimized.

Copy link
Contributor

commented Oct 14, 2017

@amscanne

I assume you're talking about language-level virtual machines here, where the implementation is in Go.

I'm talking about a particular VM.

Why would they require constant conversions between uintptr and unsafe.Pointer?

VM memory consists of the code memory, stack, text and data segments and heap memroy. Except for the code memory, all other VM memory is allocated outside of the Go runtime using mmaps. VM pointers are uintptrs. Every access to any non code memory converts a particular uintptr to an unsafe.Pointer. Essentially every VM operation/instruction has two or more mmaped memory accesses, for example AddI32.

@amscanne

This comment has been minimized.

Copy link
Author

commented Oct 16, 2017

Isn't that exactly the use-case that the Go uintptr type (and the corresponding C99 uintptr_t type) is meant to address? (Why return void* / unsafe.Pointer when the caller can always do that conversion explicitly once they have verified that the value is, in fact, a valid pointer?)

Sure, but this doesn't always happen. See the runtime. Note that in the reference change, I update the code there to use the uintptr correctly:
https://go-review.googlesource.com/c/go/+/34719/1/src/runtime/mem_linux.go

The runtime can rely upon whatever implementation details it likes. (Runtime code is not user code!)

Uff. The argument that the runtime doesn't need to be "valid" Go code is disconcerting. (I say this because one of the referenced issues was closed as not a valid thing to do.) I understand the need for the pragmatic considerations and exceptions for the runtime (hence the exception in the original change), but that's a different thing.

I'm talking about a particular VM.

Gotcha. Yes, I see the calls you're referring to here:
https://github.com/cznic/virtual/blob/06a14e3fe719b20921b8be3ff81847eac0e53527/cpu.go#L65

This does hit the crux of the issue. The code there could be slightly restructured and there would be the risk of triggering runtime errors. (If the pointers lived on the stack during a growth event instead of being immediately consumed.)

I'm not really sure about the cost in that case anyways. The branch predictor will likely make most of it disappear. If the proposal is friendly (though I get the sense that it is not), I would happily run some benchmark if you're got it to assess the impact for your case.

@bcmills

This comment has been minimized.

Copy link
Member

commented Oct 16, 2017

Note that in the reference change, I update the code there to use the uintptr correctly

That seems like a good idea regardless of whether we also add the proposed dynamic check.

@bcmills

This comment has been minimized.

Copy link
Member

commented Oct 16, 2017

I note the following in the reference change description:

This will only happen on conversion to unsafe.Pointer from
non-pointer types, so most internal runtime pointer mangling should be
free from these checks.

Does that actually address a significant fraction of bad pointers? If the pointer is coming from C code, odds are good that it's already encoded as an unsafe.Pointer.

Or does the code generated by cgo itself do explicit conversions? (If so, it still seems unfortunate to couple the runtime pointer-validity check to a cgo implementation detail, especially when the interaction between cgo and the compiler is in question: see #16623.)

@amscanne

This comment has been minimized.

Copy link
Author

commented Oct 16, 2017

Does that actually address a significant fraction of bad pointers? If the pointer is coming from C code, odds are good that it's already encoded as an unsafe.Pointer.

Or does the code generated by cgo itself do explicit conversions? (If so, it still seems unfortunate to couple the runtime pointer-validity check to a cgo implementation detail, especially when the interaction between cgo and the compiler is in question: see #16623.)

This is a good question. It does seem like the structure would currently avoid this check. (The generated code effectively passes a *unsafe.Pointer which is filled in by the generated C stub.) I think the logical thing to do would be to update the CGo code generation to use uintptr across the boundary and convert to unsafe.Pointer only on the Go side.

@gopherbot

This comment has been minimized.

Copy link

commented Oct 17, 2017

Change https://golang.org/cl/71270 mentions this issue: runtime: separate error result for mmap

@aclements

This comment has been minimized.

Copy link
Member

commented Oct 17, 2017

I propose to add code that causes a recoverable panic at the time the conversion is made. This conversion adds a small cost to each unsafe.Pointer cast.

This seems like a not unreasonable debugging mode to me, but it's not clear that it should be on by default. It could be enabled, for example, by a GOEXPERIMENT setting. In that case, I wouldn't mind having it on even in the runtime, since the runtime shouldn't be making up bad unsafe.Pointers either.

Conversions that create bad unsafe.Pointers must violate the unsafe.Pointer rules. Having the ability to trap on clear violations of the rules seems like a good idea. Related to this, I've been exploring adding significantly more safe-points to code, which could expose more code that violates these rules. Having a more aggressive check would be useful for testing this.

Alternate proposal
I propose that a new debug variable is introduced, debug.stackptr, that toggles the behavior of the check during stack growth. I would advocate that this check should default to off, but regardless it would allow just that check to be disabled if it proves problematic.

The real danger here is not that we observe a small-valued not-really-a-pointer. The danger is that such code may also generate a not-really-a-pointer that actually looks like a real pointer. That can crash GC hard. The main point of detecting small-valued pointers is as a first line of defense against this. Turning off that check won't do anyone any favors.

gopherbot pushed a commit that referenced this issue Oct 18, 2017
runtime: separate error result for mmap
Currently mmap returns an unsafe.Pointer that encodes OS errors as
values less than 4096. In practice this is okay, but it borders on
being really unsafe: for example, the value has to be checked
immediately after return and if stack copying were ever to observe
such a value, it would panic. It's also not remotely idiomatic.

Fix this by making mmap return a separate pointer value and error,
like a normal Go function.

Updates #22218.

Change-Id: Iefd965095ffc82cc91118872753a5d39d785c3a6
Reviewed-on: https://go-review.googlesource.com/71270
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2018

There doesn't seem to be much support for the original proposal, which seems to be unconditionally adding some checking code to conversions from uintptr to unsafe.Pointer. That would slow down correct uses. It might also require some mechanism for turning it off for the garbage collector or other code; not sure.

As a debugging mode it makes sense, perhaps through a compiler flag.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Apr 2, 2018

Following Austin's line of thought, I suggest GOEXPERIMENT=checkptr and make it a throw, not a panic. It would be great as a debugging mode.

@aclements

This comment has been minimized.

Copy link
Member

commented Apr 3, 2018

I apologize if I've lost track of some of the history of this, but was this motivated by an observed problem, or is this theoretical? Neither the issue nor the CL give a concrete motivation.

I discussed this with several of the runtime and compiler developers today and came up with a counter-proposal: enable this behind a GOEXPERIMENT flag, but make the check much more exhaustive than simply looking for small-valued pointers. The small-valued pointers check in stack copying and GC is just a canary in a coal mine: it's meant to detect strong evidence that there are bad pointers around, but it's in no way exhaustive. Silencing the canary is too little, to the point of being misleading. Pointers < 4096 are not more bad than any other bad pointers. So if we're going to add a debugging mode for this, we should own it and detect all of the bad pointers we can detect, including pointers to dead objects in the Go heap.

@amscanne

This comment has been minimized.

Copy link
Author

commented Apr 6, 2018

This was an issue hit in a production system. Given the nature of the issue (panic on stack check), it wasn't obvious to debug.

An experimental check mode seems like a good proposal, presumably it would have caught the bad usage fixed by https://golang.org/cl/71270 at a minimum.

@rsc

This comment has been minimized.

Copy link
Contributor

commented Apr 16, 2018

Everyone seems to agree that the resolution here would be to add GOEXPERIMENT=checkptr and check for more pointers than just < 4096. Accepting that proposal, although no one is promising to work on it.

@rsc rsc modified the milestones: Proposal, Unplanned Apr 16, 2018

@rsc rsc changed the title proposal: Check pointers for illegal values on conversion from uintptr proposal: add GOEXPERIMENT=checkptr Apr 16, 2018

@mdempsky

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

Some suggestions:

  1. GOEXPERIMENT is very heavy weight and requires rebuilding the entire toolchain, whereas I don't see any reason this can't just be a compiler flag. We can hide it behind -d initially while still experimenting with ideas.

  2. We can check that when converting unsafe.Pointer to *T that the pointer is aligned according to T's alignment. This would be a very lightweight check.

  3. When converting unsafe.Pointer to *T, if the result points into the heap, make sure objectoffset + sizeof(T) doesn't exceed the span's sizeclass. This seems unlikely to find many issues in practice, but shouldn't be too hard to instrument.

  4. We can check that when converting uintptr to unsafe.Pointer that if it points to a Go object in the heap, then it was derived from an existing pointer to that same object according to the pointer arithmetic rules.

For the last one, at compile time, given unsafe.Pointer(U) where U is a uintptr-typed expression, we can define O(U) as:

O(uintptr(P)) = [P]    // if P has type unsafe.Pointer
O(U1 + U2) = O(U1) || O(U2)
O(U1 - _) = O(U1)
O(U1 &^ _) = O(U1)
O(_) = []

This picks out all the unsafe.Pointer expressions that according to package unsafe's strict arithmetic rules (add or subtract offsets, and use &^ to round pointers) that the resulting pointer could derive from. (In most cases, this is likely exactly one pointer.)

The compiler then just needs to invoke a runtime function like:

func checkPointerArithmetic(derived unsafe.Pointer, originals ...unsafe.Pointer) {
    baseD, _, _ := findObject(derived, 0, 0)
    if baseD == 0 {
         // Not in the heap.
         // TODO: Apply checks for stacks and/or globals?
         return
    }

    for _, original := range originals {
        baseO, _, _ := findObject(original, 0, 0)
        if baseD == baseO {
            // derived was computed from original.
            return
        }
    }

    throw("bad pointer arithmetic")
}

This is a bit more expensive than alignment checking, but should still be reasonably fast for how infrequent pointer arithmetic is.

It may also be possible to apply checks for objects on the stack or in globals. For example, we can at least guarantee that a pointer into a stack object or global object was derived from a pointer to that same stack/object region.

@gopherbot

This comment has been minimized.

Copy link

commented Feb 13, 2019

Change https://golang.org/cl/162237 mentions this issue: cmd/compile: add -d=checkptr to validate unsafe.Pointer rules

@mdempsky

This comment has been minimized.

Copy link
Member

commented Feb 13, 2019

I pushed a mostly functional proof-of-concept CL to 162237 that checks for pointers between 0 and minLegalPointer, pointer alignment, and pointer arithmetic. (It does not check pointed-to-object size.) It noticed a handful of "failures" within the standard library:

  • strings/builder.go's noescape function.
  • v := Mincore(unsafe.Pointer(uintptr(1)), 1, &dst) in runtime/runtime_linux_test.go.
  • StorePointer(addr, unsafe.Pointer(new)) in atomic_test.go.
  • syscall.UnixRights creates a past-the-end-of-the-object pointer

I'd be interested if it catches anything interesting in other code bases. To try it out:

  1. Apply the above patch locally.
  2. Reinstall cmd/compile (i.e., go install cmd/compile).
  3. Build programs or run tests with -gcflags=-d=checkptr.
@josharian

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2019

  • syscall.UnixRights creates a past-the-end-of-the-object pointer

This seems like a bona fide failure, not a scare-quote failure. I'm surprised it hasn't caused any GC-related crashes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.