Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: overload allocator within specified namespaces #34586

Open
awnumar opened this issue Sep 28, 2019 · 9 comments
Open

proposal: overload allocator within specified namespaces #34586

awnumar opened this issue Sep 28, 2019 · 9 comments
Labels
Milestone

Comments

@awnumar
Copy link
Contributor

@awnumar awnumar commented Sep 28, 2019

This proposal is aimed at tackling the problem of leaking sensitive data when using Go code that was written by someone else, i.e. from the standard library or elsewhere.

In your own code you can use a special allocator (such as one provided by memguard or using syscalls directly) to hold sensitive data.

However, the vast majority of people are not cryptographers, experts at implementing cryptographic code, or capable of vetting code for soundness. For this reason they must rely on standard implementations provided by the standard library or trustworthy third-parties (libsodium, nacl, etc.)

The problem is that these implementations are often written with little consideration towards limiting the exposure of sensitive state. Take this example of the blake2b-256 digest function from the crypto package:

// Sum256 returns the BLAKE2b-256 checksum of the data.
func Sum256(data []byte) [Size256]byte {
	var sum [Size]byte
	var sum256 [Size256]byte
	checkSum(&sum, Size256, data)
	copy(sum256[:], sum[:Size256])
	return sum256
}

The digest is computed and then copied to a new fixed-sized array which is returned to the caller. There is no way for the caller to ensure that the state remains within specified memory regions. Issues like this are everywhere.

The most straightforward solution is to re-write parts of the implementation yourself, ensuring that they use a specified allocator and clean up state when they are done. However this is non-trivial and cryptographic code is really tricky.

None of the previous proposals (#21374 and #21865) manage to resolve this issue so here I propose something different: the ability to provide a custom allocator within specified namespaces, such as an imported package. I'm not certain yet on the specifics so I very much am looking for discussion at this time to see if this idea is feasible.

We can define an allocator with some bare-minimum required functionality:

  1. A method to allocate a continuous region of memory.
  2. A method to securely de-allocate a previously allocated region of memory.
type Allocator interface {
    Alloc(n int) ([]byte, error)
    Free(b []byte) error
}

(Perhaps a realloc or resize method could also be included, although this can be emulated by allocating a new mapping of the required size, copying the data, and destroying the old mapping. From a performance point of view this would only be useful if the realloc function dynamically resized the memory region without copying data, although this may not be possible when the new size is greater.)

Ignoring syntax for the moment, imagine the runtime knows that when packageA requires memory it should be using this black-box allocator.

  1. packageA executes an instruction requiring a new heap allocation of 32 bytes.
  2. The runtime calls a.Alloc(32) and is essentially given a pointer to a 32 byte region of memory on the heap. The semantics of how the allocator did this is irrelevant, and the only assumption is that this (and only this) region of memory is safe to use.
  3. The runtime casts this allocation to the required type and passes it to packageA.
  4. A finaliser may be attached to the allocation, calling Free when it detects no references to the memory.

Using memguard to implement this is fairly trivial. An added bonus is that memguard keeps a reference to every active allocation that has not yet been destroyed, so the original caller (whoever is importing packageA) may call memguard.Purge() in order to wipe all state, removing reliance on the garbage collector calling the finaliser.

Another bonus of this design is that the interface is very minimal, allowing many different implementations of the allocator to "compete" for the "best" solution. AFAIK the only one at the moment is memguard, API reference here.

Now on to the syntax.

The solution I'm toying with at the moment is some semantics added to the unsafe package, allowing the caller to specify a scope somehow. This function would be called before the library is used to perform any sensitive operation.

package unsafe

func OverloadAlloc(importPath string, a Allocator) error {
    // ...
}
if err := unsafe.OverloadAlloc("golang.org/x/crypto/blake2b", a); err != nil {
    panic(err)
}

This preserves backwards-compatibility, requiring no changes to the Go language syntax or specification.

Another way could be to specify the allocator at import-time using some special syntax.

import <a>"golang.org/x/crypto/blake2b"
import a:"golang.org/x/crypto/blake2b"
import Alloc<a>"golang.org/x/crypto/blake2b"
// idk what the best way is ¯\_(ツ)_/¯

This method would require changes to the language but it is backwards-compatible in that code that worked before would continue working and code that was broken before would almost certainly still be broken, albeit in a different way. (Unless old code happened to accidentally implement the correct interface and also guess the syntax for using it.)

Another solution still would be to overload not packages but something else like a function's namespace and everything it calls, although this might be more tricky. Or perhaps overload the allocator for the entire program but this could be undesirable if the program allocates a lot of memory, most of which is not sensitive. (This could be accomplished anyway with a function in unsafe by passing it the name of the current package, or main).

Anyways, looking for insights.

/cc @FiloSottile @randall77 @ianlancetaylor @agl

@gopherbot gopherbot added this to the Proposal milestone Sep 28, 2019
@gopherbot gopherbot added the Proposal label Sep 28, 2019
@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Sep 28, 2019

type Allocator interface {
    Alloc(n int) ([]byte, error)
    Free(b []byte) error
}

Is this allocator only used for []byte allocations? How about strings?

Delegating general allocations are hard, as the runtime needs to keep track of which parts of the object are pointers. []byte-only allocation nicely sidesteps that problem. The runtime will still need to track the set of objects returned by such an allocator, and the GC would require a check for each pointer scanned to see if it intersects one of those objects (which could be expensive).

Can the allocator return Go heap memory, or must it use outside-of-heap memory?

The runtime will only call Free when it gets around to it, exactly like finalizers. It doesn't know that the object can be freed until the next GC. I'm not sure that's the desired behavior that crypto code needs.

In general I'm against this proposal. I think it is a much better design to require crypto code that wants this guarantee to code it explicitly. For example, your Sum256 example would require returning a memguard region, not an array, to really be secure. Even an allocation policy like the one described here wouldn't fix this code, as the calling convention requires that the return value be on the stack - it cannot use the allocator.

The problem is that these implementations are often written with little consideration towards limiting the exposure of sensitive state.

That sounds like the thing we should fix. If you want to limit exposure of sensitive state, you must write code to do that. If the stdlib doesn't do that, you have to write your own (or update the stdlib to do what you want).

@awnumar

This comment has been minimized.

Copy link
Contributor Author

@awnumar awnumar commented Sep 28, 2019

Is this allocator only used for []byte allocations? How about strings?

[]byte is the most useful as it can be cast to any other type:

data := []byte("yellow submarine")
str := *(*string)(unsafe.Pointer(&data))
// both reference the same region of memory

Can the allocator return Go heap memory, or must it use outside-of-heap memory?

It would return non-go heap memory, probably attained from mmap.

The runtime will only call Free when it gets around to it, exactly like finalizers. It doesn't know that the object can be freed until the next GC. I'm not sure that's the desired behavior that crypto code needs.

In memguard, a reference to every currently existing allocation is kept, and there are convenience functions to destroy these securely. In a situation where the library is providing an allocator to the runtime, it can keep track of the allocations itself and wipe out only those ones once the library has finished its work and returned a value.

For example, your Sum256 example would require returning a memguard region, not an array, to really be secure. Even an allocation policy like the one described here wouldn't fix this code, as the calling convention requires that the return value be on the stack - it cannot use the allocator.

Good point. Even so, rewriting functions to take an output buffer as an argument or return the data as an ordinary slice is a lot easier than trying to re-write the code to explicitly use a different allocator.

@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Sep 28, 2019

it can keep track of the allocations itself and wipe out only those ones once the library has finished its work and returned a value.

Then why would the runtime also do that? Just as a backstop?

Even so, rewriting functions to take an output buffer as an argument or return the data as an ordinary slice is a lot easier than trying to re-write the code to explicitly use a different allocator.

I'm not sure i agree. Changing code to use a different allocator just involves passing a alloc Allocator around, calling it when necessary, and keeping track of when to free. While you're in there changing input and output buffers, it isn't much harder to add an allocator.
Keeping track of when to free is the hard part, but you think the allocator itself can do that.

@awnumar

This comment has been minimized.

Copy link
Contributor Author

@awnumar awnumar commented Sep 28, 2019

Then why would the runtime also do that? Just as a backstop?

Yes, mostly. Attaching a finaliser is free so might as well do it.

I'm not sure i agree. Changing code to use a different allocator just involves passing a alloc Allocator around, calling it when necessary, and keeping track of when to free. While you're in there changing input and output buffers, it isn't much harder to add an allocator.

I've tried, it's really not very easy. When you have something like secretbox here where there's lots of internal functions doing lots of allocations, copies, calling other functions and packages and passing state between them, and on top of it all cryptographic code is incredibly difficult to parse and understand, it quickly gets out of hand.

Keeping track of when to free is the hard part, but you think the allocator itself can do that.

a := getAlloc()
unsafe.OverloadAlloc("crypto", a)
crypto.Work(secret, state, &wow)
a.Destroy() // destroys everything that the imported package used
@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Sep 28, 2019

Yes, mostly. Attaching a finaliser is free so might as well do it.

Attaching a finalizer to non-Go-heap memory is not free. That would be tricky to implement in the runtime.

A finalizer on the Allocator is fine (presumably the Allocator is in the Go heap).

So I'm not sure whether Free is necessary in this proposal.

@awnumar

This comment has been minimized.

Copy link
Contributor Author

@awnumar awnumar commented Sep 28, 2019

Attaching a finalizer to non-Go-heap memory is not free. That would be tricky to implement in the runtime.

We already attach a finaliser on allocations in memguard: https://github.com/awnumar/memguard/blob/11ddc2241fddbb1ddc7d8ffcbd9051d6c118428f/buffer.go#L28-L35

Is this different since it's attached to a struct object instead of a slice? I guess the allocator itself could attach the finaliser if the runtime cannot.

So I'm not sure whether Free is necessary in this proposal.

I thought about this but Free gives the runtime some flexibility in a way. It can be used to emulate realloc, for example.

@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Sep 28, 2019

Is this different since it's attached to a struct object instead of a slice? I guess the allocator itself could attach the finaliser if the runtime cannot.

The type of object doesn't matter. What matters is whether the object is in the Go heap or not.
memguard sets a finalizer on a Go object, not on a mmap-allocated object.

@awnumar

This comment has been minimized.

Copy link
Contributor Author

@awnumar awnumar commented Sep 28, 2019

I guess the slice itself is still a go object, just the ptr value points at some memory not in the go heap. The runtime doesn't need to concern itself with that location.

@randall77

This comment has been minimized.

Copy link
Contributor

@randall77 randall77 commented Sep 30, 2019

The allocator as spec'd returns a []byte. You cannot set a finalizer on a []byte. You can only set a finalizer on something that is a pointer.

b, err := alloc.Alloc(100)

You can't set a finalizer on &b[0] because that's not a pointer to a heap object.
You could store b in a containing structure:

type C struct {
    b []byte
}
c := &C{b:b}

You could then set a finalizer on c. But that doesn't solve the underlying problem. Just because c is dead doesn't mean that b is dead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.