Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erasing secrets from memory (zero on drop) #11

Closed
hdevalence opened this issue Jan 6, 2017 · 22 comments
Closed

Erasing secrets from memory (zero on drop) #11

hdevalence opened this issue Jan 6, 2017 · 22 comments

Comments

@hdevalence
Copy link
Contributor

Yesterday @burdges mentioned some ideas about how to try to erase secret data from memory, and some things about zero-on-drop behaviour in Rust. It would be good to have a reliable way to clear secret data.

If I understand correctly, just implementing the Drop trait to zero memory may not be sufficient, for two reasons:

  1. the write may be optimized away (see also);

  2. drop() may never be called, because Rust's memory model allows memory leaks: "memory unsafety is doing something with invalid data, a memory leak is not doing something with valid data"

Other notes that may be of interest: this morning at RWC 2017, Laurent Simon (@lmrs2 ?) presented secretgrind.

It could be quite convenient if Rust had a #[secret_stack] function annotation that guaranteed stack erasure, but this would require a language change.

@burdges
Copy link
Contributor

burdges commented Jan 6, 2017

After this answer to my question on the non-zeroing dynamic drop RFC, I've decided to start using a macro like :

macro_rules! impl_KeyDrop {
    ($t:ident) => {
        impl Drop for $t {
            fn drop(&mut self) {
                unsafe { ::std::intrinsics::volatile_set_memory(self,0,1); }
            }
        }
    }
}

which looks okay except maybe it needs some attribute due to being unstable. I suppose a stable version could be built with std::ptr::write_volatile, ala

macro_rules! impl_KeyDrop {
    ($t:ident) => {
        impl Drop for $t {
            fn drop(&mut self) {
                unsafe { ::std::ptr::write_volatile::<Self>(self, $t(Default::default())); }
            }
        }
    }
}

And you could just write $t([0u8; 32]) since you should really know the key sizes in advance.

In particular this code demonstrates that fixed length arrays are not zeroed when dropped.

#[derive(Debug,PartialEq,Eq)]
struct SecretKey(pub [u8; 32]);

fn main() {
    let p : *const SecretKey;
    { let s = SecretKey([1u8; 32]); p = &s; ::std::mem::drop(s); }
    unsafe { assert_eq!(*p,SecretKey([0u8; 32])); }
}

If secretgrind becomes mature enough then I'd imagine RFCs and code for appropriate attributes in rustc should go smoothly. I'd think the main issue would be ensuring that his call graph based stuff did not cause any problems for dynamic linking, not that rust does that very often.

@burdges
Copy link
Contributor

burdges commented Jan 6, 2017

I assume memory leaks are only possible with trait objects? I'll go read that blog post.

Assuming so, I wonder if memory leaks could be prevented by always placing a Drop bound on key material whenever trait objects were used, or even making Drop a super trait of any traits for key material. If not, this sounds like an issue with raising with the Rust developers.

I see now. You must explicitly opt-in to leaking resources with ::std::mem::forget, which is safe and might be used by data structures, although not usually on user types. There are currently around 69 calls to occurrences of forget in the rust repository. We could maybe submit an RFC for #[never_forget] although they'd probably call it ?Forget or something. ;)

@hdevalence
Copy link
Contributor Author

Thanks so much for this detailed write-up!

@burdges
Copy link
Contributor

burdges commented Jan 6, 2017

Also, std/ptr/fn.write_volatile says 'Rust does not currently have a rigorously and formally defined memory model, so the precise semantics of what "volatile" means here is subject to change over time. That being said, the semantics will almost always end up pretty similar to C11's definition of volatile.'

Another few ways to fail to Drop by leaking resources are cycles involving Rc/Arc, unsafely overwriting pointers, including with write_volatile, etc., transmute to a type that forgets about a pointer, and panics. And @bluss pointed me to rust-lang/rfcs#1066

If we're just trying to zero an array, then we're seemingly just worried about the data structures we insert the key material into. It appears any local stack usages should only leak on panic, as well as the scenarios explained by Laurent Simon.

@Manishearth
Copy link
Contributor

I assume memory leaks are only possible with trait objects?

Memory leaks are possible safely with mem::forget or Rc cycles. It's pretty easy to avoid both in smaller crates, though in a complex application that has a DAG in it you may end up having an Rc cycle if you're not too careful. Also things like forgetting to rebox a Box::into_raw or whatever.

The whole ?Forget (?Leak was the original proposal iirc) thing has been gone over in the past. It's possible to have such a trait, but:

  • ?-traits are hard and having two of them would probably break everything
  • This is tricky to make work with Rc. Possible, but tricky.
  • This changes the safety contract of Rust. There is now a trait that some data structures must opt out of.

There were other issues which I don't quite recall. But mostly, it doesn't seem to be something which will happen if we propose it again (feel free to try though). I too was on the ?Leak train back then, but by now I've come to appreciate the current status quo.

@Manishearth
Copy link
Contributor

Manishearth commented Jan 7, 2017

Also, I'm not sure if this line of reasoning is the correct one to follow. We already rely on the Rust typesystem to make it impossible to leak secrets. Reading uninitialized data is already UB. Relying on the same types system to furthermore guarantee Drop does not seem like it would be an additional layer of protection, it's just the same layer of protection and if unsafe code breaks the first it's probably possible to break the second. There may be a discipline that could be followed to make the secret data super super isolated from other unsafe code, however, ensuring that it's destroyed (by manual inspection/verification) would be a minor check compared to how hard enforce the rest of that is. The way I see it, if it's widely shared in such a way that the lack of leak guarantees in Rust is an actual practical problem (i.e. if it's shared out the wahzoo in a reference counted maybe-tree-or-DAG), you've already lost. Bad unsafe code that makes it possible to read random memory will probably be able to access such a long-living secret anyway. There's a case to be made for nasal demons invoked by C code we call, but it's really the same situation -- if you're sharing the secret data that widely, you've already lost and no amount of static checks will help you.

Basically, for simple uses of a secret value, where it's visually verifiable that it's not being shared in such a way as to cause potential leaks, the leak-checking static analysis doesn't help much. For complicated cases where the static analysis would help, any bugs in unsafe code are likely to break enough to render the leak protection moot anyway. If your threat model doesn't involve broken unsafe code, you don't have to worry anyway since Rust guarantees no uninit reads if your unsafe code is sound.

I understand having protection against this kind of thing built into a different level of the application, e.g. via a sanitizer on top of the existing safety static analysis, or by having extra instrumentation to clear stacks. I don't really see the value of more static analysis here, since in the cases where it matters it won't be robust to other unsafe code breaking down anyway.

(I hope I got this point across well -- it's a bit nuanced and usually I'm all for static checks, just that in this case I don't see it adding anything)

@burdges
Copy link
Contributor

burdges commented Jan 7, 2017

There are going to be attacks on Rust code through both calls to C libraries and mistakes in unsafe code, so that should be considered in the threat model.

Anyone building a modern messaging application needs to hold Axolotl ratchet key material in non-trivial data structures and serialize it to disk. Pond uses go-protobuf for serialization, for example. Cap'n proto is far nicer than protobuf, but uses a strange arena allocator that can leave holes.

Right now, I'm wondering if #[never_forget] could not be some static analysis tool:

Imagine X is some forget finding algorithm, say looking for panic paths, data structures involving Weak, or yes outright calls to mem::forget(). If you specify a type as being #[never_forget(X)] then nothing happens in cargo build, but you can then run say cargo findforget and it will rerun type and borrow checking for all your dependencies to infer some "virtual" ?Forget<X> traits and mark your #[never_forget(X)] types as !Forget<X> to create conflicts.

You could use this to find memory leaks more easily of course, maybe adding the #[never_forget] only temporarily. Yet, secret types should always be marked as #[never_forget] so that users can run all leak finding algorithms with cargo findforget. I'd expect you could not eliminate all warnings produced by this, but you could document the output in a never_forget.txt file and provide reasoning why you think each warning looks okay.

@Manishearth
Copy link
Contributor

Yeah, so I am considering broken unsafe code and broken C libraries in the threat model; I'm saying that the leak protection static analysis can provide will depend on the same invariants plus some extras and in the face of broken unsafe code, will likely be broken for nontrivial cases.

@burdges
Copy link
Contributor

burdges commented Jan 7, 2017

Against prepared attackers sure, but I think just leaving key material around willy nilly could interact poorly with complex data structures to produce accidental disclosures. A few volatile writes costs little compared with the crypto itself.

An interesting place to raise the static analysis ideas might be the Roadmap RFC because they were discussing interoperability with C++ and this ability to recheck all your dependencies in a slight variant of Rust might be relevant to stuff like on-demand monomorphization or whatever.

@Manishearth
Copy link
Contributor

but I think just leaving key material around willy nilly

You're not getting my point here 😄 (my fault, it's nuanced)

Static leak protection won't protect against that. It will protect against the case where you trigger a memory leak in safe Rust.

Triggering a memory leak in safe Rust is hard to do by accident. It being safe to do does not mean it's easy. You need to explicitly call mem::forget, Box::into_raw, or some other thing from the very small list of leaky stdlib/ecosystem functions. It's actually possible to pretty much address this kind of leak with a marker type, wrapper functions over these functions parametrized over the marker type, and grep to forbid the original types. (It's not possible to do this in the core language without it being a breaking change, though it would be nice if it were)

Or, you need to have reference counted cycles in your application. If you application is such that you are in danger of RC cycles involving the key happening, you already have your key material being shared willy nilly and no amount of static analysis will protect you from that.

Basically, the nontrivial/interesting cases that such analysis protects you from are situations where you have the problem regardless of how leaky it is.

I bet a nicer analysis not involving grep can be written out of tree by reading info from -Z save-analysis (which you can run on deps too). https://github.com/nrc/rls-analysis can help.

(I don't really agree that this is relevant to the C++ bits of the roadmap)

@burdges
Copy link
Contributor

burdges commented Jan 7, 2017

Ahh! I'd missread one statement you made as being an argument against zeroing Drops, as opposed to just being an argument against aggressive measures to prevent them. It's not nuanced. I just glazed over and did not read carefully enough. ;) And -Z save-analysis, etc. sounds great, thanks!

@cesarb
Copy link

cesarb commented Jan 8, 2017

We already rely on the Rust typesystem to make it impossible to leak secrets. Reading uninitialized data is already UB. [...]

Overwriting ("zeroizing") secrets in memory bounds the temporal interval in which they exist. You have to think outside Rust's sandbox: the threat is not only leaking secrets from within the process, but also someone from the outside attaching a debugger, the whole process memory being dumped (a core dump, a VM snapshot, suspend-to-disk), or even direct access to the memory hardware (cold boot attack). If all traces of the secret had been safely erased by then, you're safe.

I proposed something related in rust-lang/rfcs#1496.

As for preventing the write from being optimized away, there's a trick similar to what I did on my constant_time_eq crate: do the zeroing in an #[inline(never)] function. While in theory the compiler could look through it, in practice it works as an optimization barrier. Or you could write the zeroing function in C, or even in assembly if you want to be 100% sure. Or all the three: use assembly, fallback to C if you don't have assembly for this architecture, and fallback to #[inline(never)] if you don't have a working C compiler.

@Manishearth
Copy link
Contributor

You have to think outside Rust's sandbox:

I am. I'm not arguing that zeroing secrets is a bad idea. I think it's a really good thing to do. I'm saying that I don't see any additional value in static analysis in Rust that ensures that values don't leak, in the context of an already-broken sandbox.

@vks
Copy link

vks commented Jan 9, 2017

Note that you can try to use libc::mlock on supported platforms to avoid memory being paged to the swap area.

@burdges
Copy link
Contributor

burdges commented Jan 9, 2017

As I understand it, mlock is kinda a blunt instrument :

  • Memory locks do not stack, that is, pages which have been locked several times by calls to mlock() or mlockall() will be unlocked by a single call to munlock() for the corresponding range or by munlockall().
  • Pages which are mapped to several locations or by several processes stay locked into RAM as long as they are locked at least at one location or by at least one process.

I think this say, if two cryptographic libraries attempt to mlock stack variables, or if you put two mlocked values into a data structure, then they are almost guaranteed to interfere and munlock one another. It sounds like the applications needs to mlock its whole stack when in sensitive areas, while any data structures for storing key material need their own mlock. Anyone want to write a crate to mlock the stack?

Or..

You can write smallish code, dynamically link it somehow, and just call mlockall. I imagine this approach conflicts with any calls to munlock too. :)

Anyways, I think mlock is beyond the scope of this library, although presumably anyone here would be happy to chat about it.

@vks
Copy link

vks commented Jan 9, 2017

@burdges I think you want to be the exclusive owner of your secret data anyway and not expose it to other applications/libraries. mlocking everything is not feasible unfortunately, because the default limits are very low. For instance, on my system ulimit -l returns 64.

@burdges
Copy link
Contributor

burdges commented Jan 9, 2017

Interesting, I suppose any data structures that store vast numbers of keys, like per contact ratchet states, should be encrypted in memory then. And threads doing crypto might want a small mlocked stack.

@burdges
Copy link
Contributor

burdges commented Jan 10, 2017

Just created this related issue : rust-lang/rfcs#1850

@burdges
Copy link
Contributor

burdges commented Jan 12, 2017

Ideally, one should probably update tars to the new allocator traits or something. I've post a quick and dirty crate to do zero on drop the cheap way discussed here however : https://github.com/burdges/zerodrop-rs

@burdges
Copy link
Contributor

burdges commented Jan 14, 2017

We'll see what people say about this idea : rust-lang/rfcs#1853

@cesarb
Copy link

cesarb commented Jan 14, 2017

I would like to present a crate I wrote these last few days inspired by this discussion: https://crates.io/crates/clear_on_drop. Some of the ideas in it might be useful.

burdges referenced this issue in burdges/lake Jan 15, 2017
Rust does not zero non-`Drop` types when it drops them.
Avoid leaking these type as doing so obstructs zeroing them.
In particular, if you are working with secret key material then
- do not call `::std::mem::forget`,
- do not unsafely zero types with owning pointers,
- ensure your code cannot panic.
- take care with `Weak`, and
- examine the data structures you use for violations of these rules.

See rust-lang/rfcs#320 (comment)
and https://github.com/isislovecruft/curve25519-dalek/issues/11
@hdevalence
Copy link
Contributor Author

Just to round this out -- we're currently using @cesarb's clear_on_drop to wipe heap allocations. (We heap-allocate in multiscalar multiplication.)

For stack allocations, I don't think there's much we can do, for two reasons:

  1. We have no control over stack allocations anyways, so we have no way to know that the memory we zero is the memory that was used for temporaries (we don't know how big spills are, etc);
  2. The proper place to zero the stack is at the entrance to the "crypto" part, but this is outside of curve25519-dalek, which is intended as a tool for building higher-level crypto operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants