Skip to content
This repository has been archived by the owner on Nov 3, 2021. It is now read-only.

memory.fill out-of-range behaviour, and interaction with exceptions #10

Closed
julian-seward1 opened this issue Mar 28, 2018 · 14 comments
Closed

Comments

@julian-seward1
Copy link

For a memory.fill which is partially or entirely out-of-range with respect to the associated memory, what should the behaviour be? I assume it should trap. This would be consistent with the baseline load/store instructions.

Looking forwards, if WebAssembly is extended to support exceptions, and there are exceptions for out-of-range memory accesses, then the exception specification will need to say something about the state of a memory area that is in the valid portion of a partially out-of-range memory.fill (or, in general, any store insn).

As an example, imagine we have a one-page memory (valid offsets 0 .. 0xFFFF inclusive) and we do a memory.fill starting at 0xFF00 for 0x200 bytes. This will then presumably trap and eventually we will end up in the memory exception handler. Then there are the questions:

  • What is the state of 0xFF00 .. 0xFFFF now?

  • What failing access (base, size) pair can we report? Should it be (0xFF00, 0x200) ? Or can it be some subrange of that?

These have relevance when considering how to implement an optimised memory.fill for short constant ranges, by compiling the equivalent of a sequence of baseline store instructions inline.

If we require that the state of the valid area is unchanged following a trapping memory.fill, we will force implementations to perform a single range check for the entire area, which might be inconvenient for very simple implementations that simply want to emit the equivalent code to that of a sequence of baseline store instructions.

If we can allow the contents of the area to be arbitrary and we're allowed to report only a failing sub-range to the exception handler, then the fill and range checks can be done incrementally. But allowing memory to be arbitrary at any point doesn't seem compatible with WebAssembly security goals.

I would therefore (reluctantly) suggest that the memory.fill out of range semantics must be "if any part of the range is invalid then (1) no part of memory is changed and (2) the entire range must be reported to the exception handler." (But I'd prefer to be convinced somehow that the incremental approach is also OK ..)

@rossberg
Copy link
Member

Traps can be caught at the embedding level, so the cases you mention are observable already even without exception support in Wasm. Existing store instructions are specified to have the behaviour you suggest for fill, and the memory model for threads will guarantee the same for atomic access (even with tearing).

So I agree that performing the size check prior to any mutation is the right semantics.

@binji
Copy link
Member

binji commented Mar 28, 2018

Yes, we already require this behavior for data segment initialization as well, so it is consistent.

Like @julian-seward1, I'm a little hesitant to require an up-front bounds check, but at least for memory.fill, it shouldn't be a big deal for implementations w/ trap handlers since they can fill incrementally backward (from high to low addresses) to achieve the same effect. memory.copy could also generally copy the ranges backward, but would have to copy forward to handle overlap when source > dest.

That said, allowing this would require not specifying how bytes are filled during the fill/copy. For threading at least, this is not possible. I suppose we could specify this ordering to allow for this optimization, however.

@lukewagner
Copy link
Member

Agreed on all above; also, wasm doesn't (yet) semantically report the trapping address. Pseudo-semantically, it only reports bytecode offset of the faulting instruction and function index.

@jfbastien
Copy link
Member

What will the behavior be when we also add support for mprotect?

I don't think we want to guarantee any particular ordering for fill / copy when we add atomics (otherwise we need fences everywhere), and I'm guessing we also don't want to guarantee any order with copy because we might want to e.g. perform it in reverse? For segment initialization we guarantee that we'll perform initialization from low to high, until we fault, but segments never have to deal with mprotect so we want do this.

So I'm not sure that we want to mandate a bounds-check up front for fill / copy, because with mprotect this will be different from segment initialization.

@eholk
Copy link

eholk commented Mar 28, 2018

But allowing memory to be arbitrary at any point doesn't seem compatible with WebAssembly security goals.

Which security goals does this violate? It doesn't seem like type safety is violated here, since the memory is just an array of bytes, so no guarantees would be broken by only changing some of the bytes.

@rossberg
Copy link
Member

@jfbastien, what's the difference between init and copy in this regard? Why would mprotect not affect the former?

@jfbastien
Copy link
Member

@jfbastien, what's the difference between init and copy in this regard? Why would mprotect not affect the former?

Ah you're right, I was thinking that no code runs before segment initialization. While that's true for a single module, it's not true when sharing memories between modules, so we've got the same problem.

We basically have restricted segment initialization to be in-order, and for each segment from low to high address, byte-per-byte (or, observably for faults, one page at a time). That doesn't seem like a pessimization, so I guess it's fine with mprotect. Enforcing the same restriction for fill / copy is restrictive, though.

@lars-t-hansen
Copy link
Contributor

I seem to remember that we come back to this issue from time to time during meetings but I forget if we've actually made any decisions here about bounds checking behavior. @binji, what do you remember?

@conrad-watt
Copy link
Contributor

conrad-watt commented Jan 24, 2019

My impression is that it was decided that bulk ops should be incremental, low->high.
That was @jfbastien's position here (WebAssembly/threads#96)
and I remember a consensus consistent with that when we talked in the CG about this PR (WebAssembly/spec#820)

FWIW, with the current memory model this doesn't imply that the low->high ordering MUST be observable by racing accesses in other threads (so no extra barriers needed), just that any trapping behaviour has to be consistent with that ordering.

@binji
Copy link
Member

binji commented Jan 24, 2019

Yes, that's my understanding as well.

@lars-t-hansen
Copy link
Contributor

"low->high" makes sense for fill and init but not in the general case for copy with overlaps, which will sometimes prefer to copy high->low to avoid the use of an intermediate buffer. The specification (currently in a PR) calls for using an intermediate buffer for all copies, but for good performance we don't really want to do that.

@titzer
Copy link

titzer commented Jan 25, 2019

Whatever is decided here will eventually apply to table operations as well, when/if tables can be shared. Currently there are no shared tables and no trapping behavior is possible for copy and init (bounds are checked before operation begins), but with shared tables it will degenerate to the same set of issues.

@lars-t-hansen
Copy link
Contributor

I think we're all aligned on element-at-a-time bounds checking for both memories and tables, certainly both the reference interpreter and SpiderMonkey implement this (haven't checked v8).

@nomeata
Copy link
Contributor

nomeata commented Apr 30, 2020

In case someone stumbles on this old thread; there is a related, longer one that concluded that bounds are checked ahead of time; see #111 (comment).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants