Discuss implementation defined behavior #75

jfbastien · 2015-05-19T18:17:50Z

Opening this bug so I go back and write documentation about this.

We want to avoid all forms of undefined behavior which can lead to nasal demons, and instead discuss how the wasm platforms allows for implementation defined behavior and what acceptable behavior is.

C/C++ UB is progressively refined by the compiler, and can be affected by tools such as sanitizers. The wasm platform then nails down some behaviors and leaves other open to the implementation. The implementation can then decide, based on the OS/ISA it's executing on, what the behavior is.

Note that behaviors include: "what happens if an enum is out of range", "shift by bitwidth or larger", "what do out-of-bounds accesses do", "what about unaligned accesses", "data races", and much more exciting things!

As a reference PNaCl has a non-comprehensive list of undefined behavior.

sunfishcode · 2015-05-19T18:49:35Z

Our present answers:

"what happens if an enum is out of range" -> C++ compilers lower enums; that's not wasm's problem
"shift by bitwidth or larger" -> just works
"what do out-of-bounds accesses do" -> they trap. Or in the non-ideal semantics presently being discussed, there are specific unsavory possibilities.
"what about unaligned accesses" -> they work, but if the alignment is less than is claimed, they may be slow
"data races" -> Garbage values and non-deterministic orderings seem unavoidable. I am hoping we can draw the line there, but it's not formalized at present.

There's also:

NaN sign bits and payloads after floating-point operations are implementation-dependent
SIMD may want to retain the "subnormals may or may not be flushed" clause

kg · 2015-05-19T19:12:59Z

"shift by bitwidth or larger" -> just works probably isn't precise enough. IIRC some architectures have different answers to what 'just works' means here. Extra bits masked off/mod N'd, clamped, treated as 0, etc. The threshold at which the masking/clamping happens varies too.

sunfishcode · 2015-05-19T19:49:29Z

AstSemantics.md has the full scoop. Shift counts are unsigned, unmasked, unclamped, and not treated as zero unless they are zero.

sunfishcode · 2015-05-20T02:30:07Z

A few more things:

there's a maximum callstack depth which depends on dynamic conditions; if the program exceeds that, it traps (Mention that stack overflow is checked. #77)
dynamically resizing the heap may fail due to allocation failure
programs may fail to start for numerous reasons

Unless I've missed something, this is a comprehensive list of incompletely specified behavior in the language itself, at present.

titzer · 2015-05-20T07:48:13Z

On Wed, May 20, 2015 at 4:30 AM, Dan Gohman notifications@github.com
wrote:

A few more things:

there's a maximum callstack depth which depends on dynamic
conditions; if the program exceeds that, it traps (Mention that stack overflow is checked. #77
Avoid relying on correct rounding of mod_float. spec#77)

dynamically resizing the heap may fail due to allocation failure

We should group these two under the category "exceeding resources of
execution engine. "

programs may fail to start for numerous reasons

We should list these reasons. For example, linking failures, verification
failures, resource exhaustion.

Unless I've missed something, this is a comprehensive list of incompletely
specified behavior in the language itself, at present.

—
Reply to this email directly or view it on GitHub
WebAssembly/spec#75 (comment).

sunfishcode · 2015-05-26T20:59:21Z

Another thing:

SIMD.js is currently proposed to have reciprocal and reciprocal sqrt approximation functions. As approximations, the specific results may vary between platforms.

sunfishcode · 2015-05-28T16:42:34Z

I created #87 to start a document collecting the list here.

jfbastien · 2015-05-28T17:27:53Z

I'm hoping that we can explain UB as a progressive filtering: C++ has wide UB, the compiler narrows it somehow, sanitizers can narrow it more, and then wasm filters it more into implementation-defined behavior.

sunfishcode · 2015-05-28T18:04:25Z

I agree, that sounds useful.

sunfishcode · 2015-06-01T18:16:45Z

On the other hand, this isn't specific to WebAssembly; it's just how C++ works, on any platform. So while there's value in explaining how C++ works to C++ developers, it's not clear where this would fit into the WebAssembly documentation.

titzer · 2015-06-01T20:24:13Z

I think it's important that we limit the scope of undefined behavior or
implementation-defined behavior in wasm. That doesn't seem to be a priority
in the C++ world, but it'd be nice to say, e.g. a misaligned load doesn't
cause your program to jump into the middle of "sqrt" or trash half the heap.

-B

On Mon, Jun 1, 2015 at 8:16 PM, Dan Gohman notifications@github.com wrote:

On the other hand, this isn't specific to WebAssembly; it's just how C++
works, on any platform. So while there's value in explaining how C++ works
to C++ developers, it's not clear where this would fit into the WebAssembly
documentation.

—
Reply to this email directly or view it on GitHub
WebAssembly/spec#75 (comment).

sunfishcode · 2015-06-02T01:56:36Z

We are indeed very strenuously limiting the scope of undefined behavior and implementation-defined behavior in wasm.

And we do have pretty good control flow integrity, since return addresses are stored on the trusted stack and can't be clobbered, and indirect calls will always call into the beginning of some function, never into the middle of a function or into garbage memory. We should advertise this more in the documentation.

However, we can't change C++ itself. After a program is compiled to wasm, its behavior will be relatively fixed (races and other documented details notwithstanding), but before that, C++ optimizers are known to take extensive advantage of the threat of nasal demons, and can trash half the heap if they think they're optimizing something.

titzer · 2015-06-02T07:42:00Z

On Tue, Jun 2, 2015 at 3:56 AM, Dan Gohman notifications@github.com wrote:

We are indeed very strenuously limiting the scope of undefined behavior
and implementation-defined behavior in wasm
http://IncompletelySpecifiedBehavior.md.

And we do have pretty good control flow integrity, since return addresses
are stored on the trusted stack and can't be clobbered, and indirect calls
will always call into the beginning of some function, never into the
middle of a function or into garbage memory. We should advertise this more
in the documentation.

However, we can't change C++ itself. After a program is compiled to wasm,
its behavior will be relatively fixed (races and other documented details
notwithstanding), but before that, C++ optimizers are known to take
extensive advantage of the threat of nasal demons, and can trash half the
heap if they think they're optimizing something.

Agree; nasal demons are C++ compiler territory; we should just make this
explicit in our documentation.

-B

—
Reply to this email directly or view it on GitHub
WebAssembly/spec#75 (comment).

sunfishcode · 2015-06-02T09:16:20Z

#102 is an attempt at addressing the concerns discussed here.

jfbastien · 2015-06-02T17:11:31Z

I'd like to capture a point I made in #102:

I'm not sure that we want to guarantee that there is a trusted call stack, that branches always have a valid destination, or that an application can't clobber the call stack. In the context of running untrusted code on the web we definitely want this guarantee, but I see it as an implementation detail. We should make it possible to implement Web Assembly with entirely different sandboxes, or under entirely different security models.

Two examples:

Targeting NaCl means there doesn't need to be a trusted stack to enforce security when running untrusted code.
Environments such as node.js have a different security boundary, and don't necessarily need to treat code as untrusted and pay the associated cost.

jfbastien · 2015-06-02T17:24:05Z

Linking to issue #105: Alignment will probably require implementation-defined behavior.

lukewagner · 2015-06-03T17:05:49Z

Update on #105: unless @titzer's search finds anything, looks like we get to keep deterministic behavior concerning alignment.

sunfishcode · 2015-06-10T18:55:30Z

@jfbastien does #102 address the concerns here, or is there more you'd like to do here?

jfbastien · 2015-06-10T19:03:42Z

I'll want to revisit this with @davidsehr and others, but that can wait until after going public. Let's just leave it open for now, and try to close before MVP.

binji · 2015-10-23T17:50:04Z

Closing along with #107 being closed.

jfbastien · 2015-10-23T18:49:02Z

This issue was actually addressed by the following text:
https://github.com/WebAssembly/design/blob/master/CAndC%2B%2B.md#undefined-and-implementation-defined-behavior

sunfishcode mentioned this issue May 28, 2015

Create a list of incompletely specified behavior. #87

Merged

jfbastien added the enhancement label May 29, 2015

sunfishcode mentioned this issue Jun 2, 2015

Incompletely specified behavior.md #102

Merged

jfbastien mentioned this issue Jun 2, 2015

Should WebAssembly have spooky action at a distance? #107

Closed

jfbastien added this to the MVP milestone Jun 10, 2015

jfbastien self-assigned this Jun 10, 2015

binji closed this as completed Oct 23, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discuss implementation defined behavior #75

Discuss implementation defined behavior #75

jfbastien commented May 19, 2015

sunfishcode commented May 19, 2015

kg commented May 19, 2015

sunfishcode commented May 19, 2015

sunfishcode commented May 20, 2015

titzer commented May 20, 2015

sunfishcode commented May 26, 2015

sunfishcode commented May 28, 2015

jfbastien commented May 28, 2015

sunfishcode commented May 28, 2015

sunfishcode commented Jun 1, 2015

titzer commented Jun 1, 2015

sunfishcode commented Jun 2, 2015

titzer commented Jun 2, 2015

sunfishcode commented Jun 2, 2015

jfbastien commented Jun 2, 2015

jfbastien commented Jun 2, 2015

lukewagner commented Jun 3, 2015

sunfishcode commented Jun 10, 2015

jfbastien commented Jun 10, 2015

binji commented Oct 23, 2015

jfbastien commented Oct 23, 2015

Discuss implementation defined behavior #75

Discuss implementation defined behavior #75

Comments

jfbastien commented May 19, 2015

sunfishcode commented May 19, 2015

kg commented May 19, 2015

sunfishcode commented May 19, 2015

sunfishcode commented May 20, 2015

titzer commented May 20, 2015

sunfishcode commented May 26, 2015

sunfishcode commented May 28, 2015

jfbastien commented May 28, 2015

sunfishcode commented May 28, 2015

sunfishcode commented Jun 1, 2015

titzer commented Jun 1, 2015

sunfishcode commented Jun 2, 2015

titzer commented Jun 2, 2015

sunfishcode commented Jun 2, 2015

jfbastien commented Jun 2, 2015

jfbastien commented Jun 2, 2015

lukewagner commented Jun 3, 2015

sunfishcode commented Jun 10, 2015

jfbastien commented Jun 10, 2015

binji commented Oct 23, 2015

jfbastien commented Oct 23, 2015