Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory model standardization #9

Open
binji opened this issue May 23, 2017 · 18 comments
Open

Memory model standardization #9

binji opened this issue May 23, 2017 · 18 comments
Labels

Comments

@binji
Copy link
Member

binji commented May 23, 2017

We'll need a WebAssembly memory model. The ECMAScript memory model was designed to be able to work with WebAssembly, but we'll likely need a separate document to cover WebAssembly specifics, or perhaps even a shared document that the ES spec references.

Is it necessary that this document exist before we standardize threads in WebAssembly?

@binji binji added the question label May 23, 2017
@lars-t-hansen
Copy link

We'll also need some kind of notion of guaranteed forward progress, which ES makes separate from its memory model.

FWIW, the "shared document" model was favored by the ES committee, whenever it was mentioned. We could start out by referencing the ES spec and having an addendum to that, if we feel we have additions.

@binji
Copy link
Member Author

binji commented Jun 6, 2017

We agreed in the CG meeting that a formal memory model document is needed, but not as a blocker for v1 threads. We will, in the interim, reference the ES memory model and note places where WebAssembly differs.

@lars-t-hansen
Copy link

We will, in the interim, reference the ES memory model and note places where WebAssembly differs.

One place where WebAssembly differs from JS, and where I think we need to make some decisions, is that wasm has 8-byte memory accesses while JS (currently) does not. This raises the issue of whether 8-byte accesses in wasm are access-atomic (non-tearing) on all platforms.

Recall that in the JS memory model, racy accesses are observed as access-atomic provided all racing accesses are of the same size. (By construction all accesses are aligned.) Thus a program can use integers to simulate pointers within a flat heap without fearing that it'll have bogus pointer values appear out of thin air, say, while still using racy loads and stores to access "pointer fields".

For 8-byte accesses on 32-bit platforms this is a little trickier. While the x86 has been access-atomic up to 8 bytes even for unaligned accesses and across cache line boundaries for a long time, it has no plain 8-byte integer load and store instructions, and to have the effect of a non-tearing 8-byte load or store the implementation would be forced to use cmpxchg8b even for plain memory accesses. (On ARM the situation is the same except that LDRD and STRD can be used on some newer ARMv7 chips; on MIPS32 there is no solution, but then MIPS32 users are left out in the cold already by our decision to require lock-free 8-byte accesses too.)

I would like to propose that for wasm32 we resolve that 8-byte accesses are not access-atomic but can instead be executed as two four-byte operations (in either order). Obviously this is for the sake of efficiency and simplicity, but even leaving those aside it's not clear that wasm32 really needs access-atomicity for racy operations on 8-byte data, and to me it seems like a minor loss.

Speaking of access-atomicity, we probably should also clarify that only naturally aligned racy accesses (of size 2 and 4) can be access-atomic in the sense of the JS spec.

@lars-t-hansen
Copy link

Related in the JS space: tc39/proposal-bigint#78

@littledan
Copy link
Contributor

What are the differences intended between the WebAssembly and JavaScript memory models? From these notes I see discussion of tearing on non-aligned memory accesses. Is there a difference in what the semantics should be between the two languages for non-aligned memory accesses?

@lars-t-hansen
Copy link

I think the "intended" differences between the memory models are "as few as we can manage".

Wasm true atomic accesses (using the atomic instructions) require alignment and are therefore aligned (haha) with JS.

JS does not have non-aligned accesses so that's not an issue per se; my guess is that we would model wasm unaligned accesses as sets of byte accesses and that would be that, they would fit neatly into the JS memory model.

@binji
Copy link
Member Author

binji commented Sep 7, 2017

I would like to propose that for wasm32 we resolve that 8-byte accesses are not access-atomic but can instead be executed as two four-byte operations (in either order).

This seems like a good idea. I'll write something up and bring it up at the next CG meeting.

@littledan
Copy link
Contributor

FWIW I wrote Lars's semantics up as a BigInt pull request.

I think with some refactoring, we could make the ES spec easy to call into directly from Wasm, and this refactoring could even land before BigInt does, if that's useful.

@rossberg
Copy link
Member

rossberg commented Sep 7, 2017

@littledan, well, it would be highly unfortunate to make the Wasm core spec dependent on JavaScript. Moreover, it seems rather unlikely that the memory primitives exposed by the JS model will suffice for Wasm once we move further down the road map (think more relaxed atomics, or SIMD, for example).

@littledan
Copy link
Contributor

@rossberg-chromium Do you think we should have two distinct memory models which refer to the same piece of underlying memory? How would users and implementers be able to reason about how operations from the two languages interacted?

I was imagining that the eventual goal state would either be that Wasm references the memory model in the JS spec, or that the memory model is extracted into a third document that both reference. This sort of extraction is a pretty superficial and political operation, so I thought we could decouple it from actually writing the text, starting in the JS spec.

The JavaScript spec already has hooks for several things in it which are not implemented in the spec. I don't see why we couldn't add relaxed atomics to the JS spec even if it's not exposed with JS operations. If this isn't acceptable to others in working on JS, or if we feel it's important for cleanness or organization, we can prioritize the separation of the JS memory model into a separate document. Another option is that the Wasm memory model could be a series of additions which fit cleanly on top of the JS memory model somehow. But having an entirely separate Wasm memory model formalization, apart from the JS one, seems really suboptimal.

@jfbastien
Copy link
Member

The question of where the canonical memory model lives, and how to share, should probably be discussed as a joint TC39 / WebAssembly thing. Dan, would you be OK listing the alternatives and their pro / con in a separate issue?

@littledan
Copy link
Contributor

@jfbastien Sorry for getting off-topic, filed #68 for that discussion.

@carlsmith
Copy link

What was decided in the end? I cannot find any information on how Wasm interacts with JS in terms of tearing values. If JS uses typed views to access a shared memory, and a Wasm module also accesses the memory, using regular (non-atomic) loads and stores, are there any guarantees?

@conrad-watt
Copy link
Collaborator

conrad-watt commented Nov 11, 2020

@carlsmith broadly speaking, every Wasm 1.0 access now has a counterpart in JavaScript's DataView API (including non-aligned and 8-byte accesses), and our ordering/tearing guarantees are identical to the choices made there (EDIT: where there's no equivalent typed array access).

IIRC, the choice was that non-aligned accesses and 8-byte non-atomic accesses tear on a per-byte basis, while 8-byte atomic accesses are guaranteed not to tear.

There is some question about the precise tearing behaviour of SIMD v128 load/store instructions, which I don't think have been explicitly discussed in a concurrent context. The safest choice would be to make them tear on a per-byte basis. This would mean that they'd potentially be weaker than an analogous series of scalar loads (i.e. vectorising a sequence of sequential loads as a Wasm to Wasm transformation would be unsound in general, although the transformation could probably be permitted in Binaryen through a similar mechanism to --ignore-implicit-traps).

@lars-t-hansen
Copy link

On current hardware, all you get with SIMD loads and stores is bytewise tearing behavior. Intel's volume 3A 8.1.1 states directly that SSE accesses larger than 64 bits may be implemented using multiple memory accesses. The ARMv8 ARM (follow the link from eg C6.3.168 SIMD&FP LDR (aarch64) to the getters and setters for Mem) is explicit that AccType_VEC accesses are never atomic and have bytewise semantics. The ARM is less explicit for aarch32 but an unbiased reading (of the ARMv8 ARM, which may have firmer semantics than older editions) is that the same interpretation applies, ie, bytewise.

@lars-t-hansen
Copy link

BTW, are we sure DataView is the desired JS reference here? DataView is ... complicated. The TypedArray APIs are generally cleaner and allow for native-endian accesses and are the ones that were specced for SharedArrayBuffer. Last I looked, accesses to shared memory via DataView had no guarantees.

@conrad-watt
Copy link
Collaborator

conrad-watt commented Nov 11, 2020

BTW, are we sure DataView is the desired JS reference here? DataView is ... complicated. The TypedArray APIs are generally cleaner and allow for native-endian accesses and are the ones that were specced for SharedArrayBuffer. Last I looked, accesses to shared memory via DataView had no guarantees.

Sorry, I was too offhand with my original answer. The way the Wasm model is set up currently, accesses get their cue from the TypedArray semantics wherever possible (including the BigInt extension), and from the DataView semantics otherwise (both little-endian). This boils down to saying that unaligned accesses always tear, so you end up with these tables (where all tearing is per-byte):

non-atomics aligned unaligned
i8 no-tear n/a
i16 no-tear tearing
i32 no-tear tearing
i64 tearing tearing
f32 tearing tearing
f64 tearing tearing
atomics aligned unaligned
i8 no-tear n/a
i16 no-tear n/a
i32 no-tear n/a
i64 no-tear n/a

AFAIU, aside from the tearing choice (DataView is specifically special-cased to always tear), DataView and TypedArray non-atomic accesses are treated in a uniform way (e.g. in their interaction with happens-before).

@carlsmith
Copy link

Thank you, @conrad-watt, @lars-t-hansen. Your replies are very helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants