Skip to content

Latest commit

 

History

History
1486 lines (1268 loc) · 96.7 KB

CG-07.md

File metadata and controls

1486 lines (1268 loc) · 96.7 KB

WebAssembly logo

Table of Contents

Agenda for the July meeting of WebAssembly's Community Group

  • Host: Google, Kirkland, WA

  • Dates: Tuesday-Thursday July 18-20, 2017

  • Times:

    • Tuesday - 9:00am to 5:00pm
    • Wednesday - 9:00am to 5:00pm all-day GC / managed data discussion
    • Thursday - 9:00am to 5:00pm
  • Location: 6THB, 787 6th St S, Kirkland, WA 98033

    • Room: Banana Seat
  • Wifi: GoogleGuest (no password)

  • Dinner:

    • Wednesday - 6:00pm
    • Izumi
    • 12539 116th Ave N.E., Kirkland, WA 98034
  • Contact:

Registration

Now closed. Registration form

Logistics

  • Where to park

    • Free parking is available outside the building.
  • How to access the building

    • The morning of event we'll meet you at the door.
    • At other times, please call or email the host.
  • Technical presentation requirements (adapters, google hangouts/other accounts required, etc.)

    • Presentations will be done with a Google Hangout.
    • Contact host if you need alternatives.

Hotels

The Heathman Hotel

Nearby Hotels

Agenda items

  • Tuesday - July 18
    1. Opening, welcome and roll call
      1. Opening of the meeting
      2. Introduction of attendees
      3. Host facilities, local logistics
    2. Find volunteers for note taking
    3. Adoption of the agenda
    4. Proposals and discussions
      1. WebAssembly Specification (Andreas Rossberg)
        1. Status update & brief overview
          • What's there and how it's structured
          • Potential meta issues (Markdown scalability, MathJax performance, Latex familiarity)
        2. Discussion & feedback
        3. Discussion of further steps:
          • Hosting
          • Tagging for W3C
          • JavaScript & Web API specs (find additional volunteers?)
      2. Multiple return values and generalised block types (Andreas Rossberg)
        1. Strawman proposal
        2. Discussion: Should we unify/generalise the handling of function types?
        3. Discussion: Should the proposal already include a pick instruction or alternative?
      3. Tail calls (Andreas Rossberg)
        1. Strawman proposal
        2. Discussion: Typing, interaction with host functions
        3. Discussion: What instruction scheme should we pick?
        4. Discussion: Priority, timeline?
      4. Update on proposals from May: Non-trapping float-to-int conversions (Dan Gohman)
        1. Proposal Repo
      5. Update on proposals from May: threads (Ben Smith)
        1. POLL: Use floating-point timeout for Wait?
        2. POLL: Should non-shared WebAssembly.Memory be serializable
        3. Discussion: Any further issues to address?
      6. Update on proposals from May: SIMD (James Zern + Brad Nelson)
        1. Presentation on WebP portable SIMD performance.
        2. Discussion on ramifications of performance results.
        3. POLL: Does the WebP port demonstrate sufficient performance consistency to obviate concern around performance cliffs?
        4. POLL: Does the WebP port demonstrate sufficient performance gain to justify adding integer SIMD opcodes?
        5. POLL: Does the WebP port demonstrate a useful constellation of opcodes around a V128 type?
          • For v128.const, v128.load, v128.store?
          • For v128.and, v128.or, v128.xor, v128.not?
          • For v128.shuffle (v8x16)?
          • For i8x16?
            • splat, extract_lane, replace_lane?
            • add, sub, mul, neg?
            • shl, shr_s, shr_u?
            • eq, ne?
            • lt, le, gt, ge x _s, _u?
            • min, max?
          • For i16x8?
            • splat, extract_lane, replace_lane?
            • add, sub, mul, neg?
            • shl, shr_s, shr_u?
            • eq, ne?
            • lt, le, gt, ge x _s, _u?
            • min, max?
          • For {i8x16,i16x8}.{add,sub}saturate[su]?
          • For {i8x16}.mul15?
            • This would be a 15-bit constant followed by a 15-bit shift multiply (to allow for use of MULHI and the arm equivalent which has a different shift).
          • For i32x4?
            • splat, extract_lane, replace_lane?
            • add, sub, mul, neg?
            • shl, shr_s, shr_u?
            • eq, ne?
            • lt, le, gt, ge x _s, _u?
            • min, max?
          • For i64x2?
            • splat, extract_lane, replace_lane?
            • add, sub, mul, neg?
            • shl, shr_s, shr_u?
            • eq, ne?
            • lt, le, gt, ge x _s, _u?
            • min, max?
          • Narrow ops?
          • Widening ops?
          • Horizontal adds?
        6. POLL: Do the integer ops form a single constellation?
    5. Adjourn
  • Wednesday - July 19
    1. Find volunteers for note taking
    2. Proposals and discussions
      1. GC / managed data support (Andreas Rossberg) discussed all-day Wednesday
        1. Discuss motivation and general design space
          • What can't be done with WebAssembly MVP that this proposal allows?
          • goals & non-goals
          • pros, cons, & issues of implementing user-land GC with WebAssembly as is
          • pros, cons, & issues of adding hooks to support user-land GC
          • pros, cons, & issues of making GC built-in
          • should we provide one solution, the other, multiple ones?
        2. Interaction with threads, especially on the web
          • Is it possible / worthwile to restrict cross-language references to simplify GC problem?
        3. Dive into more specifics as time allows
    3. Adjourn
  • Thursday - July 20
    1. Find volunteers for note taking
    2. Proposals and discussions
      1. Future meeting dates
      2. Presentation: webpack and WebAssembly (Sean Larkin)
      3. Access to DOM/JS Objects
        • Discussion: Can we find a simple way to cover use cases that require Wasm modules to handle DOM / JS objects
          • Past proposals have considered allowing JSObjects in WebAssembly.Table slots.
          • A small set of element access / invocation opcodes could allow faster calling to JS methods + more stand-alone Wasm modules.
        • POLL: Show we prepare a stand-alone proposal to add this facility?
      4. User engagement
        • Discussion: Should we create a user mailing list or forum?
      5. Administration of GitHub
        • Discussion: How should admin access be handled?
      6. Meeting format and dates for Working Group
        • Discussion: How much should be remote?
        • Discussion: How frequently should we meet?
      7. Pure Wasm Threads
        • Discussion: Could we add a way for Wasm to spawn + join threads?
        • Discussion: Should wasm Tables be sharable in this context?
      8. Virtual memory in WebAssembly
        • Discussion: Allow unmapping the zero page?
        • Dicusssion: Allow other sorts of memory mapping? ArrayBuffers etc?
    3. Closure

Schedule constraints

  • Andreas Rossberg will depart mid-day Thursday.
  • Lars Hansen will depart mid-day Thursday.
  • James Zern is only present Tuesday + Wednesday.

Dates and locations of future meetings

Dates Location Host
2017-11-06 to 2017-11-07 Burlingame, CA TPAC

Meeting notes

Tuesday

Roll call

Attendee Affiliation Note
Keith Miller Apple
Gabriel Dos Reis Microsoft
Andrew Scheidecker self
Brad Nelson Google Host
Lars Hansen Mozilla
Bill Budge Google
Mohamed Hegazy Microsoft
James Zern Google
JF Bastien Apple Chair
Alex Danilo Google
Andreas Rossberg Google
Jakob Olesen Mozilla
richard winterton Intel
Jacob Gravelle Google
Rodrigo Kumpera Microsoft
Ben Smith Google
David Wrighton Microsoft
Bobby Powers UMass Amherst
Limin Zhu Microsoft
Michael Holman Microsoft
Saam Barati Apple
Mark S. Miller Google
Leif Kornstaedt Microsoft
Thomas Nattestad Google
Maoni Stephens Microsoft
Louis Lafreniere Microsoft
William Maddox Adobe
Ben L. Titzer Google
Michael Ferris Microsoft
Luke Wagner Mozilla
Arun Purushan Intel
Denis Merigoux Mozilla
Leaf Petersen Google
Daniel Frampton Microsoft
Deepti Gandluri Google
Ed Coyne Google
Rodolph Perfetta ARM
Daniel Lehenbauer Microsoft
Sean Larkin Microsoft
Mike Dunne ACTIV Financial Systems

Agenda Items

Agenda formally adopted.

WebAssembly Specification

Andreas Rossberg presenting.

  • slides
  • JS and Web API aren’t included: separate specs.
  • PLDI paper, initial work for formal description.
  • Small-step vs. big-step semantics for execution; small-step is necessary when threading is introduced

Andreas questions to the audience:

  • how do we ratify?

  • what about spec version?

  • Brad: As we hand off to the WG, there is a versioning scheme that is required.

  • JF: W3C wants linear versions.

  • Brad: there can be overlap (of versions?)

  • JF: Every post-MVP thing should have a unicode icon as we did in the design repo to make it clear which are grouped together. When we stamp out a version for the W3C process, we have the features included automatically.

  • Andreas: This may not be enough for folks, they may want to know that a feature is on track for a particular version.

  • Brad: There are legal ramifications with versioning. Charter says there is a version number every 6 months.

  • JF: once a version has been released there is a time period for W3C commitments.

  • JF: Having feature groups means we can implement in differing order.

  • Andreas: What about the version numbering scheme (naming).

  • Brad: The charter says v1, v2, etc.

  • Volunteers for JS API and WebAPI specs?

  • It is a separate document in the same repo. Current specs are independent from any browser.

  • Brad: we need to be careful: the WCG is for a specific document so we need to make those as an “extension” of the wasm specs.

  • Andreas: any volunteers?

  • Brad: there is a chance Domenic might be willing but need to check with him.

  • Luke: Have a donut document in between, WebAssembly to Web IDL bindings, Dan Ehrenberg has indicated some interest.

  • JF: what is missing from the non JS side?

  • Andreas: Nothing, though there will be bugs

  • Luke: the JS layer should run in any JS shell, the Web layer in any browser.

  • Andreas: we have significant amount of tests but corners of the language are probably not tested fully. Missing negative tests for text format syntax.

  • JF: how do we fix those missing tests? Can we have an outline of what is missing?

  • Andreas: I can try.

  • Ben: it would be nice to have coverage statistics for the tests

  • Andreas: we are also missing tests for the binary format but unsure how to proceed.

  • JF: Can we sync with Dan Ehrenberg?

  • Luke: I’ll start an email thread with a few folks

  • (Domenic will be looped in as well)

  • JF: is anyone using the specs? Andreas doubt it, too recent.

Action item: Luke to start an email thread, loop in Domenic.

COFFEE BREAK

Proposal: Multiple return values and generalised block types

Andreas Rossberg presenting.

  • slides
  • Post MVP feature, semantic discussed in the PLDI paper. Address symmetry issues wrt branches (forward branches can have argument, backwards currently cannot)
  • JF: Is there a point in doing it if it doesn’t lead to better codegen? In short: what does it allow the toolchain to do? It should avoid some spills no?
  • Luke: gave some example where handling C++ features has become easier with this proposal.
  • Lars: It seems like the block case is more controversial.
  • Andreas: I agree.
  • Luke: It seems like it should be straightforward to implement that case for engines that do SSA conversion.
  • ?? (MS person): .Net use this to implement things like ternary operator. It didn’t bring much value and is source of bugs as it is an uncommon feature.
  • Keith: this may be more commonly used as it could result in better binaries
  • ?? (MS person): .Net didn’t have a pick instruction it works from the top of the stack which may explain the limited use.
  • Andreas: Having pick for locals may be useful for the GC proposal, since some GC types cannot be default-initialized, like needed for locals.
  • Discussion about using an JS array for multiple returns, array is the canonical choice already made by TypeScript and there is no upcoming JS feature which will help.
  • Further discussion about importing a JavaScript function that returns multiple values via an array.
  • Andreas: should the inclusion of pick be a separate proposal?
  • Keith: it feels like a genuinely useful operation.
  • Andreas: pick can be seen as a generalisation of dup, but to be completely useful you need a generalisation of swap as well.
  • Ben T: Pick should be separate item, since it requires a more complex type checking in the context of unreachable code (polymorphic stack).
  • Michael F: Does this include pick across blocks?
  • Andreas: That would require block arguments
  • Luke: Could be an option… (not proposing it!)

POLL

Any objection to multiple results for functions & instructions?

None.

POLL

Should this proposal include block parameters?

SF F N A SA
1 14 3 2 1

Dissent:

  • Brad: concerned that there may be quirks around unification, rather would keep it separate.
  • Lars: Not well motivated at this point, need more data about size wins. We can do what we need with locals currently, may be useful for GC, but need more info.
  • Jakob: same, size data.

POLL

Should this proposal include a pick operator?

SF F N A SA
0 0 11 9 1

Consensus to not do it in this proposal.

Action item: Luke to follow up on the impact for emscripten.

POLL

Should multiple results be reflected as arrays at the JS API level?

SF F N A SA
2 12 2 3 1
  • Keith: not obvious what we want, not critical so we should go back to it later.
  • Brad: doesn’t solve all of our problems
  • Daniel L: Doesn’t seem like something we need to commit to now / introducing a type system (per GC) might create more options (e.g., { x: 0, y: 0 } instead of [0, 0]).

SHORT BREAK 🕺

Non-trapping float to int conversion

Dan is not here, JF presenting.

  • Andreas: why didn’t we change the operator?
  • JF: not to change the semantic. It is also useful to have the trapping version.
  • Ben: the concern was not to break the semantic
  • Keith: not to set a precedent of changing semantic so early in the specs lifetime.
  • Andreas: we want to set that precedent — want to allow to define behaviour that previously trapped
  • Ben T: That’s why we made those operators trap in the first place, to allow defining these corner cases later
  • JF: we don’t want to re-have last meeting discussion just to know if we agree with the direction Dan took.
  • JF: the work should continue online (merge pull request). It would be good if this could be merged by the next meeting (Oct/Nov 2017).
  • JF: btw Brad has a pull request to describe the process.

LUNCH BREAK

Tail call

Andreas Rossberg presenting.

  • slides
  • AR: Does anybody have questions about general motivation for Tail calls?
  • David: Do we want it for performance/semantics?
  • AR: Various compilation, optimization techniques are hard to implement without tail calls, other things you can do with tail calls - representation of FSMs
  • KM: In our engine, the instance has some data that it stores right out of TLS data, per instance, if you call to another instance, lexically, the stub needs to return
  • Ben T: Luke talked about disallowing cross-module tail calls.
  • MH: you can allow it without requiring it. Windows X64 calling convention does not support non matching calls. Function that is doing the tail calls has to have same or less arguments. Weirdness around SIMD parameters - only require tail calls when the signatures match.
  • BT: What is being called depends on the outer context
  • JF: It’s not hard to implement that.
  • AR: You won’t address most of the use cases
  • MH: I’m ok not supporting those use cases. :-)
  • SB: You can emulate tail calls
  • AR: Only way is trampolines, which are inefficient
  • MH: That’s what we would have to do for x64 anyway, calling convention doesn’t allow moving frames
  • AR: So you want to hold everyone hostage to the Windows x64 calling convention?
  • MH: Yes.
  • LH: Don’t you have languages that require tail calls?
  • JF: doesn’t F# do this?
  • DW: CLR does support tail calls in the way that Andreas is describing, calls a thunk, sets up padding, and then jumps. It’s expensive in a fundamental way, but it’s faster than you would expect
  • KM: Is it so inefficient that using that operation is never done?
  • AR: What does F# do?
  • DW: This features was specifically built for F#, we built it for them, they just do global analysis and convert their tail calls into loops. In order to get performance, F# stopped using this feature.
  • AR: That only works for tail recursion, though, not for tail calls in general
  • JF: For MS implementation to work well, what would that imply for work from your engine, where would the cost be prohibitive? In the specific trampoline case - can’t look at tail calls as an optimization, it would unlock things you couldn’t do before but it would be costly. Which ones would it just work, and when would you need the trampolines?
  • MF: This would be at the LLVM level or higher. If we have this, then LLVM is going to use it everywhere and that would be bad
  • Ben T: How prohibitive would it be to change the ABI?
  • MH: We can’t do that, it would break all the tooling. Windows would not be happy
  • KM: Stack machines only get bigger, interesting to investigate.
  • MH: If you need a tail calls, only functions with signatures that match, coerce so that functions have the same arguments
  • BT: They could go through an indirection, why can’t these be done in the engine?
  • MH: Will not work for indirect calls, it requires more work at the producer level
  • AR: You can’t do it or it has to be super slow
  • KM: You’d have to have a transformation where you find the function with the largest number of arguments, and convert them all to match.
  • KM: I’d be ok with restrictions on number of arguments
  • JF: What is the best way to figure this out? No matter what we do there is a cost on Edge.
  • MH: Having matching number of arguments would work for us
  • BT: Violates primary use case for functional languages
  • KM: You can have something, optionally parsed section with tail call information
  • AR: Why can’t your engine figure that out? It knows whether tail calls are present.
  • KM: Optional section
  • BS: Wouldn’t that cause problems with Lazy validation?
  • BT: You would have to recompile. Like a C++ builtin
  • KM You can scan the module and find the largest one
  • BT: You’d have to regenerate code
  • KM: Scan for the biggest one, now you know how much space you need to allocate, it’s weird with lazy validation to push the arguments, information is backwards because you don’t know which functions have tail calls
  • BT: Can you just inflate frames? Why does that bubble up to the tools?
  • MH: It’s not just the tools, You can have empty parameters, that would be fine
  • BT: Figure out type of call, and figure out whether you need to inflate the parameter
  • MH: Only functions, that call functions that have tail calls
  • SB: Why can’t you analyze ahead of time to analyze the module?
  • MH: Because lazy
  • SB: Don’t do it lazily.
  • BN: For arguments sake, we’re privileging languages that use this. If we provide just the mechanism that doesn’t allow you to grow the stack that might work(?) Not allow C++ to store its primary stuff on the stack, is it that hard to push the problem on to the tools?
  • KM: Difference between not doing something else, delay fixing the problem
  • SB: Having headers for information already available in the module is awful
  • LW: There’s also streaming compile, capabilities are not limited we just have more upfront information
  • SB: Don’t buy that what you’re proposing is a performance win
  • BS: Conclusion we came to was that we need tail call information in the headers
  • LW: You can compute locally what functions do tail calls, and which ones don’t, computed in the engine when it looks at the functions. One does not trivially add tail calls to ABIs of existing functions
  • BN: Does two bits for function signature work?
  • KM: We can call it the two bit tail call type system
  • MH: That might work, don’t want it to be the default
  • JF: It is worrying, but fixable. What would turn MH ot neutral to explore tail calls?
  • WM: How bad would it be to allow for separate calling conventions, and understand that tail calls are more expensive, doing global analysis within the module is standard stuff
  • JF: It’s a small subset, very frequent and can have huge performance impact
  • MH: It’s punishing our most popular platform, x64
  • BT: What do we do if it’s bad on one engine? Not do it at all?
  • JF: Goal is to increase consensus
  • BT: What is the ultimate goal? We can fix the problem through fixing the implementation in the engine.
  • GDR: The way I’m reading it is that it’s a post MVP feature, microsoft folks can take it offline, and come back and figure out what the options are.
  • DW: ABI, I’m not sure where the policy comes from, but all the tools produce the same ABI, all the tooling is tied to the same ABI, there are ways to solve the problem. Which limitation do you pick? The core stack has to be consistent for MS tools
  • JF: Microsoft folks can take it offline, and come back with more information
  • AR: Exactly the same conversation we had with JS, what is MS’s incentive to work on it?
  • MH: It’s in the JS spec, we have no intention of implementing it.
  • JF: Let’s assume this is fresh, we trust MS to come back next meeting with a better understanding of what they can do. Let’s take a poll, and know MS will vote against, but we trust MS to come back with reasonable investigation
  • GDR: Not a promise to say yes, but will investigate.
  • LW: We have more information in wasm than in JS (explicit vs. implicit) because type annotations.
  • MH+JF+BN: Don’t allow cross origin globals
  • JF: Maybe Ben has a better idea than I do, what do you expect toolchain to do cross boundaries?
  • BT: Indirect tail calls are possible?
  • JF: Say you want to do a tracing jit, you can’t tail call across that - if you jit through that mechanism, we’re disallowing that if we can’t tail call cross modules.
  • LW: If you have any pinned register state, you need to restore that, indeterminate state
  • SB: I disagree, through use of inlined caching it’s possible
  • LW: You can’t guarantee state with pinned registers
  • SB: I think you’re right. (to LW)
  • JF: Take a poll on general idea of tail calls, revisit this next meeting and see if balance has shifted.
  • BS: We need action items to make sure work gets done.
  • WM: Is it possible to provide the simplest mechanism to provide tail calls w/ zero args and zero results, and require using linear memory for stack?
  • BT: How would this work for languages that disable linear memory?
  • MH: I’m ok with exploring tail calls.

POLL

Do we want to explore tail calls more?

SF F N A SA
14 7 4 0 0
  • JF: What are the AIs for the next meeting?
  • Microsoft: Figure out your constraints
  • Everyone else: Be flexible about calling convention
  • MF: What are the use cases? Can we see some data? If 99% is the same pattern we could support that.

Action item

  • Microsoft: figure out constraints
  • Google toolchain folks: Gather data on where tail call would be used, what it would look like.

BREAK 💃

Threads

Ben Smith presenting, keeping it brief as most of the bigger issues have been discussed in the previous meeting.

  • BS: presenting float timeout.
  • JF: What is the range for the timeout? What is the practical maximum
  • BS: Practically, I don’t think there are any concerns - someone calculated it for F64, it was ~years. Range is irrelevant, because it is so large, precision on the other hand needs to be taken into account. AR brought up that we are essentially lying about precision.
  • AR: More about precision, why pretend we have this large value when we don’t in practise?
  • JF: Can we agree that it’s a question of aesthetics - there is no technical reason?
  • BT: There actually is, we’re trying to map between FP time, and integer time.
  • AR: It’s just not a correct implementation. I still care about the range being too large.
  • BS: We already have implementation limits.
  • BT: We could just have one numeric type, double :-).
  • AR: JS has no choice, impls are effectively using an integer - it’s an Int53 and then you start rounding
  • BT: NaN is just a pain.
  • BS: Same point could be made about int64(?) as well
  • JF: Have an op for infinite wait?
  • BS: I did consider that, many of the APIs did not make that distinction, there are other APIs where you can pass infinity as a value. Might be nicer to have one instruction instead of two.
  • JF: When we want to allow interruptible threads, we want it to be able to stop wait.
  • BS: Separate waits - timed vs otherwise seems weird. Negative means infinite for the integer case etc.
  • BS: Do we care about smaller than ns for timing resolution?
  • JF: From an OS perspective, I don’t think that’s possible.
  • MM: No hardware in practice is that fast
  • BS: This has happened to APIs in the past, when smaller granularities aren’t taken into account.
  • (Discussion about timing on POSIX time).
  • AS: How do you define that when wasm doesn’t have it’s own concept of time.
  • JF: Voting strongly for is keeping status quo.

POLL

Use floating-point timeout for Wait?

SF F N A SA
1 1 4 7 3

Consensus: change from floating point (current) to int64.

Dissent:

  • KM: Easy to maintain symmetry between Web APIs
  • (?): If we were to add a WebAssembly clock, we would want that to be an integer.

Memory serialization discussion:

  • SB: I can see copy being useful
  • KM: Copy instead of manually reconstructing memory
  • BS: I don’t see the toolchain being able to do that, you would still need manual setup.
  • JF: vfork is exactly this.
  • (BS explaining memory sharing between agents.)
  • (JF & BN Discussing read only permissions for shared memory)
  • BN: Has Domenic chimed in on the issue?
  • BS: Will follow up to see if he wants to chime in.

Action item: Ben Smith to follow up with Domenic.

  • SB: What is the actual question here?
  • BS: It’s just whether we need it, or postpone to later
  • LW: Copy on write is hard to implement - do we want to think about it right away? We have to do something, so let’s postpone for now.
  • JF: Do we have any users?
  • RW: Potential use case with optane memory - special memory that’s retainable.

POLL

Should non-shared WebAssembly.Memory be serializable?

SF F N A SA
0 0 5 9 0

Consensus: Not explore, unless we have new information.

  • Discussion: Any further issues to address?
  • Future steps: Writing Spec, tests, moving implementation along further - any concerns?
  • JF: Useful to have issues to follow along. Updates on formal memory model?
  • BS: Christian Mattarei is a researcher with Stanford working on memory model standardization, currently implemented a tool that.., large suite of litmus tests, exploring how to integrate with test262.
  • JF: Context on current memory models
  • BS: Once integration into test262 is complete, probably will be able to push forward to define what is needed for the WebAssembly memory model
  • AR: Is it mostly same?
  • BS: Mostly yes, but some differences.
  • AR, BN, LH - Discussion on memory model, weaknesses of axiomatic memory models.
  • AR: At least for release acquire, it’s a solved problem.
  • LH: We have release acquire, we want to allow for racy accesses.
  • JF, AR: More discussion about validity of memory model being worked on, compared to other memory models.
  • JF: Can we have a side note to point to memory model, instead of locking it down in the spec? Practically, we have an understanding of what we want to do. How do we spec it accurately?
  • AR: Specifying memory model itself is useful, even if we don’t block on the memory model, do we plan to do something about it?
  • BS: Yes, but we don’t know who/when. It’s important right now to not block.
  • BN: Who has implementations?
  • LH+LW: We share code between workers.
  • LH: Agents are a compromise
  • JF: It would be nice to semi-formally agree on for the shell, to have a good test suite, maybe in the spec tests.

Action item: Lars to propose a common shell API. Others to discuss.

  • JF: Agent stuff is painful, can’t pass a closure, makes it hard to write interesting tests.
  • LH: We should talk about this in the future.
  • MF: We have a JS SAB, but we don’t do anything more
  • SB, LH, BS - discussion on whether sequential consistency is enough for now.
  • BS: Changing name of WebAssembly.Global? We have some options.
  • (Discussion on potential names)
  • AR: We can’t change the term global, it’s everywhere. All JS Api objects reflect the name.
  • BT: Did we decide they were thread locals?
  • AR: It’s global in terms of scope.

Consensus: Do bikeshedding online.

BREAK

James Zern presenting WebP Portable Intrinsics port.

  • WebP portable intrinsics port
  • JF: What do you mean SSE4?
  • JZ: Forcing default arch to be SSE4, vs arch default which is SSE2 on x64.
  • JF: Just using LLVM?
  • JZ: Yes, clang 4.0.1, last stable release. R14b (Android toolchain) is older, r15b is newer, based on 5.0.
  • JF: What are comparison points?
  • JZ: Comparing to C code, and to arch implementation, but Wasm does not have all the operations we need.
  • JF: Why is Aarch64 slower than armv7?
  • JZ: The relative improvement isn’t as great, I would agree with you, but there’s a fair amount of inlining, but pattern matching on the compiler end is fragile, state of the arm now is that there is a lot of stuff that we’re doing manually - that can slow things down.
  • Post-meeting note JZ: armv7 was run with r14b mistakenly, the numbers were corrected post-discussion.
  • JF: To BN: Did you talk to some other customers who have their own portable intrinsics? Do we have other users?
  • BN: We’re trying to mirror the architectures, we’re trying to address the concern of performance cliffs.
  • JF: Discrepancies are weird, but what’s good is that you get a similar speed up.
  • JZ: Neon, and SSE2 are fairly close as the data shows.
  • KM: What exactly is WebP
  • JZ: Idea is trying to save bits on the wire + brief overview of WebP
  • JF: Could you clarify what 1.4 is?
  • JZ: Compared to C code, 1.0 is C code.
  • JF: Artificially tried to simulate the toolchain?
  • JZ: We’ve left out a bunch of ops, used only the subset of the ones that have been agreed on.
  • KM: What images are you using?
  • JZ: For benchmarking we use a complex image, that gives us good coverage for the codec. Bryce canyon image for test image.
  • (KM+JZ Discussion on picking images for codecs, standard decoding practice is to look for the hardest image decode.)
  • JF: Getting information on benchmarking, running WebP on different platforms.
  • RW: Experimented with data, we have a tool that gives instruction statistics - I can share some data.
  • JZ: What is your test data?
  • RW: It’s an emulator,
  • JZ: What is it running?
  • RW: 1080, hi motion, standard transformer.
  • JF: How much should we care about other architectures?
  • BN: Do you measure on other architectures?
  • JZ: We do have MSA archs, but no hardware. Other products have dropped buildbots for MIPS.
  • JF: Numbers are promising, as CG chair, I want to make sure other architectures are well represented.
  • JF: Do you have numbers for other archs? N9 etc.?
  • JZ: N9 numbers are consistently bad, Samsung performance S5 onwards are a lot closer. Samsung might be another data point.
  • JF: We can’t block this on non-responsiveness from other architectures.
  • (JF+JZ Discussion on N5/N9 performance)
  • MH: We cover x64/ARM, so onboard.
  • JF: What about Mozilla?
  • JF: We don’t want to standardize something without a real world performance.

POLL

Assuming we perform measurements on various x86 and ARM we have access to, and do an honest effort to reach out to other architectures (MIPS, POWER), do we agree that we’ve covered “multiple relevant architectures”?

SF F N A SA
12 8 0 0 0
  • JF: Concerned about measurements on other platforms, but we are signing up to do measurements. We will collect data.
  • JZ: Tradeoff on how much pattern matching do we want vs. how many native ops we need.
  • RW: I emulate, by running with full SSE capabilities.
  • JZ: If you want to see what instructions are being used, don’t use the portable intrinsics, use the native instrinsics like you used VP9 to see what instructions are actually being used and not rely on what is generated in the portable case.
  • DG, JZ, JF, (?) - Discussion on performance cliffs, across architectures, application, images etc.
  • Agree that integer, and floating point ops are separate.

POLL

Does the WebP port demonstrate sufficient performance consistency to obviate concern around performance cliffs?

SF F N A SA
5 8 3 4 0
  • JF: Will follow up with measurements on platforms that matter to Apple / Android, next meeting - new meeting with data that increases consensus, or cripples performance.
  • KM: Having a use case is valuable.
  • SB: This is good for benchmarking in general, not tune to one application. We owe it to SIMD in wasm to have more than one use case.
  • BN: We’ve used this as a real world application, there are many micro benchmarks.
  • BN: Get some inputs on neutrals
  • RW: Only that it’s a point case.

Action item:

  • JZ to measure on other Android CPU ISAs.
  • JZ / BN to try contacting MIPS / POWER folks to perform measurements.
  • JF to measure on Apple hardware.

POLL

Does the WebP port demonstrate sufficient performance gain to justify adding integer SIMD opcodes (assuming there are no performance cliffs)?

SF F N A SA
9 7 3 1 0

Dissent:

  • SB: Just one payload, there is a possibility that it might just not perform well on other payloads.

POLL

Do the set of integer opcodes present in the portable simd port of WebP demonstrate a useful constellation of opcodes? It’s a subset of opcodes already.

SF F N A SA
9 5 0 2 0

Dissent:

  • JF: Concern is that it demonstrates only on WebP, we need another use case. Can be dispelled with measurement.
  • BN: What benchmark? Micro benchmark?
  • JF: Real world.
  • BN: How many is enough?
  • JF: At least 2.

Action item: JZ / BN to gather another similar integer benchmark.

  • Should we Include integer Min/Max instructions as a part of the wasm SIMD Specification?
  • DG: We want these because pattern matching is fragile, exact numbers are hard to get, these are useful to include just because pattern matching is fragile - encodes performance cliffs.
  • No consensus.
  • Explore narrowing, widening ops?

Action item: JZ / BN to come back with concrete proposal of what the narrowing/widening, min/max operations should look like.

Wednesday

GC / managed data

Andreas Rossberg presenting.

  • Slides
  • AR: Cross-heap references (wasm->host heap) require some kind of object management; circular cross-heap references require (global) GC
  • Various: Some work on pure user-land GC with shadow stack in the context of compiling existing runtimes to wasm already done.
  • AR: In the space we're working, safety & security cannot be compromised (performance and expressivity can)
  • AR: It's a huge undertaking to build a good GC and providing GC will make wasm a more tractable target for languages
  • AR: Risk that combining GC and threads will negatively impact hosting platform (JS notably)
  • AR: Risk that needing to support specific languages well will result in substantial feature creep, finalizers/weakrefs obvious example
  • BN: GC tuning becomes a harder problem than now when we tune only for JS, if supporting other languages may feel pressure to do something different
  • AR: User-land GC w/ assist not an easy solution primarily due to security implication of stack walking, thread pool access, thread parking (but also technical issues)
  • LH: Any investigation of GC-less assists for how to handle the host object problem?
  • LW: Not aware of anything. Cycle collection in Firefox.
  • AR: Oilpan is a similar solution in Chrome, but far from easy.
  • Leaf (Dart): Interacting with the DOM really leads to cycles almost immediately
  • KM: How performant does the GC have to be to operate "reasonably well", perhaps relative to a native GC?
  • AR: Anything in the space will be a compromise along all axes and we'll probably not always agree on what's most important
  • KM: Interaction with linear memory - do we think people will use both linear and managed memory? Will this impact the performance of wasm when they use both at the same time?
  • AR: I'm guessing people will use one or the other
  • LW: Really seeing something else now, eg with elm which could usefully make use of both
  • BT: Could easily imagine linear memory being used for i/o and other bulk data cases, with managed objects for the rest
  • BN: Modularity will probably lead toward both being used in the same program
  • XX (.Net): Some experience with this from CLR, where programs use both. Fair amount of complexity.
  • XX (.Intel): Power consumption from GC on a device easily 25-30%, harder to control than power consumption of programs running on linear memory (?)
  • AS: The thread story for managed objects is very unclear
  • KM: Do we have at least a rough breakdown of categories of languages?
    • AR: I imagine at least three families: conventional OO (Java); statically typed functional; dynamically typed. And we need to apply an MVP approach to get off the ground.
  • XX (MS?): Is anything excluded as a goal?
  • AR: Magic interoperability with specific languages, eg universal object models or type systems. Interoperation among languages is something for the translators of those languages to resolve.
  • BN: What about JS in that context?
  • AR: Initially we don't want to provide direct access to JS objects inside Wasm, I expect.
  • AR: A simpler question is, how do we want to reflect wasm heap objects to the JS world? It's a large space and we want to start smaller.
  • Leaf: There's an opportunity to provide good profiling and inspecting API with a built-in GC, ie good tooling support.
  • YY: Eg, object maps
  • Leaf: Object maps would reflect information about the source language via wasm back to the tool
  • BN / BT: Might be useful to survey existing languages and implementations to make sure we don't paint ourselves into a corner (interior pointers, dynamic types)
  • AR: we want to avoid complex things like generic types but instead strive for a simple subtyping system with casts
  • NN: Define new types at runtime might be useful
  • AR: Is this not analogous to jitting functions, just creating a new module at run-time?
  • KM: Variable-width objects aren’t covered by this proposal either. Probably important to understand what potential targets might need for good performance before we go into details
  • AR: There's always the escape hatch of linear memory
  • BT: I don't think that's quite true; eg for multithreaded GC you really must be able to park the threads somehow
  • AR: Agree, there may be some assists needed
  • AR: We should not stop work on other mechanism (eg generalized tables), it's not mutually exclusive
  • JF: there are reps in the room for other runtimes than JS, perhaps they want to speak up?
  • AS: substantial burden on the wasm runtime provider
  • EC: would really like to see this as an optional feature, now wasm is very appealing and simple and has a low barrier to entry, it's appealing to keep it that way
  • AS: there's a general discussion to be had about how features of this type could be made optional
  • MM: can we also make the flat memory optional? eg a componentization of wasm that also extends to memory
  • LW: without a memory section in the module then loads and stores are validation errors already

POLL

Do we (now) think we want to explore the addition of managed types (without ruling out future enhancements to support user space approaches)?

SF F N A SA
15 16 2 0 0

Dissent:

  • BB (Neutral): I guess I'm skeptical that we can do anything that would be useful for highly-tuned runtimes
  • AR: I agree go would be a very tough customer
  • XX (.net): We have a lot of complexity, interior pointers. But we'd like to see where this would go.

Continued:

  • BN: Should we have a general discussion about modularization of the language?
  • BT: Seems like a natural thing to have, there will be programs that never needs managed data, why should everyone have to support the GC component?
  • BN: But it's a natural way to interact with the host environment, notably tools like emscripten could depend on it
  • LW: But there could be a portable mode for emscripten that uses tables, not managed data, for example
  • BN: It creates a risk of modes, which will tend to break
  • LW: I really do believe switches / modes for emscripten would work
  • BT: A core + extensions
  • AR: At least having a framework for this allows us to name and talk about feature sets, as well as requirements for feature sets
  • BN: Probably a higher bar for certain embeddings (eg web would require "everything")
  • AR: Likely
  • Various: Up to interest groups to form themselves to define the subsets / profiles they find useful, Arduino is mentioned

POLL

We should devise separate "subsets" to provide a way to talk about feature detection and compatibility.

SF F N A SA
8 9 11 2 0

Dissent:

  • BN (Against): The utility of what we're trying to build is interoperability
  • MH (Against): Unless there's good reason we should not make things optional

Coffee break

  • LW: Staging? Order dependencies?
  • JF: For MVP we had a set of features we knew would work well. Not so easy now. I'd like to see actual use cases (i.e. languages) running on the tech we propose before we start rolling anything out.
  • KM: Before we go into the proposal, should we figure out what tech we actually need to be successful to support the languages we want to support well? Take Java for example. Very hard to match native perf. What are the success criteria? One thing is "support", abstractly; we can "support" Java. But to do it well?
  • AR: Perhaps language subsets are useful, but unclear exactly what they tell us. Are there simpler OO languages we could try than Java.
  • WM: The proposed type system matches Oberon...
  • BT: Java is a statically typed language but requires many dynamic optimizations, this muddles the success criteria.
  • KM: We need to be very clear about what our criteria are, or there's a risk that users will be confused and disappointed.
  • LW: Perhaps a staged approach; "it will work in v1, it will be fast in v5"
  • XX: Thinking in terms of languages we run is wrong, we should think in terms of applications we run - how much of Java do we need to run app XYZ well?
  • JF: Too bad nobody from Unity in the room, they have some experience with C# in this context
  • BT: While people may compare Java-native to Java-on-wasm, they are likely to compare Java-on-JS to Java-on-wasm instead, a lower bar for wasm.
  • (Tech blip resulted in some discussion lossage)
  • (Discussion about whether CLR/JVM is "static" or "dynamic")

Presentation continues:

  • more slides
  • (Bikeshed about nullability)
  • AR: defining types on the JS side and allowing wasm to import them resolves problems with field naming and type identity (that we get from structural typing on the wasm side)
  • AR (in response to MM): some kind of sealing or branded/nominal typing can be added separately to achieve dynamically opaque types and encapsulation
  • WM: Clarify issues with nominal typing vs. structural typing
  • AR: Complications w/ inheritance model
  • (Discussion about how language-level checks are not directly provided)
  • XX (on video): Should consider whether one (nominal vs structural) can easily represent the other or vice versa
  • AR: Clearly structural can easily represent nominal, the other way around is hard
  • AR: Though the type equivalence algorithm for structural types is a little hairy
  • AR: we're not building an object system, things like vtables are a producer’s concern
    • BT: as a result something like InvokeInterface (from Java) will need a different implementation and will be less efficient (than in a Java native impl)
  • XX: complications in the interaction of structs and arrays
  • AR: yes, a later part of this proposal

Lunch

  • AR: I don’t have a lot of slides left. Maybe present the strawman document. Presents slide “Large Design Space Lurks”. Nesting like C requires inner references. Inner references a little annoying but easy when distinguished by type (fat pointers). Type recursion can be difficult to check performantly. I have implemented a prototype. All can be done but experience may tell us it may be better to restrict.
  • AR: Subtyping may be something we want to support. How rich should we make it? Immutability is tied into subtyping. Without it, no sound subtyping in most cases. We need the notion of immutable fields. Nullability - should we distinguish from the regular type? Then leads to problem of default value for ref types.
  • DW: A lot of things that seem natural for functional languages don’t hold for OO languages. Vtables in C++ immutable?
  • AR: Vtable in C++ is immutable, the reference to it from an object isn’t. Fits the model
  • AR: Universal ref type ‘anyref’ and checked downcasts are needed to express generics and other scenarios the type system isn’t expressive enough for. Such cases will necessarily exist. Interactions with import / export, threads and shareability, closures. Tagged types support for disjoint union types. A possible later extension, not to do now.
  • LK: Implications of design on performance, since Wasm is supposed to load and run fast.
  • (MH): I believe that recursive structural equivalence checking is NP-complete. With Typescript we have a depth limit, and decide that it is not equivalent.
  • AR: problem in full generality is equivalent to checking equivalence of DFAs
  • AR: Worth thinking about restrictions to avoid expensive worst cases.
  • KM: Hitting the depth limit may be allowed to convert to anyref.
  • AR: Then different implementations may check differently, leading to undesirable differences in behavior.
  • SB: Can you expand on what are the hard problems.
  • AR: When you have arbitrary mutually recursive types, you can think of it as a graph. It is the same problem as checking that DFA are equivalent. It’s like graph isomorphism, but modulo unfolding / minimization.
  • MM: Question - When I implemented something similar, it was simpler and less expressive but when we allowed them to be cyclic, we had a straightforward, fast unwinding algorithm that worked.
  • AR: With one type node and n-ary children, you can already build a general graph.
  • MM: If it is just the problem of infinite … this should be a solved problem, and can be implemented quickly.
  • MH: How to compare structural types?
  • AR: One standard approach: Minimize graphs, then compare the minimized graphs. Minimization is expensive.
  • MM: I think I can explain the algorithm I have in mind quickly. Not sure if now is the right time. Maybe offline.
  • AR: Yes, one remark: not just type equivalence, but subtyping, which is more costly. Corresponds to DFA inclusion
  • MM: I accept that the algorithm I had in mind does not cover this case.
  • DW: What don’t you like about inner references.
  • AR: If you ask GC people, I think they would not be too happy to implement…
  • BT: Fat pointers aren’t too hard, just two words. Otherwise you have raw pointers in the middle of objects.
  • MS: Either you put the burden on the GC or the guys who use inner refs.
  • DW: In the CLR, inner references can only appear on the stack, not in the heap.
  • AR: That simplifies the problem a lot. Fat pointers are probably strictly more powerful.
  • RK: In CLR, fat pointers can’t work because you can pass by reference, and the object could be on the stack or the heap.
  • AR: If you want to map a language with inner refs to this, you’d have to solve the harder more costly problem.
  • RK: Implies boxing everything.
  • AS: When you say that nesting requires inner references, is that because get_field works a particular way. If you have get_field that specifies that it needs a inner pointer?
  • AR: Just to support nesting, you might have “second class” inner references. What does get_field mean in a struct. Try to contain them so they can’t escape somehow. Fat pointers are just a way to do this.
  • AS: You could have get_field/set_field as a single operation with a full chain of fields.
  • BT: You could have a pointer to an integer field so it would escape.
  • AR: When you have arbitrary nestings of arrays and structs, implies a very fat instruction.
  • Leaf: Inner pointers vs. arbitrary nesting is orthogonal. Having arrays embedded in structs would provide that with limited functionality.
  • AR: But it doesn’t give the full functionality. Some people want that. Go is an example.
  • Leaf: Limited functionality vs. no functionality, I’d prefer limited.
  • AR: If design allows them, VM can decide how much to optimize.
  • XX: Your JIT can then take inner references and optimize them away if they’re local? I just want a reference to an array, so I need an inner reference for that?
  • LW: For nesting, can you embed variable length array in struct? Maybe at end.
  • AR: You can distinguish fixed sized arrays vs. variable sized arrays, leading to notion of flexible type. Can be generalized recursively for structs ending in flexible type.
  • LW: Perhaps allowing one special case (variable length array at end of struct) which is common is enough.
  • RK: People are assuming linear and managed memory will be disjoint. Say you want to implement memcpy. May need to copy between the two kinds of memory.
  • AR: One thing that is missing from slide: tagged integers as a reference type. When you have anyref, it could be a reference into the heap or a tagged integer that is an address of linear memory.
  • MS: So it would be up to the implementation to do a dynamic check to see which it is.
  • DW: For C# / .NET we would use tagged pointer. Not a complete solution, since you can change ref to real pointer in fixed blocks. Seems like a limitation for .NET but maybe not a good idea because of safety issue. Might not be possible using provided GC. May not want to go there.
  • AR: Yes, I have no idea how we could possibly allow that under the given safety constraints.
  • DW: You could do it with GC completely in user land. An interesting limitation for .NET. The concept of inner pointers is critical for implementing .NET. No reason not to have them be a different type.
  • AR: One subsumes the other, any regular reference can be converted to an inner reference.
  • DW: We’d need an interior ref to the middle of a struct.
  • AR: Where do we go from here?
  • LW: Are we allowing typed objects to be exposed to JS in MVP?
  • AR: You can always use an indirection instead of nesting, it just costs you performance. The JS API would have something that resembles what ES6-abandoned typed objects were. Hopefully something simpler than some of those proposals.
  • BN: Nulls are needed for .NET?
  • DW: Yes. Everything can be nullable. A requirement for us. Non-nullability is a subset. For us nullable refs are fundamental, we can represent nullable types on top of that. Might start with only nullable?
  • AR: If we do that, then some of the new instructions will have too general types later. Esp for locals, we might want them to be non-null.
  • LW: If we know that impls will make null checks free…
  • AR: Yes, then it becomes less important.
  • BN: We have contexts where that may not be possible. We may not be able to have signal handler control, say WebView on Android.
  • AR: So are there features on the list that we shouldn’t have, or ones we should add.
  • BN: Finalizers!
  • AR: Right, definitely a post-MVP feature. Need to survey and consolidate the design space carefully.
  • RK: Introspection / Reflection?
  • AR: Should be a language implementor concern.
  • RK: Without support, we would have to thunk everywhere around the struct / array types.
  • BT: Reflection could also be a part of the embedder API
  • AR: I too am scared about the length of the list.

Break

  • LW: We’ve been talking about three topics interrelated. What are the gating criteria for an MVP for GC support. Can we get implementers to be “guinea pigs”?
  • AR: Also, what are the features that should be in the MVP?
  • JF: One thing we can do is see if we have volunteer implementers here today? Call for Guinea Pigs? If we do that we should set a minimum bar for the feature. May have to do this outside of meeting.
  • LW: So first, what are gating criteria re validating the design?
  • BN: Tomorrow, I’m hoping to put short term stopgaps on expanding tables on the agenda tomorrow. Will enough folks be here tomorrow?
  • JF: Also Sean will be presenting wrt Webpack.
  • MH Talking about TypeScript
  • MH: We would be interested in being guinea pigs to integrate between WebAssembly and JS. We would have a restricted subset of the TS language, WebAssembly within a module boundary. I’d love to start emitting out Typescript into the format of the proposal. Problems I see are tagged types. This is what’s off the top of my head.
  • LW: How can Typescript use GC types if Typescript doesn’t have a sound Type system.
  • MH: TypeScript attempts to model JS type system, intentionally looser type system. May be able to be stricter for WebAssembly. Some of the things we’re adding - stricter readability for types. Typescript allows more down casts than would be possible in Wasm. Restrict these.
  • XX: Are there threading concerns?
  • MH: I haven’t thought about it. Not an issue since Typescript like JS is single threaded. Find a subset of the Typescript / JS language that targets Wasm.
  • XX: Avoid wrapping host objects by keeping them in JS?
  • MH: You could have a shadow DOM implementation in WebAssembly. Adding a managed data system on top of what currently exists would be useful.
  • BP: Expectations about what the performance would be in Wasm vs. regular JIT.
  • MH: It’s unclear, we haven’t done any investigations. The plan we had is to implement a GC in linear memory, but it seems like that won’t really be possible. The compiler can verify that some objects can live on the stack, so it may be possible to make it work. Possibly better than what we have today, we’ll have to see.
  • DL: Do you have runtime requirements. String type support?
  • MH: I’ll have to think about it more, do you have something in mind?
  • DL: String runtime may be very large, and non-viable
  • LW: Maybe we should expose JS strings? They’re not great since they’re UTF-16 code units, but they’re built-in… :-)
  • JF: ICU.wasm
  • KM: How does this work? How can WASM code interact with such a type?
  • BT: Do you plan to write up plans about extending tables for object references?
  • LW: Discuss tomorrow. Dovetails with that discussion.
  • MH: That’s it.

DW presenting for .NET

  • DW: We’re most interested in can we represent what you can do with function objects, re value , ref, out params. And our BCL which is huge. Is there a use case that doesn’t imply loading that for any usage. Whether we’ll have the answer by next meeting is unclear. We are certainly interested.
  • LW: Clarifying that it’s not needed for the next meeting, just over the arc of implementation.
  • DW: Can we take some meaningful portion of say ASP.NET and make a demo that does something interesting. But this is very heavyweight for a simple web page.
  • KM: How dense is the .net framework standard library. If you don’t use it all, how much can you strip as unreachable?
  • DW: You can remove a lot if you take out reflection. Unfortunately, lots of stuff uses it. It’s hard to strip things down.
  • BB: How about Silverlight?
  • DW: Silverlight was very heavyweight, there could be a big WebAssembly blob, which is equivalent to Silverlight, which could be cached by browser. Silverlight apps were very small.
  • LW: Currently we just have indexeddb for sharing WebAssembly modules, implemented in FF.
  • KM: Many clients want to control everything, leading to many copies of similar large runtimes.
  • DW: This is why I’m not super enthusiastic about this.
  • LW: Couldn’t there be a CDN that caches all of these, or perhaps foreign fetch?That might be a reliable way for us to cache these things.
  • RK: Parallels with the mobile app ecosystem.
  • DL: CLR has safe subset of IL. Would that restrict apps too much? Would you want to verify that app only uses safe subset.
  • DW: You could verify ahead of time, at compile time. It is designed to protect for code escaping from the sandbox. It depends on what the threat model is. We’re running in a sandbox, in those cases, in some examples, that is enough. How much value do you get out of that sandbox, vs another… it depends.
  • MM: How do you make use of fine grained protection? There’s value to this. Multiplicative benefit in terms of robustness as well as security.
  • DW: Do we want to support a verifiable subset of app code? When the write C# and don’t use the unsafe switch, then you effectively have a verifier, as long as you trust the sources. You could run a verifier at AOT, you could apply as many as you require.
  • MM: An example of fine grained protection - Google Earth. Takes third party plugins for geospatial calculations, using object to object protection for safety.
  • DW: We’re requiring safe subset verifiability with compiler switch.
  • RK: At least for mono, there is a big chunk of unsafe code which is native interop. We view it as a C# inside a larger C++ codebase.
  • DW: There are multiple .NET teams at Microsoft with varying requirements. Our team is mostly concerned with making apps work. Maybe Rodrigo can talk about Mono perspective.
  • LK: Would you be interested in supporting AppDomains in Wasm?
  • DW: We’ve been trying to step away from AppDomains. For MVP, no.
  • XX: What are you stepping away from AppDomains?
  • DW: It’s a process model -- an extra process model that makes things more complicated. It’s unclear how much it helps; it hurts because they have to deal with the consequences of the AppDomain. We hope they need it less. Legacy still needs it.

RK Presenting for Mono.

  • Right now at Mono we’re actively working on a port. Not using AOT compiler. Mono doesn’t work well without runtime compile. We’re most interested in interoperating rather than bringing a new class of app to browser.
  • JF: What in the current model doesn’t work for you?
  • RK: (?) For us, we probably bake the basic types.
  • JF: You don’t know the types in advance?
  • RK: We’d bake the System.Object type, but not other types which we’d figure out at runtime. This is important for importing our existing code. We don’t bake them in ahead of time.
  • JF: You can’t just do a prepass to figure out the types and then bake it?
  • RK: Generics and value types mean layout is unknown until runtime. Lots of exotic cases, so shift work to runtime to simplify.
  • JF: One of the assumptions of WebAssembly, is we’re OK pushing the costs to producers, so the work is not on the client.
  • RK: Can’t be complete before runtime. Concern with increasing size of download. Generics specialize types, but not code. So not as much of a compile burden. But need to bake the type at runtime, so essentially creating something equivalent to vtable but perhaps slower and slightly larger.
  • LW: Are your experiments using linear memory and your own GC? (RK: yes) And would you use GC types if they were available?
  • RK: We’re emphasizing smaller download vs. execution speed.

LP presenting for Dart.

  • LP: we don’t have anything concrete now. Google uses Dart for some large web apps. We’re concerned with loading performance, we have compile issues. Wasm is promising to improve startup speed. One of our clients is very interested in DOM / JS interop. Good movement at that boundary is exciting in Wasm. Threads are not an issue for us. For mobile, Wasm may allow sharing of code. Our allocation pattern is to generate a lot of garbage quickly, so this would be important for us.
  • KM: It’s not uncommon to do that in JS either.
  • LP: Lots of people interested in exploring Wasm.
  • LW: Are there any interesting language features that pose challenges?
  • LP: Dart is changing in ways that make Wasm targeting easier. However, you can fall into a dynamic style that would be challenging. We would need to reify type information, since we have generics. We do type introspection, so there might be redundancy with what the runtime would be doing.
  • It’s not that unusual of a language, it’s class based. It has async-await. There’s nothing that unusual. Integer performance is… we’d love to move away from having to rely on 53-bit integers. I don’t see anything too hard yet.
  • LW: Async-await is a question. What would you do there? Or perhaps enough web integration to do promises.
  • KM: Your function can just have a switch at the top… it would have to be implemented...
  • LP: We have translations to JS to implement this in various way. So we could handle this.
  • LW: At some times, JS may be a faster compilation target if objects are very dynamic.
  • LP: No, can’t switch to very dynamic object in general. You have to start from AnyObject. So we know what is going to be dynamic ahead of time. We do have to generate interfaces for dispatch but we already have to deal with that so no Wasm concern. Dart type system is sound now.
  • LW: Any other languages?
  • JF: Sean will be back tomorrow.

Discussion of GC MVP.

  • LW: Next, discussion of MVP release criteria. Two axes - how do we validate this design? Are there two different languages that use this that aren’t toys.
  • BN: Three categories of languages.
  • AR: OO, FP, something dynamic.
  • BN: How do we decide what’s a reasonable subset?
  • LW: At least one language in each category that is to the same level as Emscripten w/ C++?
  • AR: I’ve talked to OCaml folks and they’re interested in using Wasm as sandboxing technology for Docker. That may be a real language.
  • LW: It would be good if it were something real, that people could start using and writing web apps with.
  • AS: I like earlier suggestion that we look at applications rather than just languages. So perhaps a VDom example, or something like a game.
  • LW: For MVP, we didn’t just have emscripten, we had unity ….
  • AR: These are different axes, we want to ensure that we can map different languages, but also that we can map different applications.
  • KM: I’m not worried about GC workloads. That exists with Javascript already. I see apps verifying that the object model doesn’t lead to performance burden. I’m not overly concerned since browsers have robust GC.
  • AR: I agree with that. GC probably not an issue, but should keep an eye on the cost of casts and associated runtime checks. Similar situation as with call_indirect earlier.
  • KM: Another interesting thing: if you can compile a language to Wasm, the implementers will know how common things are and may have intuition that approach is feasible before getting apps. E.g. ‘delete’ performance in JS isn’t concerning since it’s not that widely used - this is well known to browser implementers.
  • AR: Basically what you’re saying is that feedback from the language implementers is useful.
  • KM: Feeback from language implementers is more valuable than having apps.
  • LH: One aspect of that -- whether using the GC system is significantly better than just using linear memory.
  • AR: Ideal if we had candidates who try both implementation approaches.
  • LW: Maybe mono already is doing this?
  • RK: Yeah.
  • WM: We are looking for people to try this out. Is there any implementation prototype?
  • AR: I’ve started reference implementation in interpreter.
  • LW: We could take the reference interpreter, compile it to JS. Use this to test.
  • AR: Interpreter uses BuckleScript which doesn’t support Ocaml’s Bigarray type yet. So linear memory would be problematic for now.
  • LW: Doing this in JS, we could have a polyfill that would at least be useful for validation of semantics.
  • AR: It might be possible to hack something up there for people to test.
  • SB: Might be helpful for gauging the ergonomics of the type system. Less useful for evaluating difference between linear memory and managed objects.
  • KM: It may just be better to compile to JS.
  • AR: Maybe a first step before building something more serious.
  • SB: There is benefit to assessing performance early on.
  • AR: That requires doing design early on.
  • SB: Performance is so important that we should do that earlier.
  • LW: It takes a lot of up-front investment to take these steps. We need to give people confidence that they can do it to test, and it won’t change under their feet. The polyfill might help with this.
  • BT: We might need to validate the design across multiple implementations.
  • LW: Multiple languages, multiple applications, multiple implementations… 3 axes. Does this sound about right?
  • SB: What would be the plan for mapping other languages on this proposal? How do we do this? Do we evolve this with other language implementers?
  • AR: We have the repository for the proposal where we can iterate on the design and reference interpreter. Just the normal process.
  • SB: What you said earlier about mapping OCaml onto this. It would be useful to judge the ergonomics.
  • BN: It would be nice if OCaml or someone burning to get stuff on the web.
  • LW: If we can build this polyfill, then a few people will be eager to experiment.
  • AR: Also the Facebook guys. They are building applications in OCaml/Reason today. There’s a chance we can involve them as well.
  • LW: So next step is creating a polyfill.
  • AR: What are the MVP features we want to focus on?
  • LW: Andreas’ list presented earlier. All of them except nesting and internal pointers.
  • MH: Do we need multi-threading?
  • LW: Something to talk about. As a high level sequence we won’t deal with threading first. So for MVP, we just won’t have shared references.
  • BT: So shared globals can’t be reference types.
  • MM: I would be in favor of that long term -- there are some languages that cannot be embedded and use GC. In JS I liked that it was constrained to pure data.
  • SB: Question to Mark: What’s wrong with the general heap and why should that restrict Wasm in the future?
  • MM: Let’s not get into it. :-)
  • LW: Are you worried that in no possible future are JS objects shared across threads but Wasm allows this?
  • MM: The problem is that, conventional shared-memory multi-threading has an inherent data race problem. We’ve confined to bits. Once you have in heap, what locking mechanisms? Calls through methods, unit of locking? Tony Hoare monitor thing? Infinite design mess. End of the day, it is a corrupt and unrescuable way to do concurrency anyway.
  • Laughter
  • LW: Can we move this forward to put this in the same bucket as workers.
  • MM: One particular nightmare problem - if in a data race you fetch or store a shared pointer, how to determine which pointer you fetched. JS avoided this. Forces you to do another difficult round of engineering which you could avoid.
  • AR: We should realize that for the MVP, if we exclude threads we can’t implement full Java. I’m fine with that, but pointing it out.
  • LH: I want to point out we can’t ship an MVP with GC without a plan for concurrency. Like we did for the original Wasm MVP.
  • LW: So we have to have a plan for this. But agreement that we don’t ship this in GC MVP.
  • DW: We can’t do .NET without nested structs and interior references.
  • AR: Just use indirection.
  • DW: Question is how to reference nearby fields without interior ref. Discussion with Andreas, pointing out a possible workaround.
  • AR: So may be a way to avoid nesting in MVP.
  • LW: Closures is another question mark?
  • AR: Certainly not in MVP. So everything in the list except nesting, closures, tagged types. *LH+AR - Discussion about tagged types
  • AR: One thing we talked about -- supporting named access fields. If you want to be usable, you need to have a somewhat richer notion of type imports.
  • LW: Problem of creating a DOM node in Wasm.
  • AR: Exclusions from MVP: threading support, nesting, dynamically abstract types, closures, and tagged types. We have to include tagged integer types.
  • DW: I don’t think I could get a performance measurement without it. There are so many structures on the stack, if code is expected to be no-allocation is now doing allocation for many fields.
  • AR: How much nesting do you need?
  • DW: Theoretically you can do everything. Closest example -- generics in .NET are reified. We can’t predict every instantiation, so we have ability to generate generic code that boxes all the locals. That runs about 10x slower.
  • AR: How are locals relevant to nesting?
  • DW: So, the locals are related to aggregate types. Because of the way they’re passed around, we would probably have to allocate them (for closures).
  • AR: Agrees, this can turn fast stack allocation to slow heap allocation.
  • DW: For MVP, it would be useful just to see if it works.
  • BN: To be clear -- finalizers are out, weak references are out.
  • AR: Definitely; a rathole of questions, I didn’t even list that.
  • LW: Any more comments on release requirements for MVP?
  • BT: What about timeframe?
  • LW: Personally, I think we’ll be in the high level design space for at least a year, another year to implement something so at least two years.
  • BT: I wasn’t expecting much movement until the end of the year.
  • JF: Maybe a more public timeline. Apple doesn’t give timeline but the CG can give something for “guinea pigs”.
  • LW: posting something publicly, so people have realistic expectations is useful.
  • BN: We’ll have to renew our charter before this timeframe.
  • LW: At the very beginning of wasm we had the same level of specificity.
  • BN: Something in the DOM / JS space will certainly be needed.
  • BS: We kind of threw out stack-walking approaches. There’s a lot of complexity here - are we still not interested now that we see.
  • LW: A generally useful feature - stack walking - but it’s a question of priorities. It is also a hard problem.
  • BT: How much of that lives in the embedder?
  • LW: Depends on having a use case. If GC isn’t a use case, then reduces the priority. Unity should be able to use shadow stack in user land.
  • AR: I know of no successful design for this.
  • BN: The other approach is provide what the hardware provides. Not a stack walker but ability to examine memory.
  • BT: I think stack walking is much closer to debugging than GC.
  • BN: As for timelines, the sooner we can find compelling use cases the faster this can happen.
  • AR: When you have tables, you need some amount of GC. How do threads complicate this?
  • LW: Yeah, if we want to share tables… if you want to keep your JS GC independent. It’s nuanced and complicated, we shouldn’t need it until we get to pure wasm threads.

End of GC discussion, discussion about topics for tomorrow.

  • JF: We have a meeting at TPAC which will be just a chance to meet and discuss Wasm. In November, Burlingame. Brad will run this, so if folks have ideas. It will be like a Fosdem.
  • AD: It’s up to the individual groups. CSS runs actual work... it’s up to the chair.
  • JF: I won’t be there because of another commitment. We’ve been asked to follow the TPAC format rather than doing our own thing. We don’t have a venue for a CG meeting which we should have around Oct Nov. Any companies willing to host a meeting before the end of the year? We need at least one more iteration to finish up threads. We are still going to have biweekly calls. In person meetings help move things along.
  • BN: For tomorrow, do we have the right folks here to discuss ES6 Modules?
  • JF: I’d like to discuss but couldn’t get Domenic to volunteer for this. Background, modules are a way to integrate stuff dynamically. Wasm modules could fit into this. We need a champion
  • LW: What about script-type=module?
  • Discussion of Modules and Wasm Module concept.
  • BN: Another topic - any interest in going beyond 32 bit address spaces?
  • JF: Right now we need to figure out how to work in 32 bit. Some very large apps that should be smaller.
  • BS: Memcpy?
  • DIscussion virtual memory as a topic tomorrow?

Thursday

Webpack in WebAssembly

Sean Larkin presenting.

  • Slides
  • Overview of webpack
  • KM: Will webpack have toolchain integrated to compile sources ?
  • SL: Expect to abstract the toolchain inside a loader. The loader will contain everything it needs to compile the correct type of sources to WebAssembly.
  • BN: How will you do the packaging?
  • SL: Will package loader as npm module which can contains sources to be compiled once installed.
  • LW: Compile the toolchain to WebAssembly and use it to compile the modules.
  • JF: Emscripten is targeting cpp for the web and cpp developers.
  • SB: How will a javascript developer use cpp?
  • SL: The intent is to take the cpp sources as input and pass in a loader to compile it down to WebAssembly and pass the result to javascript.
  • SB: The current solution is Emscripten, but it is not lightweight enough to achieve this goal at this time.
  • BN: Are you scoping to single cpp file?
  • SL: We would like to have incremental compilation for single file.
  • KM: Right now it is much more expensive to call across WebAssembly modules, what would be the “calling convention” to call between different module coming from different source types.
  • SL: There will need to be a javascript layer between modules to determine how to call the different modules.
  • KM: If I have a very simple sources with very few functions all the different modules would have to pay the cost for calling in and out.
  • LW: What we’d like is to bundle all the sources from the same type, cpp for instance, into a single WebAssembly module.
  • BN: The user could describe how they want to bundle the webassembly module.
  • SL: What we really want is for javascript developers to use webassembly module like any javascript module.
  • KM: Suggestion: every module could initially use a layer to call different module and there could be a compilation step that tries to strip the layer and bundle the modules together to avoid the extra cost of calling cross modules.
  • SL: In webpack we have a way to split bundles to asynchronously load different parts of the bundle and this could be applied to WebAssembly modules to split/join different parts.
  • SL: Issues with WebAssembly labels on the webpack repo.
  • End of Webpack in WebAssembly.

Host bindings

LW Presenting Host Bindings.

  • slides
  • MM: Is the overhead you’re worried about the overhead of collection itself?
  • LW: Currently every thread would have its own GC.
  • We could recognize native functions being passed as imports and simply use them directly instead of calling the javascript version.
  • With GC, we could recognize as specific type that represents a DOM node and check if an import is a DOM api and call it directly.
  • Keep references of objects from the host and keep them in a table.
  • KM: With the thread proposal, would table be shared?
  • LW: No.
  • BN: The JS binding section has to define all those new types which are like function types.
  • XX: This would require multiple tables with different types ?
  • LW : Yes.
  • MM: How would you pass other hosts type in an FFI ?
  • LW: We would need a type to represent a host object and not just primitive types.
  • MM: The GC needs to be aware of these opaque references in WebAssembly.
  • KM: The table already gives that functionality as it needs to keep functions.
  • XX: How do you free objects in the table ?
  • LW: Yeah, you would need a method to null an element in the table in order to release it.
  • BT: How would you feature detection ?
  • LW: This could be polyfilled with javascript functions

End of slides.

  • BN: I like doing this as a section. It gives us a way to test and iterate on it.
  • KM: We need to make sure we have a good way to feature detect it.
  • BN: What we really need is multiple table and use types.
  • SF: How much do we want this proposal to move forward with/without a web perspective?
  • LH: I think we should try to make it work for JS and then see how we can support more hosts.
  • SF: What I am saying, we should conscientiously say are we going to do JS first then do the others later.
  • SB: How would you feature detect this ?
  • LW: We could compile little things and branch depending on the result.
  • JF: you can feature detect with new WebAssembly.Table({ type: “notanyfunc” … });

POLL

We should develop the "Language Bindings section" version of this proposal more.

SF F N A SA
10 10 0 0 0
  • LW: The tools would have 2 modes, emits the javascript implemented version or use the new section.
  • SB: Why implement a javascript version when we can already do this?
  • KM: This could be seen as an ergonomic prototype.
  • LW: This can also be done to check how the tooling could utilize the feature.
  • SF: This is for a user perspective, we want to see how a user would want to use this, then as WebAssembly implementer, we could remove the javascript bloating to make this work efficiently. It gives us the opportunity to determine if users want this feature.

POLL

We should develop a prototype JS implementation and focus on making that a useable transition strategy.

SF F N A SA
7 13 1 0 0

POLL

We should focus on JS use cases for now, and assess how general the solution is after we have fully explored JS.

SF F N A SA
8 11 2 0 0
  • SF: I think going that way is fine.
  • LW: For non-JS scenario, we could create a generic bindings section with opaque references.

End of host bindings presentation.

User engagement

Andreas Rossberg presenting.

  • AR: Where do we want to direct people to ask questions about WebAssembly? Right now, we point them to github and this is not appropriate: having to open issues raises barrier, causes admin overhead, and will just spam the issue tracker when community grows.
  • LW: After the question has been answered, they don’t always close their issues themselves.
  • AR: We should have a dedicated users mailing list.
  • SF: I would like to direct people to stack overflow because it is easier to search and find similar questions.
  • I don’t want to direct people to public CG WebAssembly because every time people start asking “irrelevant questions”, I see people leave because of this.
  • I think that if we want people to do Q&A we should have something specific for that.
  • BN: Should we have different notification list?
  • JF: I will look to restrict public WebAssembly or announcement only and have another channel for Q&A and direct people accordingly.
  • AR: All posts on announcement list could also be forwarded to users list.

Action item: JF to create a moderated announcement list for all CG members, and a users list which upsers opt-into.

Administration of GitHub

  • BN: Can we broaden our admins to have more people be responsive to questions.
  • JF: I think we should have a list of admins so every admins are alerted of questions.
  • JF: W3C has guidelines about github contributions, after a certain amount of contributions, you get additional privileges.

Action items

  • When a repo for a proposal is created add the champion as admin
  • Keep limited admins, folks asking for new repos should route through other companies
  • JF / BN: Cleanup existing admins on projects

Meeting format and dates for Working Group

  • BN: On the last steps in order to create the Working Group
  • overview of the WG process
  • Discussion: How much should be remote?
  • Discussion: How frequently should we meet?

Virtual Memory Support

Ben Titzer presenting.

  • BT: No slides, but we’ve often discussed being able to unmap the zero page. Useful for trapping on null pointers, for JVM or C++. Or we might want to implement mmap for non-Web embedders.
  • JF: madvise?
  • BT: madvise: I don’t need these pages at all anymore, so the implementation will tell the OS to unmap. Could be useful for GC.
  • Memcpy, memmove: doesn’t require virtual memory, but could be possible to perform better w/ virtual memory.
  • WM: Sounds like a good idea but some embedded systems might not have the hardware to support it.
  • LW: We should bucket since some are easy like memcpy, but virtual memory stuff may be more difficult. Memcpy is pretty hot in profiles, so we may want to do sooner. Madvise dont_need could be used immediately as well for allocators.
  • BN: Interaction with ArrayBuffers.
  • MH, KM: I’ve seen memcpy also very high in performance profiling. Implementations sometimes don’t do much to optimize beyond a little unrolling.
  • SB: Question about profiling memcpy. Is it system function or library?
  • BS: Relevant issue
  • Relevant future feature

POLL

Should we explore native support for memmove/memset?

SF F N A SA
13 2 2 0 0

Dissent:

  • JF: seems fine but would like to see a compelling case (believes there is, but want to see). (there needs to be some proof that this makes a difference)
  • SB: What are some of the use cases?

Discussion:

  • BT: Read-only memory.
  • KM: 4G memory hack… this may be difficult on OS without signals.
  • BT: May need an inline bit table check.
  • WM: This will have ASAN-level overhead on non-MM ISA.
  • SB: ASAN is way more expensive than this.
  • WM: May be significant cost on systems without virtual memory.
  • KM: JSC is just a framework, application owns the signal handlers. Not nice to hijack from application.
  • BN: Can you give the application the option to opt-in?
  • KM: Current system will fall back to bounds check if you can’t opt in for privileged apps.
  • We don’t want things that aren’t portable, some environments can’t do this.
  • BN: Can we provide some of this behavior with multiple memories.
  • KM: Multiple memories have other problems, can’t optimize the same way we can currently.
  • SB: Question: if user implements it via bit map, can we do a better job than them?
  • BT: I think so, implementer can skip one of the bounds checks.
  • SB: What does Emscripten do currently?
  • All: it skips memory, but writing/reading is legal.
  • LW: Is there a way to separate zero-page check from more general case?
  • BT: Let’s look at result of poll.

POLL

Should we explore adding virtual memory operations to WASM (madvise, zero page protection, read-only pages)?

SF F N A SA
4 11 2 0 0
  • MF: shakes his head
  • LL: doesn't quite see the need for this yet

Discussion about mmap:

  • MH: Windows doesn’t have support for mapping memory into already reserved memory.
  • LW: Current discussion about mapping into already shared memory -- some new memory, so it requires multiple memories.
  • JF: Older windows mapping/unmapping not really possible right now, probability of snooping attacks
  • LW: Could have fallback to copy if read-only or COW. Could be a nice way to get file contents into memory.
  • WM: If you’re on a platform that doesn’t have MM hardware, you may be better with segment oriented semantics, via multiple memories, rather than mapping memory ranges.
  • BN: Danger is people assume it’s backed by virtual memory
  • LW: It would be backed by virtual memory, but they’d assume it’s cheap.
  • KM: Multiple memories, will not be performant because JSC pins base address of memory, multiple memories mean you have to find the base of the buffer.
  • WM: I’m suggesting if you have multiple regions of memory treated differently, bounds checked separately -- you’ll have to fetch that info and cache it somewhere -- in registers. More explicit, client more responsible to notifying that it’s touching this area vs. that area.
  • BT: Higher level question about this: how much do we care about platforms that don’t have virtual memory.
  • WM: How much do we care about embedded platforms? It could be running with RTOS, HW MM support even if present may not be visible.
  • BN: tethered to the web embedding?
  • BT: Potentially gate based on platform like SIMD, gate on capabilities of platforms.
  • WM: Seems reasonable. But then these things need to be decoupled. Trap on null, may be more easily simulated in software rather than fancy memory sharing.
  • BN: If you’re not overmapping, you might get better performance if you don’t assume you can allocate large chunks.
  • LW: That’s the concern I’ve heard -- people assume it’s O(1)...
  • BT: 3 levels of cost: zero page, bit table check, or full software page table.
  • KM: Each platform has their own costs. Differentiating between platforms that have virtual memory vs. the ones that don’t and costs associated with them.
  • BN: We could have more constrained behavior than POSIX.
  • KM: How will this work with JS?
  • BN: One idea -- you could have a region visible in the view ...
  • KM: What if it’s in the middle? You can have an array of these?
  • BN: The region that you get an ArrayBuffer view of does not have the zero page.
  • WM: I’d like to understand better what the issues with multiple memories are, it would be a lot easier to say map the file to this memory.
  • JF: It depends whether memories are baked into the code or not.
  • KM: If it’s a dynamic memory, you have to pick at runtime.
  • MH: In all the multiple memories I’ve seen, it has static indexes.
  • JF: If it’s a static number, you can’t just give a pointer to the memory you just loaded.
  • BN: There’s two issues, C++ will not express this elegantly, secondly you’d want these allocated dynamically.
  • BT: You can always do that at the user level using a switch.
  • JF: Practically speaking you wouldn’t do that.
  • BT: yeah.
  • KM: Wasm … our implementation optimizes on the fact that there is one memory. There’s exactly one base. We use base register for common case. Your wasm load is single load instruction. As soon as you have multiple memories...
  • MH: You can do that without having multiple memories..
  • KM: Makes the register allocator slower, more live things. Haven’t measured.
  • JF: We should discuss multiple memories separately. Also ignore sharing to JS for now.
  • BN: But if you do this you’re cut off from the rest of the web platform.
  • Discussion about design space for this.

Poll

Should we explore adding (page-level) shared memory and file mapping capabilities to WASM?

SF F N A SA
1 0 10 2 2

Dissent:

  • Microsoft thinks this basically requires a kernel change to do this.
  • MH: complexity with splitting mappings, kernel team would not be happy probably.
  • LW, KM: Could make this work by limiting it to just one mapping.
  • BB: We did something similar for Pepper, the corner cases when reading/writing from memories did not behave the same way making it unusable.
  • WM: can’t say no to exploring, but it seems like a rathole. Do people need this?
  • LW: MMIO is highest end performance, we’re way below that. How can we efficiently get a stream into wasm memory? That would be a huge improvement, maybe this is a later step.

BREAK 🥞

Pure Wasm Threads

  • BN: Early days, but basic idea: save the cost of JS context. May be simpler to share memories and tables, but things to investigate there. Jukka has mentioned places in emscripten where he would have a desire to use pure wasm threads.

  • Simplest variation of this idea is that wasm threads can’t communicate with JS.

  • Benefits: we can avoid pure worker context -- it also opens the door to how bindings can be handled.

  • JF: Seems natural for non-JS embeddings to want this.

  • BN: True, right now non-js types have to invent their own implementation.

  • JF: With atomics it seems like that should just work out. We shouldn’t need postMessage, right?

  • BN: right.

  • BN: We’ll probably want a join operation, but how does that work? Can it work on main thread.

  • JF: Could make it a promise.

  • BN: Related point; async SAB wait...

  • WM: It’ll be nice for people porting legacy c++ code..

  • BN: Right now, you can spawn threads ahead of time, but you may run out of spawned threads.

  • LW: Specifically I think workers don’t start unless we return from the event loop on some browsers, at least Chrome and Firefox.

  • BN: We’ll want a create operation and a join. Open question: what happens with tables and FFI.

  • LW: I can draw it up on the whiteboard.

  • LW: Agent cluster contains multiple agents, they share SABs. Instance points to module, module points to memory. But tables are not shared. If performing dynamic linking, need to keep separate tables in sync.

  • Can we switch to a model where there is a logically shared instance, and the table points to the shared instance. The table can point to JS things, so that’s tricky. So maybe there is a shared zone in the middle. Shared zone needs to be handled w/ a concurrent GC.

  • SB: So in this example, the shared table could not point to normal JS.

  • LW: Right.

  • BT, LW, SB -- Discussion about how sharing tables.

  • BT: … Add an indirection so table can be partially shared.

  • SB: If there is always indirection… why prevent JS things from shared tables? If we want JS things to be in shared tables, then you need to check whether it is JS or not.

  • LW: I was talking to Jukka about this, asked about no JS imports. If there is a shared instance, pointing out to JS.

  • LW: Currently, in JS we only have shared SABs and modules.

  • How do you call JS stuff if you can’t call out. Perhaps you can annotate a function to say that it doesn’t have to call to JS -- for example, calling a WebGL API.

  • BT: Only thing you’re avoiding doing is a broadcast write.

  • LW: Also GC...

  • LW: Some Web APIs can be marked as being JS-free.

  • BT: Emscripten already has a number of functions implemented in JS.

  • LW: Those would need to be moved inline.

  • BT: This will really slow down… if you can only call the blessed web APIs.

  • LW: Limitation is when you have to proxy.

  • BT: Reachability and GC problem is implementation detail, shouldn’t show up in API. Shared table is roots into JS context. GC them independently, to execute write need to make sure you…

  • LW: What about cycles between independent heaps.

  • BN: Can we imagine a separate table for shared things.

  • LW: Issue with cycles.

  • BT: What’s the problem with broadcast write?

  • BN: Do we need to stop the world?

  • BT: Writing to a table…

  • LW: Fine if writes are slow. Trying to avoid by construction cross-worker GC.

  • JF: Shared zone could work, but could you have multiple shared zones?

  • LW: Just one per agent.

  • JF: Reason you’re doing it is to simulate an abstraction for a process. Having multiple would be multiple processes.

  • LW: You could have multiple agent clusters in a single process.

  • JF: Seems like it could work.

  • LW: No difference between worker and pure wasm threads.

  • BN: How do we make Web APIs OK to be accessed in parallel. They have to be parallel in particular way.

  • LW: We want them to be like syscalls. For example offscreen canvas. It’s OK to check affinity, and that can change over time.

  • BN: I’m nervous about broadcast, because that puts us in a world where updates to the table are slower.

  • BT: Fundamentally, in order to be able to call into JS w/ shared tables requires broadcast writes.

  • LW: If we make a thread-local root array, this solves both goals. If we say that it is a thread-local array, and it is a root. That is a good way to let each thread say these our functions out to JS.

  • BN: Then we forbid in the table any JS references.

  • BT: I think that’s too restrictive. I think everything should be moved into the shared table and make that work.

  • LW: we have the same problem with imports, tables, out edges.

  • BN: The table is our abstraction over indirect calls, seems sad to have to mutate.

  • BT: Algorithm I’m thinking of is pretty simple.

  • SB: How would you not detect timing of walking over tables.

  • BT: Not constant time, you just don’t have to stop the world.

  • SB: You can notice that something got loaded in at different times.

  • BT: Non-atomic?

  • SB: yes

  • LW: If you have pointers to other tables, then other threads can keep it alive. So you have to GC a cluster.

  • SB: A JS function being stored… what does it mean semantically for a worker to store JS function in the shared table.

  • LW: I agree, does it null out…?

  • LW: If you rule out the out edges, it’s easy to see how this works. Maybe this is a reason to make blessed APIs.

  • BT: What’s the timeframe for blessing APIs?

  • JF, BS: Already started, w/ SAB

  • LW: Fallback is to proxy.

  • BN: So you have one IO thread where you can do non-blessed APIs…

  • JF: So you need to be able to postMessage, and promisify await.

  • BT: This is not necessary. You can have a callback when a JS object is collected; you have a callback when the table object is collected. That callback fires you edit the shared table to delete the references in the shared table.

  • LW: You could still get cycles w/ shared roots.

  • BN: You’re imagining, Ben, that this is racy? BT: I think you could make it non-racy. Should talk about off-line.

  • All agree.

  • JF: Where do you want to take this idea?

  • LW: First we need to ship basic threads. And importing Web APIs, etc.

  • JF: No poll needed probably.

  • BN: It does make it clearer some of the blockers we’ll need to address.

  • LW: You want a worker that gets callbacks or signals from memory locations they’re waiting on.

ADJOURN