Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we leave the door open for WebAssembly growing into a universal binary format for.. everything? #249

Closed
aardappel opened this issue Jul 6, 2015 · 15 comments

Comments

@aardappel
Copy link

I appreciate that there's "Web" in WebAssembly, and that the MVP goals are all about making sure that WebAssembly performs well as a replacement for current asm.js & NaCl use cases.

But I see a bigger future. We've never had a cross platform, stable, portable and widely supported binary standard that supports C level semantics well. They've either not been very stable (LLVM), don't support C efficiently (JVM) or not widely supported (most everything else).

If in the future all browsers will have high quality WebAssembly implementation, the attraction to also use this for client (desktop/mobile) & server deployment will be huge. I know there's this:
https://github.com/WebAssembly/design/blob/master/NonWeb.md,
but that seems to mostly hint at the Node.js style use case. I'd hope that non-web client use cases (especially e.g. games) are not ignored.

Cross-platform C/C++ building, testing and deployment continues to be a huge pain till this day. The thought that I can make a C++ game on my favourite platform, spit out a wasm binary, test it locally, show it to others using the web, then deploy on other native platforms without further compilation (either by bundling it with a pre-made binary that can run wasm or because the platform supports it natively) makes me positively drool.

Emscripten already comes with ways to access OpenGL/SDL-alike functionality out of the box, so turning this into something that typically all WebAssembly client platforms provide would be amazing for all sorts of client development.

Additionally, I think WebAssembly growing into an even bigger standard would do wonderful things for future programming languages, which could then be on equal footing with C/C++ by simply outputting WebAssembly. Currently, that requires a level of toolchain integration on many platforms that's a huge barrier (see current mobile platforms, especially).

tl;dr: any design choices that tie WebAssembly too closely to JS semantics / JS APIs / DOM today may slow down or even prevent this very beneficial future direction.

@jfbastien
Copy link
Member

+1 that's exactly what we're aiming for.

You asking this question means it's not clear. Do you have suggestions of what parts of the design we can clarify? You can even send pull requests if you're willing to join the Community Group.

@sunfishcode
Copy link
Member

I believe there's already a desire to ensure that WebAssembly can be used independently of JS. I created #251 to add a paragraph about this.

Beyond that, I think the functionality you describe would mostly just need a person to do the work to develop, maintain, and package such an environment. That's not something the WebAssembly Community Group itself is likely to do (for the foreseeable future), but others could.

@BrendanEich
Copy link

+1 of course, but I want to add something that may not be clearly written down (apologies if I missed it). That idea is "1VM" -- the singular nature of the engine running JS today, wasm and JS tomorrow, whose semantics accessed by future wasm syntax can far exceed what we want to put into JS-the-language.

1VM is the piece of the puzzle that I think past in-browser attempts at supporting multiple programming languages missed. Old JScript in IE shared a COM-scripting engine with VBScript, prefiguring this 1VM notion. I'm told that old engine was written in assembly (or major parts were).

In contrast, over a decade ago at Mozilla for Active State and others, Mark Hammond integrated the C-Python engine in addition to the SpiderMonkey JS engine, but exposed only to XUL scripts not web scripts. To cope with C++/JS cycles, we had added a cross-language-heap Cycle Collector after Bacon & Rajan 2001:

http://researcher.watson.ibm.com/researcher/files/us-bacon/Bacon03Pure.pdf

This CC was extended to handle non-JS scripting engines. All the non-JS, non-C++ stuff eventually was removed; it was too heavy and it did not pay its way.

Someone at Google may have thoughts to add on Dartium / Dart VM in Chrome, which also didn't stick.

I don't know of multiple engine integration attempts in other browsers. IBM Research had Parley back in the day:

http://hirzels.com/martin/papers/vee04-parley.pdf

Anyway, 1VM and future wasm >> JS FTW!

/be

@kg
Copy link
Contributor

kg commented Jul 7, 2015

As far as 1VM approaches go, the Microsoft CLR is an interesting example (the JVM likewise but less so) - from an early stage the CLR people recognized that multiple languages needed to exist, and each language had different needs. So they built a wide feature set into the VM, despite the fact that their flagship language (C#) didn't expose a lot of those features. To this day there are some features that are only accessible from VB.net, F#, or C++/CLI - based on the needs and style of each language. It's a bit complex, but it has clearly paid off in terms of portability and making that VM appealing to developers with different needs/tastes.

We're in a great position here since there are already high-performance, powerful shipping VMs like SpiderMonkey and v8 deployed to millions (billions?) of machines, with every other VM improving every day. Tapping into that with wasm gives us a huge head start.

@titzer
Copy link

titzer commented Jul 7, 2015

I share this goal as well, and there have been attempts at this in the
past. I think the important design consideration is that being universal is
easy when the language is low-level, but too low-level is not a win, since
it pushes too many abstractions up, beyond the understanding of the virtual
machine. Keeping this in mind when we add an object model, it feels like a
low-level object system, with perhaps only records and function pointers,
might be a good compromise. That leaves some work for language
implementations to map their data structures into records and function
pointers (e.g. a Java implementation making explicit vtables) but still
leaves enough of a strong contract that a precise GC can be built at the
lower level.

On Tue, Jul 7, 2015 at 3:58 AM, Katelyn Gadd notifications@github.com
wrote:

As far as 1VM approaches go, the Microsoft CLR is an interesting example
(the JVM likewise but less so) - from an early stage the CLR people
recognized that multiple languages needed to exist, and each language had
different needs. So they built a wide feature set into the VM, despite the
fact that their flagship language (C#) didn't expose a lot of those
features. To this day there are some features that are only accessible from
VB.net, F#, or C++/CLI - based on the needs and style of each language.
It's a bit complex, but it has clearly paid off in terms of portability and
making that VM appealing to developers with different needs/tastes.

We're in a great position here since there are already high-performance,
powerful shipping VMs like SpiderMonkey and v8 deployed to millions
(billions?) of machines, with every other VM improving every day. Tapping
into that with wasm gives us a huge head start.


Reply to this email directly or view it on GitHub
#249 (comment).

@aardappel
Copy link
Author

Great to hear you are all behind this use case, and that it is on the radar. Can't wait for our universal binary future! ;)

@jfbastien: I'll make a PR to improve NonWeb.md tomorrow. #251 Will help, but it its current form the text is entirely about server use cases, and does not give any indication whether it would be suitable for high performance nonweb client use cases (e.g. games). It reads as if nonweb means "a better Node.js".

@sunfishcode: I understand this group wouldn't be the one to make such packages, and I'll be careful to not set that expectation in the PR text. Making that use case easy/practical/performant, I hope is definitely of interest to this group.

@BrendanEich: Yes, it is interesting to think that behind the JS semantics of current JS implementations lies a much lower level VM that is capable of C semantics, that is now enabling this WebAssembly future. And that C semantics is the key to allowing the widest range of languages to live on a single VM.
Even wider would be "CPU semantics", as C (and some of the current WebAssembly definition) may preclude more esoteric language features such as continuations. Are you implying with "future wasm syntax" that we can keep extending that reach?

@kg: One curious thing about the CLR is that even though it is powerful enough to run "most of" C++, it has never grown to be a performant enough platform for C++ code, and that now the semantically further removed JS has grown into something that can provide a more performant and less restrictive C++ platform.

@titzer: WebAssembly is actually fairly high-level compared to something like LLVM, in part because it is AST based. I don't think an "object model" is something that necessarily needs to be exposed at the WebAssembly level, since languages can implement these very differently. WebAssembly already has indirect calls, which many object models can be built on top of. APIs can be exposed as static functions or tables of function pointers.

@lukewagner
Copy link
Member

Agree with @titzer on exposing the lowest-level GC primitives we can and letting different languages' object models to be built up from that. I'm hoping these low-level GC primitives can be reflected in JS as Typed Objects.

@aardappel: the key is allowing the compiled program to use the same GC as the rest of the browser which avoids nasty GC cycle/leak problems that you otherwise get when each web app compiles their own GC on top of linear memory. There are also nice side benefits like better browser devtools integration and reducing the download size of small programs.

On the subject of function pointers: we can't load/store function-pointer types directly to/from linear memory b/c they can be aliased/corrupted; we need some well-defined index into a table of function pointers. But with safe GC memory primitives, we could have function pointers types that really stored function pointers in the heap. With this, one should be able to implement a vtable without any extra indirection.

@jbondc
Copy link
Contributor

jbondc commented Jul 8, 2015

Hmm interesting about IBM Parley. I've always wondered the tradoffs of a '2VM' design where (a) 1VM deals with typed languages (ala c++), more low level control over memory. (b) second VM deals more with highly dynamic languages (ala Smalltalk, Self, JavaScript), more managed control over memory.

@BrendanEich
Copy link

@aardappel: indeed wasm can extend its reach, once we are past the interregnum where asm.js and wasm should be coexpressive to the extent that a "polyfill" or shim from wasm back to asm.js is efficient (this may take a few years; we don't know a priori). I've mentioned call/cc as a farther future possible extension, which won't go into JS but which could go into the 1VM of the future, usable via wasm then. I thought I saw mention of continuations in the FutureFeatures document, but I don't see it now. Perhaps it was removed, or I'm missing it -- or else it could be added.

@jbondc: Parley was no more industry-ready or practical than Jikes RVM, IMHO. 1VM is a requirement and for good reasons, some of which @lukewagner just reiterated (GC).

/be

ghost pushed a commit to aardappel/design that referenced this issue Jul 8, 2015
As discussed in:
WebAssembly#249
NonWeb.md didn't capture the full generality of possible use cases
for WebAssembly as a universal binary standard.
@aardappel
Copy link
Author

Ok, made an attempt at clarifying the use cases I mention in this issue: #258

ghost pushed a commit to aardappel/design that referenced this issue Jul 8, 2015
As discussed in:
WebAssembly#249
NonWeb.md didn't capture the full generality of possible use cases
for WebAssembly as a universal binary format.
@zmoshansky
Copy link

Would WebAssembly eventually become like some sort of Portable (Higher Level/More Semantics Maintained) "LLVM IR", as discussed here, 2011?

Perhaps (vaguely) as outlined below:

Any LLVM Frontend -> LLVM IR -> WebAssembly(LLVM Backend)
WebAssembly (LLVM Frontend)  -> LLVM IR -> LLVM Backend

or

LLVM Frontend -> LLVM "Passes" -> "WebAssembly" -> LLVM "Passes" -> LLVM IR -> LLVM Backend

If so, I feel like that is a very good way to describe it to a technical person that would quickly help distinguish and position it relative to LLVM IR, PNaCl, CLR, JVM Bytecode, etc. Something I have struggled to do from reading the early design documentation.

@kg
Copy link
Contributor

kg commented Jul 11, 2015

WebAssembly is not intended as a replacement for LLVM IR or a portable LLVM-esque IR, though it is possible you could use it in some scenarios where you would use LLVM IR.

WebAssembly is and will likely always be less expressive than LLVM IR. This is for multiple reasons, but essentially being less expressive makes it more suitable for use as a web format. Likewise for some of our other choices, like using an AST.

kg pushed a commit to kg/design that referenced this issue Jul 13, 2015
As discussed in:
WebAssembly#249
NonWeb.md didn't capture the full generality of possible use cases
for WebAssembly as a universal binary format.
@zmoshansky
Copy link

Thanks for explaining that Katelyn, it definitely helps clarify the vision
for WA. I'm excited to watch the progress over the coming months.

On Fri, Jul 10, 2015 at 5:06 PM, Katelyn Gadd notifications@github.com
wrote:

WebAssembly is not intended as a replacement for LLVM IR or a portable
LLVM-esque IR, though it is possible you could use it in some scenarios
where you would use LLVM IR.

WebAssembly is and will likely always be less expressive than LLVM IR.
This is for multiple reasons, but essentially being less expressive makes
it more suitable for use as a web format. Likewise for some of our other
choices, like using an AST.


Reply to this email directly or view it on GitHub
#249 (comment).

Zachary Moshansky
BASC Candidate 2015 - E: zmoshansky@gmail.com /
zacharymoshansky@alumni.ubc.ca

@BrendanEich
Copy link

Just for anyone who may have missed the FAQ (and FAQs always need maintenance, so please file issues separately):

https://github.com/WebAssembly/design/blob/master/FAQ.md#why-not-just-use-llvm-bitcode-as-a-binary-format

/be

@sunfishcode
Copy link
Member

#258 was merged, so it sounds like this issue is resolved. Please reopen or file new issues for further questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants