Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: [Swift+WASM] initial support for compiling Swift to WebAssembly #24684

Draft
wants to merge 55 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@zhuowei
Copy link
Contributor

commented May 10, 2019

What's in this pull request?

This pull request adds initial support for compiling Swift code to WebAssembly.

"Hello world" works, and a large subset of the stdlib already works on WebAssembly.

You can try this yourself with our cloud-hosted toolchain incorporating this port:

https://swiftwasm.org

This patch uses the WASI SDK, so WebAssembly executables generated by this port will work both in browsers and in standalone WebAssembly runtimes such as Wasmtime or Fastly's Lucet.

Links to issues

See SR-9307 for some background and the existing Swift+WASM changes in Swift that this patch depends on.

Also see emscripten#2427 for discussion and previous attempts on porting Swift to WebAssembly, which this port draws heavily upon.

Thanks

Thank you to everyone who helped make this possible:

If you would like to help, you can join us at https://github.com/swiftwasm.

Status of the port

This port is not ready for merging. The biggest blocking issues currently are:

  • Tests are disabled
  • Crash when passing non-throwing closure to function taking throwing closure

We're opening this pull request now to get advance feedback and advice, so we can fix these remaining issues and start cleaning up our patches.

Per Swift's contributing guidelines, we're planning to split each change into a separate pull request.

We've already created pull requests for some minor changes:

We welcome advice on how best to submit these changes for review.

Note that this port also requires changes to Clang and LLVM: the corresponding pull requests are:


Here's a more detailed explanation of the changes included in this pull request, and on what still needs to be done.

How WebAssembly differs from other platforms

WebAssembly is a new platform, with unique attributes that pose issues to Swift's runtime.

  • Functions have strict argument checking
    • Swift often calls functions with extra arguments
    • couldn't find a way to fix this
    • so eg Optional.Map crashes because tries to call non-throwing closure with an extra error pointer
  • Limited relocation support in linker: can't take difference between two symbols
    • Metadata relies on this. Solution: switch to absolute pointers
    • Not the first port that required this
    • We would like to find a way to merge absolute metadata support
    • or find a different workaround
    • eg only emitting Bitcode for LTO, avoiding intermediate .wasm files

What's done

Swift changes

  • Cmake and utils/build-script changes to compile Stdlib for WebAssembly

    • A lot of this overlaps with ddunbar's changes (#20684)
    • We should probably try to unify our changes with that PR.
  • Metadata: switch to absolute pointers

  • Load the metadata at runtime

    • swiftwasm@bc15cba
    • swiftwasm@885cd05
    • swiftwasm@2057797
    • Metadata read code is shared with ELF with one change
    • linker doesn't export section start and end pointer symbols, unlike ELF
    • COFF also has this problem, but COFF sections are sorted alphabetically
      • so Windows port adds dummy sections with names that would sort before/after each metadata section, and uses them as start/end symbols
    • Wasm doesn't do this. So have to link swift_start and swift_end objects at start and end of linker invocation, like older versions of Swift on Linux
  • Disable atomics and -glldb

  • Disable COMDATs for reflection metadta

    • metadata of each type is emitted in same section
    • COMDAT support on WebAssembly requires each piece of metadata emitted in separate section
    • #24133
    • thanks to comments, will rework this to actually emit each metadata in its own section
  • Various stdlib changes to fix method signatures

    • #24054
    • #24181
    • #24182
    • swift_once calls are hard to fix, since calls are compiler generated
    • hacked around by duplicating swift_once to swift_once_shim
    • swiftwasm@4184451
    • swiftwasm@bd86ee9
    • tried to fix it properly; didn't work because I don't know how to generate SIL

LLVM changes

0x0 -> value witness table pointer
0x4 -> metadata entry
... etc

https://github.com/apple/swift/blob/master/docs/ABI/TypeMetadata.rst

So to access the metadata entry at 0x4 bytes past the table pointer, a symbol is exported using LLVM's alias directive, pointing 4 bytes past the value witness table pointer

Other metadata use similar aliases, again to export a symbol with an offset inside another symbol

Clang doesn't emit alias directives with offset, so this wasn't already implemented by LLVM's Wasm backend. This change adds it.

  • cherry-pick wasm linking info v2 patch from LLVM master:

Clang changes

What still needs to be done

Most important: fix swift calling convention and extra arguments

WebAssembly has strict function signature checking, so this crashes:

struct Test {
    var a:Int
}
print(Test(a: 1))

inside Optional.Map.

Why? Optional.Map takes a throwing closure, but is passed a non-throwing closure.

A throwing closure is compiled down to a signature similar to this:

void closure(void* arg1, void* arg2, void* swiftSelf, error** swiftError)

A non-throwing closure has signature like this:

void closure(void* arg1, void* arg2, void* swiftSelf)

without an end error pointer.

Swift assumes passing extra parameters to a function pointer is ignored, so it doesn't generate a thunk if a non-throwing closure is called as a throwing closure, or if a thin function is called as a thick function.

lib/IRGen/GenFunc has a comment that explains this further.

This assumption is valid on all platforms that Swift currently supports, but doesn't work on WebAssembly thanks to the strict signature checking.

Unfortunately, I have no idea how to fix this.

Modifying the Swift compiler to generate the thunks might be difficult. Currently, thunks are only generated when calling to a function with different calling conventions, not between functions with the same calling convention but different number of arguments. SIL's SILFunctionType doesn't even track if a function throws.

I've never worked with Swift compiler internals, so I don't even know where to start modifying IRGen.

I asked @jrose-apple, who suggested that one short-term alternative is to standardize all swiftcall functions to take only one extra parameter:

void closure(void* arg1, void* arg2, void** extraArgs)

extraArgs would point to an area on the stack, containing swiftself, swifterror, and any other extra parameters.

This way, thin, thick, and throwable thick functions would have the same number of arguments.

I'm guessing this would require either Clang+Swift changes or an LLVM pass.

I found an example in Chrome PNaCL that transforms function arguments https://chromium.googlesource.com/native_client/pnacl-llvm/+/mseaborn/merge-34-squashed/lib/Transforms/NaCl/ExpandVarArgs.cpp#170, so the LLVM pass might not be too complicated.

We would really appreciate help and advice on how best to approach this.

Reenable and run tests

  • Tests are completely disabled right now.

Upstream the LLVM patches

  • Currently, the LLVM/Clang patches are based on Swift's stable branches
  • Our goal is to get the LLVM changes into LLVM upstream, so they'll be pulled into swift's upstream-with-swift branches
  • I've never worked with LLVM before, so I would appreciate advice on how best to structure patches for upstreaming.

Support building Swift stdlib for Wasm using a macOS host

  • It seems that a macOS host can only cross compile for Darwin platforms, as compiling stdlib for Wasm fails on macOS. (It works on Linux.)
  • Is the existing Android cross-compile support also affected by this?
  • @MaxDesiatov is working on getting the stdlib on Wasm building on macOS.

Split this PR into small, reviewable chunks

  • We started to do this with the simpler stdlib signature changes already

Longer-term goals

Get Swift's other libraries working

  • We have not tried building Foundation or any other Swift libraries.
  • Would these libraries/tools even work in WebAssembly, which, as of now, only supports single threaded execution?

Support link-time optimization

  • @ddunbar discussed the importance of link-time optimizations for Swift on WebAssembly in SR-9307.
  • This would reduce executable size (currently, a Swift Hello world compiles to a 7.8 MB .wasm)
    • Emscripten heavily depends on LTO to keep binary sizes down when compiling C/C++ to WebAssembly, so it's probably worth supporting
  • it can also help us avoid the relative/absolute metadata issue since relative relocations can be represented in LLVM bitcode.

Work on ways to interop with JavaScript from Swift

@jckarter

This comment has been minimized.

Copy link
Member

commented May 11, 2019

I asked @jrose-apple, who suggested that one short-term alternative is to standardize all swiftcall functions to take only one extra parameter:

void closure(void* arg1, void* arg2, void** extraArgs)
extraArgs would point to an area on the stack, containing swiftself, swifterror, and any other extra parameters.

This way, thin, thick, and throwable thick functions would have the same number of arguments.

I'm guessing this would require either Clang+Swift changes or an LLVM pass.

Rather than spill to the stack, it should be sufficient to make the two swiftself and swifterror arguments always be provided, and leave them undef when they aren't needed. Those are the only two extra arguments that you should need to worry about. Since there's already a distinct swiftcc convention at the LLVM level, it seems natural to me to introduce these arguments, if they don't exist in the LLVM signature, into the wasm-level signature in the backend.

Note that swifterror isn't a real pointer-to-pointer argument for the x86 or arm backends, but really represents an in-out register. If it's possible to model it that way in wasm too, it'd probably lead to better native code size and quality when lowered to native code.

@jckarter
Copy link
Member

left a comment

If you're never emitting relative references to begin with, you ought to be able to stop emitting GOT-relative pointers altogether. They're only necessary in native code image formats for referencing symbols from other binaries that don't have fixed relative offsets. You should be able to chase down all the IRGen helpers that return ConstantReference values and modify them to always return Direct references. That should save you having to support the tagging scheme here as well.

@ddunbar

This comment has been minimized.

Copy link
Member

commented May 11, 2019

Metadata relies on this. Solution: switch to absolute pointers

Given the newness of WASM, it would be nice to instead see if support for this can be added to WASM; the feature is useful and could eventually benefit other things.

@zhuowei

This comment has been minimized.

Copy link
Contributor Author

commented May 11, 2019

@jckarter That was one of the alternatives that @jrose-apple suggested. However, this may not work for all methods: According to GenFunc.cpp there could be extra "witness_method generic parameters" after the error parameter, so just adding two parameters might not be enough.

Also, re swiftcc calling convention: Wasm doesn't have registers: it is a stack machine, similar to the Java virtual machine.

@jckarter

This comment has been minimized.

Copy link
Member

commented May 11, 2019

@zhouwei For witness table entry points, there is still a consistent calling convention that all witnesses use. That logic duct tapes over some representation issues in SIL; they should end up ultimately lowering to compatible LLVM level signatures.

If wasm is a stack machine, then it may still be worth considering the intent of the swiftcc special arguments in designing how it lowers to the stack machine representation. The self argument is intended to be mapped to a stable, callee-preserved register, so that (a) context-free functions are ABI compatible with closures that have captures, and (b) in a series of method calls on the same object, the caller can save code size by not having to constantly reload the self argument register. In a stack machine, you could get a similar effect by always pushing the self argument first, and maybe having the callee leave it on the stack after the call (though for a stack machine that's not an obvious win; it has a tradeoff in code size for calls to methods on different objects, since the caller will then have to pop more).

The error argument is similarly intended to be mapped to a fixed, normally callee-preserved register, which the caller sets to zero, and the callee sets to nonzero on error. This is so that nonthrowing functions are ABI compatible when used as throwing functions, and so that propagating errors through multiple stack frames can be done with minimal code size cost for the test and early return. For a stack machine, it seems like we don't really gain anything from passing a value in to the callee, since it's always zero or undef, but you could push the error value after the primary return value when the callee returns, so that the caller can easily test it and either pop or return to propagate the error upward.

@jckarter

This comment has been minimized.

Copy link
Member

commented May 11, 2019

Given the newness of WASM, it would be nice to instead see if support for this can be added to WASM; the feature is useful and could eventually benefit other things.

@ddunbar Swift's intended use is to reduce load time for memory mapped native code binaries. Since wasm AIUI generally goes through another compilation stage on the client, it seems to me like the cost of relocating wouldn't be that big a part of the load time cost. Hopefully the wasm binary format already has a reasonably efficient way of representing references to local symbols...

@zhuowei

This comment has been minimized.

Copy link
Contributor Author

commented May 11, 2019

@jckarter Wasm call instructions seem to consume the arguments from the stack: https://godbolt.org/z/NjLc0f so it might not matter whether self is pushed first or last, from a code density or performance point of view.

I have no experience working on compilers, though, so that's just my guess. for what it's worth, Java's JVM, which has a similar stack machine and call semantics, pushes the self pointer first, but I'm not sure why they do that.

@jrose-apple

This comment has been minimized.

Copy link
Member

commented May 13, 2019

Swift's intended use [for relative indirections] to reduce load time for memory mapped native code binaries. Since wasm AIUI generally goes through another compilation stage on the client, it seems to me like the cost of relocating wouldn't be that big a part of the load time cost.

Some of them do reduce in-memory static data size too, but maybe that's not worth pushing a whole feature through WASM.

zhuowei added some commits Apr 9, 2019

Hack: Remove returnaddress calls emitted by exclusivity checks
WebAssembly doesn't have __builtin_return_address.
WebAssembly: Hack: disable all atomic instructions
This matches what Clang does.

This doesn't check that the target is Wasm; will need to do that.

Patcheng's port seems to have working? atomics, so it's probably possible
to get atomics working, but this is fine for now.

This still doesn't get the stdlib to compile, but at least the error now matches
what I get when I run clang on the -emit-ir LLVM IR output.
Disable generating PC-relative relocations in constant builder
This doesn't take care of all the PC relative relocations in constant builder yet.

This still breaks on

`((.Lgot.$sSTMp.656-($ss18EnumeratedSequenceVMn))-56)+1`

Which sits in rodata. Not sure where that's generated yet
WebAssembly: remove even more PCRel relocations
This removes the PCRel reloc in \01l_type_metadata_table but nowhere else
WebAssembly: remove more PCRel relocations
This at least seems to get rid of all the PCRel relocations emitted.

Now it crashes in:

(($sSTMp)+48)-8
The expr type is 0
expression kind not supported

yay?!

zhuowei and others added some commits Apr 17, 2019

build static stdlibs in build script
WebAssembly only supports static linking (currently; dynamic is coming later)
so we need static stdlib; I need to add that to the CMake scripts later
disable DWARF5 and atomics on WebAssembly
LLVM doesn't support generating DWARF5 debugging data for WebAssembly,
and support for atomics is currently very limited.

Set LLVM target options to avoid generating DWARF5 debug data and lower
atomics to regular load/stores when building for WebAssembly.

This shouldn't affect existing platforms.
WebAssembly: HACK: use llvm-ar from the WASI sdk
Ubuntu's GNU ar doesn't work with wasm-ld (doesn't see any files inside the archive)
[stdlib] Fix return type of swift_{uint64,int64,float*}ToString
The return type of these functions are uint64_t in Stubs.cpp
but UInt in the Swift code; this changes the Swift code to match
the C++ return type.

Found when compiling the stdlib for WebAssembly, which requires that
all return types match: UInt maps to i32 while uint64_t maps to i64,
so functions calling these functions fail the validation.
[stdlib] fix return type of getNumRuntimeFunctionCounters
The return type of getNumRuntimeFunctionCounters is defined as
uint64_t in RuntimeInvocationsTracking.cpp, but it has return type
Int in RuntimeFunctionCounters.swift.

Found when compiling the stdlib for WebAssembly, as WebAssembly validates
return types. uint64_t corresponds to i64, but Int is i32,
so the program fails validation.
WebAssembly: add start/end objects for metadata
The WebAssembly linker, unlike ELF linkers, doesn't define
__start and __stop symbols for sections. So we have to make our own.

If you put an object first in the command line, its symbols are placed first;
same for last in command line. So we make two files.

This is the same method that Swift used to use on ELF platforms.

run ./buildstartend.sh; these two objects must be linked at the start
and the end. see the updated linkPlease.sh file.
Add wasm branch scheme to update-checkout-config
This allows cloning all of the repositories directly with
./swift/utils/update-checkout --clone --scheme wasm
without relying on paths hardcoded in swiftwasm-sdk.
It also clones icu that way as well, as that one is used when building on Linux.
Later we could add "WebAssembly" platform to update-checkout-config.json to clone
it only for WebAssembly and Linux platforms, but for now it's pulled for all
platforms to make things easy.

swiftwasm#1
WebAssembly: fix GOT-relative pointers without PCrel
Previous commits changed a bunch of relative pointers to absolute
in the metadata; at the time I didn't add support for adding tag
bits, which signifies if the pointers point to the .got section
(and thus needs an extra layer of indirection.)

This broke _isCImportedTagType on $ss7UnicodeO5ASCIIOMn,
so this commit adds back tag support for metadata pointers.
Revert "WebAssembly: add logging in swift_getAssociatedTypeWitness"
Don't need the logging anymore.

This reverts commit 7d35644.
Revert "WebAssembly: add a ton of logging when looking up metadata"
Don't need logging anymore.

This reverts commit 065ab6f.

@zhuowei zhuowei force-pushed the swiftwasm:swiftwasm branch from 179b67e to 049121f Jun 9, 2019

kverrier and others added some commits Jun 15, 2019

Merge pull request #4 from kverrier/swiftwasm
Update default checkout scheme to "wasm".
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.