diff --git a/ml-proto/README.md b/ml-proto/README.md index d03ec581d6..ce0917ca36 100644 --- a/ml-proto/README.md +++ b/ml-proto/README.md @@ -5,10 +5,13 @@ This repository implements a prototypical reference interpreter for WebAssembly. Currently, it can * *parse* a simple S-expression format, +* *decode* the binary format (work in progress), * *validate* modules defined in it, -* *execute* invocations to functions exported by a module. +* *execute* invocations to functions exported by a module, +* *encode* the binary format, +* *prettyprint* the S-expression format (work in progress). -The file format is a (very dumb) form of *script* that cannot just define a module, but also batch a sequence of invocations. +The S-expression format is a (very dumb) form of *script* that cannot just define a module, but in fact a sequence of them, and a batch of invocations, assertions, and conversions to each one. As such it is different from the binary format, with the additional functionality purely intended as testing infrastructure. (See [below](#scripts) for details.) The interpreter can also be run as a REPL, allowing to enter pieces of scripts interactively. @@ -61,17 +64,34 @@ Either way, in order to run the test suite you'll need to have Python installed. You can call the executable with ``` -wasm [option] [file ...] +wasm [option | file ...] ``` -where `file` is a script file (see below) to be run. If no file is given, you'll get into the REPL and can enter script commands interactively. You can also get into the REPL by explicitly passing `-` as a file name. You can do that in combination to giving a module file, so that you can then invoke its exports interactively, e.g.: +where `file`, depending on its extension, either should be an S-expression script file (see below) to be run, or a binary module file to be loaded. + +A file prefixed by `-o` is taken to be an output file. Depending on its extension, this will write out the preceding module definition in either S-expression or binary format. This option can be used to convert between the two in both directions, e.g.: ``` -./wasm module.wast - +wasm -d module.wast -o module.wasm +wasm -d module.wasm -o module.wast ``` -Note however that the REPL currently is too dumb to allow multi-line input. :) -See `wasm -h` for (the few) options. +The `-d` option selects "dry mode" and ensures that the input module is not run, even if it has a start section. +In the second case, the produced script contains exactly one module definition (work in progress). + +Finally, the option `-e` allows to provide arbitrary script commands directly on the command line. For example: + +``` +wasm module.wasm -e '(invoke "foo")' +``` + +If neither a file nor any of the previous options is given, you'll land in the REPL and can enter script commands interactively. You can also get into the REPL by explicitly passing `-` as a file name. You can do that in combination to giving a module file, so that you can then invoke its exports interactively, e.g.: + +``` +wasm module.wast - +``` + +See `wasm -h` for (the few) additional options. ## S-Expression Syntax @@ -168,9 +188,13 @@ cmd: ( assert_return_nan (invoke * )) ;; assert return with floating point nan result of invocation ( assert_trap (invoke * ) ) ;; assert invocation traps with given failure string ( assert_invalid ) ;; assert invalid module with given failure string + ( input ) ;; read script or module from file + ( output ) ;; output module to file ``` -Invocation is only possible after a module has been defined. +Commands are executed in sequence. Invocation, assertions, and output apply to the most recently defined module (the _current_ module), and are only possible after a module has been defined. Note that there only ever is one current module, the different module definitions cannot interact. + +The input and output commands determine the requested file format from the file name extension. They can handle both `.wast` and `.wasm` files. In the case of input, a `.wast` script will be recursively executed. Again, this is only a meta-level for testing, and not a part of the language proper. @@ -202,11 +226,15 @@ The implementation consists of the following parts: * *Parser* (`lexer.mll`, `parser.mly`, `desguar.ml[i]`). Generated with ocamllex and ocamlyacc. The lexer does the opcode encoding (non-trivial tokens carry e.g. type information as semantic values, as declared in `parser.mly`), the parser the actual S-expression parsing. The parser generates a full AST that is desugared into the kernel AST in a separate pass. +* *Pretty Printer* (`prettyprint.ml[i]`). Turns a module AST back into the textual S-expression format. (Work in progress) + +* *Decoder*/*Encoder* (`decode.ml[i]`, `encode.ml[i]`). The former (work in progress) parses the binary format and turns it into an AST, the latter does the inverse. + * *Validator* (`check.ml[i]`). Does a recursive walk of the kernel AST, passing down the *expected* type for expressions, and checking each expression against that. An expected empty type can be matched by any result, corresponding to implicit dropping of unused values (e.g. in a block). * *Evaluator* (`eval.ml[i]`, `values.ml`, `arithmetic.ml[i]`, `int.ml`, `float.ml`, `memory.ml[i]`, and a few more). Evaluation of control transfer (`br` and `return`) is implemented using local exceptions as "labels". While these are allocated dynamically in the code and addressed via a stack, that is merely to simplify the code. In reality, these would be static jumps. -* *Driver* (`main.ml`, `script.ml[i]`, `error.ml`, `print.ml[i]`, `flags.ml`). Executes scripts, reports results or errors, etc. +* *Driver* (`main.ml`, `run.ml[i]`, `script.ml[i]`, `error.ml`, `print.ml[i]`, `flags.ml`). Executes scripts, reports results or errors, etc. The most relevant pieces are probably the validator (`check.ml`) and the evaluator (`eval.ml`). They are written to look as much like a "specification" as possible. Hopefully, the code is fairly self-explanatory, at least for those with a passing familiarity with functional programming. @@ -215,6 +243,6 @@ In typical FP convention (and for better readability), the code tends to use sin ## What Next? -* Binary format as input and output. +* More tests. -* Compilation to JS/asm.js. +* Compilation to JS/asm.js? diff --git a/ml-proto/given/lib.ml b/ml-proto/given/lib.ml index d6bf1b3d16..a7408a312f 100644 --- a/ml-proto/given/lib.ml +++ b/ml-proto/given/lib.ml @@ -1,5 +1,8 @@ module List = struct + let rec make n x = + if n = 0 then [] else x :: make (n - 1) x + let rec take n xs = match n, xs with | 0, _ -> [] diff --git a/ml-proto/given/lib.mli b/ml-proto/given/lib.mli index 3428f95bc3..cef55599b4 100644 --- a/ml-proto/given/lib.mli +++ b/ml-proto/given/lib.mli @@ -2,6 +2,7 @@ module List : sig + val make : int -> 'a -> 'a list val take : int -> 'a list -> 'a list val drop : int -> 'a list -> 'a list diff --git a/ml-proto/given/source.ml b/ml-proto/given/source.ml index 93ef7de420..c00422a674 100644 --- a/ml-proto/given/source.ml +++ b/ml-proto/given/source.ml @@ -1,6 +1,6 @@ type pos = {file : string; line : int; column : int} type region = {left : pos; right : pos} -type 'a phrase = { at : region; it : 'a} +type 'a phrase = {at : region; it : 'a} (* Positions and regions *) @@ -9,9 +9,14 @@ let no_pos = {file = ""; line = 0; column = 0} let no_region = {left = no_pos; right = no_pos} let string_of_pos pos = - string_of_int pos.line ^ "." ^ string_of_int (pos.column + 1) + if pos.line = -1 then + string_of_int pos.column + else + string_of_int pos.line ^ "." ^ string_of_int (pos.column + 1) + let string_of_region r = - r.left.file ^ ":" ^ string_of_pos r.left ^ "-" ^ string_of_pos r.right + r.left.file ^ ":" ^ string_of_pos r.left ^ + (if r.right = r.left then "" else "-" ^ string_of_pos r.right) let before region = {left = region.left; right = region.left} let after region = {left = region.right; right = region.right} diff --git a/ml-proto/host/encode.ml b/ml-proto/host/encode.ml new file mode 100644 index 0000000000..c92d907aae --- /dev/null +++ b/ml-proto/host/encode.ml @@ -0,0 +1,396 @@ +(* Version *) + +let version = 0x0b + + +(* Encoding stream *) + +type stream = +{ + buf : Buffer.t; + patches : (int * char) list ref +} + +let stream () = {buf = Buffer.create 8192; patches = ref []} +let pos s = Buffer.length s.buf +let put s b = Buffer.add_char s.buf b +let put_string s bs = Buffer.add_string s.buf bs +let patch s pos b = s.patches := (pos, b) :: !(s.patches) + +let to_string s = + let bs = Buffer.to_bytes s.buf in + List.iter (fun (pos, b) -> Bytes.set bs pos b) !(s.patches); + Bytes.to_string bs + + +(* Encoding *) + +let encode m = + let s = stream () in + + let module E = struct + (* Generic values *) + + let u8 i = put s (Char.chr (i land 0xff)) + let u16 i = u8 (i land 0xff); u8 (i lsr 8) + let u32 i = + Int32.(u16 (to_int (logand i 0xffffl)); + u16 (to_int (shift_right i 16))) + let u64 i = + Int64.(u32 (to_int32 (logand i 0xffffffffL)); + u32 (to_int32 (shift_right i 32))) + + let rec vu64 i = + let b = Int64.(to_int (logand i 0x7fL)) in + if i < 128L then u8 b + else (u8 (b lor 0x80); vu64 (Int64.shift_right i 7)) + + let rec vs64 i = + let b = Int64.(to_int (logand i 0x7fL)) in + if -64L <= i && i < 64L then u8 b + else (u8 (b lor 0x80); vs64 (Int64.shift_right i 7)) + + let vu32 i = vu64 (Int64.of_int32 i) + let vs32 i = vs64 (Int64.of_int32 i) + let vu i = vu64 (Int64.of_int i) + let f32 x = u32 (F32.to_bits x) + let f64 x = u64 (F64.to_bits x) + + let bool b = u8 (if b then 1 else 0) + let string bs = vu (String.length bs); put_string s bs + let list f xs = List.iter f xs + let opt f xo = Lib.Option.app f xo + let vec f xs = vu (List.length xs); list f xs + let vec1 f xo = bool (xo <> None); opt f xo + + let gap () = let p = pos s in u32 0l; p + let patch_gap p n = + assert (n <= 0x0fff_ffff); (* Strings cannot excess 2G anyway *) + let lsb i = Char.chr (i land 0xff) in + patch s p (lsb (n lor 0x80)); + patch s (p + 1) (lsb ((n lsr 7) lor 0x80)); + patch s (p + 2) (lsb ((n lsr 14) lor 0x80)); + patch s (p + 3) (lsb (n lsr 21)) + + (* Types *) + + open Types + + let value_type = function + | Int32Type -> u8 0x01 + | Int64Type -> u8 0x02 + | Float32Type -> u8 0x03 + | Float64Type -> u8 0x04 + + let expr_type t = vec1 value_type t + + let func_type = function + | {ins; out} -> u8 0x05; vec value_type ins; expr_type out + + (* Expressions *) + + open Source + open Kernel + open Ast + + let op n = u8 n + let arity xs = vu (List.length xs) + let arity1 xo = bool (xo <> None) + + let memop off align = vu align; vu64 off (*TODO: to be resolved*) + + let var x = vu x.it + let var32 x = vu32 (Int32.of_int x.it) + + let rec expr e = + match e.it with + | Nop -> op 0x00 + | Block es -> op 0x01; list expr es; op 0x17 + | Loop es -> op 0x02; list expr es; op 0x17 + | If (e, es1, es2) -> + expr e; op 0x03; list expr es1; + if es2 <> [] then op 0x04; list expr es2; op 0x17 + | Select (e1, e2, e3) -> expr e1; expr e2; expr e3; op 0x05 + | Br (x, eo) -> opt expr eo; op 0x06; arity1 eo; var x + | Br_if (x, eo, e) -> opt expr eo; expr e; op 0x07; arity1 eo; var x + | Br_table (xs, x, eo, e) -> + opt expr eo; expr e; op 0x08; arity1 eo; vec var32 xs; var32 x + + | Ast.I32_const c -> op 0x0a; vs32 c.it + | Ast.I64_const c -> op 0x0b; vs64 c.it + | Ast.F32_const c -> op 0x0c; f32 c.it + | Ast.F64_const c -> op 0x0d; f64 c.it + + | Ast.Get_local x -> op 0x0e; var x + | Ast.Set_local (x, e) -> unary e 0x0f; var x + + | Ast.Call (x, es) -> nary es 0x12; var x + | Ast.Call_import (x, es) -> nary es 0x1f; var x + | Ast.Call_indirect (x, e, es) -> expr e; nary es 0x13; var x + | Ast.Return eo -> nary1 eo 0x14 + | Ast.Unreachable -> op 0x15 + + | I32_load8_s (o, a, e) -> unary e 0x20; memop o a + | I32_load8_u (o, a, e) -> unary e 0x21; memop o a + | I32_load16_s (o, a, e) -> unary e 0x22; memop o a + | I32_load16_u (o, a, e) -> unary e 0x23; memop o a + | I64_load8_s (o, a, e) -> unary e 0x24; memop o a + | I64_load8_u (o, a, e) -> unary e 0x25; memop o a + | I64_load16_s (o, a, e) -> unary e 0x26; memop o a + | I64_load16_u (o, a, e) -> unary e 0x27; memop o a + | I64_load32_s (o, a, e) -> unary e 0x28; memop o a + | I64_load32_u (o, a, e) -> unary e 0x29; memop o a + | I32_load (o, a, e) -> unary e 0x2a; memop o a + | I64_load (o, a, e) -> unary e 0x2b; memop o a + | F32_load (o, a, e) -> unary e 0x2c; memop o a + | F64_load (o, a, e) -> unary e 0x2d; memop o a + + | I32_store8 (o, a, e1, e2) -> binary e1 e2 0x2e; memop o a + | I32_store16 (o, a, e1, e2) -> binary e1 e2 0x2f; memop o a + | I64_store8 (o, a, e1, e2) -> binary e1 e2 0x30; memop o a + | I64_store16 (o, a, e1, e2) -> binary e1 e2 0x31; memop o a + | I64_store32 (o, a, e1, e2) -> binary e1 e2 0x32; memop o a + | I32_store (o, a, e1, e2) -> binary e1 e2 0x33; memop o a + | I64_store (o, a, e1, e2) -> binary e1 e2 0x34; memop o a + | F32_store (o, a, e1, e2) -> binary e1 e2 0x35; memop o a + | F64_store (o, a, e1, e2) -> binary e1 e2 0x36; memop o a + + | Grow_memory e -> unary e 0x39 + | Current_memory -> op 0x3b + + | I32_add (e1, e2) -> binary e1 e2 0x40 + | I32_sub (e1, e2) -> binary e1 e2 0x41 + | I32_mul (e1, e2) -> binary e1 e2 0x42 + | I32_div_s (e1, e2) -> binary e1 e2 0x43 + | I32_div_u (e1, e2) -> binary e1 e2 0x44 + | I32_rem_s (e1, e2) -> binary e1 e2 0x45 + | I32_rem_u (e1, e2) -> binary e1 e2 0x46 + | I32_and (e1, e2) -> binary e1 e2 0x47 + | I32_or (e1, e2) -> binary e1 e2 0x48 + | I32_xor (e1, e2) -> binary e1 e2 0x49 + | I32_shl (e1, e2) -> binary e1 e2 0x4a + | I32_shr_u (e1, e2) -> binary e1 e2 0x4b + | I32_shr_s (e1, e2) -> binary e1 e2 0x4c + | I32_rotl (e1, e2) -> binary e1 e2 0xb6 + | I32_rotr (e1, e2) -> binary e1 e2 0xb7 + | I32_eq (e1, e2) -> binary e1 e2 0x4d + | I32_ne (e1, e2) -> binary e1 e2 0x4e + | I32_lt_s (e1, e2) -> binary e1 e2 0x4f + | I32_le_s (e1, e2) -> binary e1 e2 0x50 + | I32_lt_u (e1, e2) -> binary e1 e2 0x51 + | I32_le_u (e1, e2) -> binary e1 e2 0x52 + | I32_gt_s (e1, e2) -> binary e1 e2 0x53 + | I32_ge_s (e1, e2) -> binary e1 e2 0x54 + | I32_gt_u (e1, e2) -> binary e1 e2 0x55 + | I32_ge_u (e1, e2) -> binary e1 e2 0x56 + | I32_clz e -> unary e 0x57 + | I32_ctz e -> unary e 0x58 + | I32_popcnt e -> unary e 0x59 + | I32_eqz e -> unary e 0x5a + + | I64_add (e1, e2) -> binary e1 e2 0x5b + | I64_sub (e1, e2) -> binary e1 e2 0x5c + | I64_mul (e1, e2) -> binary e1 e2 0x5d + | I64_div_s (e1, e2) -> binary e1 e2 0x5e + | I64_div_u (e1, e2) -> binary e1 e2 0x5f + | I64_rem_s (e1, e2) -> binary e1 e2 0x60 + | I64_rem_u (e1, e2) -> binary e1 e2 0x61 + | I64_and (e1, e2) -> binary e1 e2 0x62 + | I64_or (e1, e2) -> binary e1 e2 0x63 + | I64_xor (e1, e2) -> binary e1 e2 0x64 + | I64_shl (e1, e2) -> binary e1 e2 0x65 + | I64_shr_u (e1, e2) -> binary e1 e2 0x66 + | I64_shr_s (e1, e2) -> binary e1 e2 0x67 + | I64_rotl (e1, e2) -> binary e1 e2 0xb8 + | I64_rotr (e1, e2) -> binary e1 e2 0xb9 + | I64_eq (e1, e2) -> binary e1 e2 0x68 + | I64_ne (e1, e2) -> binary e1 e2 0x69 + | I64_lt_s (e1, e2) -> binary e1 e2 0x6a + | I64_le_s (e1, e2) -> binary e1 e2 0x6b + | I64_lt_u (e1, e2) -> binary e1 e2 0x6c + | I64_le_u (e1, e2) -> binary e1 e2 0x6d + | I64_gt_s (e1, e2) -> binary e1 e2 0x6e + | I64_ge_s (e1, e2) -> binary e1 e2 0x6f + | I64_gt_u (e1, e2) -> binary e1 e2 0x70 + | I64_ge_u (e1, e2) -> binary e1 e2 0x71 + | I64_clz e -> unary e 0x72 + | I64_ctz e -> unary e 0x73 + | I64_popcnt e -> unary e 0x74 + | I64_eqz e -> unary e 0xba + + | F32_add (e1, e2) -> binary e1 e2 0x75 + | F32_sub (e1, e2) -> binary e1 e2 0x76 + | F32_mul (e1, e2) -> binary e1 e2 0x77 + | F32_div (e1, e2) -> binary e1 e2 0x78 + | F32_min (e1, e2) -> binary e1 e2 0x79 + | F32_max (e1, e2) -> binary e1 e2 0x7a + | F32_abs e -> unary e 0x7b + | F32_neg e -> unary e 0x7c + | F32_copysign (e1, e2) -> binary e1 e2 0x7d + | F32_ceil e -> unary e 0x7e + | F32_floor e -> unary e 0x7f + | F32_trunc e -> unary e 0x80 + | F32_nearest e -> unary e 0x81 + | F32_sqrt e -> unary e 0x82 + | F32_eq (e1, e2) -> binary e1 e2 0x83 + | F32_ne (e1, e2) -> binary e1 e2 0x84 + | F32_lt (e1, e2) -> binary e1 e2 0x85 + | F32_le (e1, e2) -> binary e1 e2 0x86 + | F32_gt (e1, e2) -> binary e1 e2 0x87 + | F32_ge (e1, e2) -> binary e1 e2 0x88 + + | F64_add (e1, e2) -> binary e1 e2 0x89 + | F64_sub (e1, e2) -> binary e1 e2 0x8a + | F64_mul (e1, e2) -> binary e1 e2 0x8b + | F64_div (e1, e2) -> binary e1 e2 0x8c + | F64_min (e1, e2) -> binary e1 e2 0x8d + | F64_max (e1, e2) -> binary e1 e2 0x8e + | F64_abs e -> unary e 0x8f + | F64_neg e -> unary e 0x90 + | F64_copysign (e1, e2) -> binary e1 e2 0x91 + | F64_ceil e -> unary e 0x92 + | F64_floor e -> unary e 0x93 + | F64_trunc e -> unary e 0x94 + | F64_nearest e -> unary e 0x95 + | F64_sqrt e -> unary e 0x96 + | F64_eq (e1, e2) -> binary e1 e2 0x97 + | F64_ne (e1, e2) -> binary e1 e2 0x98 + | F64_lt (e1, e2) -> binary e1 e2 0x99 + | F64_le (e1, e2) -> binary e1 e2 0x9a + | F64_gt (e1, e2) -> binary e1 e2 0x9b + | F64_ge (e1, e2) -> binary e1 e2 0x9c + + | I32_trunc_s_f32 e -> unary e 0x9d + | I32_trunc_s_f64 e -> unary e 0x9e + | I32_trunc_u_f32 e -> unary e 0x9f + | I32_trunc_u_f64 e -> unary e 0xa0 + | I32_wrap_i64 e -> unary e 0xa1 + | I64_trunc_s_f32 e -> unary e 0xa2 + | I64_trunc_s_f64 e -> unary e 0xa3 + | I64_trunc_u_f32 e -> unary e 0xa4 + | I64_trunc_u_f64 e -> unary e 0xa5 + | I64_extend_s_i32 e -> unary e 0xa6 + | I64_extend_u_i32 e -> unary e 0xa7 + | F32_convert_s_i32 e -> unary e 0xa8 + | F32_convert_u_i32 e -> unary e 0xa9 + | F32_convert_s_i64 e -> unary e 0xaa + | F32_convert_u_i64 e -> unary e 0xab + | F32_demote_f64 e -> unary e 0xac + | F32_reinterpret_i32 e -> unary e 0xad + | F64_convert_s_i32 e -> unary e 0xae + | F64_convert_u_i32 e -> unary e 0xaf + | F64_convert_s_i64 e -> unary e 0xb0 + | F64_convert_u_i64 e -> unary e 0xb1 + | F64_promote_f32 e -> unary e 0xb2 + | F64_reinterpret_i64 e -> unary e 0xb3 + | I32_reinterpret_f32 e -> unary e 0xb4 + | I64_reinterpret_f64 e -> unary e 0xb5 + + and unary e o = expr e; op o + and binary e1 e2 o = expr e1; expr e2; op o + and nary es o = list expr es; op o; arity es + and nary1 eo o = opt expr eo; op o; arity1 eo + + (* Sections *) + + let section id f x needed = + if needed then begin + let g = gap () in + let p = pos s in + string id; + f x; + patch_gap g (pos s - p) + end + + (* Type section *) + let type_section ts = + section "type" (vec func_type) ts (ts <> []) + + (* Import section *) + let import imp = + let {itype; module_name; func_name} = imp.it in + var itype; string module_name; string func_name + + let import_section imps = + section "import" (vec import) imps (imps <> []) + + (* Function section *) + let func f = var f.it.ftype + + let func_section fs = + section "function" (vec func) fs (fs <> []) + + (* Table section *) + let table_section tab = + section "table" (vec var) tab (tab <> []) + + (* Memory section *) + let memory mem = + let {min; max; _} = mem.it in + vu64 min; vu64 max; bool true (*TODO: pending change*) + + let memory_section memo = + section "memory" (opt memory) memo (memo <> None) + + (* Export section *) + let export exp = + let {Kernel.name; kind} = exp.it in + (match kind with + | `Func x -> var x + | `Memory -> () (*TODO: pending resolution*) + ); string name + + let export_section exps = + (*TODO: pending resolution*) + let exps = List.filter (fun exp -> exp.it.kind <> `Memory) exps in + section "export" (vec export) exps (exps <> []) + + (* Start section *) + let start_section xo = + section "start" (opt var) xo (xo <> None) + + (* Code section *) + let compress locals = + let combine t = function + | (t', n) :: ts when t = t' -> (t, n + 1) :: ts + | ts -> (t, 1) :: ts + in List.fold_right combine locals [] + + let local (t, n) = vu n; value_type t + + let code f = + let {locals; body; _} = f.it in + vec local (compress locals); + let g = gap () in + let p = pos s in + list expr body; + patch_gap g (pos s - p) + + let code_section fs = + section "code" (vec code) fs (fs <> []) + + (* Data section *) + let segment seg = + let {Memory.addr; data} = seg.it in + vu64 addr; string data + + let data_section segs = + section "data" (opt (vec segment)) + segs (segs <> None && segs <> Some []) + + (* Module *) + + let module_ m = + u32 0x6d736100l; + u32 (Int32.of_int version); + type_section m.it.types; + import_section m.it.imports; + func_section m.it.funcs; + table_section m.it.table; + memory_section m.it.memory; + export_section m.it.exports; + start_section m.it.start; + code_section m.it.funcs; + data_section (Lib.Option.map (fun mem -> mem.it.segments) m.it.memory) + end + in E.module_ m; to_string s diff --git a/ml-proto/host/encode.mli b/ml-proto/host/encode.mli new file mode 100644 index 0000000000..6263761d63 --- /dev/null +++ b/ml-proto/host/encode.mli @@ -0,0 +1 @@ +val encode : Ast.module_ -> string diff --git a/ml-proto/host/lexer.mll b/ml-proto/host/lexer.mll index 2d0c5fd9e0..ab49531b0e 100644 --- a/ml-proto/host/lexer.mll +++ b/ml-proto/host/lexer.mll @@ -169,15 +169,15 @@ rule token = parse (F32_store (o, (Lib.Option.get a 4), e1, e2)) (F64_store (o, (Lib.Option.get a 8), e1, e2))) } | (ixx as t)".load"(mem_size as sz)"_"(sign as s) - { LOAD (fun (o, a, e) -> + { if t = "i32" && sz = "32" then error lexbuf "unknown opcode"; + LOAD (fun (o, a, e) -> intop t (memsz sz (ext s (I32_load8_s (o, (Lib.Option.get a 1), e)) (I32_load8_u (o, (Lib.Option.get a 1), e))) (ext s (I32_load16_s (o, (Lib.Option.get a 2), e)) (I32_load16_u (o, (Lib.Option.get a 2), e))) - (ext s (I32_load32_s (o, (Lib.Option.get a 4), e)) - (I32_load32_u (o, (Lib.Option.get a 4), e)))) + Unreachable) (memsz sz (ext s (I64_load8_s (o, (Lib.Option.get a 1), e)) (I64_load8_u (o, (Lib.Option.get a 1), e))) @@ -186,12 +186,13 @@ rule token = parse (ext s (I64_load32_s (o, (Lib.Option.get a 4), e)) (I64_load32_u (o, (Lib.Option.get a 4), e))))) } | (ixx as t)".store"(mem_size as sz) - { STORE (fun (o, a, e1, e2) -> + { if t = "i32" && sz = "32" then error lexbuf "unknown opcode"; + STORE (fun (o, a, e1, e2) -> intop t (memsz sz (I32_store8 (o, (Lib.Option.get a 1), e1, e2)) (I32_store16 (o, (Lib.Option.get a 2), e1, e2)) - (I32_store32 (o, (Lib.Option.get a 4), e1, e2))) + Unreachable) (memsz sz (I64_store8 (o, (Lib.Option.get a 1), e1, e2)) (I64_store16 (o, (Lib.Option.get a 2), e1, e2)) @@ -366,6 +367,8 @@ rule token = parse | "assert_return_nan" { ASSERT_RETURN_NAN } | "assert_trap" { ASSERT_TRAP } | "invoke" { INVOKE } + | "input" { INPUT } + | "output" { OUTPUT } | name as s { VAR s } diff --git a/ml-proto/host/main.ml b/ml-proto/host/main.ml index 87b1b7c448..ea309866b4 100644 --- a/ml-proto/host/main.ml +++ b/ml-proto/host/main.ml @@ -1,5 +1,5 @@ let name = "wasm" -let version = "0.2" +let version = "0.3" let configure () = Import.register "spectest" Spectest.lookup; @@ -8,82 +8,20 @@ let configure () = let banner () = print_endline (name ^ " " ^ version ^ " spec interpreter") -let parse name lexbuf start = - lexbuf.Lexing.lex_curr_p <- - {lexbuf.Lexing.lex_curr_p with Lexing.pos_fname = name}; - try start Lexer.token lexbuf with Script.Syntax (region, s) -> - let region' = if region <> Source.no_region then region else - {Source.left = Lexer.convert_pos lexbuf.Lexing.lex_start_p; - Source.right = Lexer.convert_pos lexbuf.Lexing.lex_curr_p} in - raise (Script.Syntax (region', s)) - -let error at category msg = - Script.trace ("Error (" ^ category ^ "): "); - prerr_endline (Source.string_of_region at ^ ": " ^ msg); - false - -let process file lexbuf start = - try - let script = parse file lexbuf start in - Script.trace "Desugaring..."; - let script' = Script.desugar script in - Script.trace "Running..."; - Script.run script'; - true - with - | Script.Syntax (at, msg) -> error at "syntax error" msg - | Script.AssertFailure (at, msg) -> error at "assertion failure" msg - | Check.Invalid (at, msg) -> error at "invalid module" msg - | Eval.Trap (at, msg) -> error at "runtime trap" msg - | Eval.Crash (at, msg) -> error at "runtime crash" msg - | Import.Unknown (at, msg) -> error at "unknown import" msg - -let process_file file = - Script.trace ("Loading (" ^ file ^ ")..."); - let ic = open_in file in - try - let lexbuf = Lexing.from_channel ic in - Script.trace "Parsing..."; - let success = process file lexbuf Parser.script in - close_in ic; - if not success then exit 1 - with exn -> close_in ic; raise exn - -let continuing = ref false - -let lexbuf_stdin buf len = - let prompt = if !continuing then " " else "> " in - print_string prompt; flush_all (); - continuing := true; - let rec loop i = - if i = len then i else - let ch = input_char stdin in - Bytes.set buf i ch; - if ch = '\n' then i + 1 else loop (i + 1) - in - let n = loop 0 in - if n = 1 then continuing := false else Script.trace "Parsing..."; - n +let usage = "Usage: " ^ name ^ " [option] [file ...]" -let rec process_stdin () = - banner (); - let lexbuf = Lexing.from_function lexbuf_stdin in - let rec loop () = - let success = process "stdin" lexbuf Parser.script1 in - if not success then Lexing.flush_input lexbuf; - if Lexing.(lexbuf.lex_curr_pos >= lexbuf.lex_buffer_len - 1) then - continuing := false; - loop () - in - try loop () with End_of_file -> - print_endline ""; - Script.trace "Bye." +let args = ref [] +let add_arg source = args := !args @ [source] -let usage = "Usage: " ^ name ^ " [option] [file ...]" let argspec = Arg.align [ "-", Arg.Set Flags.interactive, " run interactively (default if no files given)"; + "-e", Arg.String add_arg, " evaluate string"; + "-i", Arg.String (fun file -> add_arg ("(input \"" ^ file ^ "\")")), + " read script from file"; + "-o", Arg.String (fun file -> add_arg ("(output \"" ^ file ^ "\")")), + " write module to file"; "-s", Arg.Set Flags.print_sig, " show module signatures"; "-d", Arg.Set Flags.dry, " dry, do not run program"; "-t", Arg.Set Flags.trace, " trace execution"; @@ -94,11 +32,12 @@ let () = Printexc.record_backtrace true; try configure (); - let files = ref [] in - Arg.parse argspec (fun file -> files := !files @ [file]) usage; - if !files = [] then Flags.interactive := true; - List.iter process_file !files; - if !Flags.interactive then process_stdin () + Arg.parse argspec (fun file -> add_arg ("(input \"" ^ file ^ "\")")) usage; + List.iter (fun arg -> if not (Run.run_string arg) then exit 1) !args; + if !Flags.interactive || !args = [] then begin + banner (); + Run.run_stdin () + end with exn -> flush_all (); prerr_endline diff --git a/ml-proto/host/parse.ml b/ml-proto/host/parse.ml new file mode 100644 index 0000000000..09f3f3f3b5 --- /dev/null +++ b/ml-proto/host/parse.ml @@ -0,0 +1,28 @@ +type 'a start = + | Module : Ast.module_ start + | Script : Script.script start + | Script1 : Script.script start + +exception Syntax = Script.Syntax + +let parse' name lexbuf start = + lexbuf.Lexing.lex_curr_p <- + {lexbuf.Lexing.lex_curr_p with Lexing.pos_fname = name}; + try start Lexer.token lexbuf + with Syntax (region, s) -> + let region' = if region <> Source.no_region then region else + {Source.left = Lexer.convert_pos lexbuf.Lexing.lex_start_p; + Source.right = Lexer.convert_pos lexbuf.Lexing.lex_curr_p} in + raise (Syntax (region', s)) + +let parse (type a) name lexbuf : a start -> a = function + | Module -> parse' name lexbuf Parser.module1 + | Script -> parse' name lexbuf Parser.script + | Script1 -> parse' name lexbuf Parser.script1 + +let string_to start s = + let lexbuf = Lexing.from_string s in + parse "string" lexbuf start + +let string_to_script s = string_to Script s +let string_to_module s = string_to Module s diff --git a/ml-proto/host/parse.mli b/ml-proto/host/parse.mli new file mode 100644 index 0000000000..afba014fa9 --- /dev/null +++ b/ml-proto/host/parse.mli @@ -0,0 +1,11 @@ +type 'a start = + | Module : Ast.module_ start + | Script : Script.script start + | Script1 : Script.script start + +exception Syntax of Source.region * string + +val parse : string -> Lexing.lexbuf -> 'a start -> 'a (* raise Syntax *) + +val string_to_script : string -> Script.script (* raise Syntax *) +val string_to_module : string -> Ast.module_ (* raise Syntax *) diff --git a/ml-proto/host/parser.mly b/ml-proto/host/parser.mly index 529a4fbfa7..e72a360693 100644 --- a/ml-proto/host/parser.mly +++ b/ml-proto/host/parser.mly @@ -133,6 +133,7 @@ let implicit_decl c t at = %token FUNC START TYPE PARAM RESULT LOCAL %token MODULE MEMORY SEGMENT IMPORT EXPORT TABLE %token ASSERT_INVALID ASSERT_RETURN ASSERT_RETURN_NAN ASSERT_TRAP INVOKE +%token INPUT OUTPUT %token EOF %token INT @@ -154,9 +155,10 @@ let implicit_decl c t at = %nonassoc LOW %nonassoc VAR -%start script script1 +%start script script1 module1 %type script %type script1 +%type module1 %% @@ -336,10 +338,10 @@ segment_list : memory : | LPAR MEMORY INT INT segment_list RPAR - { {initial = Int64.of_string $3; max = Int64.of_string $4; segments = $5} + { {min = Int64.of_string $3; max = Int64.of_string $4; segments = $5} @@ at () } | LPAR MEMORY INT segment_list RPAR - { {initial = Int64.of_string $3; max = Int64.of_string $3; segments = $4} + { {min = Int64.of_string $3; max = Int64.of_string $3; segments = $4} @@ at () } ; @@ -376,9 +378,9 @@ import : export : | LPAR EXPORT TEXT var RPAR - { let at = at () in fun c -> ExportFunc ($3, ($4 c func)) @@ at } + { let at = at () in fun c -> {name = $3; kind = `Func ($4 c func)} @@ at } | LPAR EXPORT TEXT MEMORY RPAR - { let at = at () in fun c -> ExportMemory $3 @@ at } + { let at = at () in fun c -> {name = $3; kind = `Memory} @@ at } ; module_fields : @@ -426,6 +428,8 @@ cmd : { AssertReturnNaN ($5, $6) @@ at () } | LPAR ASSERT_TRAP LPAR INVOKE TEXT const_list RPAR TEXT RPAR { AssertTrap ($5, $6, $8) @@ at () } + | LPAR INPUT TEXT RPAR { Input $3 @@ at () } + | LPAR OUTPUT TEXT RPAR { Output $3 @@ at () } ; cmd_list : | /* empty */ { [] } @@ -450,4 +454,7 @@ script : script1 : | cmd { [$1] } ; +module1 : + | module_ EOF { $1 } +; %% diff --git a/ml-proto/host/print.ml b/ml-proto/host/print.ml index bd64e9be78..a258e55f70 100644 --- a/ml-proto/host/print.ml +++ b/ml-proto/host/print.ml @@ -21,11 +21,13 @@ let print_var_sig prefix i t = let print_func_sig m prefix i f = printf "%s %d : %s\n" prefix i (string_of_func_type (func_type m f)) -let print_export_sig m n f = - printf "export \"%s\" : %s\n" n (string_of_func_type (func_type m f)) - -let print_export_mem n = - printf "export \"%s\" : memory\n" n +let print_export m i ex = + let {name; kind} = ex.it in + let ascription = + match kind with + | `Func x -> string_of_func_type (func_type m (List.nth m.it.funcs x.it)) + | `Memory -> "memory" + in printf "export \"%s\" : %s\n" name ascription let print_table_elem i x = printf "table [%d] = func %d\n" i x.it @@ -33,16 +35,12 @@ let print_table_elem i x = let print_start start = Lib.Option.app (fun x -> printf "start = func %d\n" x.it) start + (* Ast *) let print_func m i f = print_func_sig m "func" i f -let print_export m i ex = - match ex.it with - | ExportFunc (n, x) -> print_export_sig m n (List.nth m.it.funcs x.it) - | ExportMemory n -> print_export_mem n - let print_module m = let {funcs; start; exports; table} = m.it in List.iteri (print_func m) funcs; diff --git a/ml-proto/host/run.ml b/ml-proto/host/run.ml new file mode 100644 index 0000000000..87a299d768 --- /dev/null +++ b/ml-proto/host/run.ml @@ -0,0 +1,130 @@ +(* File types *) + +let sexpr_ext = "wast" +let binary_ext = "wasm" + +let dispatch_file_ext on_sexpr on_binary file = + if Filename.check_suffix file sexpr_ext then + on_sexpr file + else if Filename.check_suffix file binary_ext then + on_binary file + else + raise (Sys_error (file ^ ": Unrecognized file type")) + + +(* Input *) + +let error at category msg = + Script.trace ("Error (" ^ category ^ "): "); + prerr_endline (Source.string_of_region at ^ ": " ^ msg); + false + +let run_from get_script = + try + let script = get_script () in + Script.trace "Running..."; + Script.run script; + true + with + | Decode.Code (at, msg) -> error at "decoding error" msg + | Parse.Syntax (at, msg) -> error at "syntax error" msg + | Script.Assert (at, msg) -> error at "assertion failure" msg + | Check.Invalid (at, msg) -> error at "invalid module" msg + | Eval.Trap (at, msg) -> error at "runtime trap" msg + | Eval.Crash (at, msg) -> error at "runtime crash" msg + | Import.Unknown (at, msg) -> error at "unknown import" msg + | Script.IO (at, msg) -> error at "i/o error" msg + | Script.Abort _ -> false + +let run_sexpr name lexbuf start = + run_from (fun _ -> Parse.parse name lexbuf start) + +let run_binary name buf = + let open Source in + run_from + (fun _ -> let m = Decode.decode name buf in [Script.Define m @@ m.at]) + +let run_sexpr_file file = + Script.trace ("Loading (" ^ file ^ ")..."); + let ic = open_in file in + try + let lexbuf = Lexing.from_channel ic in + Script.trace "Parsing..."; + let success = run_sexpr file lexbuf Parse.Script in + close_in ic; + success + with exn -> close_in ic; raise exn + +let run_binary_file file = + Script.trace ("Loading (" ^ file ^ ")..."); + let ic = open_in_bin file in + try + let len = in_channel_length ic in + let buf = Bytes.make len '\x00' in + really_input ic buf 0 len; + Script.trace "Decoding..."; + let success = run_binary file buf in + close_in ic; + success + with exn -> close_in ic; raise exn + +let run_file = dispatch_file_ext run_sexpr_file run_binary_file + +let run_string string = + Script.trace ("Running (\"" ^ String.escaped string ^ "\")..."); + let lexbuf = Lexing.from_string string in + Script.trace "Parsing..."; + run_sexpr "string" lexbuf Parse.Script + +let () = Script.input_file := run_file + + +(* Interactive *) + +let continuing = ref false + +let lexbuf_stdin buf len = + let prompt = if !continuing then " " else "> " in + print_string prompt; flush_all (); + continuing := true; + let rec loop i = + if i = len then i else + let ch = input_char stdin in + Bytes.set buf i ch; + if ch = '\n' then i + 1 else loop (i + 1) + in + let n = loop 0 in + if n = 1 then continuing := false else Script.trace "Parsing..."; + n + +let rec run_stdin () = + let lexbuf = Lexing.from_function lexbuf_stdin in + let rec loop () = + let success = run_sexpr "stdin" lexbuf Parse.Script1 in + if not success then Lexing.flush_input lexbuf; + if Lexing.(lexbuf.lex_curr_pos >= lexbuf.lex_buffer_len - 1) then + continuing := false; + loop () + in + try loop () with End_of_file -> + print_endline ""; + Script.trace "Bye." + + +(* Output *) + +let create_sexpr_file file m = + () (*TODO: pretty-print*) + +let create_binary_file file m = + Script.trace ("Encoding (" ^ file ^ ")..."); + let s = Encode.encode m in + let oc = open_out_bin file in + try + Script.trace "Writing..."; + output_string oc s; + close_out oc + with exn -> close_out oc; raise exn + +let create_file = dispatch_file_ext create_sexpr_file create_binary_file +let () = Script.output_file := create_file diff --git a/ml-proto/host/run.mli b/ml-proto/host/run.mli new file mode 100644 index 0000000000..8b57b619d9 --- /dev/null +++ b/ml-proto/host/run.mli @@ -0,0 +1,5 @@ +val run_string : string -> bool +val run_file : string -> bool +val run_stdin : unit -> unit + +val create_file : string -> Ast.module_ -> unit diff --git a/ml-proto/host/script.ml b/ml-proto/host/script.ml index b0226e7e11..5327b7bdd8 100644 --- a/ml-proto/host/script.ml +++ b/ml-proto/host/script.ml @@ -3,96 +3,97 @@ open Source (* Script representation *) -type 'm command = 'm command' Source.phrase -and 'm command' = - | Define of 'm +type command = command' Source.phrase +and command' = + | Define of Ast.module_ | Invoke of string * Kernel.literal list - | AssertInvalid of 'm * string + | AssertInvalid of Ast.module_ * string | AssertReturn of string * Kernel.literal list * Kernel.literal option | AssertReturnNaN of string * Kernel.literal list | AssertTrap of string * Kernel.literal list * string + | Input of string + | Output of string -type script = Ast.module_ command list -type script' = Kernel.module_ command list - - -(* Desugaring *) - -let rec desugar_cmd c = desugar_cmd' c.it @@ c.at -and desugar_cmd' = function - | Define m -> Define (Desugar.desugar m) - | Invoke (s, ls) -> Invoke (s, ls) - | AssertInvalid (m, r) -> AssertInvalid (Desugar.desugar m, r) - | AssertReturn (s, ls, lo) -> AssertReturn (s, ls, lo) - | AssertReturnNaN (s, ls) -> AssertReturnNaN (s, ls) - | AssertTrap (s, ls, r) -> AssertTrap (s, ls, r) - -let desugar = List.map desugar_cmd +type script = command list (* Execution *) +module Abort = Error.Make () module Syntax = Error.Make () -module AssertFailure = Error.Make () +module Assert = Error.Make () +module IO = Error.Make () +exception Abort = Abort.Error exception Syntax = Syntax.Error -exception AssertFailure = AssertFailure.Error (* assert command failure *) +exception Assert = Assert.Error +exception IO = IO.Error let trace name = if !Flags.trace then print_endline ("-- " ^ name) -let current_module : Eval.instance option ref = ref None +let current_module : Ast.module_ option ref = ref None +let current_instance : Eval.instance option ref = ref None let get_module at = match !current_module with | Some m -> m - | None -> raise (Eval.Crash (at, "no module defined to invoke")) + | None -> raise (Eval.Crash (at, "no module defined")) + +let get_instance at = match !current_instance with + | Some m -> m + | None -> raise (Eval.Crash (at, "no module defined")) +let input_file = ref (fun _ -> assert false) +let output_file = ref (fun _ -> assert false) let run_cmd cmd = match cmd.it with | Define m -> + let m' = Desugar.desugar m in trace "Checking..."; - Check.check_module m; + Check.check_module m'; if !Flags.print_sig then begin trace "Signature:"; - Print.print_module_sig m + Print.print_module_sig m' end; + current_module := Some m; trace "Initializing..."; - let imports = Import.link m in - current_module := Some (Eval.init m imports) + let imports = Import.link m' in + current_instance := Some (Eval.init m' imports) | Invoke (name, es) -> - trace "Invoking..."; - let m = get_module cmd.at in + trace ("Invoking \"" ^ name ^ "\"..."); + let m = get_instance cmd.at in let v = Eval.invoke m name (List.map it es) in if v <> None then Print.print_value v | AssertInvalid (m, re) -> trace "Asserting invalid..."; - (match Check.check_module m with + let m' = Desugar.desugar m in + (match Check.check_module m' with | exception Check.Invalid (_, msg) -> if not (Str.string_match (Str.regexp re) msg 0) then begin print_endline ("Result: \"" ^ msg ^ "\""); print_endline ("Expect: \"" ^ re ^ "\""); - AssertFailure.error cmd.at "wrong validation error" + Assert.error cmd.at "wrong validation error" end | _ -> - AssertFailure.error cmd.at "expected validation error" + Assert.error cmd.at "expected validation error" ) | AssertReturn (name, es, expect_e) -> - trace "Asserting return..."; - let m = get_module cmd.at in + trace ("Asserting return \"" ^ name ^ "\"..."); + let m = get_instance cmd.at in let got_v = Eval.invoke m name (List.map it es) in let expect_v = Lib.Option.map it expect_e in if got_v <> expect_v then begin print_string "Result: "; Print.print_value got_v; print_string "Expect: "; Print.print_value expect_v; - AssertFailure.error cmd.at "wrong return value" + Assert.error cmd.at "wrong return value" end | AssertReturnNaN (name, es) -> - trace "Asserting return..."; - let m = get_module cmd.at in + trace ("Asserting return \"" ^ name ^ "\"..."); + let m = get_instance cmd.at in let got_v = Eval.invoke m name (List.map it es) in if match got_v with @@ -104,28 +105,48 @@ let run_cmd cmd = then begin print_string "Result: "; Print.print_value got_v; print_string "Expect: "; print_endline "nan"; - AssertFailure.error cmd.at "wrong return value" + Assert.error cmd.at "wrong return value" end | AssertTrap (name, es, re) -> - trace "Asserting trap..."; - let m = get_module cmd.at in + trace ("Asserting trap \"" ^ name ^ "\"..."); + let m = get_instance cmd.at in (match Eval.invoke m name (List.map it es) with | exception Eval.Trap (_, msg) -> if not (Str.string_match (Str.regexp re) msg 0) then begin print_endline ("Result: \"" ^ msg ^ "\""); print_endline ("Expect: \"" ^ re ^ "\""); - AssertFailure.error cmd.at "wrong runtime trap" + Assert.error cmd.at "wrong runtime trap" end | _ -> - AssertFailure.error cmd.at "expected runtime trap" + Assert.error cmd.at "expected runtime trap" ) + | Input file -> + (try if not (!input_file file) then Abort.error cmd.at "aborting" + with Sys_error msg -> IO.error cmd.at msg) + + | Output file -> + (try !output_file file (get_module cmd.at) + with Sys_error msg -> IO.error cmd.at msg) + let dry_cmd cmd = match cmd.it with | Define m -> - Check.check_module m; - if !Flags.print_sig then Print.print_module_sig m + let m' = Desugar.desugar m in + trace "Checking..."; + Check.check_module m'; + if !Flags.print_sig then begin + trace "Signature:"; + Print.print_module_sig m' + end; + current_module := Some m + | Input file -> + (try if not (!input_file file) then Abort.error cmd.at "aborting" + with Sys_error msg -> IO.error cmd.at msg) + | Output file -> + (try !output_file file (get_module cmd.at) + with Sys_error msg -> IO.error cmd.at msg) | Invoke _ | AssertInvalid _ | AssertReturn _ diff --git a/ml-proto/host/script.mli b/ml-proto/host/script.mli index d941636522..4392f3b61e 100644 --- a/ml-proto/host/script.mli +++ b/ml-proto/host/script.mli @@ -1,21 +1,25 @@ -type 'm command = 'm command' Source.phrase -and 'm command' = - | Define of 'm +type command = command' Source.phrase +and command' = + | Define of Ast.module_ | Invoke of string * Kernel.literal list - | AssertInvalid of 'm * string + | AssertInvalid of Ast.module_ * string | AssertReturn of string * Kernel.literal list * Kernel.literal option | AssertReturnNaN of string * Kernel.literal list | AssertTrap of string * Kernel.literal list * string + | Input of string + | Output of string -type script = Ast.module_ command list -type script' = Kernel.module_ command list - -val desugar : script -> script' +type script = command list +exception Abort of Source.region * string exception Syntax of Source.region * string -exception AssertFailure of Source.region * string +exception Assert of Source.region * string +exception IO of Source.region * string -val run : script' -> unit - (* raises Check.Invalid, Eval.Trap, Eval.Crash, Failure *) +val run : script -> unit + (* raises Check.Invalid, Eval.Trap, Eval.Crash, Assert, IO *) val trace : string -> unit + +val input_file : (string -> bool) ref +val output_file : (string -> Ast.module_ -> unit) ref diff --git a/ml-proto/runtests.py b/ml-proto/runtests.py index f9bf5ecfc7..b3ea86813a 100755 --- a/ml-proto/runtests.py +++ b/ml-proto/runtests.py @@ -7,14 +7,16 @@ import glob import sys -class RunTests(unittest.TestCase): - def _runTestFile(self, shortName, fileName, interpreterPath): - logPath = fileName.replace("test/", "test/output/").replace(".wast", ".wast.log") +def tempFile(path): try: - os.remove(logPath) + os.remove(path) except OSError: pass + return path +class RunTests(unittest.TestCase): + def _runTestFile(self, shortName, fileName, interpreterPath): + logPath = tempFile(fileName.replace("test/", "test/output/").replace(".wast", ".wast.log")) commandStr = ("%s %s > %s") % (interpreterPath, fileName, logPath) exitCode = subprocess.call(commandStr, shell=True) self.assertEqual(0, exitCode, "test runner failed with exit code %i" % exitCode) @@ -33,6 +35,25 @@ def _runTestFile(self, shortName, fileName, interpreterPath): actualText = output.read() self.assertEqual(expectedText, actualText) +class TranscodeTests(unittest.TestCase): + def _runTestFile(self, shortName, fileName, interpreterPath): + logPath = tempFile(fileName.replace("test/", "test/output/").replace(".wast", ".wast.log")) + wasmPath = tempFile(fileName.replace("test/", "test/output/").replace(".wast", ".wast.wasm")) + try: + os.remove(wasmPath) + except OSError: + pass + + commandStr = ("%s -d %s -o %s") % (interpreterPath, fileName, wasmPath) + exitCode = subprocess.call(commandStr, shell=True) + self.assertEqual(0, exitCode, "test runner failed with exit code %i" % exitCode) + + commandStr = ("%s %s > %s") % (interpreterPath, wasmPath, logPath) + exitCode = subprocess.call(commandStr, shell=True) + self.assertEqual(0, exitCode, "test runner failed with exit code %i" % exitCode) + + # TODO: once s-expr output works, re-encode and compare + def generate_test_case(rec): return lambda self : self._runTestFile(*rec) @@ -74,4 +95,5 @@ def rebuild_interpreter(path): testFiles = glob.glob("test/*.wast") generate_test_cases(RunTests, interpreterPath, testFiles) + generate_test_cases(TranscodeTests, interpreterPath, testFiles) unittest.main() diff --git a/ml-proto/spec/ast.ml b/ml-proto/spec/ast.ml index 86b1a71166..efff76c307 100644 --- a/ml-proto/spec/ast.ml +++ b/ml-proto/spec/ast.ml @@ -42,8 +42,6 @@ and expr' = | I32_load8_u of Memory.offset * int * expr | I32_load16_s of Memory.offset * int * expr | I32_load16_u of Memory.offset * int * expr - | I32_load32_s of Memory.offset * int * expr - | I32_load32_u of Memory.offset * int * expr | I64_load8_s of Memory.offset * int * expr | I64_load8_u of Memory.offset * int * expr | I64_load16_s of Memory.offset * int * expr @@ -52,7 +50,6 @@ and expr' = | I64_load32_u of Memory.offset * int * expr | I32_store8 of Memory.offset * int * expr * expr | I32_store16 of Memory.offset * int * expr * expr - | I32_store32 of Memory.offset * int * expr * expr | I64_store8 of Memory.offset * int * expr * expr | I64_store16 of Memory.offset * int * expr * expr | I64_store32 of Memory.offset * int * expr * expr @@ -201,7 +198,7 @@ type func = func' Source.phrase and func' = { ftype : var; - locals :Types.value_type list; + locals : Types.value_type list; body : expr list; } diff --git a/ml-proto/spec/check.ml b/ml-proto/spec/check.ml index 8b1b4eeec5..a74ed444f4 100644 --- a/ml-proto/spec/check.ml +++ b/ml-proto/spec/check.ml @@ -288,11 +288,12 @@ let check_elem c x = module NameSet = Set.Make(String) let check_export c set ex = - let name = match ex.it with - | ExportFunc (n, x) -> ignore (func c x); n - | ExportMemory n -> require (c.has_memory) ex.at "no memory to export"; n in - require (not (NameSet.mem name set)) ex.at - "duplicate export name"; + let {name; kind} = ex.it in + (match kind with + | `Func x -> ignore (func c x) + | `Memory -> require c.has_memory ex.at "no memory to export" + ); + require (not (NameSet.mem name set)) ex.at "duplicate export name"; NameSet.add name set let check_start c start = @@ -315,11 +316,11 @@ let check_segment pages prev_end seg = let check_memory memory = let mem = memory.it in - require (mem.initial <= mem.max) memory.at - "initial memory pages must be less than or equal to the maximum"; + require (mem.min <= mem.max) memory.at + "minimum memory pages must be less than or equal to the maximum"; require (mem.max <= 65535L) memory.at "linear memory pages must be less or equal to 65535 (4GiB)"; - ignore (List.fold_left (check_segment mem.initial) 0L mem.segments) + ignore (List.fold_left (check_segment mem.min) 0L mem.segments) let check_module m = let {memory; types; funcs; start; imports; exports; table} = m.it in diff --git a/ml-proto/spec/decode.ml b/ml-proto/spec/decode.ml new file mode 100644 index 0000000000..b3b4790949 --- /dev/null +++ b/ml-proto/spec/decode.ml @@ -0,0 +1,598 @@ +(* Decoding stream *) + +type stream = +{ + name : string; + bytes : string; + pos : int ref; + len : int +} + +exception EOS + +let stream name bs = {name; bytes = bs; pos = ref 0; len = String.length bs} +let substream s end_ = {s with len = end_} + +let len s = s.len +let pos s = !(s.pos) +let eos s = (pos s = len s) + +let check n s = if pos s + n > len s then raise EOS +let skip n s = check n s; s.pos := !(s.pos) + n +let rewind p s = s.pos := p + +let read s = Char.code (s.bytes.[!(s.pos)]) +let peek s = if eos s then None else Some (read s) +let get s = check 1 s; let b = read s in skip 1 s; b +let get_string n s = let i = pos s in skip n s; String.sub s.bytes i n + + +(* Errors *) + +module Code = Error.Make () +exception Code = Code.Error + +let string_of_byte b = Printf.sprintf "%02x" b + +let position s pos = Source.({file = s.name; line = -1; column = pos}) +let region s left right = + Source.({left = position s left; right = position s right}) + +let error s pos msg = raise (Code (region s pos pos, msg)) +let require b s pos msg = if not b then error s pos msg + +let guard f s = + try f s with EOS -> error s (len s) "unexpected end of binary or function" + +let get = guard get +let get_string n = guard (get_string n) +let skip n = guard (skip n) + +let expect b s msg = require (guard get s = b) s (pos s - 1) msg +let illegal s pos b = error s pos ("illegal opcode " ^ string_of_byte b) + +let at f s = + let left = pos s in + let x = f s in + let right = pos s in + Source.(x @@ region s left right) + + + +(* Generic values *) + +let bit i n = Int32.(logand n (shift_left 1l i)) <> 0l + +let u8 s = get s + +let u16 s = + let lo = u8 s in + let hi = u8 s in + hi lsl 8 + lo + +let u32 s = + let lo = Int32.of_int (u16 s) in + let hi = Int32.of_int (u16 s) in + Int32.(add lo (shift_left hi 16)) + +let u64 s = + let lo = Int64.of_int32 (u32 s) in + let hi = Int64.of_int32 (u32 s) in + Int64.(add lo (shift_left hi 32)) + +let rec vu64 s = + let b = u8 s in + let x = Int64.of_int (b land 0x7f) in + if b land 0x80 = 0 then x + else Int64.(logor x (shift_left (vu64 s) 7)) + (*TODO: check for overflow*) + +let rec vs64 s = + let b = u8 s in + let x = Int64.of_int (b land 0x7f) in + if b land 0x80 = 0 + then (if b land 0x40 = 0 then x else Int64.(logor x (logxor (-1L) 0x7fL))) + else Int64.(logor x (shift_left (vs64 s) 7)) + (*TODO: check for overflow*) + +let vu32 s = Int64.to_int32 (vu64 s) (*TODO:check overflow*) +let vs32 s = Int64.to_int32 (vs64 s) (*TODO:check overflow*) +let vu s = Int64.to_int (vu64 s) (*TODO:check overflow*) +let f32 s = F32.of_bits (u32 s) +let f64 s = F64.of_bits (u64 s) + +let bool s = match get s with 0 | 1 as n -> n <> 0 | _ -> error s (pos s - 1) "invalid boolean" +let string s = let n = vu s in get_string n s +let rec list f n s = if n = 0 then [] else let x = f s in x :: list f (n - 1) s +let opt f b s = if b then Some (f s) else None +let vec f s = let n = vu s in list f n s +let vec1 f s = let b = bool s in opt f b s + + +(* Types *) + +open Types + +let value_type s = + match get s with + | 0x01 -> Int32Type + | 0x02 -> Int64Type + | 0x03 -> Float32Type + | 0x04 -> Float64Type + | _ -> error s (pos s - 1) "invalid value type" + +let expr_type s = vec1 value_type s + +let func_type s = + expect 0x05 s "invalid function type"; + let ins = vec value_type s in + let out = expr_type s in + {ins; out} + + +(* Decode expressions *) + +open Kernel +open Ast + +let op s = u8 s +let arity s = vu s +let arity1 s = bool s + +let memop s = + let align = vu s in + (*TODO: check flag bits*) + let offset = vu64 s in + offset, align + +let var s = vu s +let var32 s = Int32.to_int (vu32 s) + +let rec args n stack s pos = args' n stack [] s pos +and args' n stack es s pos = + match n, stack with + | 0, _ -> es, stack + | n, e::stack' -> args' (n - 1) stack' (e::es) s pos + | _ -> error s pos "too few operands for operator" + +let args1 b stack s pos = + match args (if b then 1 else 0) stack s pos with + | [], stack' -> None, stack' + | [e], stack' -> Some e, stack' + | _ -> assert false + +let rec expr stack s = + let pos = pos s in + match op s, stack with + | 0x00, es -> + Nop, es + | 0x01, es -> + let es' = expr_block s in + expect 0x17 s "end opcode expected"; + Block es', es + | 0x02, es -> + let es' = expr_block s in + expect 0x17 s "end opcode expected"; + Loop es', es + | 0x03, e :: es -> + let es1 = expr_block s in + if peek s = Some 0x04 then begin + expect 0x04 s "else or end opcode expected"; + let es2 = expr_block s in + expect 0x17 s "end opcode expected"; + If (e, es1, es2), es + end else begin + expect 0x17 s "end opcode expected"; + If (e, es1, []), es + end + | 0x04, _ -> + assert false (* else *) + | 0x05, e3 :: e2 :: e1 :: es -> + Select (e1, e2, e3), es + | 0x06, es -> + let b = arity1 s in + let x = at var s in + let eo, es' = args1 b es s pos in + Br (x, eo), es' + | 0x07, e :: es -> + let b = arity1 s in + let x = at var s in + let eo, es' = args1 b es s pos in + Br_if (x, eo, e), es' + | 0x08, e :: es -> + let b = arity1 s in + let xs = vec (at var) s in + let x = at var s in + let eo, es' = args1 b es s pos in + Br_table (xs, x, eo, e), es' + + | 0x09 as b, es -> illegal s pos b + + | 0x0a, es -> I32_const (at vs32 s), es + | 0x0b, es -> I64_const (at vs64 s), es + | 0x0c, es -> F32_const (at f32 s), es + | 0x0d, es -> F64_const (at f64 s), es + + | 0x0e, es -> + let x = at var s in + Get_local x, es + | 0x0f, e :: es -> + let x = at var s in + Set_local (x, e), es + + | 0x10 | 0x11 as b, _ -> illegal s pos b + + | 0x12, es -> + let n = arity s in + let x = at var s in + let es1, es' = args n es s pos in + Call (x, es1), es' + | 0x13, es -> + let n = arity s in + let x = at var s in + let es1, es' = args (n + 1) es s pos in + Call_indirect (x, List.hd es1, List.tl es1), es' + + | 0x14, es -> + let b = arity1 s in + let eo, es' = args1 b es s pos in + Return eo, es' + | 0x15, es -> + Unreachable, es + + | 0x16, _ -> assert false (* next *) + | 0x17, _ -> assert false (* end *) + | 0x18 | 0x19 | 0x1a | 0x1b | 0x1c | 0x1d | 0x1e as b, _ -> + illegal s pos b + + | 0x1f, es -> + let n = arity s in + let x = at var s in + let es1, es' = args n es s pos in + Call_import (x, es1), es' + + | 0x20, e :: es -> let o, a = memop s in I32_load8_s (o, a, e), es + | 0x21, e :: es -> let o, a = memop s in I32_load8_u (o, a, e), es + | 0x22, e :: es -> let o, a = memop s in I32_load16_s (o, a, e), es + | 0x23, e :: es -> let o, a = memop s in I32_load16_u (o, a, e), es + | 0x24, e :: es -> let o, a = memop s in I64_load8_s (o, a, e), es + | 0x25, e :: es -> let o, a = memop s in I64_load8_u (o, a, e), es + | 0x26, e :: es -> let o, a = memop s in I64_load16_s (o, a, e), es + | 0x27, e :: es -> let o, a = memop s in I64_load16_u (o, a, e), es + | 0x28, e :: es -> let o, a = memop s in I64_load32_s (o, a, e), es + | 0x29, e :: es -> let o, a = memop s in I64_load32_u (o, a, e), es + | 0x2a, e :: es -> let o, a = memop s in I32_load (o, a, e), es + | 0x2b, e :: es -> let o, a = memop s in I64_load (o, a, e), es + | 0x2c, e :: es -> let o, a = memop s in F32_load (o, a, e), es + | 0x2d, e :: es -> let o, a = memop s in F64_load (o, a, e), es + + | 0x2e, e2 :: e1 :: es -> let o, a = memop s in I32_store8 (o, a, e1, e2), es + | 0x2f, e2 :: e1 :: es -> let o, a = memop s in I32_store16 (o, a, e1, e2), es + | 0x30, e2 :: e1 :: es -> let o, a = memop s in I64_store8 (o, a, e1, e2), es + | 0x31, e2 :: e1 :: es -> let o, a = memop s in I64_store16 (o, a, e1, e2), es + | 0x32, e2 :: e1 :: es -> let o, a = memop s in I64_store32 (o, a, e1, e2), es + | 0x33, e2 :: e1 :: es -> let o, a = memop s in I32_store (o, a, e1, e2), es + | 0x34, e2 :: e1 :: es -> let o, a = memop s in I64_store (o, a, e1, e2), es + | 0x35, e2 :: e1 :: es -> let o, a = memop s in F32_store (o, a, e1, e2), es + | 0x36, e2 :: e1 :: es -> let o, a = memop s in F64_store (o, a, e1, e2), es + + | 0x37 | 0x38 as b, _ -> illegal s pos b + | 0x39, e :: es -> Grow_memory e, es + | 0x3a as b, _ -> illegal s pos b + | 0x3b, es -> Current_memory, es + + | 0x40, e2 :: e1 :: es -> I32_add (e1, e2), es + | 0x41, e2 :: e1 :: es -> I32_sub (e1, e2), es + | 0x42, e2 :: e1 :: es -> I32_mul (e1, e2), es + | 0x43, e2 :: e1 :: es -> I32_div_s (e1, e2), es + | 0x44, e2 :: e1 :: es -> I32_div_u (e1, e2), es + | 0x45, e2 :: e1 :: es -> I32_rem_s (e1, e2), es + | 0x46, e2 :: e1 :: es -> I32_rem_u (e1, e2), es + | 0x47, e2 :: e1 :: es -> I32_and (e1, e2), es + | 0x48, e2 :: e1 :: es -> I32_or (e1, e2), es + | 0x49, e2 :: e1 :: es -> I32_xor (e1, e2), es + | 0x4a, e2 :: e1 :: es -> I32_shl (e1, e2), es + | 0x4b, e2 :: e1 :: es -> I32_shr_u (e1, e2), es + | 0x4c, e2 :: e1 :: es -> I32_shr_s (e1, e2), es + | 0x4d, e2 :: e1 :: es -> I32_eq (e1, e2), es + | 0x4e, e2 :: e1 :: es -> I32_ne (e1, e2), es + | 0x4f, e2 :: e1 :: es -> I32_lt_s (e1, e2), es + | 0x50, e2 :: e1 :: es -> I32_le_s (e1, e2), es + | 0x51, e2 :: e1 :: es -> I32_lt_u (e1, e2), es + | 0x52, e2 :: e1 :: es -> I32_le_u (e1, e2), es + | 0x53, e2 :: e1 :: es -> I32_gt_s (e1, e2), es + | 0x54, e2 :: e1 :: es -> I32_ge_s (e1, e2), es + | 0x55, e2 :: e1 :: es -> I32_gt_u (e1, e2), es + | 0x56, e2 :: e1 :: es -> I32_ge_u (e1, e2), es + | 0x57, e :: es -> I32_clz e, es + | 0x58, e :: es -> I32_ctz e, es + | 0x59, e :: es -> I32_popcnt e, es + | 0x5a, e :: es -> I32_eqz e, es + + | 0x5b, e2 :: e1 :: es -> I64_add (e1, e2), es + | 0x5c, e2 :: e1 :: es -> I64_sub (e1, e2), es + | 0x5d, e2 :: e1 :: es -> I64_mul (e1, e2), es + | 0x5e, e2 :: e1 :: es -> I64_div_s (e1, e2), es + | 0x5f, e2 :: e1 :: es -> I64_div_u (e1, e2), es + | 0x60, e2 :: e1 :: es -> I64_rem_s (e1, e2), es + | 0x61, e2 :: e1 :: es -> I64_rem_u (e1, e2), es + | 0x62, e2 :: e1 :: es -> I64_and (e1, e2), es + | 0x63, e2 :: e1 :: es -> I64_or (e1, e2), es + | 0x64, e2 :: e1 :: es -> I64_xor (e1, e2), es + | 0x65, e2 :: e1 :: es -> I64_shl (e1, e2), es + | 0x66, e2 :: e1 :: es -> I64_shr_u (e1, e2), es + | 0x67, e2 :: e1 :: es -> I64_shr_s (e1, e2), es + | 0x68, e2 :: e1 :: es -> I64_eq (e1, e2), es + | 0x69, e2 :: e1 :: es -> I64_ne (e1, e2), es + | 0x6a, e2 :: e1 :: es -> I64_lt_s (e1, e2), es + | 0x6b, e2 :: e1 :: es -> I64_le_s (e1, e2), es + | 0x6c, e2 :: e1 :: es -> I64_lt_u (e1, e2), es + | 0x6d, e2 :: e1 :: es -> I64_le_u (e1, e2), es + | 0x6e, e2 :: e1 :: es -> I64_gt_s (e1, e2), es + | 0x6f, e2 :: e1 :: es -> I64_ge_s (e1, e2), es + | 0x70, e2 :: e1 :: es -> I64_gt_u (e1, e2), es + | 0x71, e2 :: e1 :: es -> I64_ge_u (e1, e2), es + | 0x72, e :: es -> I64_clz e, es + | 0x73, e :: es -> I64_ctz e, es + | 0x74, e :: es -> I64_popcnt e, es + + | 0x75, e2 :: e1 :: es -> F32_add (e1, e2), es + | 0x76, e2 :: e1 :: es -> F32_sub (e1, e2), es + | 0x77, e2 :: e1 :: es -> F32_mul (e1, e2), es + | 0x78, e2 :: e1 :: es -> F32_div (e1, e2), es + | 0x79, e2 :: e1 :: es -> F32_min (e1, e2), es + | 0x7a, e2 :: e1 :: es -> F32_max (e1, e2), es + | 0x7b, e :: es -> F32_abs e, es + | 0x7c, e :: es -> F32_neg e, es + | 0x7d, e2 :: e1 :: es -> F32_copysign (e1, e2), es + | 0x7e, e :: es -> F32_ceil e, es + | 0x7f, e :: es -> F32_floor e, es + | 0x80, e :: es -> F32_trunc e, es + | 0x81, e :: es -> F32_nearest e, es + | 0x82, e :: es -> F32_sqrt e, es + | 0x83, e2 :: e1 :: es -> F32_eq (e1, e2), es + | 0x84, e2 :: e1 :: es -> F32_ne (e1, e2), es + | 0x85, e2 :: e1 :: es -> F32_lt (e1, e2), es + | 0x86, e2 :: e1 :: es -> F32_le (e1, e2), es + | 0x87, e2 :: e1 :: es -> F32_gt (e1, e2), es + | 0x88, e2 :: e1 :: es -> F32_ge (e1, e2), es + + | 0x89, e2 :: e1 :: es -> F64_add (e1, e2), es + | 0x8a, e2 :: e1 :: es -> F64_sub (e1, e2), es + | 0x8b, e2 :: e1 :: es -> F64_mul (e1, e2), es + | 0x8c, e2 :: e1 :: es -> F64_div (e1, e2), es + | 0x8d, e2 :: e1 :: es -> F64_min (e1, e2), es + | 0x8e, e2 :: e1 :: es -> F64_max (e1, e2), es + | 0x8f, e :: es -> F64_abs e, es + | 0x90, e :: es -> F64_neg e, es + | 0x91, e2 :: e1 :: es -> F64_copysign (e1, e2), es + | 0x92, e :: es -> F64_ceil e, es + | 0x93, e :: es -> F64_floor e, es + | 0x94, e :: es -> F64_trunc e, es + | 0x95, e :: es -> F64_nearest e, es + | 0x96, e :: es -> F64_sqrt e, es + | 0x97, e2 :: e1 :: es -> F64_eq (e1, e2), es + | 0x98, e2 :: e1 :: es -> F64_ne (e1, e2), es + | 0x99, e2 :: e1 :: es -> F64_lt (e1, e2), es + | 0x9a, e2 :: e1 :: es -> F64_le (e1, e2), es + | 0x9b, e2 :: e1 :: es -> F64_gt (e1, e2), es + | 0x9c, e2 :: e1 :: es -> F64_ge (e1, e2), es + + | 0x9d, e :: es -> I32_trunc_s_f32 e, es + | 0x9e, e :: es -> I32_trunc_s_f64 e, es + | 0x9f, e :: es -> I32_trunc_u_f32 e, es + | 0xa0, e :: es -> I32_trunc_u_f64 e, es + | 0xa1, e :: es -> I32_wrap_i64 e, es + | 0xa2, e :: es -> I64_trunc_s_f32 e, es + | 0xa3, e :: es -> I64_trunc_s_f64 e, es + | 0xa4, e :: es -> I64_trunc_u_f32 e, es + | 0xa5, e :: es -> I64_trunc_u_f64 e, es + | 0xa6, e :: es -> I64_extend_s_i32 e, es + | 0xa7, e :: es -> I64_extend_u_i32 e, es + | 0xa8, e :: es -> F32_convert_s_i32 e, es + | 0xa9, e :: es -> F32_convert_u_i32 e, es + | 0xaa, e :: es -> F32_convert_s_i64 e, es + | 0xab, e :: es -> F32_convert_u_i64 e, es + | 0xac, e :: es -> F32_demote_f64 e, es + | 0xad, e :: es -> F32_reinterpret_i32 e, es + | 0xae, e :: es -> F64_convert_s_i32 e, es + | 0xaf, e :: es -> F64_convert_u_i32 e, es + | 0xb0, e :: es -> F64_convert_s_i64 e, es + | 0xb1, e :: es -> F64_convert_u_i64 e, es + | 0xb2, e :: es -> F64_promote_f32 e, es + | 0xb3, e :: es -> F64_reinterpret_i64 e, es + | 0xb4, e :: es -> I32_reinterpret_f32 e, es + | 0xb5, e :: es -> I64_reinterpret_f64 e, es + + | 0xb6, e2 :: e1 :: es -> I32_rotl (e1, e2), es + | 0xb7, e2 :: e1 :: es -> I32_rotr (e1, e2), es + | 0xb8, e2 :: e1 :: es -> I64_rotl (e1, e2), es + | 0xb9, e2 :: e1 :: es -> I64_rotr (e1, e2), es + | 0xba, e :: es -> I64_eqz e, es + + | b, _ when b > 0xba -> illegal s pos b + + | b, _ -> error s pos "too few operands for operator" + +and expr_block s = List.rev (expr_block' [] s) +and expr_block' stack s = + if eos s then stack else + match peek s with + | None | Some (0x04 | 0x16 | 0x17) -> stack + | _ -> + let pos = pos s in + let e', stack' = expr stack s in + expr_block' (Source.(e' @@ region s pos pos) :: stack') s + + +(* Sections *) + +let trace s name = + print_endline + (name ^ " @ " ^ string_of_int (pos s) ^ " = " ^ string_of_byte (read s)) + +let id s = + match string s with + | "type" -> `TypeSection + | "import" -> `ImportSection + | "function" -> `FuncSection + | "table" -> `TableSection + | "memory" -> `MemorySection + | "export" -> `ExportSection + | "start" -> `StartSection + | "code" -> `CodeSection + | "data" -> `DataSection + | _ -> `UnknownSection + +let section tag f default s = + if eos s then default else + let start_pos = pos s in + let size = vu s in + let id_pos = pos s in + if id s <> tag then (rewind start_pos s; default) else + let s' = substream s (id_pos + size) in + let x = f s' in + require (eos s') s' (pos s') "junk at end of section"; + x + + +(* Type section *) + +let type_section s = + section `TypeSection (vec func_type) [] s + + +(* Import section *) + +let import s = + let itype = at var s in + let module_name = string s in + let func_name = string s in + {itype; module_name; func_name} + +let import_section s = + section `ImportSection (vec (at import)) [] s + + +(* Function section *) + +let func_section s = + section `FuncSection (vec (at var)) [] s + + +(* Table section *) + +let table_section s = + section `TableSection (vec (at var)) [] s + + +(* Memory section *) + +let memory s = + let min = vu64 s in + let max = vu64 s in + let _ = bool s in (*TODO: pending change*) + {min; max; segments = []} + +let memory_section s = + section `MemorySection (opt (at memory) true) None s + + +(* Export section *) + +let export s = + let x = at var s in + let name = string s in + {name; kind = `Func x} (*TODO: pending resolution*) + +let export_section s = + section `ExportSection (vec (at export)) [] s + + +(* Start section *) + +let start_section s = + section `StartSection (opt (at var) true) None s + + +(* Code section *) + +let local s = + let n = vu s in + let t = value_type s in + Lib.List.make n t + +let code s = + let locals = List.flatten (vec local s) in + let size = vu s in + let body = expr_block (substream s (pos s + size)) in + {locals; body; ftype = Source.((-1) @@ Source.no_region)} + +let code_section s = + section `CodeSection (vec (at code)) [] s + + +(* Data section *) + +let segment s = + let addr = vu64 s in + let data = string s in + {Memory.addr; data} + +let data_section s = + section `DataSection (vec (at segment)) [] s + + +(* Unknown section *) + +let unknown_section s = + section `UnknownSection (fun s -> skip (len s - pos s) s; true) false s + + +(* Modules *) + +let rec iterate f s = if f s then iterate f s + +let module_ s = + let magic = u32 s in + require (magic = 0x6d736100l) s 0 "magic header not detected"; + let version = u32 s in + require (version = 0x0bl) s 4 "unknown binary version"; + iterate unknown_section s; + let types = type_section s in + iterate unknown_section s; + let imports = import_section s in + iterate unknown_section s; + let func_types = func_section s in + iterate unknown_section s; + let table = table_section s in + iterate unknown_section s; + let memory_limits = memory_section s in + iterate unknown_section s; + let exports = export_section s in + iterate unknown_section s; + let start = start_section s in + iterate unknown_section s; + let func_bodies = code_section s in + iterate unknown_section s; + let segments = data_section s in + iterate unknown_section s; + (*TODO: name section*) + iterate unknown_section s; + require (pos s = len s) s (len s) "junk after last section"; + require (List.length func_types = List.length func_bodies) + s (len s) "function and code section have inconsistent lengths"; + require (memory_limits <> None || segments = []) + s (len s) "data section without memory section"; + let funcs = + List.map2 Source.(fun t f -> {f.it with ftype = t} @@ f.at) + func_types func_bodies in + let memory = + match memory_limits with + | None -> None + | Some memory -> Some Source.({memory.it with segments} @@ memory.at) + in {memory; types; funcs; imports; exports; table; start} + + +let decode name bs = at module_ (stream name bs) + diff --git a/ml-proto/spec/decode.mli b/ml-proto/spec/decode.mli new file mode 100644 index 0000000000..fc7a453a96 --- /dev/null +++ b/ml-proto/spec/decode.mli @@ -0,0 +1,3 @@ +exception Code of Source.region * string + +val decode : string -> bytes -> Ast.module_ (* raise Code *) diff --git a/ml-proto/spec/desugar.ml b/ml-proto/spec/desugar.ml index 1d3b722d77..91ed5f0912 100644 --- a/ml-proto/spec/desugar.ml +++ b/ml-proto/spec/desugar.ml @@ -108,12 +108,6 @@ and expr' at = function | Ast.I32_load16_u (offset, align, e) -> LoadExtend ({memop = {ty = Int32Type; offset; align}; sz = Mem16; ext = ZX}, expr e) - | Ast.I32_load32_s (offset, align, e) -> - LoadExtend - ({memop = {ty = Int32Type; offset; align}; sz = Mem32; ext = SX}, expr e) - | Ast.I32_load32_u (offset, align, e) -> - LoadExtend - ({memop = {ty = Int32Type; offset; align}; sz = Mem32; ext = ZX}, expr e) | Ast.I64_load8_s (offset, align, e) -> LoadExtend ({memop = {ty = Int64Type; offset; align}; sz = Mem8; ext = SX}, expr e) @@ -138,9 +132,6 @@ and expr' at = function | Ast.I32_store16 (offset, align, e1, e2) -> StoreWrap ({memop = {ty = Int32Type; offset; align}; sz = Mem16}, expr e1, expr e2) - | Ast.I32_store32 (offset, align, e1, e2) -> - StoreWrap - ({memop = {ty = Int32Type; offset; align}; sz = Mem32}, expr e1, expr e2) | Ast.I64_store8 (offset, align, e1, e2) -> StoreWrap ({memop = {ty = Int64Type; offset; align}; sz = Mem8}, expr e1, expr e2) diff --git a/ml-proto/spec/eval.ml b/ml-proto/spec/eval.ml index 3dfadc27db..d892d74f52 100644 --- a/ml-proto/spec/eval.ml +++ b/ml-proto/spec/eval.ml @@ -266,7 +266,10 @@ and eval_func instance f vs = let vars = List.map (fun t -> ref (default_value t)) f.it.locals in let locals = args @ vars in let c = {instance; locals; labels = []} in - coerce (type_ c f.it.ftype).out (eval_expr c f.it.body) + let ft = type_ c f.it.ftype in + if List.length vs <> List.length ft.ins then + Crash.error f.at "function called with wrong number of arguments"; + coerce ft.out (eval_expr c f.it.body) and coerce et vo = if et = None then None else vo @@ -302,15 +305,16 @@ and eval_hostop c hostop vs at = (* Modules *) -let init_memory {it = {initial; segments; _}} = - let mem = Memory.create initial in +let init_memory {it = {min; segments; _}} = + let mem = Memory.create min in Memory.init mem (List.map it segments); mem let add_export funcs ex = - match ex.it with - | ExportFunc (n, x) -> ExportMap.add n (List.nth funcs x.it) - | ExportMemory n -> fun x -> x + let {name; kind} = ex.it in + match kind with + | `Func x -> ExportMap.add name (List.nth funcs x.it) + | `Memory -> fun x -> x let init m imports = assert (List.length imports = List.length m.it.Kernel.imports); diff --git a/ml-proto/spec/eval.mli b/ml-proto/spec/eval.mli index afa86ff6e9..c8c4390bd7 100644 --- a/ml-proto/spec/eval.mli +++ b/ml-proto/spec/eval.mli @@ -8,4 +8,3 @@ exception Crash of Source.region * string val init : Kernel.module_ -> import list -> instance val invoke : instance -> string -> value list -> value option (* raises Trap, Crash *) - diff --git a/ml-proto/spec/kernel.ml b/ml-proto/spec/kernel.ml index c57a37569c..402ea8cec4 100644 --- a/ml-proto/spec/kernel.ml +++ b/ml-proto/spec/kernel.ml @@ -119,7 +119,7 @@ and func' = type memory = memory' Source.phrase and memory' = { - initial : Memory.size; + min : Memory.size; max : Memory.size; segments : segment list; } @@ -127,8 +127,10 @@ and segment = Memory.segment Source.phrase type export = export' Source.phrase and export' = - | ExportFunc of string * var - | ExportMemory of string +{ + name : string; + kind : [`Func of var | `Memory] +} type import = import' Source.phrase and import' = diff --git a/ml-proto/test/block_comments.wast b/ml-proto/test/block_comments.wast index b10dd3d2de..0a077cd6d3 100644 --- a/ml-proto/test/block_comments.wast +++ b/ml-proto/test/block_comments.wast @@ -11,3 +11,5 @@ (; ;; bla ;) (; ;; bla ;) + +(module) ;; dummy diff --git a/ml-proto/test/memory.wast b/ml-proto/test/memory.wast index 0cdc8781b0..40d6f37588 100644 --- a/ml-proto/test/memory.wast +++ b/ml-proto/test/memory.wast @@ -9,7 +9,7 @@ (assert_invalid (module (memory 1 0)) - "initial memory pages must be less than or equal to the maximum" + "minimum memory pages must be less than or equal to the maximum" ) (assert_invalid (module (memory 0 0 (segment 0 "a")))