Skip to content

Commit a3028c3

Browse files
committed
cleanup
1 parent 99ffdc2 commit a3028c3

File tree

3 files changed

+53
-16
lines changed

3 files changed

+53
-16
lines changed

README.md

Lines changed: 23 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ The test suite includes both static validation of generated/annotated ASTs, as w
3131
- [Mono](https://www.mono-project.com/)
3232
- Tested with Mono 6.12
3333
- WASM Backend Requirements:
34-
- [WebAssembly Binary Toolkit (wabt)](https://github.com/WebAssembly/wabt)
34+
- [WebAssembly Binary Toolkit (wabt)](https://github.com/WebAssembly/wabt), specifically the `wat2wasm` tool
3535

3636
## Usage
3737

@@ -94,7 +94,7 @@ Note that in the above example commands & the `demo_jvm.sh` script all expect th
9494

9595
## CIL Backend Notes:
9696

97-
The CIL backend for this compiler outputs CIL bytecode in plaintext formatted for the Mono ilsam assembler:
97+
The CIL backend for this compiler outputs CIL bytecode in plaintext formatted for the Mono `ilasm` assembler:
9898
1. Use this compiler to generate plaintext bytecode
9999
- Format: `python3 main.py --mode cil <input file> <output dir>`
100100
- Example: `python3 main.py --mode cil tests/runtime/binary_tree.py .`
@@ -110,32 +110,43 @@ The `demo_cil.sh` script is a useful utility to compile and run files with the C
110110

111111
## WASM Backend Notes:
112112

113-
This is WIP, not all features are supported.
113+
This is WIP, not all features are supported (the binary tree example itself actually does not work, but you can try another one).
114114

115-
Features:
115+
The WASM backend for this compiler outputs WASM in plaintext `.wat` format which can be converted to `.wasm` using `wat2wasm`:
116+
1. Use this compiler to generate plaintext WebAssembly
117+
- Format: `python3 main.py --mode wasm <input file> <output dir>`
118+
- Example: `python3 main.py --mode wasm tests/runtime/binary_tree.py .`
119+
2. Run `wat2wasm` assembler to generate `.wasm` files
120+
- Format: `wat2wasm <.wat file> -o <.wasm file>`
121+
- Example: `wat2wasm binary_tree.wat -o binary_tree.wasm`
122+
3. Run the `.wasm` files using a minimal JS runtime
123+
- Example: `node wasm.js <.wasm file>`
124+
- Example: `node wasm.js binary_tree.wasm`
125+
126+
The `demo_wasm.sh` script is a useful utility to compile and run files with the WASM backend with a single command (provide the path to the input source file as an argument).
127+
- To run the same example as above, run `./demo_wasm.sh tests/runtime/binary_tree.py`
128+
129+
### WASM Backend - Supported Features:
116130
- int, bool, string, list
117131
- most operators
118132
- assignment
119133
- control flow
120134
- stdlib: print, len, and assert
121135
- globals
122136

123-
Unsupported/TODO:
137+
### WASM Backend - Unsupported Features:
124138
- class/object
125139
- nonlocal (partial)
126-
- stdlib: input
140+
- stdlib: input (node.js does not have synchronous I/O out of the box so this is difficult)
127141

128-
Memory format:
142+
### WASM Backend - Memory Format, Safety, and Management:
129143

130144
- strings (utf-8) - first 4 bytes for length, followed by 1 byte for each character
131145
- lists - first 4 bytes for length, followed by 8 bytes for each element
132146
- ints - i64
133-
- pointers (objects, strings, lists) - i32
134-
- None - 0 (i32)
135-
136-
Strings and lists are stored in the heap, aligned to 8 bytes. Note that memory does not get freed/garbage collected, so memory will run out for long-running programs. This is especially a problem with string iteration and string/list concatenation, since indexing a string in Chocopy requires a new string to be allocated.
147+
- pointers (objects, strings, lists) - i32, where `None` is 0
137148

138-
To provide memory safety, string/list indexing have bounds checking and list operations have a null-check, which crashes the program with a generic "unreachable" instruction.
149+
Strings, lists, objects, and refs holding nonlocals are stored in the heap, aligned to 8 bytes. Right now, memory does not get freed/garbage collected once it is allocated. To provide memory safety, string/list indexing have bounds checking and list operations have a null-check, which crashes the program with a generic "unreachable" instruction.
139150

140151
## FAQ
141152

compiler/wasm_backend.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -681,6 +681,7 @@ def stdlib(self) -> str:
681681
(func $nullthrow (param $addr i32) (result i32)
682682
local.get $addr
683683
i32.eqz
684+
;; throw if $addr == 0
684685
(if
685686
(then
686687
unreachable
@@ -695,6 +696,7 @@ def stdlib(self) -> str:
695696
local.get $idx
696697
i32.gt_s
697698
i32.eqz
699+
;; throw if !($len > $idx)
698700
(if
699701
(then
700702
unreachable
@@ -703,6 +705,7 @@ def stdlib(self) -> str:
703705
i32.const 0
704706
local.get $idx
705707
i32.gt_s
708+
;; throw if 0 > $idx
706709
(if
707710
(then
708711
unreachable
@@ -794,9 +797,11 @@ def stdlib(self) -> str:
794797
local.get $left
795798
i32.load
796799
local.tee $length
800+
;; compare $length with len of $right
797801
local.get $right
798802
i32.load
799803
i32.eq
804+
;; only compare contents if lengths are equal
800805
(if
801806
(then
802807
i32.const 0
@@ -807,17 +812,21 @@ def stdlib(self) -> str:
807812
local.get $length
808813
i32.lt_s
809814
i32.eqz
815+
;; get left char
810816
br_if $block
811817
local.get $left
812818
local.get $idx
813819
call $get_char
820+
;; get right char
814821
local.get $right
815822
local.get $idx
816823
call $get_char
824+
;; $result = $result && left char == right char
817825
i32.eq
818826
local.get $result
819827
i32.and
820828
local.set $result
829+
;; if !$result then break
821830
local.get $result
822831
i32.eqz
823832
br_if $block

wasm.js

Lines changed: 21 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,18 @@
1-
const wasm_path = process.argv[2];
1+
/**
2+
* Runtime support for running WASM compiled from Chocopy
3+
*
4+
* This is a very minimal runtime since the goal was to implement as much as
5+
* possible directly in WASM.
6+
*
7+
* The only imports from JS to WASM are for `console.log` and `console.assert`,
8+
* and the latter isn't even strictly necessary.
9+
*
10+
* The memory buffer is instantiated by JS, but after that it's never written
11+
* to and only used to print strings.
12+
*/
13+
214

3-
// utils for pretty-printing ints, bools, strings
15+
const wasm_path = process.argv[2];
416

517
function logString(offset) {
618
// first 4 bytes is the length
@@ -22,7 +34,10 @@ function logBool(val) {
2234
console.log(val !== 0);
2335
}
2436

25-
const memory = new WebAssembly.Memory({ initial: 10, maximum: 100 });
37+
const memory = new WebAssembly.Memory({
38+
initial: 10,
39+
maximum: 100
40+
});
2641

2742
const importObject = {
2843
imports: {
@@ -31,7 +46,9 @@ const importObject = {
3146
logBool: x => logBool(x),
3247
assert: x => console.assert(x)
3348
},
34-
js: { mem: memory },
49+
js: {
50+
mem: memory
51+
},
3552
};
3653

3754
const fs = require('fs');

0 commit comments

Comments
 (0)