Virtual machine that executes bytecode produced by JexCompiler.
jex_vm is implemented on top of extendable_vm library.
jex_vm executes bytecode which is not human-readable.
Jex is a programming language that I am working on for the purpose of learning about compilers and virtual machines. You can learn more about Jex here as well as compile it to jex_vm bytecode.
This is a stack VM that supports
- Booleans, Strings, Ints and Functions
- Basic operations like addition, multiplication, concatenation, etc
- Conditional jumps, function calls and returns
- Exceptions that halt the machine and print the stack trace
- Heap allocated objects
You can download the latest version from the Releases page. Or you can build from source.
After you have the binary executable jex_vm
you can run it:
./jex_vm path/to/bytecode
After you have the binary executable jex_vm.exe
you can run it:
./jex_vm.exe path/to/bytecode
To run with logging you have to set the environment variable RUST_LOG=jex_vm,extendable_vm
.
For example,
RUST_LOG=jex_vm,extendable_vm ./jex_vm path/to/bytecode
You can run the bytecode examples with jex_vm.
For instance, running 2_times_10.bytecode should print 20.
u8
represents an unsigned 8-bit integer
Name | Opcode (u8 ) |
Arguments (name: type) | Stack (old → new) | Description |
---|---|---|---|---|
Constant | 0 | i: u8 |
[] → [value] | Loads onto stack the i-th constant from the constant pool |
Null | 1 | [] → [null] | Loads null onto stack |
|
True | 2 | [] -> [true] | Loads true onto stack |
|
False | 3 | [] -> [false] | Loads false onto stack |
|
Pop | 4 | [x, y] → [x] | Pops the last value from stack | |
Get local | 5 | offset: u8 |
[..., x] → [..., x, y] | Gets the offset -th operand in the current call frame and loads in onto stack |
Set local | 6 | offset: u8 |
[..., x, ..., y] → [..., y, ...] | Pops the value and sets the offset -th operand in the current call frame |
Get global | 7 | identifier_i: u8 |
[...] → [..., x] | Loads a global value onto stack by its identifier which it fetches from the constant pool by index = identifier_i |
Define global | 8 | identifier_i: u8 |
[x] → [] | Sets a global value with the given identifier |
Set global | 9 | identifier_i: u8 |
[x] → [] | Sets a global value with the given identifier |
10 | [x] → [] | Prints a value | ||
Not | 11 | [x] → [!x] | Logical NOT | |
Equal | 12 | [x, y] → [x == y] | Checks if 2 values are equal | |
Greater | 13 | [x, y] → [x > y] | Checks if first is greater than the second | |
Less | 14 | [x, y] → [x < y] | Checks if first is less than the second | |
Negate | 15 | [x] → [-x] | Negates an integer | |
Add | 16 | [x, y] → [x + y] | Adds integers or concatenates strings | |
Subtract | 17 | [x, y] → [x - y] | Subtracts integers | |
Multiply | 18 | [x, y] → [x * y] | Multiplies integers | |
Divide | 19 | [x, y] → [x / y] | Divides integers | |
Jump forward | 20 | offset: u8 |
Jumps forward by offset bytes |
|
Jump forward if false | 21 | offset: u8 |
[x] → [] | Jumps forward by offset bytes if the value if false |
Jump Backward | 22 | offset: u8 |
Jumps backward by offset bytes |
|
Call | 23 | arity: u8 |
Calls a function with arity arguments. For example, CALL 3 will call f(a, b, c) when stack is [f, a, b, c] |
|
Return | 24 | Pops the last call frame and puts the returned value on top | Returns from the function ToString | |
To string | 25 | [x] → [str(x)] | Converts value to string | |
Read line | 26 | [] -> [x] | Suspends the VM and reads a like from STDIN | |
Parse int | 27 | [x] -> [int(x)] | Parses string as int. If string cannot be parsed returns null | |
New instance | 28 | [] → [x] | Creates an empty instance | |
Get field | 29 | constant_id: u8 |
[obj] -> [field_value] | Gets a field of obj: obj[str_constant] |
Set field | 30 | constant_id: u8 |
[obj, value] -> [obj] | Sets a field of obj obj[str_constant] = value |
This describes the format of the bytecode that the VM can read from the file.
struct
s are used as a way to demonstrate what each byte means. Each struct should be viewed as an array of bytes where
each value directly follow the previous (without padding and packing).
For example, struct A
represents an array [a1, a2, b]
where a1
and a2
correspond to a: u16
and b
to b: u8
.
struct A {
a: u16,
b: u8
}
Bytecode is an array of bytecode chunks. First chunks is a global script which will be run first, other chunks can be called as function.
Each chunk has n_constants
constants (constant pool) and n_code_bytes
executable bytes that contain instructions and their arguments.
struct Bytecode {
chunks: [Chunk]
}
struct Chunk {
n_constants: u8,
constants: [Constant],
// of `n_constants` size
n_code_bytes: u16,
code: [u8] // of `n_code_bytes` size
}
Bytecode constants are literal values that are included in the code. There are 3 types of constants: ints, strings and functions.
Each constant type has a unique constant_type
which is used to distinguish it from the other types.
struct Constant {
constant_type: u8,
data: [u8]
}
// Constant := IntConstant | StringConstant | FunctionConstant
struct IntConstant {
constant_type: u8,
// always 0
value: i32 // little endian
}
struct StringConstant {
constant_type: u8,
// always 1
length: u16,
utf8_data: [u8]
}
struct FunctionConstant {
constant_type: u8,
// always 2
chunk_id: u8
}
All chunks except for the first one can be called as a function with a CALL
instruction.
Callable chunk must have these 2 constants:
- the first constant must be a function name (string)
- the second constant must be a function arity (int)
To call a chunk you need to load it onto stack with a Constant
instruction, load some arguments onto stack and call it
with a CALL arity
instruction.
The executable will be located under target/debug
.
cargo build
The executable will be located under target/release
.
cargo build --release
cargo test