A lightweight Virtual Machine for dynamic, object-oriented languages. It aims to
be fast, as simple as possible, easily optimizable, with LLVM support, and
easily targetable for language designers and implementors. That's why its
interface (instruction set and bytecode format) is extensively documented and an
example compiler is provided under the
Before anything, I want to give special thanks to my awesome mentors Jeremy Tregguna, Brian Ford, Dirkjan Bussink and Evan Phoenix. Without their teachings and patience I would never have started this in the first place.
I'd love to discuss literally anything about my choices regarding the design and the implementation of TerrorVM. Feel free to ping me on twitter, drop me an email, or if you are in Berlin, just grab some beers together :) After all,
In TerrorVM, everything is an object, and every object may have a prototype. The basic value types that the VM provides are:
Integer: Simple integers.
String: Immutable strings.
Vector: Dynamically sized vectors that may contain any type.
Map: Hashmaps (for now only strings are supported as keys).
Closure: A first-class function.
True: True boolean.
False: False boolean.
Nil: Represents nothingness. It is falsy just like
These basic types are objects themselves (of type
Object). They are the
prototype for any objects of their own kind, and are provided
with all the functionality that those objects will need -- this is done
in the prelude, I'll explain what this is a bit further ahead.
Objects are simply collections of slots that may contain any kind of object.
I'm considering adding Traits, although I'll wait until I see a need for it. In the simplicity of Terror lies its power.
TerrorVM tries to implement as much as possible in its own code, rather than C.
This makes it a perfect candidate as a multi-language VM to implement any
language on top of it. You can find a high-level prelude under
compiler/examples/prelude.rb that compiles down to Terror native
This prelude wires up the VM primitives to the real objects at runtime, so that your code can use them conveniently.
To recompile all examples and kernel files from Ruby to Tvm, do this:
$ make kernel $ make examples
Implementing your own dynamic language running on TerrorVM
TerrorVM is designed to run dynamic languages. You can easily implement a compiler of your own that compiles your favorite dynamic language down to TVM bytecode.
I've written a demo compiler in Ruby under the
compiler/ folder, just to
show how easy it is to write your own. This demo compiler compiles a subset of
Ruby down to TerrorVM bytecode, so you can easily peek at the source code or
just copy and modify it.
You can write your compiler in whatever language you prefer, of course.
The algorithm of choice for TerrorVM is Baker's treadmill, a real-time, non-moving GC algorithm. Unfortunately it is not implemented yet. It will be as soon as I understand how to do it. Remember it's a work in progress :)
This is a really important topic these days, not to be overlooked. Although its concurrency support is not in place yet, it will feature forking, threads and coroutines, but I might change my mind as I learn more.
The bytecode format might change to be more compact, but I'll describe what it is for now. A file must contain a main block, and may contain other blocks (functions defined there). This is how a block looks like (if you're curious, it's just a hello world):
_main :2:8 "hello world "puts 16 PUSHSELF 17 PUSH 0 128 SEND 1 1 20 PUSHNIL 144 RET
As you can see,
_main, defines the entry point of the file. Then these
:2:8 mean that this block has two literals and eight lines
of instructions. There are actually only 5 instructions, but the operands for
these instructions count
as well, so we're in a total of 8.
Right after these counts, we have the literals, each one in its own line. There
are two kinds of literals: integers and strings. Integers are just numbers, but
strings must be preceded by a
And finally we get to eight lines of numbers, namely the instructions and their
operands. The labels you see beside every instruction (
PUSHSELF) are totally
optional, the VM doesn't read them, but they help debugging when looking at a
bytecode file manually.
After that there might be more functions. Imagine our hello world defined an
empty closure, then we'd have right after
_block_153 :0:2 20 PUSHNIL 144 RET
That's it! :)
Examples (high-level Ruby code and its Terror compiled counterpart)
- Hello world (Ruby code, TVM bytecode)
- Maps (Ruby code, TVM code)
- Vectors (Ruby code, TVM code)
- Numbers (Ruby code, TVM code)
- Objects with prototypal inheritance (Ruby code, TVM bytecode)
- Functions and closures (Ruby code, TVM bytecode)
- NOOP: no operation -- does nothing.
- PUSHSELF: pushes the current self to the stack.
- PUSH A: pushes the literal at index
Ato the stack.
- PUSHTRUE: pushes the
trueobject to the stack.
- PUSHFALSE: pushes the
falseobject to the stack.
- PUSHNIL: pushes the
nilobject to the stack.
- PUSHLOCAL A: pushes the local at index
Ato the stack.
- SETLOCAL A: sets the current top of the stack to the local variable
A. Does not consume any stack.
- JMP A: Jumps forward as much as
- JIF A: Jumps forward as much as
Ainstructions if the top of the stack is falsy (
- JIT A: Jumps forward as much as
Ainstructions if the top of the stack is truthy (any value other than
- GETSLOT A: Pops the object at the top of the stack and asks for its slot with name
A(a literal), pushing it to the stack if found -- if not, it'll raise an error.
- SETSLOT A: Pops a value to be set, then pops the object at the top of the stack and sets its slot with name
A(a literal) to the value that was first popped. Then pushes that value back to the stack.
- POP: pops a value off the stack.
- DEFN A: takes the closure with the name
A(a literal) and pushes it to the stack.
- MAKEVEC A: Pops as much as
Aelements off the stack and pushes a vector with all of them in the order they were popped (the reverse order they were pushed in the first place).
- SEND A, B: Pops as much as
Barguments off the stack, then the receiver, and sends it the message with the name
A(a literal) with those arguments.
- DUMP: Prints the contents of the value stack to the standard output.
Building the VM
It uses the latest C standard (C11). It is supported by GCC and Clang, so you'll be alright.
$ git clone git://github.com/txus/terrorvm.git $ cd terrorvm $ make
To run the tests:
$ make dev
And to clean the mess:
$ make clean
.tvm bytecode files such as the
numbers.tvm under the
$ ./bin/tvm examples/numbers.tvm
It ships with a simple compiler written in Ruby (Rubinius) that compiles a
tiny subset of Ruby to
.tvm files. Check out the
compiler directory, which
has its own Readme, and the
compiler/examples where we have the
hello_world.rb file used to produce the
TerrorVM doesn't need Ruby to run; even the example compiler is a proof of concept and could be written in any language (even in C obviously).
- Fork it
- Create your feature branch (
git checkout -b my-new-feature)
- Commit your changes (
git commit -am 'Added some feature')
- Push to the branch (
git push origin my-new-feature)
- Create new Pull Request