#Implementing BrainFuck JIT in PyPy
In what follows we will be implementing a simple interpreted language using PyPy which is a set of tools for implementing interpreters. The exposition follows PyPy Status Blog: Writing an interpreter with PyPy by Carl Friedrich Bolz
##BrainFuck features
The language consists in a sequence of tapes that can hold integer values and a single pointer to one of the cells. Even with such a simple model we will have IO operations, looping and branching and assignment.
-
Moves the tape pointer one cell to the right
- < Moves the tape pointer once cell to the left
-
- Increment the value of the cell underneath the pointer
-
- Decrement the value of the cell underneath the pointer
- [ If the value under the cell is 0, skip to the instruction after the matching ]
- ] Skip back to matching [
- . Print out a single byte to stdout from the current cell
- , Read in a single byte from stdin from the current cell
##Usage
python evalloop.py 99beer.b
This implementation is very slow so feel free to instead use the below to have a better idea of what's going on underneath the blinking cursor.
python -m trace --trace evalloop.py 99beer.b
##How to build it
###Implementing the tape
Checkout the Tape
class and the rest won't need to much imagination
def __init__(self):
"""
Pointer starts at first cell
Tape starts out as all 0
"""
self.thetape = [0]
self.position = 0
###Main loop
We initialize an empty tape and iterate over each character of code using the program counter.
tape = Tape()
pc = 0
while pc < len(program):
code = program[pc]
Assignment is easy
if code == ">":
tape.advance()
So is IO
elif code == ".":
sys.stdout.write(chr(tape.get()))
For looping we need a way to keep track of matching brackets and we do this via a simple stack when parsing the program and we will pass the matching brackets in a dictionary form to the main loop.
##PyPy translation
The above was undoubtedly incredibly slow, so to speed it up we'll write the above using RPython which implements a restricted subset of Python to allow speedups.
For example, we can no longer use sys.stdout.write
but will instead use os.write
elif code == ".":
os.write(1, chr(tape.get()))
More importantly, we're expected to implement three functions:
target
which returns anentry_point
functionentry_point
which given a filename creates a file pointer forrun
run
which willos.read(fp)
, populate theprogram_contents
, parse it and then run evaluate it.
pypy-evalloop.py
contains the details
###Usage
Go ahead and download Pypy from github and run
python ./pypy/pypy/translator/goal/translate.py pypyevallop.py
The result will be an executable binary that you can then use to interpret the BrainFuck file much much faster.
./pypyevallop-c 99beer.b
##Adding a JIT
Pypy can also generate a JIT for us, to do this we must explictly define the language stack frames: in our case this translates to what is constant across the execution of a particular instruction.
- Constants (green): pc, program, bracket_map
- Variables (red): tape
jitdriver = JitDriver(greens=['pc', 'program', 'bracket_map'], reds=['tape'])
The boilerplate to set everything up is in pypyevallop.py
###Usage
python ./pypy/pypy/translator/goal/translate.py pypyevallop.py
def mainloop(...):
...
while pc < len(program):
jitdriver.jit_merge_point(pc=pc, tape=tape, program=program,bracket_map=bracket_map)
...
Should produce a much larger binary that interprets the BrainFuck files much faster
###How it works
The interpreter is usually running on your code as it is written, when it enters a loop that loop is considered hot and marked to be traced. When the loop is finished, the trace is sent to an optimizer and then an assembler which outputs machine code that gets executed every time an instruction is logged.
You can see how it works by adding a get_location
function that you pass to the JitDriver
def get_location(pc, program, bracket_map):
return "%s_%s_%s" % (
program[:pc], program[pc], program[pc+1:]
)
jitdriver = JitDriver(greens=['pc', 'program', 'bracket_map'], reds=['tape'],
get_printable_location=get_location)