Skip to content

amckewan/forth-itc64

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

64-bit Indirect-Threaded Forth

This Forth model and implementation was inspired by this Linkedin post about the unique design of the 1980s video game Starflight which was written in Forth. The author remarked that "the interpreter is only 5% of the state: the rest is the Forth heap, full of cons cells (basically) that function as both data and code."

This describes the classic Forth model I first learned from public Forths like FIG-Forth and F83, and is the same model as the first Win32Forth. Simple indirect-threaded code. A small number of words are implemented in machine-dependent assembly code, the rest of the dictionary is built as lists of pointers to other words, the "cons cells."

The article concludes, that "just as in the history of life, a lot of cool weird lineages died out."

We know Forth isn't quite dead yet, although it's not in the top 50 programming languages according to the TIOBE index. There are viable Forth-based software companies, hobby groups and, I am sure, many personal Forth projects. This is one of them.

This is an adaptation of the classic Forth model to run on a modern 64-bit OS.

  • 64-bit cell size and address space
  • 32-bit execution tokens
  • Forth addresses are process addresses (no rel>abs)
  • Separate code and data
  • 32 GB max dictionary size
  • Indirect-threaded inner interpreter
  • Kernel: 250 words, 3 KB code, 10 KB data
  • Extended: 340 words, 15 KB data

Because we extended the cell size from 16 to 64 and the execution token from 16 to 32 bits, we would expect Starflight to be about twice the size of the DOS version (assuming we had the source and the patience to port it). We wouldn't need the swapping code since the entire dictionary can fit in memory. Perhaps it would fit on one of those new-fangled 1.44 MB 3.5" floppies.

What is missing? A LOT! This is just enough to compile itself and pass the standard core test suite. I did this for fun. If you want to program in Forth, consider a more complete system. This is just a toy.

Building

Building requires make, clang, nasm and gforth.

Try make, make test and make run.

Source Description
fo.c C wrapper that allocates memory, loads the dictionary, starts Forth and provides the BIOS
bios.c Implements basic OS functions
cross.f Forth cross-compiler
kernel.f Forth source for the kernel
kernel.asm x86-64 assembly source
rth Loads extensions on top of the kernel
src/ Forth source for the extensions
test/ Test suite
bench/ Benchmarks to compare with gforth (sanity check)

The wrapper fo is built from the C sources.

The assembly code in kernel.asm is built with NASM and produces two files:

code.bin - binary image to be loaded at origin

code.sym - symbols used to build the data dictionary

The cross compiler loads the code symbols and defines the words that build the data dictionary. We then load kernel.f to build the dictionary which is written to data.bin.

With fo, code.bin and data.bin we can then run Forth like this:

$ ./fo rth

Memory Map

Memory is mapped at a fixed address with the mmap() function. Linux and Windows allow you to map addresses as low as 64K, but MacOS won't allocate below 4 GB. Given that, we map our Forth dictionary at address $100000000 (4 GB).

The Forth dictionary is arranged as follows:

      LIMIT --> +-------------------------------+
                |     input buffers (~4K)       |
      FIRST --> +-------------------------------+
                |    data stack, grows down     |
                |                               |
                |                               |
                |                               |
                |                               |
                |     free space, grows up      |
       HERE --> +-------------------------------+
                |                               |
                |          dictionary           |
                |                               |
                +-------------------------------+
                |           code (8K)           |
     ORIGIN --> +-------------------------------+

We use the C stack as the Forth return stack.

x86 Registers

This is the register usage for x86 Linux:

x86 Forth Description
rax top Top of data stack
rbx w XT of the currently executing word
rcx
rdx
rsi
rdi
rsp sp Data-stack pointer
rbp rp Return-stack pointer
r8-11
r12 ip Instruction pointer
r13 lp Local frame pointer
r14 up User pointer (not currently used)
r15 origin Start of dictionary

Unassigned registers are freely used in code words. All Forth registers except rax are preserved across API calls (System V ABI).

Word Headers

Headers have the name preceeding the count so all fields are at a fixed offset from the CFA. The CFA and PFA are cell-aligned.

Field Size Abbr. Description
Name 1-31 Name characters
Count 1 NFA Count + flags
Link 4 LFA Link field, XT of previous definition
Code 8 CFA Code field, points to executable code
Body 0+ PFA Parameter field

Inner Interpreter

Code fields are 8-byte aligned and contain the address of the machine code implementing the word.

An execution token (XT) is the offset, in cells, from the origin to the code field. We limit the XT to 32 bits, giving us a maximum dictionary size of 32 GB.

: XT  ( cfa -- xt )  ORIGIN -  3 RSHIFT ;
: CFA ( xt -- cfa )  3 LSHIFT  ORIGIN + ;

The Forth instruction pointer (IP) is 4-byte aligned and points to the XT of the next word to execute. The x86-64 implementation of NEXT is three instructions:

    mov     ebx,[r12]       ; fetch xt (ebx zero extended -> rbx)
    add     r12,4           ; advance ip
    jmp     [r15+rbx*8]     ; indirect jump via code field

The code for a colon definition pushes the IP onto the return stack and loads the IP from the PFA of the word. In x86-64 (simplified):

    mov     [rp-8],ip       ; push ip to the return stack
    sub     rp,8
    lea     ip,[r15+rbx*8+8] ; new ip = pfa
    next

At the end of a colon definition, we compile ;S which does:

    mov     ip,[rp]         ; pop ip from the return stack
    add     rp,8
    next

Porting

For x86 Windows, we would need to modify the OS interface to account for the different x86-64 ABI.

x86 MacOS should be trivial (a recompile) although no longer meaningful.

ARM MacOS would require porting the assembly code to aarch64. The Forth source and BIOS should not require changes.

Porting to a non-64-bit architecture is more significant since the code in many places just assumes a 64 bit cell size.

About

64-bit Indirect-threaded Forth Model

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors