Skip to content

A sub-1KB, self-hosting, native code Forth without compromise

Notifications You must be signed in to change notification settings

mkicjn/paraforth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

paraforth

A sub-1KB, self-hosting, native code Forth without compromise

At the core of paraforth is a very small assembly program - just an association list of names to subroutines, and an input loop for invoking them. By pre-populating the list with just enough functionality to build a macro assembler, a self-extensible language kernel is born.

This project is a long-running exercise in building the smallest self-sufficient Forth possible, without ANY sacrifices in speed or usability. No inputting pre-assembled machine code at runtime, and no cobbling together logic operations from NAND.

The entire language, save for just 15 words and 756 bytes of machine code, is implemented in itself - legibly - and builds in place on startup. Additionally, support for full bootstrapping coming soon (TM).

(Please note: This project is an active work in progress.)

Quirks and Features:

  • Tiny binary executable size - under one kilobyte
  • Fast - a simplistic benchmark task demonstrates ~4x speedup over gforth-fast on author's machine
  • Fewer primitives than eForth - 15 vs. 31 - with one spent just to enable line comments out of the box
  • Subroutine-threaded code with primitive inlining - works by postponing blocks of code with { and }
  • Compile-only Forth - code can still be "interpreted" (compiled and executed immediately) with [ and ]
  • All words technically immediate - non-immediates use a shim that compiles a call instruction
  • No internalized number syntax - parsing words like $ and # used for integer literals (at least at first)
    • Quality-of-life extensions in repl.fth implement base and automatic number parsing
  • An assembler for a useful subset of x86-64 implemented at runtime as the first extension of the compiler
  • Working but very basic demo of ELF executable generation (no metacompiler yet)
  • Reasonably extensive design notes in the source code - assumes familiarity with typical Forth internals

Getting Started:

The loader script and list files handle the tedium of concatenating and piping source code files (plus standard input, if applicable). Running ./loader.sh interactive.lst provides the friendliest environment available for experimentation.

From there, you can try entering the canonical Hello World example, which looks like this:

[ ." Hello, world!" cr  bye ]

Note that due to the interaction between cat, standard input, and the pipe to paraforth, you will need to hit enter once paraforth terminates before returning to the terminal. The loader script can also be run with no arguments for additional details.

More example code is available in the examples and src directories.

As a very brief overview, many trivial Forth examples can be translated to paraforth in just a couple of steps:

  • Surround interpreted non-immediate words with brackets to execute them.
    • Example: bye becomes [ bye ], 10 constant x becomes [ 10 ] constant x
  • (Optional) Precede numeric literals with a parsing word indicating the base.
    • Example: 77 becomes # 77 or $ 4d
    • Note: This step is only mandatory if not using the code in repl.fth

Friendly disclaimer: This is scratching the surface. Although this project aims to respect established conventions, standards conformance is not a priority. Deviations are necessary to serve design goals, constraints, and/or personal preferences.

(Old usage notes with some additional details)
  • Compile with make
  • Run manually with, e.g., cat input | ./paraforth > output or cat input - | ./paraforth
  • Debug with gdb paraforth -ex 'r < <(cat input)' and an int3 assembled somewhere
    • Tip: Disassemble latest word with x/10i $rsi+9+N where N is the length of its name (i.e., x/1c $rsi+8)
  • Disassemble using objdump -b binary -m i386:x86-64 -D paraforth

Dependencies:

  • fasm (flat assembler; used to assemble the core)
  • Linux-based OS (only to host syscalls)

My hope for this project is that it will eventually become fully self-hosting, even down to the OS level in the distant future.

Roadmap:

  1. Produce a minimal subroutine-threaded Forth compiler capable of implementing an assembler.
    • (DONE)
  2. Implement a basic assembler using the compiler.
    • (DONE)
  3. Extend the existing Forth compiler in-place using the assembler.
    • (DONE)
  4. Improve usability by providing a REPL with error handling, convenient launch scripts, and library code.
    • (IN PROGRESS)
  5. Bootstrap the core and use the resulting matured system for bigger projects. (Generating UEFI executables?)

(Anything marked done is still subject to improvements over time.)

Resources:

For a list of words paraforth currently offers, load the interactive file list and invoke words (defined in src/repl.fth).

To learn the system in detail, review src/core.asm before proceeding through the files listed in interactive.lst. All source code comments in this project assume familiarity with programming in Forth, as well as typical Forth implementation techniques.

Recommended background resources:

Additionally, a few notable design decisions were inspired by FreeForth.

About

A sub-1KB, self-hosting, native code Forth without compromise

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published