r4 is an interactive environment where the source code IS the machine code. There is no compilation in r4.
r4 is a stack-based, RPN, virtual CPU/VM that supports many registers, functions, locals, floating point, and any amount of RAM.
The number of registers, functions, and memory are configurable and can be scaled as necessary to fit into a system of any size.
For example, one might configure things like this:
- On a Leonardo, 16 registers, 32 functions, 1K of CODE RAM, no VARS RAM.
- On an ESP8266, 512 registers, 512 functions, 12K of CODE RAM, 12K VARS RAM.
- On a RPI Pico, 16K registers, 16K functions, 64K of CODE RAM, 64K VARS RAM.
- On a PC, 64K registers, 64K functions, 128K of CODE RAM, 256K VARS RAM.
There are multiple reasons for creating r4, including:
- Freedom from the need for a multiple gigabyte tool chain and the edit/compile/run/debug loop for developing everyday programs. Of course, you need one of these monsters to build r4, but at least after that, you are free of them.
- Many interpreters use tokens and a large SWITCH statement in a loop to execute the program. In those systems, there needs to be a compiler to parse the input text and generate the machine code that the VM executes. Additionally, the VM's opcodes (the cases in the SWITCH statement) are often arbitrarily assigned and are not human-readable, so they have no meaning to the programmer when inspecting the code that is actually being executed. I wanted to avoid as much of that as possible, and have only one thing to learn: the VM's opcodes.
- A minimal implementation that is "intuitively obvious upon casual inspection" and easy to extend as necessary.
- An interactive programming environment.
- The ability to use the same environment on my personal computer as well as development boards.
- An environment that can be deployed to many different types of development boards via the Arduino IDE.
- Short commands so that there is not a lot of typing needed.
A register name is of the form [A-Z][A-Z0-9]*
(e.g. - REG12).
- They can be retrieved, set, incremented, or decremented in a single operation (r[REG],s[REG],i[REG],d[REG]).
- Example: -
12 sTMP1 34 sTMP2 rTMP1 rTMP2 + .
will print46
.
A reference to a temporary register is of the form [rsid][0-9]
(e.g. r6).
- They are allocated and freed in groups of 10.
- Working with these are very fast, as the name does not need to be hashed.
T+
allocates 10 new temporary registers (r0-r9).T-
frees the most recently allocated set.- They have the same operations [rsid]. For example:
i4
increments register #4. - They are used like other registers. For example:
r7
pushes the value of register #7. - They can be used as local variables inside a function (using T+ and T-), or across functions.
A function name is of the form [A-Z][A-Z0-9]*
(e.g. - BTW10AND20).
Functions are defined in a Forth-like style, using ':', and are called using the 'c' opcode.
0(COPY (F T N--): copy N bytes from F to T)
:COPY T+ s3 s2 s1 r3 0[r1 C@ r2 C! i1 i2] T-;
rFROM rTO rNUMBER cCOPY
Code | Description |
---|---|
"Hello World!" | The standard "hello world" program. |
:MIN %%>($)\; :MAX %%<($)\; | Define functions MIN and MAX |
:BTW T+ s3 s2 s1 r1 r2 > r1 r3 < b& T-; | Using temporary registers |
:BTW10AND20 9 21 cBTW; | Define a function with an alphanumeric name |
32 127[I###"%n[%c] - %d, %x, %b"] | Loop to print the ASCII table |
The Arduino "blink" program is a one-liner, except this version stops when a key is pressed:
1000 sDELAY 13 sLED rLED xPO 1{\ 0 2[I rLED xPWD rDELAY xW] K? ~} K@ \
Explanation of the blink program:
Code | Description |
---|---|
1000 sDELAY | Set register DELAY to 1000 |
13 sLED | Set register LED to 13, the typical port number of the built-in LED |
rLED xPO | Push register LED (13) on the stack and open that port |
1{ | Begin a WHILE loop; DROP control value |
\ 0 2 [ | Begin a FOR loop from 0 to 2 |
I rLED xPWD | Write the loop index (I 0 or 1) to port LED (13) |
rDELAY xW] | Wait for DELAY (1000) milliseconds; end FOR loop |
K? ~} | Check if a key was pressed; if not then continue, else exit loop |
K@ \ | Read the key that was pressed and discard it |
There are more examples here: https://github.com/CCurl/r4/blob/main/examples.txt
There are 2 memory areas in r4:
- CODE: r4 function code goes in this area. Default size is 128K.
- VARS: use with 'V' opcode. Default size is 256K.
xIH
(Info: HERE) returns the first unused address in the CODE area.- The CODE area can also be used for data. See the 'U' opcode.
- The VARS area is not used by the r4 system at all, it is 100% free.
Some development boards support LittleFS. For those boards, the LITTLEFS directive can be #defined to persist data to the board, including supporting automatically loading blocks on boot.
r4 includes a simple block editor. It has a VI-like feel to it.
- The entire system is implemented in a few files, primarily: config.h, r4.h, r4.cpp, pc-main.cpp, and r4.ino.
- There are a few additional files to support optional functionality (e.g.-File access).
- The same code runs on Windows, Linux, and multiple development boards (via the Arduino IDE).
- See the file "config.h" for system configuration settings.
r4 hashes register and function names and uses the hashed value as the index into the array of values or addresses.
Here is the hashing function (the DJB2a hash function using XOR):
int doHash(int max) {
UCELL hash = 5381;
if (!ISALPHA(*pc)) { return 0; }
while (ISALPHANUM(*pc)) {
hash = (hash * 33) ^ *(pc++);
}
return hash & max;
}
This is very fast, but poses some limitations:
- The maximum number of registers and functions need to be powers of 2.
- r4 does NOT detect hash collisions as it does not keep key values.
- My tests have indicated that for a large enough number of buckets, collisions are not common.
- ':' prints "-redef-f:[hash]-" when the given function name already has a value (a possible collision).
- '&' prints "-redef-r:[hash]-" when a the given register name already has a value (a possible collision).
- Use
xh[NAME]
to see info about [NAME]. For example:xhTEST
- The target machine/environment is controlled by the #defines in the file "config.h"
- For Windows, There is a Visual Studio solution. Use the x86 configuration (32-bit).
- For Development boards, I use the Arduino IDE. See the file "config.h" for board-specific settings.
- For Linux systems, there is a makefile. Default is 64-bit, or you can change it to 32-bit.
- I do not have an Apple system, so I haven't tried to build r4 for that environment.
- However, being such a minimal C program, it should not be difficult to port r4 to any environment.
OP | Stack | Description |
---|---|---|
+ | (a b--n) | n: a+b - addition |
- | (a b--n) | n: a-b - subtraction |
* | (a b--n) | n: a*b - multiplication |
/ | (a b--q) | q: a/b - division |
M | (a b--r) | r: a%b - modulo |
S | (a b--q r) | q: div(a,b), r: modulo(a,b) (SLASH-MOD) |
- r4 uses double wide (64-bit) floating point numbers
OP | Stack | Description |
---|---|---|
FF | (i--f) | f: i converted to a float |
FI | (f--i) | i: f converted to an integer |
F+ | (a b--n) | n: a + b |
F- | (a b--n) | n: a - b |
F* | (a b--n) | n: a * b |
F/ | (a b--n) | n: a / b |
F< | (a b--fl) | fl: if (a < b) then 1 , else 0 |
F= | (a b--fl) | fl: if (a = b) then 1 , else 0 |
F> | (a b--fl) | fl: if (a > b) then 1 , else 0 |
F. | (fl--) | Float: print fl |
F_ | (a--b) | b: -a |
FQ | (f--sq) | sq: the SQRT of f |
FT | (f--th) | th: the TANH of f |
OP | Stack | Description |
---|---|---|
b& | (a b--n) | n: a AND b |
b| | (a b--n) | n: a OR b |
b^ | (a b--n) | n: a XOR b |
b~ | (a--b) | b: complement of a (e.g - 1011 => 0100) |
L | (a n--b) | b: a << n (Left-Shift) |
R | (a n--b) | b: a >> n (Right-Shift) |
OP | Stack | Description |
---|---|---|
# | (a--a a) | Duplicate TOS (DUP) |
\ | (a b--a) | Drop TOS (DROP) |
$ | (a b--b a) | Swap top 2 stack items (SWAP) |
% | (a b--a b a) | Push 2nd (OVER) |
_ | (a--b) | b: -a (NEGATE) |
D | (a--b) | b: a-1 (DECREMENT) |
P | (a--b) | b: a+1 (INCREMENT) |
A | (a--b) | b: absolute value of a (ABS) |
xK | (--) | Display the stack (.S) |
OP | Stack | Description |
---|---|---|
@ | (a--n) | Fetch CELL_T n from address a |
! | (n a--) | Store CELL_T n to address a |
C@ | (a--n) | Fetch BYTE n from address a |
C! | (n a--) | Store BYTE n to address a |
U | (n--a) | a: address of byte n in the CODE area. |
V | (n--a) | a: address of byte n in the VARS area. |
- A register name is of the form
[A-Z][A-Z0-9]*
(e.g. - REG43). - The number of registers is controlled by the NUM_REGS #define in "config.h".
OP | Stack | Description |
---|---|---|
&ABC | (N--) | Define register rABC with initial value of N. |
*** NOTE: If register rABC has a value <> 0, r4 prints "-redef-r[hash]-". | ||
This can be used to check if there is a hashing collision. | ||
rABC | (--N) | N: value of register ABC. |
sABC | (N--) | N: value to store in register ABC. |
iABC | (--) | Increment register ABC. |
dABC | (--) | Decrement register ABC. |
- A reference to a temporary register is of the form
[rsid][0-9]
(e.g. r7).
OP | Stack | Description |
---|---|---|
T+ | (--) | Allocate 10 temporary registers |
rN | (--V) | V: value of register #N. |
sN | (V--) | V: value to store in register #N. |
iN | (--) | Increment register #N. |
dN | (--) | Decrement register #N. |
T- | (--) | Free the most recently allocated temporary registers |
- A function name is of the form
[A-Z][A-Z0-9]*
(e.g. - TMP23). - The number of functions is controlled by the NUM_FUNCS #define in "config.h".
- A function definition is limited to ONE line.
- Returning while inside of a loop will eventually cause a problem if not unwound.
- Use '^' to unwind the loop stack first.
OP | Stack | Description |
---|---|---|
:ABC | (--) | Define function ABC. Copy chars to (HERE++) until closing ';'. |
If function ABC has a value <> 0, print "-redef-f[hash]-". | ||
cABC | (--) | Call function ABC. Handles "tail call optimization". |
*** NOTE: this allocates 10 local variables, which are freed on ';' | ||
; | (--) | Return: PC = rpop() |
OP | Stack | Description |
---|---|---|
. | (N--) | Output N as a decimal number. |
, | (N--) | Output N as a character (EMIT) |
"str" | (--) | Output formatted characters until the next '"' see (1) |
B | (--) | Output a single SPACE (32,) |
N | (--) | Output a single NEWLINE (13,10,) |
K? | (--f) | f: non-zero if char is ready to be read, else 0. |
K@ | (--n) | n: Key char, wait if no char is available. |
0-9 | (--N) | Scan DECIMAL number N until non digit |
N.N | (--F) | - Use NNN.NNN (eg - 3.1415) to enter a floating point number F |
- to specify multiple values, separate them by space (4711 3333) | ||
- to enter a negative number, use "negate" (eg - 490_) | ||
hXXX | (--N) | Scan HEX number N until non hex-digit ([0-9,A-F] only ... NOT [a-f]) |
'C | (--n) | n: the ASCII value of C |
`x` | (a--a b) | Copy following chars until closing '`' to (address a). |
- a: address, b next byte after trailing NULL. |
(1) Output formatting:
- Similar to 'C' formatting, the '%' char identifies additional processing.
- %c: (N--) Output N as a char
- %b: (N--) Output N as a number in base 2
- %d: (N--) Output N as a number in base 10
- %x: (N--) Output N as a number in base 16
- %B: (N B--) Output N as a number in base B
- %s: (A--) Output A as a NULL-terminated string
- %e: (--) Output an escape (27)
- %q: (--) Output a quote (34)
- %n: (--) Output a CR/LF (13,10)
- %f: (F--) Output F as a double/float using "%f"
- %g: (F--) Output F as a double/float using "%g"
- any other char after the % is output as is (eg "%%" outputs a single '%')
OP | Stack | Description |
---|---|---|
< | (a b--f) | f: if (a < b) then 1, else 0; |
= | (a b--f) | f: if (a = b) then 1, else 0; |
> | (a b--f) | f: if (a > b) then 1, else 0; |
~ | (x--f) | f: if (x = 0) then 1, else 0; (logical NOT) |
( | (f--) | if (f != 0), execute code in '()', else skip to matching ')' |
X | (a--) | if (a != 0), execute/call function at address a |
G | (a--) | if (a != 0), go/jump to function at address a |
OP | Stack | Description |
---|---|---|
[ | (F T--) | FOR: start a For/Next loop. |
*** NOTE: if (F > T), F and T are swapped | ||
I | (--n) | n: the index of the current FOR loop |
J | (--n) | n: the index of the next outer FOR loop |
p | (n--) | n: number to add to "I" |
^ | (--) | UNWIND, use with ';'. Example: rS rK>(^;) |
] | (--) | NEXT: increment index (I) and loop if (I < T) |
OP | Stack | Description |
---|---|---|
{ | (f--f) | BEGIN: if (f == 0) skip to ending '}' |
^ | (--) | UNWIND, used with ';'. Example: rX ~(^;) |
} | (f--f?) | WHILE: if (f != 0) jump to starting '{', else drop f and continue |
OP | Stack | Description |
---|---|---|
fO | (nm md--fh) | FILE: Open, nm: name, md: mode, fh: fileHandle |
fC | (fh--) | FILE: Close, fh: fileHandle |
fD | (nm--) | FILE: Delete, nm: name |
fR | (fh--c n) | FILE: Read, fh: fileHandle, c: char, n: num |
fW | (c fh--n) | FILE: Write, fh: fileHandle, c: char, n: num |
fL | (a fh--n) | FILE: ReadLine, fh: fileHandle, a: address, n: num (-1 if EOF) |
OP | Stack | Description |
---|---|---|
bL | (N--) | BLOCK: Load block N (block-nnn.r4). This can be nested. |
bA | (--) | BLOCK: Load Abort - stop loading the current block (use wih xh ) |
bE | (N--) | BLOCK: Edit block N (file name is block-nnn.r4) |
bR | (N a sz--) | BLOCK: Read block N; a: addr, sz: size of buffer |
bW | (N a sz--) | BLOCK: Write block N; a: addr, sz: size of buffer |
OP | Stack | Description |
---|---|---|
xV | (--n) | r4 version number (YYYYMMDD) |
xIAF | (--a) | INFO: Address where the function vectors begin |
xIAH | (--a) | INFO: Address of the HERE variable |
xIAR | (--a) | INFO: Address where the registers begin |
xIAU | (--a) | INFO: Address there the CODE area begins |
xIAV | (--a) | INFO: Address there the VARS area begins |
xIC | (--n) | INFO: CELL_T size |
xIF | (--n) | INFO: Number of functions (NUM_FUNCS) |
xIH | (--a) | INFO: Current HERE value |
xIR | (--n) | INFO: Number of registers (NUM_REGS) |
xIU | (--n) | INFO: Size of CODE area (CODE_SZ) |
xIV | (--n) | INFO: Size of VARS area (VARS_SZ) |
xPI | (p--) | Arduino: open pin input (pinMode(p, INPUT)) |
xPU | (p--) | Arduino: open pin pullup (pinMode(p, INPUT_PULLUP)) |
xPO | (p--) | Arduino: open pin output (pinMode(p, OUTPUT) |
xPRA | (p--n) | Arduino: pin read analog (n = analogRead(p)) |
xPRD | (p--n) | Arduino: pin read digital (n = digitalRead(p)) |
xPWA | (n p--) | Arduino: pin write analog (analogWrite(p, n)) |
xPWD | (n p--) | Arduino: pin write digital (digitalWrite(p, n)) |
xs | (a--) | PC: call "system(a)" |
xSR | (--) | R4 System Reset |
xT | (--n) | Time in milliseconds (Arduino: millis(), Windows: GetTickCount()) |
xh[S] | (--n r f) | n: Hash, r: Reg, and f: Func value for [S] (eg - xhALLOT ) |
xH | (n--) | n: new value for screen height (for the editor). If n=0 default to 35. |
xM | (--n) | Time in microseconds (Arduino: micros()) |
xW | (n--) | Wait (Arduino: delay(), Windows: Sleep()) |
xR | (n--r) | r: a pseudo-random number between 0 and n (uses XOR-shift) |
*** NOTE: when n=0, r is the entire CELL_T-sized number | ||
xQ | (--) | PC: Exit R4 |