Skip to content
No description, website, or topics provided.
C C++
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
compiled
sources
.hgignore
Makefile
README.org
isa.h
kramvm.c
scanner.l
vm.c
vm.h

README.org

KRAM: kyle’s register arithmetic machine

What is this?

This is the follow on project to kam, kyle’s arithmetic machine. The former is a stack-based machine, while kram is (as the name suggests) a register machine. Right now, there is no assembler for the register machine, so all programs have to be handwritten (i.e. with HT).

Architecture

The KRAM VM is an 8-bit machine with a 16-bit address space. Much inspiration was drawn from the 6502 CPU; the 6502 emulator I worked on previously provided a lot of the experience needed to get this off the ground quickly.

The KRAM VM has four general purpose 8-bit registers (A, B, X, and Y). A is the accumulator and is used as the target for all value instructions (e.g. ADD). B has no designated purpose; it was added to make doing some operations easier. X and Y are designed as address registers, but can be used otherwise when a program desires. Several instructions use X and Y to derive an address; as the machine operates on 8-bit data, but has a 16-bit address space, X is used to store the high byte of an address and Y is used to store the low byte of an address.

There are three additional registers: an 8-bit flags register, and the 16-bit registers SP and PC. PC stores the program counter, and SP stores the stack pointer. The stack pointer is actually a pair of 8-bit registers, SPA and SPB, to allow for manipulation in programs. The flags register has the layout

76543210
NNNNNNNE

where N is “not-used” and E is “equal”. The E bit is set when a compare instruction returns equal, and cleared when a compare instruction is not equal.

The VM is also initialised with a certain amount of memory, and a few parameters control the machine’s operation:

  • VM_ENTRY_POINT is the default initial value for the PC (currently 0).
  • VM_SP_START is the default starting value for the stack; currently, this is 0x200.
  • VM_DS_START is the default starting value for the data segment; this is currently 0x100.
  • VM_MEMORY is the default amount of memory available to the machine; this is currently 0x400 (1K of memory).

VM_DS_START isn’t used during execution of the VM, but is intended to be used by the assembler. Given these parameters, a default VM has 256B of program space (0-0x100), 256B of data segment (0x100-0x200), and 512B of stack space available; some of the stack can be used for a heap or other memory storage.

The ISA

The ISA operates on 5-bit instructions and 3-bit registers. The MSB of an instruction divides the space into two groups: control and value instructions. The second most-significant bit further divides each group into immediate and register mode.

Instructions

InstructionMnemonicBase value
00000not used0x00
00001BNE IMM0x08
00010BEQ IMM0x10
00011JMP IMM0x18
00100MOV IMM0x20
00101CMP IMM0x28
00110POKE IMM0x30
00111PEEK IMM0x38
01000SYSCALL0x40
01001BNE REG0x48
01010BEQ REG0x50
01011JMP REG0x58
01100MOV REG0x60
01101CMP REG0x68
01110POKE REG0x70
01111PEEK REG0x78
10000ADD IMM0x80
10001SUB IMM0x88
10010MUL IMM0x90
10011DIV IMM0x98
10100AND IMM0xa0
10101OR IMM0xa8
10110NOT IMM0xb0
10111XOR IMM0xb8
11000ADD REG0xc0
11001SUB REG0xc8
11010MUL REG0xd0
11011DIV REG0xd8
11100AND REG0xe0
11101OR REG0xe8
11110NOT REG0xf0
11111XOR REG0xf8

The base value is the op code acting on register A. The opcode for a given register may be determined by adding the register’s value directly to the base value:

RegisterBitsValue
A0000
X0011
Y0102
SPA0113
PC1004
FLG1015
B1106
SPB1117

For example, a register MOV for register A is 60, while a register MOV for register B is 66.

In the instruction lists below, r designates a register using the the syntax $A to indicate register A. i designates an 8-bit value, and ii designates a 16-bit value. Immediate values use the syntax #i and #ii: “#01” indicates an 8-bit immediate value of 1, and “#0100” indicates a 16-bit immediate value of 256.

Instruction List

Branch if not equal (BNE)

Branches to the indicated address if the E bit in the FLG register is not set.

Immediate syntax, and example for branching to 0x0002 if:

BNE	$A	ii
BNE	$A	#0002

Note that BNE does not use the register bits; any register value may be used but register A is preferred for consistency. The example above compiles to the bytes 080002.

In register mode, the address is taken from the value of X and Y, as described previously:

BNE

This compiles to the bytes 48.

Branch if equal (BEQ)

This is equivalent to the BNE instruction, except that it branches when the E bit is set.

Immediate mode example:

BEQ	$A	ii
BEQ	$A	#0002

The example compiles to 100002.

In register mode, the address is taken from the value of X and Y, as described previously:

BEQ

This compiles to the bytes 50.

Jump (JMP)

This unconditionally jumps to the indicated address. Like BNE, this disregards the register value, and $A is preferred for consistency.

Immediate mode example:

JMP	$A	ii
JMP	$A	#0002

This compiles to 180002.

Register mode example:

JMP

JMP also uses $X and $Y as the jump target.

Move (MOV)

This moves a value into the specified register.

Immediate syntax, and an example of setting $A to 1:

MOV	r	i
MOV	$A	#01

This compiles to the bytes 2001.

In register mode, the low three bits of the next byte are used to select a register. The high five bits should be set to 0 for consistency.

MOV	r	r
MOV	$B	$A

This example compiles to 6600.

Compare (CMP)

Compare compares the indicated values. The first value comes from the register indicated in the compare, and the second value depends on the mode.

In immediate mode, the second value comes from an immediate.

CMP	r	i
CMP	$A	#05	; example

The example compiles to 2805.

In register mode, the second value is taken from the indicated register:

CMP	r	r
CMP	$A	$B

The order of registers is not relevant. The example compiles to 6806.

Poke (POKE)

Poke stores the register value to memory.

In immediate mode, the address is taken from the next 16-bit immediate:

POKE	r	ii
POKE	$B	#0200

The example compiles to 360200.

In register mode, the address is taken from the X and Y registers:

POKE	r
POKE	$A

This example compiles to 70.

Peek (PEEK)

Peek loads a value from memory into a register.

In immediate mode, the address is taken from the next 16-bit immediate:

PEEK	r	ii
PEEK	$A	#0200

The example compiles to 380200.

In register mode, the address is taken from the X and Y registers.

PEEK	r
PEEK	$B

The example compiles to 7e.

Syscall (SYSCALL)

Syscall executes a system call in the VM. The syscall table:

SyscallFunctionDescription
0ExitCauses the VM to stop execution.
1PrintStrPrint a string.
2PrintNumPrint a number.

The print commands expect $X and $Y to be loaded with the address of the string. Strings should be NUL-terminated sequences of bytes starting at the address; numbers prints the unsigned byte as a hex-format value.

For example, printing the string at address 0x100:

MOV	$X	#01
MOV	$Y	#00
MOV	$A	#01
SYSCALL

This example compiles to 21012200200140.

ADD

Add performs addition on two values, storing the result in $A.

In immediate mode, the value of the next immediate is added to the value of the register:

ADD	r	i
ADD	$B	2

In the example, 2 is added to the contents of $B, and the result placed in $A. This example compiles to 8602.

In register mode, the next byte contains the register; only the three register bits are used. For consistency, the remaining bits should be zeroed.

ADD	r	r
ADD $B	$X

In the example, $A will contain the addition of the contents of $B and $X. It compiles to c601.

SUB

Sub performs subtraction on two values, storing the difference in $A.

In immediate mode, the value of the next immediate is subtracted from the value of the register:

SUB	r	i
SUB	$B	2

In the example, 2 is subtracted from the contents of $B, and the result placed in $A. This example compiles to 8e02.

In register mode, the next byte contains the register; only the three register bits are used. For consistency, the remaining bits should be zeroed.

SUB	r	r
SUB $X	$B

In the example, $A will contain the difference of $B taken from $X. It compiles to c906.

MUL

MUL computes the product of two values, storing the product in $A.

In immediate mode, the value of the next immediate is multiplied by the contents of the register:

MUL	r	i
MUL	$B	3

In the example, the product of B multiplied by 3 is stored in $A. This example compiles to 9603.

In register mode, the contents of the two registers are multiplied:

MUL	r	r
MUL	$Y	$B

In this example, the result of multiplying $Y by $B is stored in $A. It compiles to d206.

DIV

DIV computes the quotient and remainder of two values. The quotient is stored in $A, and the remainder is stored in $B.

In immediate mode, the contents of the register are divided by the value of the next immediate.

DIV	r	i
DIV	$Y	2

In the example, $A will get the quotient of $Y divided by 2, and $B will get the remainder. This compiles to 9a02.

In register mode, the contents of the two registers are divided:

DIV	r	r
DIV	$X	$B

In this example, $X is divided by $B. It compiles to d906.

AND

AND computes the bitwise AND of two values, storing the result in $A.

In immediate mode, the value of the register is AND’d with an immediate:

AND	r	i
AND	$A	#40

In this example, $A is AND’d with the immediate 0x40. It compiles to a07f.

OR

OR computes the bitwise OR of two values, storing the result in $A.

In immediate mode, the value of the register is OR’d with an immediate:

OR	r	i
OR	$B	0x80

In this example, $B is OR’d with 0x80. It compiles to ae80.

NOT

NOT computes the bitwise inverse of a value, storing the result in $A.

In immediate mode, the register is ignored (but should be $A for consistency), and the immediate is inverted and stored in $A.

NOT	r	i
NOT	$A	#7f

In this example, the value 0x7f is inverted and stored in $A. It compiles to b07f.

In register mode, the value of the register is inverted and stored in $A.

NOT	r
NOT	$B

In the example, the contents of $B are inverted and stored in $A. It compiles to 0xf6.

XOR

XOR computes the bitwise exclusive-OR of two values, storing the result in $A.

In immediate mode, the register is XOR’d with the immediate:

XOR	r	i
XOR	$A	#2a

In this example, $A is XOR’d with 0x2a, and the result stored in $A. It compiles to b02a.

In register mode, the two registers are XOR’d and the result stored in $A.

XOR	r	r
XOR	$X	$B

In this example, $X is XOR’d with $B; it compiles to f906.

Writing KRAM assembly

Note: there is no assembler yet, so this is mostly notes for how to implement it.

Programs are divided into two segments: data and text. The data segment contains definitions to be copied into the data segment, and the text segment, which is copied to the program entry point.

The data section

The data section is started with “.data”; it should contain definitions in the form

[label:]	[.type] [data]

The label is optional, and is a convenience for referring to the data later. The types are:

  • string: a text string to be NUL-terminated
  • bytes: a sequence of immediate values to be stored

For example:

.data
		.bytes #00 #04 #08
hello:		.string "Hello: "

The bytes would be stored starting at the data segment; the offsets to data can be calculated from the data segment start point and taking into account the size of data.

The text segment

Instructions should be entered one per line; semicolons are used as comments and extend to the end of the line. Registers are denoted with a leading $:

NamedNumberedRegister
$A$0A
$B$6B
$X$1X
$Y$2Y
$SPA$3SPA
$SPB$7SPB
$FLG$5FLG

Interacting with the PC is not permitted in user programs.

Immediates are written in hex with a leading #, such as #02.

The first token on the line should be an instruction; stylistically, each token should be separated by a hard tab, but this is not strictly required. In place of an immediate, a bare token will be taken as the name of a label.

The kramvm

kramvm is the implementation of the virtual machine that runs programs. It has a few flags:

  • d dumps the registers to screen after the VM is finished.
  • e allows the user to explicitly set the entry point.
  • m allows the user to explicitly set the memory size.
  • s allows the user to explicitly set the initial stack pointer.

The program should be passed a single filename containing a compiled program; the VM will load the program, run it, and report any errors.

All output from the VM occurs between two lines. If an error occurs, the registers are dumped.

Example: countdown

countdown counts down from 5 to 1, displaying the counter value each iteration.

countdown listing

.data
counter:	.string "The counter is:"	; addr: 0x64
newline:	.string "\n"			; addr: 0x74

.text
MOV	$A	#05		; set the initial value
LOOP:
POKE	$A	#0100		; store counter at 0x100
MOV	$A	#01
MOV	$X	#00
MOV	$Y	#64
SYSCALL				; print counter string
MOV	$A	#02
MOV	$X	#01
MOV	$Y	#00
SYSCALL				; print value of counter
MOV	$A	#01
MOV	$X	#00
MOV	$Y	#74
SYSCALL				; print newline
PEEK	$A	#0100		; load counter into $A
SUB	$A	#01		; decrement counter
CMP	$A	#00		; is counter 0?
BNE	LOOP			; repeat loop if not
MOV	$A	#00
SYSCALL				; exit

Compiled countdown program

0000000: 2005 3001 0020 0121 0022 6440 2002 2101   .0.. .!."d@ .!.
0000010: 2200 4020 0121 0022 7440 3801 0088 0128  ".@ .!."t@8....(
0000020: 0008 0002 2000 4000 0000 0000 0000 0000  .... .@.........
0000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000060: 0000 0000 5468 6520 636f 756e 7465 7220  ....The counter 
0000070: 6973 3a00 0a00 0000 0000 0000 0000 0000  is:.............

Note: this program (as a 256-byte program) is available compiled in compiled/countdown.bin.

Example countdown run

 $ ./kramvm compiled/countdown.bin
Loading 256 byte program.
Starting VM.
------------------------------------------------------------------------
The counter is:5
The counter is:4
The counter is:3
The counter is:2
The counter is:1
------------------------------------------------------------------------
OK

Notes and lessons learned

When I started thinking about this, I drew a lot of inspiration from my study of the 6502 CPU (from the 6502 CPU emulator I wrote most of) and from my attempt at writing a MIPS assembler.

There are quite a few deficiencies with this VM:

  • It only operates on unsigned numbers; there’s no facility for signed operations.
  • There is no support for floating point operations.
  • There is no way to input data into the VM outside of the compiled program.
  • There are no tri-argument instructions, like ADD $B $A #01.
  • There are no logical shifts.
  • Due to the register/immediate symmetry, there are some useless instructions (like NOT in immediate mode).
  • There is no overflow/underflow/carry detection.

As I was writing this documentation and writing a few test programs, I found (and fixed) some design flaws as well:

  • The stack pointer register was originally a single 16-bit register; this made interacting with it practically impossible.
  • The immediate mode of value operations originally operated on a pair of immediates; the way these were implemented made it difficult to perform intra-register operations.
  • Originally, there were only three general registers: A, X, and Y. B came about while writing some programs and writing this documentation; I noticed I had extra slots for registers, so I added it.

For simplicity, I wanted the operand grabbed during the fetch stage (integral to step) to have a fixed size; as the VM operates on 8-bit values, a single byte seemed appropriate. As I begin to list out the necessary instructions, I thought about how I could combine register and instruction into this single operand. First, I started with an instruction size of 4-bits. This wasn’t enough, but that’s where I noticed the possiblity of dividing the instructions into two groups (control and value), and further dividing them into immediate and register modes, and thought how to differentiate these groups with a bit test.

The lack of an assembler made programming interesting. I wrote out the programs on paper, devising the assembler syntax that I eventually settled on this way. Then I had to hand-built instructions, shifting each instruction and adding in the register. Each jump had to be hand-calculated, sometimes with stand-in values until I got far enough to put in the right address. Finally, I wrote the instruction table that appears previously, which made writing programs much easier. There is a certain connection with my own history and with the history of computing that derives from having to enter the op-codes directly in hexadecimal and using a hex editor to enter programs; the differences in writing a program and entering a program becomes explicit, something I still often forget with the compilers and interpreters I have today.

The VM itself (and the instruction set) were planned and implemented over the course of about a day, with some minor tweaks and this documentation written the next.

While it seems at times a bit overkill to do all this work for such a trivial project, upon reflection the lessons and skills learned make this worthwhile. I’ve gained a deeper insight into how computers are designed and work (an understanding improved by TECS, but actually implementing something like this is a whole different side of the problem). Many of deficiencies I was able to fix were discovered while trying to describe how to use the VM, for example.

One of the next projects I want to build is a VM that

  • compiles from Lisp to the byte code for the VM,
  • integrates networking capabilities for communicating with the outside world,
  • is 64-bit.

Source files

  • isa.h contains the ISA definition to be shared between the VM and the assembler.
  • vm.c and vm.h contain the VM imlpementation.
  • kramvm.c contains the user interface for the VM, allowing the user to load programs and tune the VM parameters.
  • compiled/ contains pre-built programs for testing the VM and comparing with the output of the assembler.
  • sources/ contains example source code.

The examples

Compiled programs have the extension “.bin”, while source files have the extension “.rm” (for register machine).

  • helloworld is the canonical “Hello, world” program.
  • twoplustwo adds 2+2 and prints the result; it’s a slightly extended “Hello, world” program.
  • countdown has the previously listed countdown example
  • countdownb uses register B; the previous programs were written prior to the introduction of register B, and this tests its use.
You can’t perform that action at this time.