Skip to content

jfrech/joy-assembler

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Joy Assembler

A minimalistic toy assembler written in C++ by Jonathan Frech, August to November 2020.

Building

Joy Assembler requires the C++17 standard and is best built using the provided Makefile.

Build: 🟩 passing (2021-02-10T16:09:47+01:00)

Usage

Joy Assembler provides a basic command-line interface:

./JoyAssembler <input-file.asm> [visualize | step | memory-dump]

The optional argument visualize allows one to see each instruction's execution, step allows to see and step through (by hitting enter) execution. Note that the instruction pointed to is the instruction that will be executed in the next step, not the instruction that has been executed. memory-dump mocks any I/O and outputs a step-by-step memory dump to stdout whilst executing.

Architecture

Joy Assembler mimics a 32-bit architecture. It has four 32-bit registers: two general-prupose registers A (accumulation) and B (oberand) and two special-prupose registers PC (program counter) and SC (stack counter).

Along with the above 32-bit registers, whose values will be called word henceforth, a single block of 8-bit bytes of memory of a fixed size is provided, addressable as either numeric 4-byte words or individual bytes. Each Joy Assembler instruction consists of a name (represented as a singular op-code byte) paired with possibly four bytes representing an argument. All instructions' behavior is specified below.

The simulated architecture can also be slightly modified using pragmas (see below).

Joy Assembler files

Joy Assembler files usually end in a .asm file extension and contain all instructions, static data, and other file inclusions that will be executed by the simulated machine. Except for comments, each line contains either an instruction -- to the simulated machine or assembler -- or a piece of data or is left blank. What follows is the documentation of the previously mentioned features.

Comments

A comment is defined as any characters from a semicolon (;) to the end of the line.

One quirk of implementation is that string or character literals cannot contain an unescaped semicolon, as this would be interpreted as a comment. As such, if a semicolon is needed, please resort to either \u003b, \U0000003b or \; (the last one being specially treated whilst removing comments).

; takes one stack argument and returns one stack argument
routine:
    ; load argument
    lsa -8
    ; ...
    ; store argument
    ssa -8
    ret

Definitions

To define a constant value, one can use the definition operator :=:

n := 5
x := 0b00100000

Pragmas

Pragmas are used to alter the simulated machine's fundamental behavior or capability.

pragma options description default
pragma_memory-size mininal, dynamic or an unsigned 32-bit value set the size in bytes of the available memory block minimal
pragma_memory-mode little-endian or big-endian set the endianness for all 4-byte memory operations little-endian
pragma_rng-seed an unsigned 32-bit value set the random number generator's seed on default, a random seed is used
pragma_static-program true or false set all instructions in memory to read-only true
pragma_static-stack-check true or false statically assure that a stack was defined if needed true
pragma_embed-profiler-output false or true if enabled, the profiler will tacitly output to stdout false

Labels

One can either hard-code program positions to jump to or use an abstraction; program labels. A label is defined by an identifier succeeded by a colon (:) and accessed by the same identifier preceded by an at sign (@). A label definition has to be unique, yet can be used arbitrarily often:

ack:
    ; ...
main:
    mov 2
    psh
    mov 3
    psh
    cal @ack
    pop
    ptu
    pop
    hlt
stack:
    data [0xfff]

Data Blocks

A static data block can be defined using data ..., [optional-size]optional-literal-value, ... or data ..., "string-literal", .... It defines a statically initialized block of memory. Any numeric literal or string character occupies a word (four bytes), string characters being Unicode-capable:

primes:
    data 2, 3, 5, 7, 11, 13, 17 ; ... (finite memory)
global-x:
    data [1]0
global-y:
    data [1]0
buf:
    data [0xff]0xffffffff
msg:
    data "The end is nigh. \u2620\0"
msg2:
    data "I'll say \"goodbye", 0x22, ".", 0b00000000
stack:
    data [0xfff]

Stack

To easily facilitate recursion, Joy Assembler provides native support for a stack, that is, a special-purpose region of memory which is usually accessed from the stack's top and can grow and shrink. When a stack is desired, its static size has to be specified:

stack:
    data [0xfff]

A stack is required when using any stack instruction. Stack underflow, overflow and misalignment are strictly enforced. SC is initialized with @stack when present.

Instructions

This is the full list of Joy Assembler instructions. Each instruction will be statically loaded into memory using exactly five bytes: a single byte for the instruction's op-code and four further bytes for its arguments, if the instruction allows one, else four zero bytes. When an optional argument is not specified, its default value is used. When an argument is required and not specified or specified and not allowed, an error is reported.

An argument can be specified as either a numeric constant (0xdeadbeef, 55, 0b10001), a character ('🐬', '\U0001D6C7'), a label (@main, @routine, @stack) or a defined constant.

instruction name argument mnemonic description
nop
nop optional, default: 0 "no operation" No operation. Shows up in memory view, displaying the given value.
memory
lda required "load a" Load the value at the specified memory location into register A.
ldb required "load b" Load the value at the specified memory location into register B.
sta required "store a" Store the value in register A to the specified memory location.
stb required "store b" Store the value in register B to the specified memory location.
lia optional, default: 0 "load indirect a" Load the value at the memory location identified by register B, possibly offset by the specified number of bytes, into register A.
sia optional, default: 0 "store indirect a" Store the value in register A at the memory location identified by register B, possibly offset by the specified number of bytes.
lpc none "load program counter" Load the value of the program counter register PC into register A.
spc none "store program counter" Store the value in register A to the program counter register PC.
lya required "load byte a" Load the byte at the specified memory location into the least significant byte of register A, modifying it in-place and keeping the three most significant bytes.
sya required "store byte a" Store the least significant byte of register A to the specified memory location.
jumps
jmp required "jump" Jump to the specified program position, regardless of the machine's state.
jn required "jump negative" Jump to the specified program position if register A holds a negative value (A < 0), otherwise perform no operation.
jnn required "jump non-negative" Jump to the specified program position if register A holds a non-negative value (A >= 0), otherwise perform no operation.
jz required "jump zero" Jump to the specified program position if register A holds a zero value (A == 0), otherwise perform no operation.
jnz required "jump non-zero" Jump to the specified program position if register A holds a non-zero value (A != 0), otherwise perform no operation.
jp required "jump positive" Jump to the specified program position if register A holds a positive value (A > 0), otherwise perform no operation.
jnp required "jump non-positive" Jump to the specified program position if register A holds a non-positive value (A <= 0), otherwise perform no operation.
je required "jump even" Jump to the specified program position if register A holds an even value (A & 1 == 0), otherwise perform no operation.
jne required "jump non-even" Jump to the specified program position if register A holds an odd value (A & 1 == 1), otherwise perform no operation.
stack
cal required "call routine" Push the current program counter onto the stack, increase the stack counter and jump to the specified program position.
ret none "return" Decrease the stack counter and jump to the program position identified by the stack value.
psh none "push stack" Push the value of register A onto the stack and increase the stack counter.
pop none "pop stack" Decrease the stack counter and pop the stack value into register A.
lsa optional, default: 0 "load stack into a" Peek in the stack, possibly offset by the specified number of bytes, and load the value into register A.
ssa optional, default: 0 "store stack from a" Poke in the stack, possibly offset by the specified number of bytes, and store the value of register A.
lsc none "load stack counter" Load the value of the stack counter register SC into register A.
ssc none "store stack counter" Store the value in register A to the stack counter register SC.
reg. A
mov required "move immediately" Move the specified immediate value into register A.
not none "bitwise not" Invert all bits in register A, modifying it in-place.
shl optional, default: 1 "shift left" Shift the bits in register A by the specified number of places to the left, modifying it in-place. Vacant bit positions are filled with zeros.
shr optional, default: 1 "shift right" Shift the bits in register A by the specified number of players to the right, modifying it in-place. Vacant bit positions are filled with zeros.
inc optional, default: 1 "increment" Increment the value in register A by the specified value, modifying it in-place.
dec optional, default: 1 "decrement" Decrement the value in register A by the specified value, modifying it in-place.
neg none "numeric negate" Numerically negate the value in register A, modifying it in-place.
reg. A, B
swp none "swap register values" Swap the value of register A with the value of register B.
and none "bitwise and" Perform a bit-wise and operation on the value register A, using the value of register B as a mask, modifying register A in-place.
or none "bitwise or" Perform a bit-wise or operation on the value register A, using the value of register B as a mask, modifying register A in-place.
xor none "bitwise xor" Perform a bit-wise xor operation on the value register A, using the value of register B as a mask, modifying register A in-place.
add none "numeric add" Add the value of register B to the value of register A, modifying register A in-place.
sub none "numeric subtract" Subtract the value of register B from the value of register A, modifying register A in-place.
i/o
get none "get number" Input a numerical value from stdin to register A.
gtc none "get character" Input a utf-8 encoded character from stdin and store the Unicode code point to register A.
ptu none "put unsigned" Output the unsigned numerical value of register A to stdout.
pts none "put signed" Output the signed numerical value of register A to stdout.
ptb none "put bits" Output the bits of register A to stdout.
ptc none "put character" Output the Unicode code point in register A to stdout, encoded as utf-8.
rnd
rnd none "pseudo-random number" Call the value in register A r. Set A to a discretely uniformly distributed pseudo-random number in the range [0..r] (inclusive on both ends).
hlt
hlt none "halt" Halt the machine.

Testing

Automatic tests can be performed by invoking make test, testing programs in test/programs and comparing their sha512-summed memory-dump output to test/pristine-hashes. Note that test files prefixed by test-r- make use of seeding pseudo-random number generators and thus behave platform-dependantly, possibly failing on some machines.

About

A minimalistic toy assembler written in C++.

Resources

License

Stars

Watchers

Forks

Packages

No packages published