This is a universal relocatable macro assembler and linker. It does not target any particular processor, instead you define the instruction set grammar using pseudo-instructions. This is a convenient assembler for custom micro-engines you might design for an FPGA or ASIC.
The definition for a particular processor could be contained in an "include" file. For example, this is how you could define the Motorola MC6800 conditional branch instructions:
; Define a syntax rule: "name" will match "pattern". If there
; is a match, "expr" is returned.
.rule name expr pattern
; Define condition codes, so that "<cc>" will refer to them
; So for example, cs means carry set and will have the value $25
.rule cc $20 ra
.rule cc $22 hi
.rule cc $23 ls
.rule cc $24 cc
.rule cc $24 hs
.rule cc $25 cs
.rule cc $25 lo
.rule cc $26 ne
.rule cc $27 eq
.rule cc $28 vc
.rule cc $29 vs
.rule cc $2a pl
.rule cc $2b mi
.rule cc $2c ge
.rule cc $2d lt
.rule cc $2e gt
.rule cc $2f le
.rule cc $8d sr
; Define the branch instructions
.insn b<cc> <expr>
.emit arg1
.emit arg2-.+1
.end
; b<cc> defines all branch instructions, including bra, bhi, bls,
; bcc, bcs, etc.
; <expr> is the branch target, an arbitrary expression
; There is whitesace between <cc> and <expr>, so the input is
; allowed to have whitespace there.
; Within the body of the instruction, we emit two bytes:
; First byte is the op-code for the branch instrucion.
; Second byte is relative branch offset
; arg1, arg2, etc. are replaced with value of the rule or value of
; the expression. They are assigned in the order that values appear
; in the pattern after the ".insn".
; Note that rules can return a comma separated list of values. Also
; note that the "pattern" part of the rule can include reference to
; other named rules or expressions:
.rule mode $F8,arg1 <expr> ; Direct addressing
.rule mode $FA,arg1 #<expr> ; Immediate addressing
; In the above, "arg1" is replaced with the <expr> from the pattern.
uasm [-l file] [-o file] [-I path]... file
'file' Name of assembly language source file to assemble.
'-I' Add path to include file path list.
'-l' Generate assembly listing.
'-o' Generate object file.
ulink [-q] [-l file] [-o file] [-loc link[,load]:sect+sect+...]... files
'-q' Suppress messages
'files' Names of object files, libraries and text files containing
additional file names. First module found in this list is the
root module of the program. Each other module found here is only
included if there is a reference path from the root module to it.
A library is one or more object modules simply concatenated together.
'-o' Gives name of binary output file. If no name is given, no output file
is generated. Note that the first byte of the output file is always
the first generated byte, regardless of load address.
'-l' Gives name of map/cross-reference listing file. If no name is given
no map file is generated
'-loc link[,load]:sect+sect+sect...'
Locate sections:
'link' is the starting address of the given sections in hex. This
is the address the sections are expected to be at when the
program is executed and is used to link symbols together.
'load' is the load address of the sections in hex. It is the
address in the binary output file where the sections will
be placed by the linker. If this is left out, it will
default to the same as the link address.
sect+sect...
list of sections to be located. The first section in the
list is located at the given address. Additional sections
are located directly after the first.
If not all sections are located on the command line, the linker will
prompt with '>' for the locations of the remaining sections.
Note that more than one '-loc' may be specified; there should be one
for each fixed address in the desired memory map.
.set label,expr ; Temporarily set label to value of expression
.equ label,expr ; Permamently set label to value of expression
label: ; Same as '.equ label,.'
.space expr ; Reserve space
.emit expr ; Emit a single byte
.align expr ; Align to multiple of expr
.sect "name" ; Switch to named section
.public label, label, .... ; Make labels public
.macro name,arg,arg,... ; Define normal macro
.end
.foreach name,arg ; Define foreach macro: body is
; expanded for every character
.end ; of the argument.
name arg,arg,arg,... ; Expand macro
.errif expr,string ; Print string if expr is true
.if expr ; Conditional assembly
.elseif expr
.else
.end
.include "filename" ; Include a file
.rule name expr pattern ; Define a syntax rule
.insn pattern ; Define an instruction
.end
; comments are allowed (they are ignored).
<...> refers to another rule.
whitespace means require whitespace here.
<> means whitespace is optional here.
<expr> means expect an expression here, possibly by whitespace (so
it is not necessary to surround <expr> with <>s).
other characters are literal matches.
label ; Returns its 32-bit value
. ; Current location value
$FF80 ; Hex constant
@770 ; Octal constant
%1011 ; Binary constant
123 ; Decimal constant
( expr )
- expr
~ expr
! expr
expr << expr
expr >> expr
expr * expr
expr / expr
expr % expr
expr & expr
expr + expr
expr - expr
expr | expr
expr ^ expr
expr == expr
expr != expr
expr < expr
expr > expr
expr >= expr
expr <= expr
expr && expr
expr || expr
expr ? expr : expr
In the object module format description we use the following notation for object module componants:
<type:name>
where: 'name' is replaced with a descriptive name of the componant
'type' indicates how the componant is stored in an object file
Object module componant types:
'byte' componant is a single byte
'num' a variable length unsigned number:
If the number is between 0 and 125 inclusive,
128 is added to the number and it is emitted as a
single byte.
If the number is between 126 and 32767 inclusive,
the number is emitted as two bytes. The most
significant byte is emitted first.
If the number is between 32768 and 2^32-1 inclusive,
a byte equalling 255 is emitted, and then the four
byte number is emitted, most significant byte first.
Note that a flag value of 254 is reserved for future
expansion.
'zstring' a variable length string. the string is emitted
as-is, and includes a terminating NUL.
'string' a variable length string with size prefix. These,
strings have the following format:
<num:string-length> <zstring:the string>
The string-length includes the terminating NUL of
the the zstring.
'expr' is an expression emitted in reverse-polish notation.
See interm.c and interm.h for how expressions are
emitted.
An object module is composed of records. The general format of a record is as follows:
<byte:type-code> <num:body-size> <bytes:body>
where: is a single byte record type code. Record type codes are defined in interm.h
<body-size> gives the size of just the body in bytes.
<body> depends on record type and may have zero length.
Module name record. Always first record in module.
iMODULE <num:bodysize> <zstring:module-name>
Section list.
iSECTS <num:bodysize> <num:no.sections> { <num:align> <num:size> <string:section-name> } ...
Symbols. The first <no.pubs> symbols are publics. The remaining symbols are external references.
iSYMS <num:bodysize> <num:no.symbols> <num:no.pubs> { <string:symbol-name> <string:source-reference> } ...
Public symbol values. No. values is same as <no.pubs> in iSYMS record, and in same order.
iXDEFS <num:bodysize> { <expr:value> } ...
Data fragment to be placed at given offset of given section.
iFRAG <num:bodysize> <num:section no.> <num:offset> <bytes:data>
Fixups for immediately previously emitted data fragment.
iFIXUPS <num:bodysize> <num:num-fixups> { <num:data-offset> <num:type> <expr:value> <expr:msg> } ...
A type code of 1 indicates that this is a byte fixup and the value of the byte is determined by the value expression.
A type code of 2 indicates that this is a range check instruction and the the messages expression is printed if the value expression evaluated to a non-zero (true) value.
End of module.
iEND <num:bodysize>