Skip to content

Ninesquared81/bude

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bude

Bude is a stack-based language inspired in part by Porth.

NOTE: The language is currently very unfinished.

Assembly code can be generated and assembled by FASM.

Building

The project can be built using the makefile provided. This may need to be edited to configure for different environments (e.g. to change to compiler other than GCC).

$ make

Running the compiler

After building the project, the interpreter can be run using the following command:

$ ./bin/bude.exe ./examples/hello_world.bude
Hello, World!

This will run the program hello_world.bude from the examples subdirectory. A full list of command line options can be found by running:

$ ./bin/bude.exe -h

To create a Windows executable file, run:

$ ./bin/bude.exe ./examples/hello_world.bude -a -o hello_world.asm
$ fasm hello_world.asm
flat assembler  version 1.73.31  (1048576 kilobytes memory)
3 passes, 0.1 seconds, 2560 bytes.
$ ./hello_world.exe
Hello, World!

This assumes FASM has been installed and is in the PATH variable. Note that the actual output of FASM may vary.

Language Overview

Bude has a stack for storing 64-bit words. There are instructions to manipulate the stack.

Notation: below, different annotations are used to describe how data on the stack should be interpreted for the differet instructions. Note that regardless of the size of the underlying C type, these values always take up one full stack slot (64 bits). A C-like language is used for the result expressions of some operations.

Data:

  • w – arbitrary stack word
  • i – signed integer value
  • f – floating-point value (either 32- or 64-bit)
  • n – number value (integer or floating-point)
  • p – pointer value
  • b – Boolean value
  • c – UTF-8 codepoint
  • <literal> – the literal value denoted
  • pk – structural "pack" type
  • cmp – structural "comp" type
  • T – type variable "T"

Syntax:

  • <symbol> – syntactically required symbol
  • section – syntactical section
  • [ optional-section ] – optional syntactic section
  • … – syntax can be repeated arbitrarily
  • ( groups ) – group syntax sections
  • choice1|choice2 – syntax option

Push instructions

"Lorem ipsum" → ( pstart ilength ) : Push the specified string to the stack.

'q'c : Push the specified character to the stack.

42i : Push the specified integer to the stack.

trueb : Push the Boolean "true" value to the stack.

falseb : Push the Boolean "false" value to the stack.

Pop instructions

w pop → ∅ : Pop the top element from the stack and discard it.

w print → ∅ : Pop the top element from the stack and print it in a format based on its type.

c print-char → ∅ : Pop the top element from the stack and print it as a unicode character.

Field access operations

pk <field-name: T> → pk T : Push the specified field from the pack.

pk T <- <field-name: T> → pk : Pop the top stack value and use it to set the specified field of the pack underneath.

cmp <field-name: T> → cmp T : Push the specified field from the comp.

cmp T <- <field-name: T> → cmp : Pop the top stack value and use it to set the specified field of the comp underneath.

Arithmetic operations

n1 n2 + → (n1 + n2) : Pop the top two elements and push their sum.

n1 n2 - → (n1 - n2) : Pop the top two elements and push their difference.

n1 n2 * → (n1 * n2) : Pop the top two elements and push their product.

i1 i2 divmod → (i1 / i2) (i1 % i2) : Pop the top two elements and push the quotient and remainder from their division. The remainder is always non-negative (Euclidean division).

i1 i2 idivmod → (i1 /trunc i2) (i1 %trunc i2) : Pop the top two elements and push the quotient and remainder of their truncated divison. The quotient is rounded towards zero and the remainder can be negative.

i1 i2 edivmod → (i1 /euclid i2) (i1 %euclid i2) : Pop the top two elements and push the quotient and remainder of their Euclidean division. The quotient is rounded towards negative infinity and the remainder is always non-negative.

i1 i2 / → (i1 / i2) : Pop the top two stack elements and push the quotient from their division. Acts like divmod pop to pop the remainder.

f1 f2 / → (f1 / f2) : Pop the top two floating-point stack elements and push their ratio. Unlike the integer version, this is exact division and thus has no remainder.

i1 i2 % → (i1 % i2) : Pop the top two stack elements and push the remainder from their division. Acts like divmod swap pop to pop the quotient.

n ~ → (-n) : Negate the top stack element.

Comparison operations

n1 n2 < → (n1 < n2) : Pop n2 and n1 and push back whether n1 < n2.

n1 n2 <= → (n1 <= n2) : Pop n2 and n1 and push back whether n1n2.

n1 n2 = → (n1 == n2) : Pop n2 and n1 and push back whether n1 = n2.

n1 n2 >= → (n1 >= n2) : Pop n2 and n1 and push back whether n1n2.

n1 n2 > → (n1 > n2) : Pop n2 and n1 and push back whether n1 > n2.

n1 n2 /= → (n1 != n2) : Pop n2 and n1 and push back whether n1n2.

Logical operations

w not → !w : Replace the top element with its logical inverse (i.e. non-zero → false, 0 → true).

w1 w2 or → (w1 or w2) : Drop w1 if it is equal to zero, else drop w2.

w1 w2 and → (w1 and w2) : Drop w1 if it is non-zero, else drop w2.

Memory operations

p deref → (byte *p) : Pop the pointer p and push the first byte it points to.

<var-name: T> → T : Push the specified local variable.

T <- <var-name: T> → ∅ : Pop the top stack value and use it to set the specified local variable.

Stack manipulation

w1 w2 swapw2 w1 : Swap the top two elements on the stack.

w1 dupew1 w1 : Duplicate the top element on the stack.

w1 w2 overw1 w2 w1 : Copy the next element over the top element.

w1 w2 w3 rotw2 w3 w1 : Rotate the top three stack elements.

Conversions

w to <type: T> → T : Convert top stack element to type T (value-preserving).

w as <type: T> → T : Coerce top stack element to type T (bit-pattern--preserving, with truncation).

Constructors

F1 F2 F3 … <pack-name> → pk : Construct a pack with the field types F1, F2, F3, ….

F1 F2 F3 … <comp-name> → pk : Construct a comp with the field types F1, F2, F3, ….

Destructors

pk unpackF1 F2 F3 … : Unpack the pack on the top of the stack into its fields with types F1, F2, F3, ….

cmp decompF1 F2 F3 … : Decompose the comp on the top of the stack into its fields with types F1, F2, F3, ….

Control flow constructs

if condition then then-body [elif elif-condition then elif-then-body …] [else else-body] end

while condition do body end

for [<loop-var> (to|from)] count do body end

The for loop has two forms. The simple form (for count …) loops the number of times specified by count. The counting form (for <loop-var> to count …, for <loop-var> from count …) creates a loop variable and binds it to the name specified, which can be accessed inside the loop. The value stored in the loop variable either starts at zero and counts up to count or counts down to zero from count.

P1 P2 P3 … <func-name> → R1 R2 R3 : Call the specified function which takes parameters with types P1, P2, P3, … and returns values with types R1, R2, R3.

ret : Return from the current function.

Definitions

pack <pack-name> def (<field-name> -> <field-type>) … end

comp <comp-name> def (<field-name> -> <field-type>) … end

func <param-type> … <func-name> [-> <ret-type> …] func-body end

var <var-name> -> <var-type> … end

import <lib-name> def (func <param-type> … <ext-func-name> [-> <ret-type> …] [from "<alias>"] [with <call-conv>] end) … end

Language Features

Packs and Comps

Bude stores data on the stack. Each value fits into one 64-bit stack slot. This makes things easier to work with, but what if we wanted to compose many "things" together while still treating it as one unit. We need some sorty of structural type, like struct in C.

Luckily, Bude has us covered with not one, but two structural types. The first of these is the pack, which still takes up one stack slot but can have multiple non-overlapping fields inside. Each field is accessed by a corresponding name and can hold a value of a certain type. The syntax for defining a pack type is outlined above, but to give an example, let's say we want to store an RGBA value on the stack. Each channel is in the range 0-255, so 4 channels would only need 32 bits in total – much smaller than a stack slot. This is a good call for a pack. We define our pack as:

pack RGBA-Colour def
    r -> u8
    g -> u8
    b -> u8
    a -> u8
end

Now we can construct an RGBA-Colour value using the symbol "RGBA-Colour" as a constructor:

70u8 130u8 180u8 255u8 RGBA-Colour  # Creates an RGBA colour (70, 130, 180, 255) (HTML SteelBlue)

We can extract a field from the newly-created pack by using its name as a getter:

g print  # prints 130

We can also set a field using the <- operator:

150 <- a
a print   # prints 150

We can deconstruct a pack to release all its fields using the unpack instruction:

unpack
print  # prints 150
print  # prints 180
print  # prints 130
print  # prints 70

Packs are great and all, but what if we wanted to store, say, a 3-D vector? If we used 32-bit floats, we'd fill up our pack after only 2 components, and for 64-bit floats, we'd fill it up after only one component. Luckily, Bude has another structural type available to us: the comp. Comps allow us to compose several stack words into a single unit. Syntactically, they work very similarly to packs. For our vector example:

comp Vec3D def
    x -> f64
    y -> f64
    z -> f64
end

We, again, have named fields. This time, each field correpsonds to a different stack slot within the comp. Like with packs, we can access fields by their name:

5.0 42.0 -0.5 Vec3D
x print  # prints 5.0
y 5 - <- y
y print  # prints 37.0

As with packs, we have an instruction to deconstruct a comp. This time, it's called decomp (for decompose):

decomp
print  # prints -0.5
print  # prints 37.0
print  # prints 5.0

Functions

Most programming languages have a concept of a function or procedure: a block of code that can be executed on demand and return to the callsite. Bude has functions, too. For an example, a function hello which prints the string "Hello, World!":

func hello def
    "Hello, World!\n" print
end

The function is introduced by the keyword func. Then, we have the function's name followed by the keyword def, which marks the start of the function's body. This is then terminated by end, which is also where the function definition ends.

Often, we want to transfer data to/from a function through parameters and return values. In Bude, arguments and return values are passed/left on the stack. Unlike functions in most programming languages, Bude functions can return multiple values. It's as simple as leaving multiple values on the stack at the end of the function. If we want our Bude function to receive parameters or leave return values, we must specify this with a function signature. The signature comes between the func and def delimiters and is an expansion on the simple name we used in the example above (in fact, in that example, "hello" is the signature). The syntax for signatures is best illustrated with an example:

func int int diff-squares -> int def
    dupe * swap
    dupe * swap
    -
end

This function returns the difference of the squares of the two integers passed. The signature starts with the parameter types (if any), followed by the function name. Return types are introduced by a right arrow -> following the name. If there are no return types, the arrow is omitted. We can have multiple return values:

func first-five-primes -> int int int int int def
    2 3 5 7 11
end

About

A stack-based language

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages