Skip to content

Grasp specification

jedb edited this page Aug 30, 2014 · 2 revisions

Revised Report on the Esoteric Language Grasp

Introduction

Grasp is a graph-oriented esoteric programming language. Its design influences include Scheme/Lisp, from where it borrows the concept of source code being in the form of the primary data structure (replacing lists with graphs), and Funge-98, from where comes the slightly controversial design choice of providing reasonable facilities for practical programming, instead of trying to be as tarpitty as possible.

The name is an abbreviation of "GRAphS are being Processed".

Terms and Definitions

Following standard graph theory terminology, given a directed edge from node x to node y, x is called the tail of the edge, and y is the head. Unqualified uses of the word graph refer to a directed multigraph (multidigraph).

In this specification, verbiage such as "pointed by foo" is intended to be interpreted as shorthand for "pointed to by an outgoing edge labeled foo", sometimes prefixed with "value of the node" if that is necessary to make the statement sensible. Similarly, "edge foo", unless otherwise indicated, means "outgoing edge labeled foo".

Whenever something is chosen "randomly", the uniform distribution is implicated.

Concepts

A running Grasp system is a directed multigraph of labeled nodes and edges, along with some state. A Grasp program is the initial graph that exists before execution starts.

Node labels are the scalar values of Grasp. There are two scalar types: (IEEE 754 double-precision floating point) numbers and (Unicode) strings. Aggregate data structures can be built as subgraphs. A particularly notable such structure is the singly linked list connected with edges labeled next, which is used by several Grasp commands to represent stacks or strings.

Edge labels are arbitrary Unicode strings, and need not be unique even within particular (tail, head) pair.

Execution Semantics

The state of a running Grasp system consists of the graph and an ordered list of instruction pointers (IPs). Each IP consists of a stack of references to nodes; this is called the call stack of the IP. The node referred to by the topmost element of the call stack is the current node of the IP.

Execution proceeds as a sequence of discrete ticks. At each tick, the list of IPs is processed sequentially. To process an IP, its current node is executed. Unless inhibited by the execution, the current node is then updated by following a randomly chosen edge next. If no such edges exist, the IP is removed from the list.

It is an error if a node that is executed does not contain a valid command. See the instruction set listing for the list of valid commands.

If any edges cond exist, the current node is only executed if all cond edges point to nodes that have numeric, non-zero values. The standard update of the current node happens unconditionally, however.

If an edge name exists, its head must have a string value; this string is then called the name of the tail. It is an error for a node to have multiple names. It is not an error to have multiple nodes with the same name.

The nodes referred to by the call stack of any IP, as well as all named nodes, are considered reachable. Any node that is a head of an edge where the tail is reachable is also reachable. Unreachable nodes (and all their outgoing edges) are subject to garbage collection, as it is not possible for them to affect future behavior of the system.

The initial IP list will be populated by inserting one IP for each node named grasp:main, with that node as the only element of the call stack. The order of the initial IP list is indeterminate. It is an error if no nodes named grasp:main exist. Execution terminates when the IP list becomes empty.

(The colon in grasp:main has no semantic meaning, but is used by convention as a namespace separator.)

Instruction Set

Each instruction name is followed, in parentheses, by a specification of edges that have special semantics for the instruction. The specification is a comma-separated list of edge labels, optionally followed by a "+", a "?" or a "*". The items have the following meanings:

  • foo: exactly one outgoing edge foo is required.
  • foo+: one or more outgoing edges foo are required.
  • foo?: an outgoing edge foo is optional, but at most one is allowed.
  • foo*: outgoing edges foo are optional, and any number is allowed.

General-purpose instructions

set (in*, out*)
Let v be the value pointed by in; if there are multiple edges in, randomly select one; if there are no edges in, skip this instruction. Update all nodes pointed by out to have v as their value.

new (tail, head, label)
Insert a new edge with tail pointed by tail, head pointed by head and label pointed by label. It is an error for edge label to point to a node with a numeric value. Note that this instruction can alter the control flow by manipulating edges labeled next.

del (tail*, head*, label*)
Remove all matching edges. An edge matches, if its tail is pointed by any edge tail. If edges head exist, an edge matches only if its head is pointed by any edge head. If edges label exist, an edge matches only if its label is equal to the value pointed by some label.

Stack manipulation

As implied by the semantics of these commands, a stack is a linked list of nodes connected by next edges. A single node (with an arbitrary value) represents an empty stack.

push (stack, in*)
Let s be the node pointed by stack. Create a new node n, and an edge labeled next with tail n and head s. Adjust all incoming edges of s (including edge stack) to have n as their head. If any edges in exist, perform a set with edge stack playing the role of edges out.

pop (stack, out*, empty*)
Let t be the node pointed by stack. It is an error for t to have multiple edges next.

  • If t has one edge next:
    If any edges out exist, perform a set with edge stack playing the role of edge in. Let s be the node pointed by edge next from t. Adjust all incoming edges of t (including edge stack) to have s as their head.
  • If t has no edges next:
    If any edges empty exist, perform a set with edges empty playing the role of edges in. Otherwise, take no action.

pick (stack, depth, out*, empty*)
Edge depth must point to a numeric value; call that value d. Consider d as a stack depth, with 0 denoting the top element. Perform a set with an implicit edge in pointing at the stack node at depth d. If d exceeds the stack depth, apply the semantics of pop on an empty stack.

any integer-valued node (stack)
In homage to Befunge, act as push with an implicit edge in pointing at the current node itself.

Subroutines

The subgraph duplication here is intended to help in writing reentrant functions, as most nontrivial functions would generally involve manipulating the function's subgraph.

call (func+, any edges other than name, next or cond)
Let f be a node with a name matching the value pointed by any edge func. If there are multiple candidates, choose one randomly; if there are no candidates, it is an error. Duplicate the subgraph reachable from f; let f' be the duplicate node corresponding to f. For each outgoing edge of the current node with a label other than func, name, next or cond, add corresponding edges that have tail f' but are otherwise identical. Finally, push a reference to f' on the call stack. Inhibit the normal semantics of updating the (new) current node for this tick.

ret (any edges other than name, next or cond)
Let c be the node in the call stack below the current node, which must exist. For each outgoing edge of the current node with a label other than name, next or cond, add corresponding edges that have tail c but are otherwise identical. Pop the call stack. (In most cases, this makes the subgraph generated by a corresponding call instruction suitable for immediate garbage collection.)

Arithmetics

It is an error for any arguments of these operations to point at nodes with non-numeric values.

add (arg*, out*)
Let v be the sum of all values pointed by arg, or 0 if there are no edges arg. Set the value of all nodes pointed by out to v.

mul (arg*, out*)
Equivalent to add, with the exception of computing the product instead of sum, and with 1 as the identity element.

sub (left, right*, out*)
Let v be the value pointed by left, after subtracting the sum of values pointed by right. Set the value of all nodes pointed by out to v.

div (left, right*, out*)
Equivalent to sub, with the exception of computing the product instead of sum.

mod (left, right, out*)
Let v be the integer remainder of dividing left by right, with v having the same sign of left. Set the value of all nodes pointed by out to v.

IO

The string-based IO operations add a terminating 0 value at the end of the linked-list representations of strings, to make it easier to test for empty string using the stack operations.

getc (out*, fh?)
Let c be the Unicode code point of a single character read from file handle pointed by fh, or the standard input if there are no edges fh. If an end-of-file condition occurs, let c be -1. Set the value of any nodes pointed by out to c.

putc (in+, fh?)
Let c be the value pointed by (randomly selected) in, which must be a number. Write the character corresponding to Unicode code point c to the file handle pointed by fh, or the standard output if there are no edges fh.

gets (out, label, fh?)
Read a line of text from file handle fh or standard input, discarding the newline character if present. Construct a singly linked list of nodes, where the value of each node is a Unicode code point of the corresponding character, terminated by a node with the value 0. Create a new edge with tail pointed by out, head being the node corresponding to first character of the string, and label being equal to the value pointed by label.

puts (in+, nl, fh?)
Let n be the tail of a randomly selected edge in. If the value of n is a string, let s be that string; if the value of n is a number, let s be the string resulting from interpreting n as the node corresponding to the first character of a string in the format constructed by gets. If an edge nl does not exist or points to a non-zero value, append a newline to s. Write s to the file handle fh or the standard output.