code-golf-oriented esoteric programming language
Python

README.md

GS2 documentation

What is GS2?

GS2 is a stack-based, concatenative programming language with functional influences, inspired by GolfScript and J. Its primary purpose is doing well at code golf contests; it achieves this by supplying many built-in commands and syntactical shortcuts that are each only one byte long.

Programming in GS2 is conceptually similar to programming in GolfScript or similar languages: the stack initially contains a string representing standard input, and its contents are printed to standard output when the program is finished.

Values

GS2 values are integers, blocks, or lists. Blocks contain unevaluated code, representing "functions", like in GolfScript. There is no dedicated string type: strings are just lists of integers.

Tokens

Throughout this document mnemonics are listed next to some token names; these are compatible with the included gs2c utility.

Special tokens are:

Token Meaning
$01 $xx Push unsigned byte $xx to stack.
$02 $yy $xx Push signed short $xxyy to stack.
$03 $zz $yy $xx $ww Push signed short $wwxxyyzz to stack.
$04 ss* $zz Push string(s) to stack. ss is separated by $07; the action performed is decided by the string end byte $zz, see below.
$07 $cc Push single-character string "\xcc" to stack.
$08 Opens a block.
$09 Closes a block.
%111xxyyy (hex values $e0 through $fb) Wraps last y + 1 values into a block. x is the final token (what do we do with this block?): 0=nop, 1=map ($34), 2=filter ($35), 3=apply on both of top 2 elements ($38)
$xx $fc (dm1, dump-map1) Map single token inside lists. Short for $08 $90 $xx $09 $34.
$xx $fd (df1, dump-filter1) Filter single token inside lists. Short for $08 $90 $xx $09 $35.
$fe (m:) Open rest-of-program map block.
$ff (f:) Open rest-of-program filter block.

The end bytes for $04 mean the following:

End byte Meaning
$05 push strings to stack, sequentially
$06 push strings to stack in array
$9b sprintf (pops format from ss, pops fitting number of items from stack, pushes formatted string)
$9c regex match (pops regex from ss, pops string from stack, pushes 1 on match, else 0)
$9d regex replace (pops replacement string from ss, pops regex from ss, pops string from stack and pushes re.sub result)
$9e regex find (like $9c, but calls re.findall)
$9f regex split (like $9c, but calls re.split)

The regexes used by these operations may be prefixed by special characters to set a special variable c: by default it is 0, prefixing the regex by $5D sets it to 1, prefixing it by $7D $xx sets it to $xx. c affects the operations as follows:

  • $9c: match whole string if c > 0
  • $9d: perform at most c substitutions (unlimited if c = 0)
  • $9e: find first matching substring only if c > 0 (else array of matching substrings)
  • $9f: perform at most c splits (unlimited if c = 0) Furthermore, if the first character of a program is $04, it may be omitted; an unmatched string closing token will automatically be paired up. (gs2c will automatically perform this optimization.)

The following single-byte tokens have special meaning at the start of a program:

Mode Meaning
$30 (line-mode) Line mode: [program] -> [lines, map program, unlines]
$31 (word-mode) Word mode: [program] -> [words, map program, unwords]
$32 (line-mode-skip-first) Like $30, but ignore first line.

All other tokens are simple operations that pop values from the stack and push results back.

Constants

The following single-byte tokens push constants to the stack:

Byte Constant
$0a (new-line) [$0a]
$0b (empty-list) []
$0c (empty-block) {}
$0d (space) [$20]
$10 - $1a 0 - 10
$1b 100
$1c 1000
$1d 16
$1e 64
$1f 256

Functions

Opcode Meaning
$0e (make-array, extract-array) Pop number n, then pop n elements and push them back into an array; pop array and push each element.
$20 (negate, reverse, eval) Negate numbers; reverse lists; evaluate blocks.
$21 (bnot, head) Bitwise-negates numbers; extract first element from lists.
$22 (not, tail) Boolean negation for numbers; drop first element from lists.
$23 (abs, init) Absolute value for numbers; drop last element from lists.
$24 (digits, last) Push array of base 10 digits for numbers; extract last element from lists.
$25 (random) Push random.randint(0, x-1) for numbers x; choose random element for lists.
$26 (dec, left-uncons) Subtract 1 from numbers; pushes tail and then head for lists.
$27 (inc, right-uncons) Add 1 to numbers; pushes init and then last for lists.
$28 (sign, min) Push sign for numbers; minimum for lists.
$29 (thousand, max) Multiply numbers by 1000; maximum for lists.
$2a (double, lines) Multiply numbers by 2; split list into lines.
$2b (half, unlines) Divide numbers by 2; join with newlines for lists.
$2c (square, words) Square numbers; split list into words.
$2d (sqrt, unwords) Integer square root for numbers, join with space for lists.
$2e (range, length) Push [0..n-1] for numbers n, length for lists.
$2f (range1, sort) Push [1..n] for numbers n; sorts lists; sortBy for blocks.
$30 (add, cat, +) Add numbers, catenate lists/blocks.
$31 (sub, diff, -) Subtract numbers, set difference for lists.
$32 (mul, join, times, fold, \*) Multiply numbers; repeats a list n times; join list of lists with another; fold block over list.
$33 (div, chunks, split, each, /) Divide numbers; splits a list in chunks of size n; split two lists; call block with each element of list.
$34 (mod, step, clean-split, map, %) Modulo numbers; each nth element for list+number; split two lists removing empty lists; maps block over list.
$35 (and, get, when, filter, &) Bitwise and numbers; index list; eval block only when number on top of stack is non-zero, filter list by block.

Et cetera, et cetera, et cetera. You know what, just read the mnemonics in gs2.py and mess around with them. If you need to know anything ask me via mail or IRC (I'm mauris in #anagol on freenode.)

Example usage and gs2c

The included gs2c.py script is an "assembler" that compiles a more readable representation of gs2 opcodes and constants. It reads mnemonics for gs2 functions from the source code of gs2.py itself! Let's write a simple program, compile it using gs2c.py, and run it.

A very simple golf challenge is the following: given a positive integer n on standard input, print a triangle of asterisks of size n:

*
**
***
****
*****

We read the number, then map a block over [1..n], turning each number into asterisks followed by a newline:

read-num range1 m: "*" times new-line

Then gs2c can compile this, and gs2 can run the resulting file:

$ python gs2c.py < stars > compiled && echo 7 | python gs2.py compiled
*
**
***
****
*****
******
*******

Our solution is 7 bytes long: 56 2f fe 07 2a 32 0a. This is pretty good compared to GolfScript's 11.