Skip to content

Minhi: a minimalist high level language

Chris White edited this page Dec 16, 2018 · 13 revisions

TODO:

  • Postfix sigils instead of prefix
  • a sigil by itself is a postfix coercion operator
  • add += and -= (but, for minimalism, not the other <op>=)
  • add fat comma.
  • add len

E.g., array@ += 5 (append); hash% += a=>1 or hash% += ('a',1) (create key); hash% -= 'a' (delete key/value pair); num#$ < 'a' (lexicographic comparison; base 10 assumed); input$# > 10 (numeric comparison; base 10 assumed). The latter can alternatively be input$#>10, input$ #> 10, or input$ # > 10. The change to postfix sigils is so you don't have to remember which is controlling in $input# > 10 - with postfix, the one closest to the operator controls.

  • Unlike the examples in the preceding paragraph, input $ will give a parse error, as barewords cannot be coerced. (The rule against bareword coercion occurred to me as I was typing this example, and is intended to prevent precisely that sort of ambiguity.)
  • add low-precedence infix for, e.g., "(&say(ARG@[1], ' has value ', ARG@[0])) for values@.
  • With postsigils, /(?<sigil>)\d\+/ is no longer available as an alias for ARG elements. Inspired by the Raku "Whatever star", use *[0]..*[<max arg-1>] for the args. Not confusable with infix *`, since it falls in a difference place in the parse.

Basics

A Minhi program is exactly a Minhi expression. Minhi is inspired by Perl, VBA, Raku, Lua, and Turbo Pascal (in some order).

  • Comments are from // to eol.
  • All errors are fatal

Data

  • Variables:

    • Scalars: &block (callable/function), $string, #number (signed integer by default), !reference
    • Collections: @array (0-based indexing), %hash.
    • Future? --- ?boolean, *float. What about uint, or specific sizes? Traits?
  • No allomorphs (dualvars).

  • Hash keys are coerced to strings.

  • Barewords can be used as hash keys.

  • Arrays and hashes stringify to their addresses.

  • Sigils are invariant

  • All variables are lexical.

  • All collections (arrays and hashes) are held and passed by reference.

  • Number 0 and empty string are falsy; all other values are truthy.

  • $NA is the bit bucket: you can write anything to it, but it always reads as 0. It is special-cased by -> and ,.

  • No gc, and no way to free allocated memory! :)

Literals

  • \d+: number
  • 0[obx]\d+: base-N number
  • 'str': string (no interpolation)

Future: float literals?

Operators

Precedence high to low. Empty rows separate precedence levels.

Op Arity Assoc Notes
( ) Grouping
" 1 Prefix Make a symbol reference (e.g., "$foo). As a special case, quoting anything other than an identifier produces a block. E.g., 1 is a numeric literal (unity), and "1 is shorthand for the block whose only element is the expression 1.
sub 1 Prefix As ", but only works on blocks, and makes the block a routine. return (to be added) will exit the innermost enclosing routine.
var 1 Prefix Declare a new lexical variable. E.g., var "$answer. Returns a reference to the new variable, so that you can do var "$x=42 as you would expect.
const 1 Prefix Declare a new lexical constant.
[] 2 L Indexing (postcircumfix). TODO: Maybe also function calls? @x[1] and &x[1] are unambiguously distinguished by the sigil.
^ 1 Postfix Dereference pointer
The effect of these is that you can do @foo[42]=%thing; %thing[#number+1]='yay'; @foo[42][#number+1] as you would expect.
\ 1 Prefix Take address, e.g., !ptr=\@foo. Since this is below the above (!), \@foo[42] is the address of the 42 element of @foo, not the address of @foo plus 42 or some such.
++, -- 1 Prefix Preincrement, predecrement. Postincrement, postdecrement are currently unsupported.
<adjacency> 2 L Call a block. Although this is treated as an operator, parens are required: &foo(1,2,3), not &foo 1,2,3. This should simplify the parser. Open questions: should this be at a different precedence level, or handled using an express operator? Maybe .? E.g., &foo.(1,2,3).
- 1 L Unary minus
* 2 L Multiplication
/ 2 L Division; type is set by the left operand. Always floor division for integers.
mod 2 L Modulus
+ 2 L Addition or strcat. Numeric/string is set by the type of the left operand.
- 2 L Subtraction
<= >= < > 2 N Comparisons. Numeric/string is set by the type of the left operand.
= <> <=> 2 N Comparisons. Numeric/string is set by the type of the left operand.
not 1 Prefix Logical negation. Always returns a number, even for string inputs.
and 2 L Logical AND. Returns the term, not just a logical value. Short-circuits.
or 2 L As and, but logical OR
??/:: 3 R Ternary operator. Doubled up to avoid confusion with sigils. The :: is optional, so that this operator can double as the if-then.
, 2 L List constructor. For single-element lists, use $NA: (1,$NA)
.. 2 N While loop. The argument on the left is a block that is the conditional, and the argument on the right is a block to execute. E.g., var #i=0; "(#i<5) .. ( &say(#i); ++#i; )
:= 2 R Assignment. Returns the RHS.
; 2 L Separate expressions in a block. The value of the block is the value of the last expression in the block.

Possible future expansions

Op Arity Assoc Notes
names 2 N Name a block, e.g., for break statements. E.g., foo names ($x=$x+1; break foo; 42)
  • Logical xor

  • Bitwise ops: band, bor, bnot, bxor

  • return.

  • break, continue. The .. operator can enable these dynamically in a block it is given.

Built-in functions

  • &str(any): stringify any
  • &num($str[, #base]): convert $str as a number in base #base (default 10)
  • &type(any): Return the type of any.
  • &say(...): Print ... then a newline
  • &hear(): Read a line, remove the trailing newline, and return the line

User-defined functions

Functions are defined with a variable and a block. Example:

var "&factorial = sub (
    #0 <= 1 ?? 1 :: #0 * &factorial(#0 - 1)
)

Args are in @ARG automatically. Also, any identifier that is /^(?<sigil>)\d$/ refers to that element of @_. At present, no coercion is performed - &foo(42) will die if &foo tries to access $1. Access through @ARG if the type is unknown. Coercion may be added later.

The main program is a routine, and $0..$9 are the first ten arguments, not counting the program name.

$PROGRAM is the program name (bash's $0).