Skip to content
hhas edited this page Feb 21, 2023 · 7 revisions

Iris language guide

Design philosophy

Logo influence

Iris is a “stealth Logo”, which is in turn a “stealth Lisp”. While there are technical advantages arising from homoiconicity, the real benefits are cognitive. Algol/C languages by their syntactic and semantic design present a worldview that is seriously skewed: the language appears more important than its users. Logo places its users front and center, in charge not only of using the base language but growing and reshaping it so it better represents the ideas and behaviors which are of interest her.

Logo encourages compositional thinking. The entire language is described in just three Key Concepts:

  1. This is a Word.

  2. This is how you Perform words.

  3. This is how you Add your own words.

Given a built-in vocabulary of simple, general-purpose words as her starting point, the user adds her own custom words which perform the tasks which are of interest to her. New, more powerful, task-specific words are composed from existing ones:

forward 50 right 90

to square
    repeat 4 [forward 50 right 90]
end

to flower
    repeat 36 [right 10 square]
end

to garden
    repeat 25 [set-random-position flower]
end

A key tenet of Logo is that the user’s words (square, flower, garden) are of equal importance to its own (forward, right, repeat). One set of rules applies to everyone. This is in contrast to Algol/C languages, where built-in words (let, if, repeat, etc) are assigned special significance (special syntax and custom behaviors) while the user’s words (function names) are relegated to second-class status (no special syntax, limited one-size-fits-all behavior).

Logo’s rigorously simple egalitarianism[1] is valuable both conceptually and practically:

  1. It demonstrates to the user that the language and its words are not in any way “special” or superior to her own. She can construct any new behavior she wishes via a single, consistent process: recursively composing words. She can perform any behavior by typing its name, followed by any arguments (values) it needs to do its job.

  2. Novice users of Algol languages are initially overwhelmed by their myriad arcane rules and are expected to learn most or all of them before proceeding to learn “hard features”: creating and using her own functions. In Logo, creating and using new words can (and should!) be taught first: the rules for doing so are trivial. Built-in words such as if and repeat can be ignored until/unless they are needed; at which point they are discoverable and learnable through exactly the same tools and rules as every other word, including the user’s own.

As a language Logo has design flaws/compromises:

  1. As in Forth, there are no stop words/punctuation marks to denote the end of each command. Procedures are N-ary; on matching a command name to a procedure, the procedure is responsible for consuming the next N expressions, e.g. the repeat procedure consumes the next 2 expressions. A Logo program is inherently ambiguous without prior knowledge of the procedures that will evaluate it.

  2. Variable scoping is dynamic, not lexical. This allows lazily evaluated arguments, e.g. actions (lists of words) in if and repeat, to be bound at runtime without need for a special thunking mechanism, but again makes program behavior hard to reason about as name lookups traverse the entire call stack where any stack frame could mask the name being searched for with its own definition.

Iris design choices

Iris aims to replicate Logo’s positive characteristics while avoiding its negatives:

  1. Iris syntax is word-centric. Punctuation is used sparingly, for disambiguation and clarification, and tracks punctuation rules for written natural language: commas and periods serve as separators; semi-colons connect clauses; colons denote identifier-detail relations. Parentheses, brackets, and braces denote groupings. Symbols such as $ and %, commonly reassigned non-standard meanings in Algol languages, remain free to be matched as currency prefix and percent suffix to numbers, e.g. $20.01, 75%. Overloading the common meanings of symbols is not forbidden but is strongly discouraged.

  2. Commands are explicitly terminated, by punctuation and/or linebreaks. An iris program’s order of evaluation can be determined by grammatical rules alone.

  3. Built-in and user-defined commands all behave according to a single, common set of rules. User-defined commands are equally important to built-ins. Given knowledge of user-defined commands, the pretty-printer can embolden user-defined names to appear more important than built-ins—which, to the user, they are.

  4. Argument evaluation is determined by a handler’s interface. While iris’s default behavior is to eagerly evaluate arguments as any type of value, a handler can apply custom type constraints to an argument to modify this behavior, e.g. to coerce the given value to a specific type (e.g. string, list of: number), to check it falls within certain bounds (number from: 0 to: 100), or defer its evaluation until/unless needed (lazy).

Iris design compromises

One area where Iris does compromise is operator syntax. While “everything is a command”, some commands are dressed with syntactic sugar and precedences rules, trading simplicity and consistency for user convenience:

  • As in Logo, standard infix arithmetic notation is adopted for familiarity, e.g. 1 + 2 * 3 is preferred to ‘+’ {1, ‘*’ {2, 3}} (although both syntaxes are valid). Comparison, containment, and logic operations are similarly dressed.

  • Assignment and flow control commands are also sugared, e.g. if test then action rather than if test then: action. As a convention, the operator-defined keywords match the underlying command and argument names so the syntax is superficially similar to that of low-punctuation commands; the visible difference being the absence of colon characters. The advantage of adopting operator syntax here is that operands can be written as low-punctuation commands, e.g.:

    if some_command with_argument: x then another_command with_argument: y
    

    Without operator syntax, these inner commands must be explicitly parenthesized otherwise their labeled arguments will be associated with the outer if command:

    ‘if’ (some_command with_argument: x) then: (another_command with_argument: y)
    
  • A Pascal-like do...done block syntax is also provided as a keyword-based alternative to parentheses, e.g. these are equivalent:

    if test then (some_command, another_command)
    
    if test then do
      some_command
      another_command
    done
    

    (The block is terminated by done rather than end, as end is already used in reference forms.)

  • Casts (as), references (of) and reference forms (at, named, where, etc), and handler definitions (to, when, returning) also employ operator syntax.

The disadvantages of operator syntax are:

  • It increases the language’s syntactic complexity and precedence rules.

  • Alphanumeric names reserved for use as operator keywords cannot be used as names (unless explicitly single-quoted).

  • Operator syntax is [currently] defined and imported per-library. If multiple libraries reserve the same names for different operators, there is a likelihood of these operators conflicting. (Iris uses pattern matching where the longest match wins, making it somewhat resistant to conflicts.)

  • While the user may define custom operator syntax for her own commands, this is more complex and again increases risk of reserved keyword collisions.

  • Introducing new operator definitions to an existing library may create keyword collisions in existing scripts which use names that are now reserved. The library loader will need to version not only public handler interfaces but operator syntax as well, and tools for automated migration of scripts from one version of a library to another provided.

The optimal balance of custom operator syntax vs standard command syntax is TBD.

Input errors

It is expected that users (both novice and experienced) will make various errors when typing code, e.g.:

  • forgetting to add colons after argument labels

  • not grouping labeled parameters correctly when nesting low-punctuation commands

  • misspelling command and argument names

  • using reserved keywords as ordinary names without single-quoting them

  • mis-balanced parentheses/brackets/braces.

These errors are unavoidable. What is important is how they are treated and handled. The computer should not make its user feel stupid or incompetent for inevitably making minor typos. The editing environment should quickly identify and (as much as is practical) auto-correct these errors immediately, or else mark them for later manual review by the user. As long as the user’s intent is clear, iris should not annoy her with trivial typos and pedantic punctuation niggles. User-defined names should also be amenable to automatic spellcheck (multiple words within a name being clearly separated by underscores). The UX should enable the user to write, while taking care of “dotting every ‘i’ and crossing every ‘t’” for her. This UX will also facilitate voice input, where commands are spoken rather than typed.

Auto-correct should be implemented independent of the core language (i.e. at the tooling level) and apply to code during the authoring process, ideally as she types. The purpose of a finished written script is to represent the user’s requirements as the computer has understood them, after all detected ambiguities and errors are successfully resolved. The script is, in effect, the “single version of truth” copy which the user approves as representing her wishes. The system should not apply any more automated transforms to that code once the user has signed off on it, unless the user asks for further changes to be made.


[1] Excepting a couple of practical compromises: to...end (used to define new words) and arithmetic operators (standard math notation).

Clone this wiki locally