Skip to content

Customizing Syntax

Louis Thibault edited this page Jun 30, 2021 · 4 revisions

Lisps do not have syntax, per se. Instead, semantics are defined by how data-types like list, vector or int (i.e. "forms") are evaluated by the language.

As such, "customizing syntax" can mean one of two things:

  1. Customize how text is parsed into forms
  2. Customize how expressions are evaluated

Custom Parsing

Parsing is the responsibility of reader.Reader.

Slurp's reader is inspired by Clojure reader and uses a read table to map delimiters such as ( or ; onto reader macros, which parse a specific type of form (here: list and comment, respectively).

Reader can be extended to add new syntactical features by using SetMacro to add a reader.Macro to the read table, or override an existing one.

N.B.: Reader macros are not the "lisp macros" you may have heard about. The terminology is admittedly confusing, but has been adopted by Slurp to remain consistent with Clojure and the larger lisp ecosystem.

Reader macros come in two flavors:

  1. "Normal" macros are stored in the main read table, and require no additional syntax.
  2. "Dispatch" macros cause the reader to use a reader macro from another table, indexed by the dispatch character (# by default).

Thus, by default: {:foo :bar} represents a map type, whereas #{:foo :bar} is a set.

The dispatch character, like most things in Slurp, is configurable. A macro is identified as a dispatch macro by passing true as the second argument in SetMacro.

Reader returned by reader.New(...), is configured to support following forms by default:

Numbers

  • Integers use int64 Go representation and can be specified using decimal, binary hexadecimal or radix notations. (e.g., 123, -123, 0b101011, 0xAF, 2r10100, 8r126 etc.)
  • Floating point numbers use float64 Go representation and can be specified using decimal notation or scientific notation. (e.g.: 3.1412, -1.234, 1e-5, 2e3, 1.5e3 etc.)
  • You can override number reader using WithNumReader.

Characters

Characters use rune or uint8 Go representation and can be written in 3 ways:

  • Simple: \a, , etc.
  • Special: \newline, \tab etc.
  • Unicode: \u1267

Others

  • Boolean: true or false are converted to Bool type.
  • Nil: nil is represented as a zero-allocation empty struct in Go.
  • Keywords: Keywords represent symbolic data and start with :. (e.g., :foo)
  • Symbols: Symbols can be used to name a value and can contain any Unicode symbol.
  • Lists: Lists are zero or more forms contained within parentheses. (e.g., (1 2 3), (1 :hello ())).
  • Vectors: Vectors are ordered collections of forms contained within brackets. (e.g., [:foo 42 "hello" name])
  • Maps: [ ⚙️ under development ... ]
  • Sets: [ ⚙️ under development ... ]

Custom Evaluation

Slurp uses core.Env as the stateful environment for evaluating forms. Evaluation happens in two steps:

  1. A form is first analyzed using a core.Analyzer to produce an expression (i.e., syntactical analysis).
  2. An expression (i.e. core.Expr) is evaluated against an env.

Note that the exact evaluation logic depends on the underlying implementation of core.Expr; VectorExpr behaves differently than ConstExpr, for example.

Users are free to provide custom implementations of Analyzer, which maps forms to expressions, as well as custom expressions. In this way, users can support additional features and fine-tune runtime performance.

Slurp provides a builtin Analyzer that supports all builtin forms and performs macro expansion. It should be suitable for most applications.