Skip to content

Commit

Permalink
[ doc ] lexical structure
Browse files Browse the repository at this point in the history
  • Loading branch information
UlfNorell committed Oct 26, 2015
1 parent d1fa1c7 commit aad4189
Showing 1 changed file with 233 additions and 2 deletions.
235 changes: 233 additions & 2 deletions doc/user-manual/language/lexical-structure.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,236 @@
Lexical Structure
*****************

.. note::
This is a stub.
Agda code is written in UTF-8 encoded plain text files with the extension
``.agda``. Most unicode characters can be used in identifiers and whitespace is
important, see Names_ and Layout_ below.

Tokens
------

Keywords and special symbols
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Most non-whitespace unicode can be used as part of an Agda name, but there are
two kinds of exceptions:

special symbols
Characters with special meaning that cannot appear in at all in a name. These are
``.;{}()@"``.

keywords
Reserved words that cannot appear as a `name part <Names_>`_, but can appear
in a name together with other characters.

.. code-block:: none
= | -> → : ? \ λ ∀ .. ...
abstract codata coinductive constructor data eta-equality field forall hiding
import in inductive infix infixl infixr instance let macro module mutual
no-eta-equality open pattern postulate primitive private public quote
quoteContext quoteGoal quoteTerm record renaming rewrite Set syntax tactic
unquote unquoteDecl unquoteDef using where with
The ``Set`` keyword can appear with a number suffix, optionally subscripted
(see :doc:`universe-levels`). For instance ``Set42`` and ``Set₄₂`` are both
keywords.


Names
~~~~~

A `qualified name` is a non-empty sequence of `names` separated by dots
(``.``). A `name` is an alternating sequence of `name parts` and underscores
(``_``), containing at least one name part. A `name part` is a non-empty
sequence of unicode characters, excluding whitespace, ``_``, and `special symbols
<Keywords and special symbols_>`_. A name part cannot be one of the `keywords
<Keywords and special symbols_>`_ above, and cannot start with a single quote,
``'`` (which are used for character literals, see Literals_ below).

Examples
- Valid: ``data?``, ``::``, ``if_then_else_``, ``0b``, ``_⊢_∈_``, ``x=y``
- Invalid: ``data_?``, ``foo__bar``, ``_``, ``a;b``, ``[_.._]``

The underscores in a name indicate where the arguments go when the name is used
as an operator. For instance, the application ``_+_ 1 2`` can be written as ``1
+ 2``. See :doc:`mixfix-operators` for more information. Since most sequences
of characters are valid names, whitespace is more important than in other
languages. In the example above the whitespace around ``+`` is required, since
``1+2`` is a valid name.


Qualified names are used to refer to entities defined in other modules. For
instance ``Prelude.Bool.true`` refers to the name ``true`` defined in the
module ``Prelude.Bool``. See :doc:`module-system` for more information.

Literals
~~~~~~~~

There are four types of literal values: integers, floats, characters, and
strings. See :doc:`built-ins` for the corresponding types, and
:doc:`literal-overloading` for how to support literals for user-defined types.

Integers
Integer values in decimal or hexadecimal (prefixed by ``0x``) notation.
Non-negative numbers map by default to :doc:`built-in natural numbers
<built-ins>`, but can be overloaded. Negative numbers have no default
interpretation and can only be used through :doc:`overloading
<literal-overloading>`.

Examples: ``123``, ``0xF0F080``, ``-42``, ``-0xF``

Floats
Floating point numbers in the standard notation (with square brackets
denoting optional parts):

.. code-block:: none
float ::= [-] decimal . decimal [exponent]
| [-] decimal exponent
exponent ::= (e | E) [+ | -] decimal
These map to :doc:`built-in floats <built-ins>` and cannot be overloaded.

Examples: ``1.0``, ``-5.0e+12``, ``1.01e-16``, ``4.2E9``, ``50e3``.

.. _characters:

Characters
Character literals are enclosed in single quotes (``'``). They can be a
single (unicode) character, other than ``'`` or ``\``, or an escaped
character. Escaped characters starts with a backslash ``\`` followed by an
escape code. Escape codes are natural numbers in decimal or hexadecimal
(prefixed by ``x``), or one of the following special escape codes:

======== ======== ======== ======== ======== ======== ======== ========
Code ASCII Code ASCII Code ASCII Code ASCII
======== ======== ======== ======== ======== ======== ======== ========
``a`` 7 ``b`` 8 ``t`` 9 ``n`` 10
``v`` 11 ``f`` 12 ``\`` ``\`` ``'`` ``'``
``"`` ``"`` ``NUL`` 0 ``SOH`` 1 ``STX`` 2
``ETX`` 3 ``EOT`` 4 ``ENQ`` 5 ``ACK`` 6
``BEL`` 7 ``BS`` 8 ``HT`` 9 ``LF`` 10
``VT`` 11 ``FF`` 12 ``CR`` 13 ``SO`` 14
``SI`` 15 ``DLE`` 16 ``DC1`` 17 ``DC2`` 18
``DC3`` 19 ``DC4`` 20 ``NAK`` 21 ``SYN`` 22
``ETB`` 23 ``CAN`` 24 ``EM`` 25 ``SUB`` 26
``ESC`` 27 ``FS`` 28 ``GS`` 29 ``RS`` 30
``US`` 31 ``SP`` 32 ``DEL`` 127
======== ======== ======== ======== ======== ======== ======== ========

Character literals map to the :doc:`built-in character type <built-ins>` and
cannot be overloaded.

Examples: ``'A'``, ``'∀'``, ``'\x2200'``, ``'\ESC'``, ``'\32'``, ``'\n'``,
``'\''``, ``'"'``.

.. _strings:

Strings
String literals are sequences of, possibly escaped, characters enclosed in
double quotes ``"``. They follow the same rules as `character literals
<characters_>`_ except that double quotes ``"`` need to be escaped rather
than single quotes ``'``. String literals map to the :doc:`built-in string
type <built-ins>` and cannot be overloaded.

Example: ``"Привет \"мир\"\n"``.

Holes
~~~~~

Holes are an integral part of the interactive development supported by the
:any:`Emacs mode <emacs-mode>`. Any text enclosed in ``{!`` and ``!}`` is a
hole and may contain nested holes. A hole with no contents can be written
``?``. There are a number of Emacs commands that operate on the contents of a
hole. The type checker ignores the contents of a hole and treats it as an
unknown (see :doc:`implicit-arguments`).

Example: ``{! f {!x!} 5 !}``

Comments
~~~~~~~~

Single-line comments are written with a double dash ``--`` followed by arbitrary
text. Multi-line comments are enclosed in ``{-`` and ``-}`` and can be nested.
Comments cannot appear in `string literals <strings_>`_.

Example::

{- Here is a {- nested -}
comment -}
s : String --line comment {-
s = "{- not a comment -}"

Pragmas
~~~~~~~

Pragmas are special comments enclosed in ``{-#`` and ``#-}`` that have special
meaning to the system. See :doc:`pragmas` for a full list of pragmas.

Layout
------

Agda is layout sensitive using similar rules as Haskell, with the exception
that layout is mandatory: you cannot use explicit ``{``, ``}`` and ``;`` to
avoid it.

A layout block contains a sequence of `statements` and is started by one of the
layout keywords:

.. code-block:: none
abstract field instance let macro mutual postulate primitive private where
The first token after the layout keyword decides the indentation of the block.
Any token indented more than this is part of the previous statement, a token at
the same level starts a new statement, and a token indented less lies outside
the block.

::

data Nat : Set where -- starts a layout block
-- comments are not tokens
zero : Nat -- statement 1
suc : Nat → -- statement 2
Nat -- also statement 2

one : Nat -- outside the layout block
one = suc zero

Note that the indentation of the layout keyword does not matter.

An Agda file contains one top-level layout block, with the special rule that
the contents of the top-level module need not be indented.

::

module Example where
NotIndented : Set₁
NotIndented = Set

Literate Agda
-------------

Agda supports `literate programming <literate_>`_ where everything in a file is
a comment unless enclosed in ``\begin{code}``, ``\end{code}``. Literate Agda
files have the extension ``.lagda`` instead of ``.agda``. The main use case for
literate Agda is to generate LaTeX documents from Agda code. See
:any:`generating-latex` for more information.

::

\documentclass{article}
% some preable stuff
\begin{document}
Introduction usually goes here
\begin{code}
module MyPaper where
open import Prelude
five : Nat
five = 2 + 3
\end{code}
Now, conclusions!
\end{document}

.. _literate: https://en.wikipedia.org/wiki/Literate_programming

1 comment on commit aad4189

@andreasabel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like.

Please sign in to comment.