Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

629 lines (460 sloc) 20.488 kb
\section{\Idris{} --- the High Level Language}
\Idris{} is
a pure functional programming language with dependent types. It is
eagerly evaluated by default, and compiled via the Epic supercombinator
library \cite{brady2011epic}.
In this section, I will give a brief introduction to programming in \Idris{},
covering the most important features. A full tutorial is available elsewhere
\Idris{} defines several primitive types: \tTC{Int}, \tTC{Integer} and
\tTC{Float} for numeric operations, \tTC{Char} and \tTC{String} for
text manipulation, and \tTC{Ptr} which represents foreign pointers.
There are also several data types declared in the library, including
\tTC{Bool}, with values \tDC{True} and \tDC{False}. All of the usual
arithmetic and comparison operators are defined for the primitive types,
and are overloaded using type classes.
An \Idris{} program consists of a module declaration, followed by an optional
list of imports and a collection of definitions and declarations, for example:
module main
x : Int
x = 42
main : IO ()
main = putStrLn ("The answer is " ++ show x)
Like Haskell, the main function
is called \texttt{main}, and input and output is managed with an \texttt{IO}
monad. Unlike Haskell, however, a single colon indicates \texttt{:} a type
declaration, emphasising the importance of types,
and \remph{all} functions must have a top level type
declaration. This is due to type inference being, in general,
undecidable for languages with
dependent types.
A module declaration also opens a \remph{namespace}. The fully qualified names
declared in this module are \texttt{main.x} and \texttt{main.main}.
\subsection{Types and Functions}
Data types are declared in a similar way to Haskell data types, with a similar
syntax. Natural numbers and lists, for example, are declared as follows in the
data Nat = O | S Nat -- Natural numbers
-- (zero and successor)
data List a = Nil | (::) a (List a) -- Polymorphic lists
The above declarations are taken from the standard library. Unary natural
numbers can be either zero, or
the successor of another natural number (\texttt{S k}).
Lists can either be empty (\texttt{Nil})
or a value added to the front of another list (\texttt{x :: xs}).
In the declaration for \tTC{List}, we used an infix operator \tDC{::}. New operators
such as this can be added using a fixity declaration, as follows:
infixr 10 ::
This declares that \texttt{::} is a right associative operator (\texttt{infixr})
with a precedence level of 10.
Functions, data constructors and type constructors may all be given infix
operators as names. They may be used in prefix form if enclosed in brackets,
e.g. \tDC{(::)}.
Functions are implemented by pattern matching, again using a similar syntax to
Haskell. Some natural number arithmetic functions can be
defined as follows, again taken from the standard library:
-- Unary addition
plus : Nat -> Nat -> Nat
plus O y = y
plus (S k) y = S (plus k y)
-- Unary multiplication
mult : Nat -> Nat -> Nat
mult O y = O
mult (S k) y = plus y (mult k y)
The standard arithmetic operators \texttt{+} and \texttt{*} are also overloaded
for use by \texttt{Nat}, and are implemented
using the above functions. Unlike Haskell, there is no restriction on whether
types and function names must begin with a capital letter or not.
%names (\tFN{plus} and \tFN{mult} above), data constructors (\tDC{O}, \tDC{S},
%\tDC{Nil} and \tDC{::}) and type constructors (\tTC{Nat} and \tTC{List}) are
%all part of the same namespace.
\Idris{} has an interactive prompt, at which we can test these functions:
Idris> plus (S (S O)) (S (S O))
S (S (S (S O))) : Nat
Idris> mult (S (S (S O))) (plus (S (S O)) (S (S O)))
S (S (S (S (S (S (S (S (S (S (S (S O))))))))))) : Nat
Like arithmetic operations, integer literals are also overloaded using type classes,
meaning that we can also test the functions as follows:
Idris> plus 2 2
S (S (S (S O))) : Nat
Idris> mult 3 (plus 2 2)
S (S (S (S (S (S (S (S (S (S (S (S O))))))))))) : Nat
\subsubsection{\texttt{where} clauses}
Functions can also be defined \emph{locally} using \texttt{where} clauses. For example,
to define a function which reverses a list, we can use an auxiliary function which
accumulates the new, reversed list, and which does not need to be visible globally:
reverse : List a -> List a
reverse xs = revAcc [] xs where
revAcc : List a -> List a -> List a
revAcc acc [] = acc
revAcc acc (x :: xs) = revAcc (x :: acc) xs
Indentation is significant --- functions in the \texttt{where} block must be indented
further than the outer function.
Any names which are visible in the outer scope are also visible in the \texttt{where}
clause (unless they have been redefined, such as \texttt{xs} here).
\emph{However}, names which appear in the type are \emph{not} automatically
in scope. In particular,
in the above example, the \texttt{a} in the top level type and the \texttt{a} in the
auxiliary definition \texttt{revAcc} are \emph{not} the same. If this is the required
behaviour, the \texttt{a} can be brought into scope as follows:
reverse : List a -> List a
reverse {a} xs = revAcc [] xs where
revAcc : List a -> List a -> List a
\subsubsection{Dependent Types}
A standard example of a dependent type is the type of ``lists with length'',
conventionally called ``vectors'' in the dependently typed programming
literature. In \Idris{}, vectors are declared as follows:
data Vect : Set -> Nat -> Set where
Nil : Vect a O
(::) : a -> Vect a k -> Vect a (S k)
Note that this uses the same constructor names as for \tTC{List}. Ad-hoc
name overloading such as this is accepted by \Idris{}, provided that the names
are declared in different namespaces (in practice, normally in different modules).
Ambiguous constructor names are resolved by type.
This declares a family of types, which requires a different form of declaration
from the simple type declarations above. It resembles a Haskell GADT
declaration: it explicitly states the type
of the type constructor \tTC{Vect} --- it takes a type and a \tTC{Nat} as an
argument, where \tTC{Set} stands for the type of types. We say that \tTC{Vect}
is \emph{parameterised} by a type, and \emph{indexed} over \tTC{Nat}. Each
constructor targets a different part of the family of types. \tDC{Nil} can only
be used to construct vectors with zero length, and \tDC{::} to construct
vectors with non-zero length. The type of \tDC{::} states explicitly that an element
of type \texttt{a} and a tail of type \texttt{Vect a k} (i.e., a vector of length \texttt{k})
combine to make a vector of length \texttt{S k}.
Functions on dependent types such as \tTC{Vect} are declared in the same way
as on simple types such as \tTC{List} and \tTC{Nat} above, by pattern matching.
The type of a function over \tTC{Vect} will describe what happens to the
lengths of the vectors involved. For example, \tFN{++}, defined in the
library, appends two \tTC{Vect}s:
(++) : Vect A n -> Vect A m -> Vect A (n + m)
(++) Nil ys = ys
(++) (x :: xs) ys = x :: xs ++ ys
\subsubsection{The Finite Sets}
Finite sets, as the name suggests, are sets with a finite number of elements.
They are declared as follows in the library:
data Fin : Nat -> Set where
fO : Fin (S k)
fS : Fin k -> Fin (S k)
This declares
\tDC{fO} as the zeroth element of a finite set with \texttt{S k} elements;
\texttt{fS n} as the
\texttt{n+1}th element of a finite set with \texttt{S k} elements.
\tTC{Fin} is indexed by a \tTC{Nat}, which
represents the number of elements in the set.
Neither constructor targets \texttt{Fin O}, because we cannot construct an
element of an empty set.
A useful application of the \tTC{Fin} family is to represent bounded
natural numbers. Since the first \tTC{n} natural numbers form a finite
set of \tTC{n} elements, we can treat \tTC{Fin n} as the set of natural
numbers bounded by \tTC{n}.
For example, the following function which looks up an element in a \tTC{Vect},
by a bounded index given as a \tTC{Fin n}, is defined in the library:
index : Fin n -> Vect a n -> a
index fO (x :: xs) = x
index (fS k) (x :: xs) = index k xs
This function looks up a value at a given location in a vector. The location is
bounded by the length of the vector (\texttt{n} in each case), so there is no
need for a run-time bounds check. The type checker guarantees that the location
is no larger than the length of the vector.
Note also that there is no case for \texttt{Nil} here. It would be impossible
to apply such a case ---
since there is no element of \texttt{Fin O}, and the location is a
\texttt{Fin n}, then \texttt{n} can not be \tDC{O}. As a result, attempting to
look up an element in an empty vector would give a compile time type error,
since it would force \texttt{n} to be \tDC{O}.
\subsubsection{Implicit Arguments}
Let us take a closer look at the type of \texttt{index}:
index : Fin n -> Vect a n -> a
It takes two arguments, an element of the finite set of \texttt{n} elements, and a vector
with \texttt{n} elements of type \texttt{a}. But there are also two names,
\texttt{n} and \texttt{a}, which are not declared explicitly. These are \emph{implicit}
arguments to \texttt{index}. The type of \texttt{index} could also be written as:
index : {a:_} -> {n:_} -> Fin n -> Vect a n -> a
This gives bindings for \texttt{a} and \texttt{n} with placeholders for
their types, to be inferred by the machine. These types could also be given explicitly:
index : {a:Set} -> {n:Nat} -> Fin n -> Vect a n -> a
Implicit arguments, given in braces \texttt{\{\}} in the type declaration, are not given in
applications of \texttt{index}; their values can be inferred from the types of
the \texttt{Fin n} and \texttt{Vect a n} arguments. Any name which appears as a parameter
or index in a type declaration, but which is otherwise free, will be automatically
bound as an implicit argument. Indeed, binding arguments in this way is the essence
of dependent types.
Implicit arguments can still be given explicitly in applications, using the syntax
\texttt{\{a=value\}} and \texttt{\{n=value\}}, for example:
index {a=Int} {n=2} fO (2 :: 3 :: Nil)
In fact, any argument, implicit or explicit, may be given a name. For example,
the type of \texttt{index} could be declared as:
index : (i:Fin n) -> (xs:Vect a n) -> a
This can be useful for improving the readability of type signatures, particularly
where the name suggests the argument's purpose.
\subsection{Type Classes}
\Idris{} supports overloading in two ways. Firstly, as we have already seen with
the constructors of \texttt{List} and \texttt{Vect}, names
can be overloaded in an ad-hoc manner and resolved according to the context in which
they are used. This is mostly for convenience, to eliminate the need to decorate
constructor names in similarly structured data types, and eliminate explicit qualification
of ambiguous names where only one is well-typed --- this is especially useful
for disambiguating record field names\footnote{Records are however beyond the scope
of this paper}.
Secondly, \Idris{} implements \remph{type classes}, following Haskell. This
allows a more principled approach to overloading --- a type class gives a
collection of overloaded operations which describe the interface for
\remph{instances} of that class.
A simple example
is the \texttt{Show} type class, which is defined in the library and
provides an interface for converting values to
class Show a where
show : a -> String
This generates a function of the following type (which we call a \emph{method} of the
\texttt{Show} class):
show : Show a => a -> String
An instance of a class
is defined with an \texttt{instance} declaration, which provides implementations of
the function for a specific type. For example, the \texttt{Show} instance for \texttt{Nat}
could be defined as:
instance Show Nat where
show O = "O"
show (S k) = "s" ++ show k
Idris> show (S (S (S O)))
"sssO" : String
Only one instance of a class can be given for a type --- instances may not overlap.
Instance declarations can themselves have constraints. For example, to define a
\texttt{Show} instance for vectors, we need to know that there is a \texttt{Show}
instance for the element type, because we are going to use it to convert each element
to a \texttt{String}:
instance Show a => Show (Vect a n) where
show xs = "[" ++ show' xs ++ "]" where
show' : Vect a n' -> String
show' Nil = ""
show' (x :: Nil) = show x
show' (x :: xs) = show x ++ ", " ++ show' xs
\textbf{Remark: } The type of the auxiliary function \texttt{show'} is
important. The type variables \texttt{a} and \texttt{n} which are part of the
instance declaration for \texttt{Show (Vect a n)} are fixed across the entire
instance declaration. As a result, \texttt{a} need not be constrained
again. Furthermore, it means that if \texttt{n} is used again in the type, it refers
to the (fixed) length of the outermost list \texttt{xs}. Therefore,
there is a different name for the length \texttt{n'} in \texttt{show'}.
Like Haskell type classes, default definitions can be given in the class declaration.
Otherwise, all methods must be given in an instance. For example, there is an
\texttt{Eq} class:
class Eq a where
(==) : a -> a -> Bool
(/=) : a -> a -> Bool
x /= y = not (x == y)
y == y = not (x /= y)
Classes can also be extended. A logical next step from an equality relation \texttt{Eq}
is to define an ordering relation \texttt{Ord}. We can define an \texttt{Ord} class
which inherits methods from \texttt{Eq} as well as defining some of its own:
data Ordering = LT | EQ | GT
class Eq a => Ord a where
compare : a -> a -> Ordering
(<) : a -> a -> Bool
-- etc
\subsection{Matching on intermediate values}
\subsubsection{\texttt{let} bindings}
Intermediate values can be calculated using \texttt{let} bindings:
mirror : List a -> List a
mirror xs = let xs' = rev xs in
app xs xs'
Pattern matching is also supported in \texttt{let} bindings. For example, extracting
fields from a record can be achieved as follows, as well as by pattern matching at the top level:
data Person = MkPerson String Int
showPerson : Person -> String
showPerson p = let MkPerson name age = p in
name ++ " is " ++ show age ++ " years old"
\subsubsection{\texttt{case} expressions}
Another way of inspecting intermediate values of \emph{simple} types
is to use a \texttt{case} expression.
For example, \texttt{list\_lookup}
looks up an index in a list, returning \texttt{Nothing} if the index is out
of bounds. This can be used this to write \texttt{lookup\_default}, which
looks up an index and returns a default value if the index is out of bounds:
lookup_default : Nat -> List a -> a -> a
lookup_default i xs def = case list_lookup i xs of
Nothing => def
Just x => x
The \texttt{case} construct is intended for simple analysis
of intermediate expressions to avoid the need to write auxiliary functions, and is
also used internally to implement pattern matching \texttt{let} and lambda bindings.
It will \emph{only} work if:
\item Each branch \emph{matches} a value of the same type, and \emph{returns} a
value of the same type.
\item The type of the result is ``known''. i.e. the type of the expression can be
determined \emph{without} type checking the \texttt{case}-expression itself.
\subsubsection{The \texttt{with} rule}
Since types can depend on values, the form of some arguments can be determined
by the value of others. For example, if we were to write down the implicit
length arguments to \texttt{(++)}, we would see that the form of the length argument was
determined by whether the vector was empty or not:
(++) : Vect a n -> Vect a m -> Vect a (n + m)
(++) {n=O} [] [] = []
(++) {n=S k} (x :: xs) ys = x :: xs ++ ys
If \texttt{n} was a successor in the \texttt{[]} case, or zero in the
\texttt{::} case, the definition
would not be well typed.
Often, matching is required on the result of an intermediate computation
with a dependent type.
\Idris{} provides a construct for this, the \texttt{with} rule,
inspired by views in \Epigram~\cite{McBride2004a},
which takes account of the
fact that matching on a value in a dependently typed language can affect what
is known about the forms of other values.
For example, a \texttt{Nat} is either even or odd.
If it is even it will
be the sum of two equal \texttt{Nat}s. Otherwise, it is the sum of two equal \texttt{Nat}s
plus one:
data Parity : Nat -> Set where
even : Parity (n + n)
odd : Parity (S (n + n))
We say \texttt{Parity} is a \emph{view} of \texttt{Nat}.
It has a \emph{covering function} which tests whether
it is even or odd and constructs the predicate accordingly.
parity : (n:Nat) -> Parity n
Using this, a function which converts a natural number to a list
of binary digits (least significant first) is written as follows, using the \texttt{with}
natToBin : Nat -> List Bool
natToBin O = Nil
natToBin k with (parity k)
natToBin (j + j) | even = False :: natToBin j
natToBin (S (j + j)) | odd = True :: natToBin j
The value of the result of \texttt{parity k} affects the form of \texttt{k},
because the result
of \texttt{parity k} depends on \texttt{k}.
So, as well as the patterns for the result of the
intermediate computation (\texttt{even} and \texttt{odd}) right of the
\texttt{$\mid$}, the definition also expresses how
the results affect the other patterns left of the $\mid$. Note that there is a
function in the patterns (\texttt{+}) and repeated occurrences of \texttt{j} ---
this is allowed
because another argument has determined the form of these patterns.
Jump to Line
Something went wrong with that request. Please try again.