# edwinb/Idris-dev forked from idris-lang/Idris-dev

### Subversion checkout URL

You can clone with HTTPS or Subversion.

Fetching contributors…

Cannot retrieve contributors at this time

629 lines (460 sloc) 20.488 kb
 \section{\Idris{} --- the High Level Language} \label{sect:hll} \Idris{} is a pure functional programming language with dependent types. It is eagerly evaluated by default, and compiled via the Epic supercombinator library \cite{brady2011epic}. In this section, I will give a brief introduction to programming in \Idris{}, covering the most important features. A full tutorial is available elsewhere \cite{idristutorial}. \subsection{Preliminaries} \Idris{} defines several primitive types: \tTC{Int}, \tTC{Integer} and \tTC{Float} for numeric operations, \tTC{Char} and \tTC{String} for text manipulation, and \tTC{Ptr} which represents foreign pointers. There are also several data types declared in the library, including \tTC{Bool}, with values \tDC{True} and \tDC{False}. All of the usual arithmetic and comparison operators are defined for the primitive types, and are overloaded using type classes. An \Idris{} program consists of a module declaration, followed by an optional list of imports and a collection of definitions and declarations, for example: \begin{SaveVerbatim}{constprims} module main x : Int x = 42 main : IO () main = putStrLn ("The answer is " ++ show x) \end{SaveVerbatim} \useverb{constprims} \noindent Like Haskell, the main function is called \texttt{main}, and input and output is managed with an \texttt{IO} monad. Unlike Haskell, however, a single colon indicates \texttt{:} a type declaration, emphasising the importance of types, and \remph{all} functions must have a top level type declaration. This is due to type inference being, in general, undecidable for languages with dependent types. A module declaration also opens a \remph{namespace}. The fully qualified names declared in this module are \texttt{main.x} and \texttt{main.main}. \subsection{Types and Functions} Data types are declared in a similar way to Haskell data types, with a similar syntax. Natural numbers and lists, for example, are declared as follows in the library: \begin{SaveVerbatim}{natlist} data Nat = O | S Nat -- Natural numbers -- (zero and successor) data List a = Nil | (::) a (List a) -- Polymorphic lists \end{SaveVerbatim} \useverb{natlist} \noindent The above declarations are taken from the standard library. Unary natural numbers can be either zero, or the successor of another natural number (\texttt{S k}). Lists can either be empty (\texttt{Nil}) or a value added to the front of another list (\texttt{x :: xs}). In the declaration for \tTC{List}, we used an infix operator \tDC{::}. New operators such as this can be added using a fixity declaration, as follows: \begin{SaveVerbatim}{infixcons} infixr 10 :: \end{SaveVerbatim} \useverb{infixcons} \noindent This declares that \texttt{::} is a right associative operator (\texttt{infixr}) with a precedence level of 10. Functions, data constructors and type constructors may all be given infix operators as names. They may be used in prefix form if enclosed in brackets, e.g. \tDC{(::)}. \subsection{Functions} Functions are implemented by pattern matching, again using a similar syntax to Haskell. Some natural number arithmetic functions can be defined as follows, again taken from the standard library: \begin{SaveVerbatim}{natfns} -- Unary addition plus : Nat -> Nat -> Nat plus O y = y plus (S k) y = S (plus k y) -- Unary multiplication mult : Nat -> Nat -> Nat mult O y = O mult (S k) y = plus y (mult k y) \end{SaveVerbatim} \useverb{natfns} \noindent The standard arithmetic operators \texttt{+} and \texttt{*} are also overloaded for use by \texttt{Nat}, and are implemented using the above functions. Unlike Haskell, there is no restriction on whether types and function names must begin with a capital letter or not. %Function %names (\tFN{plus} and \tFN{mult} above), data constructors (\tDC{O}, \tDC{S}, %\tDC{Nil} and \tDC{::}) and type constructors (\tTC{Nat} and \tTC{List}) are %all part of the same namespace. \Idris{} has an interactive prompt, at which we can test these functions: \begin{SaveVerbatim}{fntest} Idris> plus (S (S O)) (S (S O)) S (S (S (S O))) : Nat Idris> mult (S (S (S O))) (plus (S (S O)) (S (S O))) S (S (S (S (S (S (S (S (S (S (S (S O))))))))))) : Nat \end{SaveVerbatim} \useverb{fntest} \noindent Like arithmetic operations, integer literals are also overloaded using type classes, meaning that we can also test the functions as follows: \begin{SaveVerbatim}{fntest} Idris> plus 2 2 S (S (S (S O))) : Nat Idris> mult 3 (plus 2 2) S (S (S (S (S (S (S (S (S (S (S (S O))))))))))) : Nat \end{SaveVerbatim} \useverb{fntest} \subsubsection{\texttt{where} clauses} Functions can also be defined \emph{locally} using \texttt{where} clauses. For example, to define a function which reverses a list, we can use an auxiliary function which accumulates the new, reversed list, and which does not need to be visible globally: \begin{SaveVerbatim}{revwhere} reverse : List a -> List a reverse xs = revAcc [] xs where revAcc : List a -> List a -> List a revAcc acc [] = acc revAcc acc (x :: xs) = revAcc (x :: acc) xs \end{SaveVerbatim} \useverb{revwhere} \noindent Indentation is significant --- functions in the \texttt{where} block must be indented further than the outer function. \textbf{Scope:} Any names which are visible in the outer scope are also visible in the \texttt{where} clause (unless they have been redefined, such as \texttt{xs} here). \emph{However}, names which appear in the type are \emph{not} automatically in scope. In particular, in the above example, the \texttt{a} in the top level type and the \texttt{a} in the auxiliary definition \texttt{revAcc} are \emph{not} the same. If this is the required behaviour, the \texttt{a} can be brought into scope as follows: \begin{SaveVerbatim}{revwhereb} reverse : List a -> List a reverse {a} xs = revAcc [] xs where revAcc : List a -> List a -> List a ... \end{SaveVerbatim} \useverb{revwhereb} \subsubsection{Dependent Types} A standard example of a dependent type is the type of lists with length'', conventionally called vectors'' in the dependently typed programming literature. In \Idris{}, vectors are declared as follows: \begin{SaveVerbatim}{vect} data Vect : Set -> Nat -> Set where Nil : Vect a O (::) : a -> Vect a k -> Vect a (S k) \end{SaveVerbatim} \useverb{vect} \noindent Note that this uses the same constructor names as for \tTC{List}. Ad-hoc name overloading such as this is accepted by \Idris{}, provided that the names are declared in different namespaces (in practice, normally in different modules). Ambiguous constructor names are resolved by type. This declares a family of types, which requires a different form of declaration from the simple type declarations above. It resembles a Haskell GADT declaration: it explicitly states the type of the type constructor \tTC{Vect} --- it takes a type and a \tTC{Nat} as an argument, where \tTC{Set} stands for the type of types. We say that \tTC{Vect} is \emph{parameterised} by a type, and \emph{indexed} over \tTC{Nat}. Each constructor targets a different part of the family of types. \tDC{Nil} can only be used to construct vectors with zero length, and \tDC{::} to construct vectors with non-zero length. The type of \tDC{::} states explicitly that an element of type \texttt{a} and a tail of type \texttt{Vect a k} (i.e., a vector of length \texttt{k}) combine to make a vector of length \texttt{S k}. Functions on dependent types such as \tTC{Vect} are declared in the same way as on simple types such as \tTC{List} and \tTC{Nat} above, by pattern matching. The type of a function over \tTC{Vect} will describe what happens to the lengths of the vectors involved. For example, \tFN{++}, defined in the library, appends two \tTC{Vect}s: \begin{SaveVerbatim}{vapp} (++) : Vect A n -> Vect A m -> Vect A (n + m) (++) Nil ys = ys (++) (x :: xs) ys = x :: xs ++ ys \end{SaveVerbatim} \useverb{vapp} \subsubsection{The Finite Sets} Finite sets, as the name suggests, are sets with a finite number of elements. They are declared as follows in the library: \begin{SaveVerbatim}{findecl} data Fin : Nat -> Set where fO : Fin (S k) fS : Fin k -> Fin (S k) \end{SaveVerbatim} \useverb{findecl} \noindent This declares \tDC{fO} as the zeroth element of a finite set with \texttt{S k} elements; \texttt{fS n} as the \texttt{n+1}th element of a finite set with \texttt{S k} elements. \tTC{Fin} is indexed by a \tTC{Nat}, which represents the number of elements in the set. Neither constructor targets \texttt{Fin O}, because we cannot construct an element of an empty set. A useful application of the \tTC{Fin} family is to represent bounded natural numbers. Since the first \tTC{n} natural numbers form a finite set of \tTC{n} elements, we can treat \tTC{Fin n} as the set of natural numbers bounded by \tTC{n}. For example, the following function which looks up an element in a \tTC{Vect}, by a bounded index given as a \tTC{Fin n}, is defined in the library: \begin{SaveVerbatim}{vindex} index : Fin n -> Vect a n -> a index fO (x :: xs) = x index (fS k) (x :: xs) = index k xs \end{SaveVerbatim} \useverb{vindex} \noindent This function looks up a value at a given location in a vector. The location is bounded by the length of the vector (\texttt{n} in each case), so there is no need for a run-time bounds check. The type checker guarantees that the location is no larger than the length of the vector. Note also that there is no case for \texttt{Nil} here. It would be impossible to apply such a case --- since there is no element of \texttt{Fin O}, and the location is a \texttt{Fin n}, then \texttt{n} can not be \tDC{O}. As a result, attempting to look up an element in an empty vector would give a compile time type error, since it would force \texttt{n} to be \tDC{O}. \subsubsection{Implicit Arguments} Let us take a closer look at the type of \texttt{index}: \begin{SaveVerbatim}{vindexty} index : Fin n -> Vect a n -> a \end{SaveVerbatim} \useverb{vindexty} \noindent It takes two arguments, an element of the finite set of \texttt{n} elements, and a vector with \texttt{n} elements of type \texttt{a}. But there are also two names, \texttt{n} and \texttt{a}, which are not declared explicitly. These are \emph{implicit} arguments to \texttt{index}. The type of \texttt{index} could also be written as: \begin{SaveVerbatim}{vindeximppl} index : {a:_} -> {n:_} -> Fin n -> Vect a n -> a \end{SaveVerbatim} \useverb{vindeximppl} \noindent This gives bindings for \texttt{a} and \texttt{n} with placeholders for their types, to be inferred by the machine. These types could also be given explicitly: \begin{SaveVerbatim}{vindeximpty} index : {a:Set} -> {n:Nat} -> Fin n -> Vect a n -> a \end{SaveVerbatim} \useverb{vindeximpty} \noindent Implicit arguments, given in braces \texttt{\{\}} in the type declaration, are not given in applications of \texttt{index}; their values can be inferred from the types of the \texttt{Fin n} and \texttt{Vect a n} arguments. Any name which appears as a parameter or index in a type declaration, but which is otherwise free, will be automatically bound as an implicit argument. Indeed, binding arguments in this way is the essence of dependent types. Implicit arguments can still be given explicitly in applications, using the syntax \texttt{\{a=value\}} and \texttt{\{n=value\}}, for example: \begin{SaveVerbatim}{vindexexp} index {a=Int} {n=2} fO (2 :: 3 :: Nil) \end{SaveVerbatim} \useverb{vindexexp} \noindent In fact, any argument, implicit or explicit, may be given a name. For example, the type of \texttt{index} could be declared as: \begin{SaveVerbatim}{vindexn} index : (i:Fin n) -> (xs:Vect a n) -> a \end{SaveVerbatim} \useverb{vindexn} \noindent This can be useful for improving the readability of type signatures, particularly where the name suggests the argument's purpose. \subsection{Type Classes} \Idris{} supports overloading in two ways. Firstly, as we have already seen with the constructors of \texttt{List} and \texttt{Vect}, names can be overloaded in an ad-hoc manner and resolved according to the context in which they are used. This is mostly for convenience, to eliminate the need to decorate constructor names in similarly structured data types, and eliminate explicit qualification of ambiguous names where only one is well-typed --- this is especially useful for disambiguating record field names\footnote{Records are however beyond the scope of this paper}. Secondly, \Idris{} implements \remph{type classes}, following Haskell. This allows a more principled approach to overloading --- a type class gives a collection of overloaded operations which describe the interface for \remph{instances} of that class. A simple example is the \texttt{Show} type class, which is defined in the library and provides an interface for converting values to \texttt{String}s: \begin{SaveVerbatim}{showclass} class Show a where show : a -> String \end{SaveVerbatim} \useverb{showclass} \noindent This generates a function of the following type (which we call a \emph{method} of the \texttt{Show} class): \begin{SaveVerbatim}{showty} show : Show a => a -> String \end{SaveVerbatim} \useverb{showty} An instance of a class is defined with an \texttt{instance} declaration, which provides implementations of the function for a specific type. For example, the \texttt{Show} instance for \texttt{Nat} could be defined as: \begin{SaveVerbatim}{shownat} instance Show Nat where show O = "O" show (S k) = "s" ++ show k \end{SaveVerbatim} \useverb{shownat} \begin{SaveVerbatim}{shownati} Idris> show (S (S (S O))) "sssO" : String \end{SaveVerbatim} \useverb{shownati} \noindent Only one instance of a class can be given for a type --- instances may not overlap. Instance declarations can themselves have constraints. For example, to define a \texttt{Show} instance for vectors, we need to know that there is a \texttt{Show} instance for the element type, because we are going to use it to convert each element to a \texttt{String}: \begin{SaveVerbatim}{showvec} instance Show a => Show (Vect a n) where show xs = "[" ++ show' xs ++ "]" where show' : Vect a n' -> String show' Nil = "" show' (x :: Nil) = show x show' (x :: xs) = show x ++ ", " ++ show' xs \end{SaveVerbatim} \useverb{showvec} \noindent \textbf{Remark: } The type of the auxiliary function \texttt{show'} is important. The type variables \texttt{a} and \texttt{n} which are part of the instance declaration for \texttt{Show (Vect a n)} are fixed across the entire instance declaration. As a result, \texttt{a} need not be constrained again. Furthermore, it means that if \texttt{n} is used again in the type, it refers to the (fixed) length of the outermost list \texttt{xs}. Therefore, there is a different name for the length \texttt{n'} in \texttt{show'}. Like Haskell type classes, default definitions can be given in the class declaration. Otherwise, all methods must be given in an instance. For example, there is an \texttt{Eq} class: \begin{SaveVerbatim}{eqdefault} class Eq a where (==) : a -> a -> Bool (/=) : a -> a -> Bool x /= y = not (x == y) y == y = not (x /= y) \end{SaveVerbatim} \useverb{eqdefault} \noindent Classes can also be extended. A logical next step from an equality relation \texttt{Eq} is to define an ordering relation \texttt{Ord}. We can define an \texttt{Ord} class which inherits methods from \texttt{Eq} as well as defining some of its own: \begin{SaveVerbatim}{ord} data Ordering = LT | EQ | GT \end{SaveVerbatim} \useverb{ord} \begin{SaveVerbatim}{eqord} class Eq a => Ord a where compare : a -> a -> Ordering (<) : a -> a -> Bool -- etc \end{SaveVerbatim} \useverb{eqord} \subsection{Matching on intermediate values} \subsubsection{\texttt{let} bindings} Intermediate values can be calculated using \texttt{let} bindings: \begin{SaveVerbatim}{letb} mirror : List a -> List a mirror xs = let xs' = rev xs in app xs xs' \end{SaveVerbatim} \useverb{letb} \noindent Pattern matching is also supported in \texttt{let} bindings. For example, extracting fields from a record can be achieved as follows, as well as by pattern matching at the top level: \begin{SaveVerbatim}{letp} data Person = MkPerson String Int showPerson : Person -> String showPerson p = let MkPerson name age = p in name ++ " is " ++ show age ++ " years old" \end{SaveVerbatim} \useverb{letp} \subsubsection{\texttt{case} expressions} Another way of inspecting intermediate values of \emph{simple} types is to use a \texttt{case} expression. For example, \texttt{list\_lookup} looks up an index in a list, returning \texttt{Nothing} if the index is out of bounds. This can be used this to write \texttt{lookup\_default}, which looks up an index and returns a default value if the index is out of bounds: \begin{SaveVerbatim}{listlookup} lookup_default : Nat -> List a -> a -> a lookup_default i xs def = case list_lookup i xs of Nothing => def Just x => x \end{SaveVerbatim} \useverb{listlookup} The \texttt{case} construct is intended for simple analysis of intermediate expressions to avoid the need to write auxiliary functions, and is also used internally to implement pattern matching \texttt{let} and lambda bindings. It will \emph{only} work if: \begin{itemize} \item Each branch \emph{matches} a value of the same type, and \emph{returns} a value of the same type. \item The type of the result is known''. i.e. the type of the expression can be determined \emph{without} type checking the \texttt{case}-expression itself. \end{itemize} \subsubsection{The \texttt{with} rule} Since types can depend on values, the form of some arguments can be determined by the value of others. For example, if we were to write down the implicit length arguments to \texttt{(++)}, we would see that the form of the length argument was determined by whether the vector was empty or not: \begin{SaveVerbatim}{appdep} (++) : Vect a n -> Vect a m -> Vect a (n + m) (++) {n=O} [] [] = [] (++) {n=S k} (x :: xs) ys = x :: xs ++ ys \end{SaveVerbatim} \useverb{appdep} \noindent If \texttt{n} was a successor in the \texttt{[]} case, or zero in the \texttt{::} case, the definition would not be well typed. Often, matching is required on the result of an intermediate computation with a dependent type. \Idris{} provides a construct for this, the \texttt{with} rule, inspired by views in \Epigram~\cite{McBride2004a}, which takes account of the fact that matching on a value in a dependently typed language can affect what is known about the forms of other values. For example, a \texttt{Nat} is either even or odd. If it is even it will be the sum of two equal \texttt{Nat}s. Otherwise, it is the sum of two equal \texttt{Nat}s plus one: \begin{SaveVerbatim}{parity} data Parity : Nat -> Set where even : Parity (n + n) odd : Parity (S (n + n)) \end{SaveVerbatim} \useverb{parity} \noindent We say \texttt{Parity} is a \emph{view} of \texttt{Nat}. It has a \emph{covering function} which tests whether it is even or odd and constructs the predicate accordingly. \begin{SaveVerbatim}{parityty} parity : (n:Nat) -> Parity n \end{SaveVerbatim} \useverb{parityty} \noindent Using this, a function which converts a natural number to a list of binary digits (least significant first) is written as follows, using the \texttt{with} rule: \begin{SaveVerbatim}{natToBin} natToBin : Nat -> List Bool natToBin O = Nil natToBin k with (parity k) natToBin (j + j) | even = False :: natToBin j natToBin (S (j + j)) | odd = True :: natToBin j \end{SaveVerbatim} \useverb{natToBin} \noindent The value of the result of \texttt{parity k} affects the form of \texttt{k}, because the result of \texttt{parity k} depends on \texttt{k}. So, as well as the patterns for the result of the intermediate computation (\texttt{even} and \texttt{odd}) right of the \texttt{$\mid$}, the definition also expresses how the results affect the other patterns left of the $\mid$. Note that there is a function in the patterns (\texttt{+}) and repeated occurrences of \texttt{j} --- this is allowed because another argument has determined the form of these patterns.
Something went wrong with that request. Please try again.