# edwinb/Idris-dev forked from idris-lang/Idris-dev

### Subversion checkout URL

You can clone with
or
.
Fetching contributors…
Cannot retrieve contributors at this time
415 lines (290 sloc) 11.7 KB
 \section{Example: The Well-Typed Interpreter} \label{sect:interp} In this section, we'll use the features we've seen so far to write a larger example, an interpreter for a simple functional programming language, with variables, function application, binary operators and an \texttt{if...then...else} construct. We will use the dependent type system to ensure that any programs which can be represented are well-typed. First, let us define the types in the language. We have integers, booleans, and functions, represented by \texttt{Ty}: \begin{SaveVerbatim}{tydecl} data Ty = TyInt | TyBool | TyFun Ty Ty \end{SaveVerbatim} \useverb{tydecl} \noindent We can write a function to translate these representations to a concrete \Idris{} type --- remember that types are first class, so can be calculated just like any other value: \begin{SaveVerbatim}{interpty} interpTy : Ty -> Set interpTy TyInt = Int interpTy TyBool = Bool interpTy (TyFun A T) = interpTy A -> interpTy T \end{SaveVerbatim} \useverb{interpty} \noindent We're going to define a representation of our language in such a way that only well-typed programs can be represented. We'll index the representations of expressions by their type and the types of local variables (the context), which we'll be using regularly as an implicit argument, so we define everything in a \texttt{using} block: \begin{SaveVerbatim}{usingg} using (G:Vect Ty n) \end{SaveVerbatim} \useverb{usingg} \noindent The full representation of expressions is given in Figure \ref{exprty}. They are indexed by the types of the local variables, and the type of the expression itself: \begin{SaveVerbatim}{exprdecl} data Expr : Vect Ty n -> Ty -> Set \end{SaveVerbatim} \useverb{exprdecl} \noindent Since expressions are indexed by their type, we can read the typing rules of the language from the definitions of the constructors. Let us look at each constructor in turn. \begin{SaveVerbatim}{exprty} data HasType : (i : Fin n) -> Vect Ty n -> Ty -> Set where stop : HasType fO (t :: G) t pop : HasType k G t -> HasType (fS k) (u :: G) t data Expr : Vect Ty n -> Ty -> Set where Var : HasType i G t -> Expr G t Val : (x : Int) -> Expr G TyInt Lam : Expr (a :: G) t -> Expr G (TyFun a t) App : Expr G (TyFun a t) -> Expr G a -> Expr G t Op : (interpTy a -> interpTy b -> interpTy c) -> Expr G a -> Expr G b -> Expr G c If : Expr G TyBool -> Expr G a -> Expr G a -> Expr G a \end{SaveVerbatim} \codefig{exprty}{Expression representation} \noindent We use a nameless representation for variables --- they are \emph{de Bruijn indexed}. Variables are represented by a proof of their membership in the context, \texttt{HasType i G T}, which is a proof that variable \texttt{i} in context \texttt{G} has type \texttt{T}. This is defined as follows: \begin{SaveVerbatim}{hastype} data HasType : (i : Fin n) -> Vect Ty n -> Ty -> Set where stop : HasType fO (t :: G) t pop : HasType k G t -> HasType (fS k) (u :: G) t \end{SaveVerbatim} \useverb{hastype} \noindent We can treat \remph{stop} as a proof that the most recently defined variable is well-typed, and \remph{pop n} as a proof that, if the \texttt{n}th most recently defined variable is well-typed, so is the \texttt{n+1}th. In practice, this means we use \texttt{pop} to refer to the most recently defined variable, \texttt{pop stop} to refer to the next, and so on, via the \texttt{Var} constructor: \begin{SaveVerbatim}{varcon} Var : HasType i G t -> Expr G t \end{SaveVerbatim} \useverb{varcon} \noindent So, in an expression \texttt{$\backslash$x. $\backslash$y. x y}, the variable \texttt{x} would have a de Bruijn index of 1, represented as \texttt{pop stop}, and \texttt{y 0}, represented as \texttt{stop}. We find these by counting the number of lambdas between the definition and the use. \noindent A value carries a concrete representation of an integer: \begin{SaveVerbatim}{valcon} Val : (x : Int) -> Expr G TyInt \end{SaveVerbatim} \useverb{valcon} \noindent A lambda creates a function. In the scope of a function of type \texttt{a -> t}, there is a new local variable of type \texttt{a}, which is expressed by the context index: \begin{SaveVerbatim}{lamcon} Lam : Expr (a :: G) t -> Expr G (TyFun a t) \end{SaveVerbatim} \useverb{lamcon} \noindent Function application produces a value of type \texttt{t} given a function from \texttt{a} to \texttt{t} and a value of type \texttt{a}: \begin{SaveVerbatim}{appcon} App : Expr G (TyFun a t) -> Expr G a -> Expr G t \end{SaveVerbatim} \useverb{appcon} \noindent We allow arbitrary binary operators, where the type of the operator informs what the types of the arguments must be: \begin{SaveVerbatim}{opcon} Op : (interpTy a -> interpTy b -> interpTy c) -> Expr G a -> Expr G b -> Expr G c \end{SaveVerbatim} \useverb{opcon} \noindent Finally, if expressions make a choice given a boolean. Each branch must have the same type: \begin{SaveVerbatim}{ifcon} If : Expr G TyBool -> Expr G a -> Expr G a -> Expr G a \end{SaveVerbatim} \useverb{ifcon} \noindent When we evaluate an \texttt{Expr}, we'll need to know the values in scope, as well as their types. \texttt{Env} is an environment, indexed over the types in scope. Since an environment is just another form of list, albeit with a strongly specified connection to the vector of local variable types, we use the usual \texttt{::} and \texttt{Nil} constructors so that we can use the usual list syntax. Given a proof that a variable is defined in the context, we can then produce a value from the environment: \begin{SaveVerbatim}{envtype} data Env : Vect Ty n -> Set where Nil : Env Nil (::) : interpTy a -> Env G -> Env (a :: G) lookup : HasType i G t -> Env G -> interpTy t lookup stop (x :: xs) = x lookup (pop k) (x :: xs) = lookup k xs \end{SaveVerbatim} \useverb{envtype} \begin{SaveVerbatim}{interpdef} interp : Env G -> Expr G t -> interpTy t interp env (Var i) = lookup i env interp env (Val x) = x interp env (Lam sc) = \x => interp (x :: env) sc interp env (App f s) = interp env f (interp env s) interp env (Op op x y) = op (interp env x) (interp env y) interp env (If x t e) = if interp env x then interp env t else interp env e \end{SaveVerbatim} \codefig{interpdef}{Intepreter definition} \noindent Given this, an interpreter (Figure \ref{interpdef}) is a function which translates an \texttt{Expr} into a concrete \Idris{} value with respect to a specific environment: \begin{SaveVerbatim}{interpty} interp : Env G -> Expr G t -> interpTy t \end{SaveVerbatim} \useverb{interpty} \noindent To translate a variable, we simply look it up in the environment: \begin{SaveVerbatim}{varinterp} interp env (Var i) = lookup i env \end{SaveVerbatim} \useverb{varinterp} \noindent To translate a value, we just return the concrete representation of the value: \begin{SaveVerbatim}{valinterp} interp env (Val x) = x \end{SaveVerbatim} \useverb{valinterp} \noindent Lambdas are more interesting. In this case, we construct a function which interprets the scope of the lambda with a new value in the environment. So, a function in the object language is translated to an \Idris{} function: \begin{SaveVerbatim}{laminterp} interp env (Lam sc) = \x => interp (x :: env) sc \end{SaveVerbatim} \useverb{laminterp} \noindent For an application, we interpret the function and its argument and apply it directly. We know that interpreting \texttt{f} must produce a function, because of its type: \begin{SaveVerbatim}{appinterp} interp env (App f s) = interp env f (interp env s) \end{SaveVerbatim} \useverb{appinterp} \noindent Operators and interpreters are, again, direct translations into the equivalent \Idris{} constructs. For operators, we apply the function to its operands directly, and for \texttt{If}, we apply the \Idris{} \texttt{if...then...else} construct directly. \begin{SaveVerbatim}{opinterp} interp env (Op op x y) = op (interp env x) (interp env y) interp env (If x t e) = if interp env x then interp env t else interp env e \end{SaveVerbatim} \useverb{opinterp} \noindent We can make some simple test functions. Firstly, adding two inputs \texttt{$\backslash$x. $\backslash$y. y + x} is written as follows: \begin{SaveVerbatim}{addtest} add : Expr G (TyFun TyInt (TyFun TyInt TyInt)) add = Lam (Lam (Op (+) (Var stop) (Var (pop stop)))) \end{SaveVerbatim} \useverb{addtest} \noindent More interestingly, we can write a factorial function. First, we write a \emph{lazy} version of the \texttt{App} constructor, so that the recursive branch will only be evaluated if necessary: \begin{SaveVerbatim}{lazyapp} app : |(f : Expr G (TyFun a t)) -> Expr G a -> Expr G t app = \f, a => App f a \end{SaveVerbatim} \useverb{lazyapp} \noindent The vertical bar annotation before the \texttt{f} argument indicates that this argument should be evaluated lazily when \texttt{app} is applied --- that is, the argument will be evaluated only when needed. Then \texttt{fact} (i.e. \texttt{$\backslash$x. if (x == 0) then 1 else (fact (x-1) * x)}) is written as follows: \begin{SaveVerbatim}{facttest} fact : Expr G (TyFun TyInt TyInt) fact = Lam (If (Op (==) (Var stop) (Val 0)) (Val 1) (Op (*) (app fact (Op (-) (Var stop) (Val 1))) (Var stop))) \end{SaveVerbatim} \useverb{facttest} To finish, we write a \texttt{main} program which interprets the factorial function on user input: \begin{SaveVerbatim}{factmain} main : IO () main = do putStr "Enter a number: " x <- getLine print (interp [] fact (cast x)) \end{SaveVerbatim} \useverb{factmain} \noindent Here, \texttt{cast} is an overloaded function which converts a value from one type to another if possible. Here, it converts a string to an integer, giving 0 if the input is invalid. An example run of this program at the \Idris{} interactive environment is shown in Figure \ref{factrun}. \begin{SaveVerbatim}{factrun} \$ idris interp.idr ____ __ _ / _/___/ /____(_)____ / // __ / ___/ / ___/ Version 0.9.2 _/ // /_/ / / / (__ ) http://www.idris-lang.org/ /___/\__,_/_/ /_/____/ Type :? for help Type checking ./interp.idr *interp> :exec Enter a number: 6 720 *interp> \end{SaveVerbatim} \codefig{factrun}{Running the well-typed interpreter} \subsubsection*{Aside: \texttt{cast}} The prelude defines a type class \texttt{Cast} which allows conversion between types: \begin{SaveVerbatim}{castclass} class Cast from to where cast : from -> to \end{SaveVerbatim} \useverb{castclass} \noindent It is a \emph{multi-parameter} type class, defining the source type and object type of the cast. It must be possible for the type checker to infer \remph{both} parameters at the point where the cast is applied. There are casts defined between all of the primitive types, as far as they make sense. \subsection{Unit testing} Recall the \texttt{so} data type: \useverb{sotype} \noindent This requires its parameter to be a \texttt{True} value. One simple application of this is to perform unit testing of functions at compile time. For example, we know that \texttt{4! = 24} so we can set up a test case as follows: \begin{SaveVerbatim}{unitTestFac} unitTestFac : so (interp [] fact 4 == 24) unitTestFac = oh \end{SaveVerbatim} \useverb{unitTestFac} \noindent For the program the compile, the test case must pass. If we change \texttt{24} to \texttt{25}, say, we will get an error message like the following: \begin{SaveVerbatim}{unitTestErr} Type checking ./interp.idr interp.idr:64:Can't unify so True with so False Specifically: Can't unify True with False \end{SaveVerbatim} \useverb{unitTestErr}