Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

various updates

  • Loading branch information...
commit a0dc4567aff92fd85744dfae3dee99ab03fff5b4 1 parent 8fa71d9
tnfchris authored
View
406 Docs/Presentation/Content.tex
@@ -1,259 +1,147 @@
-\section{Introduction}
-
-%\frame
-%{
-% \frametitle{Features of the Beamer Class}
-%
-% \begin{itemize}
-% \item<1-> Normal LaTeX class.
-% \item<2-> Easy overlays.
-% \item<3-> No external programs needed.
-% \end{itemize}
-%}
-
-% \begin{frame}[fragile]
-% \frametitle{By example}
-%
-% \begin{block}<+->{Consider the recursive function last:}
-% \begin{lstlisting}
-% last :: [a] -> a
-% last [] = error "Last"
-% last (x:[]) = x
-% last (x:xs) = last xs
-% \end{lstlisting}
-% \end{block}
-%
-% Runtime two checks for the Cons (:) constructor are done.
-% \end{frame}
-
-\subsection{Type Systems}
-\frame{
- \frametitle{Type inferencing}
-
- \begin{quotation}
- "Type inference refers to the ability to deduce automatically the type of an expression in a programming language."
- \end{quotation}
-
- \only<2->{For example the \emph{identity} function |\x->x : alpha -> alpha|}
-}
-
-\frame{
- \frametitle{Hindley-Milner}
-
- \begin{itemize}
- \item<1-> Damas-Milner
- \item<2-> Principle type
- \item<3-> Decidable inferencing
- \end{itemize}
-}
-
-\frame{
- \frametitle{Higher-rank types}
-
- \begin{itemize}
- \item<1-> Haskell '98 types are rank-1
- \item<2-> |a -> b -> a|
- \item<3-> |forall a b. a -> b -> a|
- \item<4-> |forall a. a -> (forall b. b -> a)|
- \item<5-> |forall b. (forall a. a -> a) -> b -> b|
- \end{itemize}
-
- \only<6->{Refers to the the number of |forall|s nested to the left of a |(->)|}
-}
-
-\frame{
- \frametitle{Higher-rank types}
-
- \begin{code}
- poly = \f -> (f 1, f 'c')
- \end{code}
-
- cannot be expressed without Higher-Rank types.
-}
-
-\frame{
- \frametitle{SystemF}
- Provides typing support for higher-rank functions
-
- \begin{block}<+->{Terms}
- \begin{itemize}
- \item Type abstraction ($\Lambda X. t$)
- \item Type application (t [T])
- \end{itemize}
- \end{block}
-
- \begin{block}<+->{Types}
- \begin{itemize}
- \item Type variables (X)
- \item Universal types (|forall X. T|)
- \end{itemize}
- \end{block}
-}
-
-\frame{
- \frametitle{SystemF}
- Typing of the |id| function
- \begin{itemize}
- \item<1-> $id = \lambda x. x$
- \item<2-> $id = \Lambda X. \lambda x : X -> x : X$
- \item<3-> Typing |id 3|
- \item<4-> id [Int] = [X$\rightarrow$Int]($\lambda x:X. x$)
- \end{itemize}
-}
-
-\frame{
- \frametitle{SystemF}
- \begin{itemize}
- \item<1->|poly = \f -> (f 1, f 'c')| is now typeable
- \item<2->Undecidable
- \item<3->Requires annotation
- \item<4->|poly = \(f :: forall a -> Int) -> (f 1, f 'c')|
- \end{itemize}
-}
-
-\frame{
- \frametitle{The problem}
-
- Type systems are specified using typing rules
- \only<2->{
- \begin{prooftree}
- \AxiomC{$x : \sigma \in \Gamma$}
- \LeftLabel{Var:\quad}
- \UnaryInfC{$\Gamma\vdash x : \sigma$}
- \end{prooftree}
- }
- \only<2->{However..}
-}
-
-\frame{
- \frametitle{The problem}
-
- Disconnect between typing rules and implementation
- \begin{itemize}
- \item<1-> Nondeterministic
- \item<2-> Implicit assumptions
- \item<3-> Implementation does not resemble typing rules
- \item<4-> Complexity explodes with size of AST
- \item<5-> A lot of it due to language used
- \end{itemize}
-}
-
-\frame{
- \frametitle{Goal}
-
- Implement type system using attribute grammars
- \begin{itemize}
- \item<1-> Easier to understand
- \item<2-> Easier to prove correct
- \item<3-> Easier to document and scale
- \end{itemize}
-}
-
-\frame{
- \frametitle{Contributions}
-
- \begin{itemize}
- \item<1-> Implementation using attribute grammars
- \item<2-> Implementation \& specification for the HML type system for EH
- \end{itemize}
-}
-
-\section{Attribute Grammars}
-
-\frame{
- \frametitle{Context Free Grammars}
-
- \begin{itemize}
- \item<1-> Can only describe syntax
- \item<2-> Cannot specify any context-sensitive conditions
- \item<2-> $a^nb^nc^n$
- \item<3-> A way to define semantics/meaning
- \end{itemize}
-}
-
-\frame{
- \frametitle{Context Free Grammars}
-
- \input{./BNF-Expr}
-}
-
-\frame{
- \frametitle{Parse Trees}
-
- \begin{figure}[H]
- \centering
- \begin{tikzpicture}[>=stealth']
- \node (r0) [draw, circle] {+};
- \node (r1) [below left=1cm of r0, draw, circle] {2}
- edge[<-] (r0.south);
- \node (r2) [below right=1cm of r0, draw, circle] {+}
- edge[<-] (r0.south);
- \node (s0) [below left=1.4cm of r2, draw, circle] {-}
- edge[<-] (r2.south);
- \node (s1) [below left=1cm of s0, draw, circle] {3}
- edge[<-] (s0.south);
- \node (s2) [below right=1cm of s0, draw, circle] {5}
- edge[<-] (s0.south);
- \node (f0) [below right=1.5cm of r2, draw, circle] {-}
- edge[<-] (r2.south);
- \node (f1) [below left=1cm of f0, draw, circle] {6}
- edge[<-] (f0.south);
- \node (f2) [below right=1cm of f0, draw, circle] {1}
- edge[<-] (f0.south);
- \end{tikzpicture}
- \caption{Parse tree example for "2 + (3 - 5) + (6 - 1)"}
- \label{fig.example1.parsetree}
- \end{figure}
-
- \begin{itemize}
- \item<1-> AG's are additions to CFG
- \item<2-> Nodes are a production or a non-terminal
- \end{itemize}
-}
-
-\frame{
- \frametitle{}
-
-}
-
-\subsection{Introduction}
-
-\subsection{UUAGC}
-
-\subsection{Ruler-Core}
-
-\section{HML}
-
-\subsection{Introduction}
-
-\subsection{Types}
-
-\subsection{Invariants}
-
-\subsection{Semantics}
-
-\subsection{Utility functions}
-
-\subsection{Normal Form}
-
-\section{Implementation}
-
-\subsection{Inferencing}
-
-\subsection{Unification}
-
-%\section{Additions}
-%
-%\subsection{Tuples}
-%
-%\subsection{Case expressions}
-
-\section{Result}
-
-\section{Problems and Future work}
-
-\subsection{HML}
-
-\subsection{Ruler-Core}
-
-\section{Conclusion}
+\section{Introduction}
+
+%\frame
+%{
+% \frametitle{Features of the Beamer Class}
+%
+% \begin{itemize}
+% \item<1-> Normal LaTeX class.
+% \item<2-> Easy overlays.
+% \item<3-> No external programs needed.
+% \end{itemize}
+%}
+
+% \begin{frame}[fragile]
+% \frametitle{By example}
+%
+% \begin{block}<+->{Consider the recursive function last:}
+% \begin{lstlisting}
+% last :: [a] -> a
+% last [] = error "Last"
+% last (x:[]) = x
+% last (x:xs) = last xs
+% \end{lstlisting}
+% \end{block}
+%
+% Runtime two checks for the Cons (:) constructor are done.
+% \end{frame}
+
+\subsection{Type Systems}
+
+
+
+
+\frame{
+ \frametitle{The problem}
+
+ Type systems are specified using typing rules
+ \only<2->{
+ \begin{prooftree}
+ \AxiomC{$x : \sigma \in \Gamma$}
+ \LeftLabel{Var:\quad}
+ \UnaryInfC{$\Gamma\vdash x : \sigma$}
+ \end{prooftree}
+ }
+ \only<2->{However..}
+}
+
+\frame{
+ \frametitle{The problem}
+
+ Disconnect between typing rules and implementation
+ \begin{itemize}
+ \item<1-> Nondeterministic
+ \item<2-> Implicit assumptions
+ \item<3-> Implementation does not resemble typing rules
+ \item<4-> Complexity explodes with size of AST
+ \item<5-> A lot of it due to language used
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Goal}
+
+ Implement type system using attribute grammars
+ \begin{itemize}
+ \item<1-> Easier to understand
+ \item<2-> Easier to prove correct
+ \item<3-> Easier to document and scale
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Contributions}
+
+ \begin{itemize}
+ \item<1-> Implementation using attribute grammars
+ \item<2-> Implementation \& specification for the HML type system for EH
+ \end{itemize}
+}
+
+\section{Attribute Grammars}
+
+\frame{
+ \frametitle{Context Free Grammars}
+
+ \begin{itemize}
+ \item<1-> Can only describe syntax
+ \item<2-> Cannot specify any context-sensitive conditions
+ \item<2-> $a^nb^nc^n$
+ \item<3-> A way to define semantics/meaning
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Context Free Grammars}
+
+ \input{./BNF-Expr}
+}
+
+\frame{
+ \frametitle{Parse Trees}
+
+ \begin{figure}[H]
+ \centering
+ \begin{tikzpicture}[>=stealth']
+ \node (r0) [draw, circle] {+};
+ \node (r1) [below left=1cm of r0, draw, circle] {2}
+ edge[<-] (r0.south);
+ \node (r2) [below right=1cm of r0, draw, circle] {+}
+ edge[<-] (r0.south);
+ \node (s0) [below left=1.4cm of r2, draw, circle] {-}
+ edge[<-] (r2.south);
+ \node (s1) [below left=1cm of s0, draw, circle] {3}
+ edge[<-] (s0.south);
+ \node (s2) [below right=1cm of s0, draw, circle] {5}
+ edge[<-] (s0.south);
+ \node (f0) [below right=1.5cm of r2, draw, circle] {-}
+ edge[<-] (r2.south);
+ \node (f1) [below left=1cm of f0, draw, circle] {6}
+ edge[<-] (f0.south);
+ \node (f2) [below right=1cm of f0, draw, circle] {1}
+ edge[<-] (f0.south);
+ \end{tikzpicture}
+ \caption{Parse tree example for "2 + (3 - 5) + (6 - 1)"}
+ \label{fig.example1.parsetree}
+ \end{figure}
+
+ \begin{itemize}
+ \item<1-> AG's are additions to CFG
+ \item<2-> Nodes are a production or a non-terminal
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{}
+
+}
+
+\section{Result}
+
+\section{Problems and Future work}
+
+\subsection{HML}
+
+\subsection{Ruler-Core}
+
+\section{Conclusion}
View
135 Docs/Presentation/Presentation.lhs
@@ -1,67 +1,68 @@
-\documentclass[compress,red,professionalfonts]{beamer}
-
-\mode<presentation>
-
-\usepackage{beamerthemeAntibes}
-\usecolortheme{lily}
-
-\title{Type systems with first class polymorphisms using Attribute Grammars}
-\subtitle{Master thesis defense}
-\author[Tamar]{Tamar Christina \\ \small{T.Christina@@students.uu.nl}}
-\institute{University of Utrecht}
-\date{\today}
-
-\setcounter{tocdepth}{1}
-
-%include lhs2TeX.fmt
-%include polycode.fmt
-%include forall.fmt
-%include greek.fmt
-%include spacing.fmt
-
-\usepackage{fancyvrb,url}
-\usepackage[english]{babel}
-%\usepackage[usenames]{color}
-\usepackage{hyperref}
-\usepackage{color}
-\usepackage{textcomp}
-\usepackage{parskip}
-\usepackage{graphicx}
-\usepackage{bussproofs}
-\usepackage{amssymb}
-\usepackage{amsmath}
-\usepackage{latexsym}
-\usepackage{hyperref}
-\usepackage[xindy, toc]{glossaries}
-\usepackage[shell, miktex, pdf]{dottex}
-\usepackage{pgf}
-\usepackage{tikz}
-\usepackage{savesym}
-\savesymbol{geq}
-\savesymbol{leq}
-\usepackage{mathabx}
-\restoresymbol{TXT}{leq}
-\restoresymbol{TXT}{geq}
-
-\usepackage{bnf, float}
-\usetikzlibrary{arrows,positioning}
-\restylefloat{figure}
-
-%include ..\\thesis\\Shorthands.tex
-
-\begin{document}
-
-\frame{\titlepage}
-
-\section[Outline]{}
-\frame{\tableofcontents}
-
-\AtBeginSection[]{
- \begin{frame}{Outline}
- \tableofcontents[currentsection]
- \end{frame}
-}
-
-%include Content.tex
-
-\end{document}
+\documentclass[compress,red,professionalfonts]{beamer}
+
+\mode<presentation>
+
+\usepackage{beamerthemeAntibes}
+\usecolortheme{lily}
+
+\title{Type systems using next generation Attribute Grammars}
+\subtitle{Master thesis defense}
+\author[Tamar]{Tamar Christina \\ \small{T.Christina@@students.uu.nl}}
+\institute{University of Utrecht}
+\date{\today}
+
+\setcounter{tocdepth}{1}
+
+%include lhs2TeX.fmt
+%include polycode.fmt
+%include forall.fmt
+%include greek.fmt
+%include spacing.fmt
+
+\usepackage{fancyvrb,url}
+\usepackage[english]{babel}
+%\usepackage[usenames]{color}
+\usepackage{hyperref}
+\usepackage{color}
+\usepackage{textcomp}
+\usepackage{parskip}
+\usepackage{graphicx}
+\usepackage{bussproofs}
+\usepackage{amssymb}
+\usepackage{amsmath}
+\usepackage{latexsym}
+\usepackage{hyperref}
+\usepackage[xindy, toc]{glossaries}
+\usepackage[shell, miktex, pdf]{dottex}
+\usepackage{pgf}
+\usepackage{tikz}
+\usepackage{savesym}
+\savesymbol{geq}
+\savesymbol{leq}
+\usepackage{mathabx}
+\restoresymbol{TXT}{leq}
+\restoresymbol{TXT}{geq}
+
+\usepackage{bnf, float}
+\usetikzlibrary{arrows,positioning}
+\restylefloat{figure}
+
+%include ..\\thesis\\Shorthands.tex
+
+\begin{document}
+
+\frame{\titlepage}
+
+\section[Outline]{}
+\frame{\tableofcontents}
+
+\AtBeginSection[]{
+ \begin{frame}{Outline}
+ \tableofcontents[currentsection]
+ \end{frame}
+}
+
+%%include Content.tex
+%include Tutorial.tex
+
+\end{document}
View
904 Docs/Presentation/Tutorial.tex
@@ -0,0 +1,904 @@
+\section{Introduction}
+
+\frame{
+ \frametitle{Current Gen Attribute Grammars}
+
+ \begin{itemize}
+ \item<1-> Computations over a single tree
+ \item<2-> Function over attributes
+ \item<3-> Visits are implicit
+ \item<4-> Same phase for synthesis and evaluation
+ \item<5-> Can describe inference but not unification
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Next Gen Attribute Grammars}
+
+ \begin{itemize}
+ \item<1-> Allows Computations over multiple trees
+ \item<2-> Based on visit sequences
+ \item<3-> Supports explicit visits
+ \item<4-> Supports multiple visits
+ \item<5-> Separate phases for synthesis and evaluation
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Why Attribute Grammars}
+
+ Implement type system using attribute grammars
+ \begin{itemize}
+ \item<1-> Easier to understand
+ \item<2-> Easier to prove correct
+ \item<3-> Easier to document and scale
+ \end{itemize}
+}
+
+\section{Ruler-core}
+
+\frame{
+ \frametitle{Ruler-Core}
+
+ \begin{block}<+->{Developed by Arie Middelkoop}
+ \begin{definition}
+ A programming language to manipulate visit sequences
+ \end{definition}
+ \end{block}
+}
+
+\frame{
+ \frametitle{Ruler-Core}
+
+ \begin{block}<+->{visits}
+ \begin{itemize}
+ \item<1-> A (partial) function
+ \item<2-> Takes several inputs (inherited)
+ \item<3-> Produces several outputs (synthesized)
+ \item<4-> A continuation for the next visit.
+ \end{itemize}
+ \end{block}
+}
+
+\frame{
+ \frametitle{A case study}
+
+ Implementing the Hindley-Milner type system
+}
+
+\frame{
+ \frametitle{Hindley-Milner}
+
+ \begin{itemize}
+ \item<1-> Damas-Milner
+ \item<2-> Principle type
+ \item<3-> Build set of constrains
+ \item<4-> Decidable inferencing
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Datatypes: TyExpr}
+
+ \begin{columns}
+ \begin{column}[c]{0.5\linewidth}
+ \begin{code}
+ data TyExpr
+ con Var
+ nm :: Name
+ con Con
+ nm :: Name
+ con App
+ func : TyExpr
+ arg : TyExpr
+ con Paren
+ tyExpr : TyExpr
+ \end{code}
+ \end{column}
+ \begin{column}[c]{0.5\linewidth}
+ \begin{description}
+ \item[|:|] are terminals
+ \item[|::|] are nonterminals
+ \end{description}
+ \end{column}
+ \end{columns}
+}
+
+\frame{
+ \frametitle{Datatypes: Expr}
+
+ \begin{columns}
+ \begin{column}[t]{0.5\linewidth}
+ \begin{code}
+ data Expr
+ con IConst
+ int :: Int
+ con Var
+ nm :: Name
+ con Con
+ nm :: Name
+ con Lam
+ arg :: Name
+ body : Expr
+ \end{code}
+ \end{column}
+ \begin{column}[t]{0.5\linewidth}
+ \begin{code}
+ con App
+ func : Expr
+ arg : Expr
+ con Let
+ nm :: Name
+ bind : Expr
+ body : Expr
+ con Paren
+ tyExpr : Expr
+ \end{code}
+ \end{column}
+ \end{columns}
+}
+
+\frame{
+ \frametitle{interfaces}
+
+ \begin{columns}
+ \begin{column}[c]{0.4\linewidth}
+ \begin{code}
+ itf <name>
+ {visit <name>
+ {attributes}
+ }
+ \end{code}
+ \end{column}
+ \begin{column}[c]{0.6\linewidth}
+ \begin{itemize}
+ \item<1-> Declare nonterminals
+ \item<2-> Multiple visits
+ \item<3-> Every visit is a co-routine
+ \item<4-> order is not relevant
+ \item<5-> Will be automatically scheduled
+ \end{itemize}
+ \end{column}
+ \end{columns}
+}
+
+\frame{
+ \frametitle{Inference interface}
+
+ \begin{columns}
+ \begin{column}[c]{0.4\linewidth}
+ \begin{code}
+ itf Expr
+ visit infer
+ inh ast :: Expr
+ inh env :: Gamma
+ inh frs :: Int
+ syn frs :: Int
+ syn sub :: Env
+ syn ty :: TyExpr
+ \end{code}
+ \end{column}
+ \begin{column}[c]{0.6\linewidth}
+ \begin{description}
+ \item[inh] inherited attributes
+ \item[syn] synthesized attributes
+ \item[] Attributes can share names
+ \item[ast] reguired attribute
+ \end{description}
+ \end{column}
+ \end{columns}
+}
+
+\frame{
+ \frametitle{Semantics}
+
+ \begin{block}<+->{There are two kinds of visits}
+ \begin{itemize}
+ \item<1->{Declared inside of interfaces
+ \begin{itemize}
+ \item<2-> Declares what to calculate
+ \end{itemize}
+ }
+ \item<3->{Declared inside of semantic functions
+ \begin{itemize}
+ \item<4-> Declares how to calculate
+ \end{itemize}
+ }
+ \end{itemize}
+ \end{block}
+}
+
+\frame{
+ \frametitle{Semantic functions}
+
+ \begin{itemize}
+ \item datasem (datatype semantic functions)
+ \item sem (semantic function for an arbitrary interface)
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Datasem}
+
+ \begin{code}
+ datasem <nonterminal>
+ {clause <name>
+ ...
+ }
+ \end{code}
+
+ \begin{itemize}
+ \item<1-> Declares semantics for a nonterminal
+ \item<2-> Clauses correspond to constructors
+ \item<3-> Implicit children for nonterminals
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Clauses}
+
+ \begin{code}
+ clause <ident>
+ { rules }
+ \end{code}
+
+ \begin{itemize}
+ \item<1-> Provide a means of scoping
+ \item<2-> A way to do alternatives
+ \item<3-> Tried sequentially
+ \item<4-> Backtracks if not applicable
+ \item<5-> contain rules (matches, bindings etc)
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{code}
+ datasem Expr monad IO
+ clause IConst
+ lhs.ty = TyExpr_Con "Int"
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ \end{code}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{code}
+ datasem Expr monad IO
+ clause IConst
+ lhs.ty = TyExpr_Con "Int"
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ clause Var
+ lhs.ty = fromMaybe (error ...) (lookup loc.nm lhs.env)
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ \end{code}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{code}
+ datasem Expr monad IO
+ clause IConst
+ lhs.ty = TyExpr_Con "Int"
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ clause Var
+ lhs.ty = fromMaybe (error ...) (lookup loc.nm lhs.env)
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ clause Con
+ lhs.ty = TfromMaybe (error ...) (lookup loc.nm lhs.env)
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ \end{code}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{code}
+ datasem Expr monad IO
+ clause IConst
+ lhs.ty = TyExpr_Con "Int"
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ clause Var
+ lhs.ty = fromMaybe (error ...) (lookup loc.nm lhs.env)
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ clause Con
+ lhs.ty = fromMaybe (error ...) (lookup loc.nm lhs.env)
+ lhs.sub = []
+ lhs.frs = lhs.frs
+ clause Paren
+ tyExpr.frs = lhs.frs
+ lhs.ty = tyExpr.ty
+ lhs.sub = tyExpr.sub
+ \end{code}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{code}
+ datasem Expr monad IO
+ default? frs = last
+ default? sub = last
+ default? env = last
+ clause IConst
+ lhs.ty = TyExpr_Con "Int"
+ lhs.sub = []
+ clause Var
+ lhs.ty = fromMaybe (error ...) (lookup loc.nm lhs.env)
+ lhs.sub = []
+ clause Con
+ lhs.ty = fromMaybe (error ...) (lookup loc.nm lhs.env)
+ lhs.sub = []
+ clause Paren
+ lhs.ty = tyExpr.ty
+ \end{code}
+}
+
+\frame{
+ \frametitle{Default rules}
+
+ \textbf{default[\textit{?}]} \hspace{5pt} \emph{attribute} = \emph{function}
+
+ \begin{block}<+->{defining default values for attributes}
+ \begin{description}
+ \item[] Collect |attribute| values in a list
+ \item[default] throws static exception
+ \item[default?] returns empty list
+ \end{description}
+ \end{block}
+}
+
+\frame{
+ \frametitle{Default rules}
+ \center
+ \begin{tikzpicture}
+ [ nd/.style={circle, minimum size=8mm, very thick, draw=black!50!black!50, top color=white, bottom color=white,font=\footnotesize}
+ , attr/.style={rectangle, minimum size=1mm, node distance=4mm, very thick, draw=black!50, top color=white, bottom color=gray!20,font=\footnotesize}
+ , inh/.style={attr}
+ , inh1/.style={syn, xshift=-1.8mm}
+ , syn/.style={attr}
+ , syn1/.style={syn, xshift=2.5mm}
+ , syn2/.style={syn, xshift=4mm}
+ , syn3/.style={syn, xshift=1mm, yshift=-3mm}
+ , arr/.style={->,dashed, very thick}
+ ]
+
+ \node[nd](n1){}
+ [sibling distance=18mm, level distance=16mm]
+ child {
+ node[nd](n2){}
+ }
+ child {
+ node[nd](n3){$\alpha$}
+ child {
+ node[nd](n4){}
+ }
+ child {
+ node[nd](n5){}
+ }
+ };
+
+ \node[syn1, right of=n1] (n1syn) {10};
+ \node[syn1, right of=n2] (n2syn) {3};
+ \node[syn1, right of=n3] (n3syn) {9};
+ \node[syn1, right of=n4] (n4syn) {6};
+ \node[syn1, right of=n5] (n5syn) {8};
+
+ \node[inh1, left of=n1] (n1inh) {1};
+ \node[inh1, left of=n2] (n2inh) {2};
+ \node[inh1, left of=n3] (n3inh) {4};
+ \node[inh1, left of=n4] (n4inh) {5};
+ \node[inh1, left of=n5] (n5inh) {7};
+
+ \node[above of=n1]{|default? value = (+1).last|};
+
+ \draw[arr] (n1inh) to (n2inh);
+ \draw[arr] (n2syn) to (n1);
+ \draw[arr] (n1) to (n3inh);
+ \draw[arr] (n3inh) to (n4inh);
+ \draw[arr] (n4syn) to (n3);
+ \draw[arr] (n3) to (n5inh);
+ \draw[arr] (n5syn) to (n3syn);
+ \draw[arr] (n3syn) to (n1syn);
+
+ \end{tikzpicture}
+
+ $\alpha = [1,2,3,4]$
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{prooftree}
+ \AxiomC{$\Gamma \vdash x : \tau_1 \quad e : \tau_2$}
+ \LeftLabel{Lam:\quad}
+ \UnaryInfC{$\Gamma \vdash \lambda x \rightarrow e : \tau_1 \rightarrow \tau_2$}
+\end{prooftree}
+
+ \begin{code}
+ clause Lam
+ (loc.a, loc.frs) = first TyExpr_Var (fresh lhs.frs)
+ body.env = (loc.arg, loc.a):lhs.env
+ body.frs = loc.frs
+
+ lhs.ty = (appAll body.sub loc.a) `mkArrow` body.ty
+ \end{code}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{prooftree}
+ \AxiomC{$\Gamma \vdash f : \tau_1 \rightarrow \tau_2 \quad x : \tau_1$}
+ \LeftLabel{App:\quad}
+ \UnaryInfC{$\Gamma \vdash f \hspace{3pt} x : \tau_2$}
+\end{prooftree}
+
+ \begin{code}
+ clause App
+ (loc.b, loc.frs) = first TyExpr_Var (fresh lhs.frs)
+
+ func.frs = loc.frs
+ arg.frs = func.frs
+ \end{code}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{prooftree}
+ \AxiomC{$\Gamma \vdash f : \tau_1 \rightarrow \tau_2 \quad x : \tau_1$}
+ \LeftLabel{App:\quad}
+ \UnaryInfC{$\Gamma \vdash f \hspace{3pt} x : \tau_2$}
+\end{prooftree}
+
+ \begin{code}
+ clause App
+ (loc.b, loc.frs) = first TyExpr_Var (fresh lhs.frs)
+
+ func.frs = loc.frs
+ arg.frs = func.frs
+
+ child u : Unify = unify
+ u.frs = arg.frs
+ u.exp1 = func.ty
+ u.exp2 = arg.ty `mkArrow` loc.b
+ \end{code}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{prooftree}
+ \AxiomC{$\Gamma \vdash f : \tau_1 \rightarrow \tau_2 \quad x : \tau_1$}
+ \LeftLabel{App:\quad}
+ \UnaryInfC{$\Gamma \vdash f \hspace{3pt} x : \tau_2$}
+\end{prooftree}
+
+ \begin{code}
+ clause App
+ (loc.b, loc.frs) = first TyExpr_Var (fresh lhs.frs)
+
+ func.frs = loc.frs
+ arg.frs = func.frs
+
+ child u : Unify = unify
+ u.frs = arg.frs
+ u.exp1 = func.ty
+ u.exp2 = arg.ty `mkArrow` loc.b
+
+ lhs.ty = appAll u.sub loc.b
+ lhs.sub = func.sub ++ arg.sub ++ u.sub
+ lhs.frs = u.frs
+ \end{code}
+}
+
+\frame{
+ \frametitle{Child}
+
+ \textbf{child} \hspace{5pt} \emph{name} : Interface \hspace{5pt} [= sem\_name]
+
+ \begin{itemize}
+ \item Declares a new \textbf{child} \emph{name}
+ \item Of nonterminal Interface
+ \item sem\_name by default |id|
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Defining inference}
+
+ \begin{prooftree}
+ \AxiomC{$\Gamma \vdash e_1 : \tau_1 \quad (\Gamma, x : \tau_1) \vdash e_2 : \tau_2 $}
+ \LeftLabel{Let:\quad}
+ \UnaryInfC{$\Gamma \vdash Let \hspace{3pt} x = e_1 \hspace{3pt} in \hspace{3pt} e_2 : \tau_2$}
+\end{prooftree}
+
+ \begin{code}
+ clause Let
+ body.env = (loc.nm, bind.ty):lhs.env
+ lhs.ty = body.ty
+ \end{code}
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ itf Unify
+ visit unify
+ inh exp1 :: TyExpr
+ inh exp2 :: TyExpr
+ inh frs :: Int
+ syn sub :: Env
+ syn frs :: Int
+ \end{code}
+
+ \begin{itemize}
+ \item Unification requires traversal of two trees
+ \item Unify |exp1| with |exp2|
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Semantic function}
+
+ \begin{code}
+ <name> = sem <internal_name> : <Interface>
+ {visit <name>
+ {clause <name>
+ ...
+ }
+ }
+ \end{code}
+
+ \begin{itemize}
+ \item defines semantic function for arbitrary interface
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ unify = sem unify : Unify monad IO
+ visit unify
+ default? frs = last
+ default? sub = const []
+ clause LeftParen
+ match TyExpr.Paren@exp1 = lhs.exp1
+
+ child u : Unify = unify
+
+ u.exp1 = exp1.tyExpr
+ u.exp2 = lhs.exp2
+ \end{code}
+
+ \begin{itemize}
+ \item Physical parenthesis can be removed
+ \item Identical case for the right parenthesis
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Match}
+
+ \begin{code}
+ match TypeName.ConstructorName@child = <expression>
+ \end{code}
+
+ \begin{itemize}
+ \item |var| and |lhs| are restricted keywords
+ \item Build in types such as |Bool| are special
+ \item<2-> |match True = ...|
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ clause constructor
+ match TyExpr.Con@exp1 = lhs.exp1
+ match TyExpr.Con@exp2 = lhs.exp2
+
+ lhs.sub = case exp1.nm == exp1.nm of
+ True -> []
+ False -> error ...
+ \end{code}
+
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ clause app
+ match TyExpr.App@app1 = lhs.exp1
+ match TyExpr.App@app2 = lhs.exp2
+
+ child u1 : Unify = unify
+
+ u1.exp1 = app1.func
+ u1.exp2 = app2.func
+ \end{code}
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ clause app
+ match TyExpr.App@app1 = lhs.exp1
+ match TyExpr.App@app2 = lhs.exp2
+
+ child u1 : Unify = unify
+
+ u1.exp1 = app1.func
+ u1.exp2 = app2.func
+
+ child u2 : Unify = unify
+
+ u2.exp1 = appAll u1.sub app1.arg
+ u2.exp2 = appAll u1.sub app2.arg
+
+ lhs.sub = u1.sub ++ u2.sub
+ \end{code}
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ clause vars
+ match TyExpr.Var@var1 = lhs.exp1
+ match TyExpr.Var@var2 = lhs.exp2
+ match True = var1.nm == var2.nm
+ \end{code}
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ clause vars
+ match TyExpr.Var@var1 = lhs.exp1
+ match TyExpr.Var@var2 = lhs.exp2
+ match True = var1.nm == var2.nm
+ clause leftvar
+ match TyExpr.Var@var1 = lhs.exp1
+ lhs.sub = [(var1.nm, lhs.exp2)]
+ \end{code}
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ clause vars
+ match TyExpr.Var@var1 = lhs.exp1
+ match TyExpr.Var@var2 = lhs.exp2
+ match True = var1.nm == var2.nm
+ clause leftvar
+ match TyExpr.Var@var1 = lhs.exp1
+ lhs.sub = [(var1.nm, lhs.exp2)]
+ clause rightvar
+ match TyExpr.Var@var1 = lhs.exp2
+ lhs.sub = [(var1.nm, lhs.exp1)]
+ \end{code}
+}
+
+\frame{
+ \frametitle{Unification}
+
+ \begin{code}
+ clause vars
+ match TyExpr.Var@var1 = lhs.exp1
+ match TyExpr.Var@var2 = lhs.exp2
+ match True = var1.nm == var2.nm
+ clause leftvar
+ match TyExpr.Var@var1 = lhs.exp1
+ lhs.sub = [(var1.nm, lhs.exp2)]
+ clause rightvar
+ match TyExpr.Var@var1 = lhs.exp2
+ lhs.sub = [(var1.nm, lhs.exp1)]
+ clause rest
+ lhs.sub = error "mismatch in unify."
+ \end{code}
+}
+
+\frame{
+ \frametitle{Invoking function}
+
+ \begin{code}
+ typeCheck :: Expr -> IO TyExpr
+ typeCheck exp = do
+ let inh = Inh_Expr_infer
+ { ast_Inh_Expr = exp
+ , env_Inh_Expr = []
+ , frs_Inh_Expr = 0
+ }
+ syn <- invoke_Expr_infer dnt_Expr inh
+ let x = ty_Syn_Expr syn
+ return (alpha_rename x)
+ \end{code}
+}
+
+\frame{
+ \frametitle{Results}
+
+ \begin{code}
+ *HM> let compose = (Expr_Lam "f" ...)
+ *HM> compose
+ \f -> \g -> \x -> f g x
+ *HM> typeCheck compose
+ (a4 -> a3) -> (a2 -> a4) -> a2 -> a3
+ \end{code}
+}
+
+\section{First class polymorphism}
+\frame{
+ \frametitle{Higher-rank types}
+
+ \begin{itemize}
+ \item<1-> Haskell '98 types are rank-1
+ \item<2-> |a -> b -> a|
+ \item<3-> |forall a b. a -> b -> a|
+ \item<4-> |forall a. a -> (forall b. b -> a)|
+ \item<5-> |forall b. (forall a. a -> a) -> b -> b|
+ \end{itemize}
+
+ \only<6->{Refers to the the number of |forall|s nested to the left of a |(->)|}
+}
+
+\frame{
+ \frametitle{Higher-rank types}
+
+ \begin{code}
+ poly = \f -> (f 1, f 'c')
+ \end{code}
+
+ cannot be expressed without Higher-Rank types.
+}
+
+\frame{
+ \frametitle{SystemF}
+ Provides typing support for higher-rank functions
+
+ \begin{block}<+->{Terms}
+ \begin{itemize}
+ \item Type abstraction ($\Lambda X. t$)
+ \item Type application (t [T])
+ \end{itemize}
+ \end{block}
+
+ \begin{block}<+->{Types}
+ \begin{itemize}
+ \item Type variables (X)
+ \item Universal types (|forall X. T|)
+ \end{itemize}
+ \end{block}
+}
+
+\frame{
+ \frametitle{SystemF}
+ Typing of the |id| function
+ \begin{itemize}
+ \item<1-> $id = \lambda x. x$
+ \item<2-> $id = \Lambda X. \lambda x : X \rightarrow x : X$
+ \item<3-> Typing |id 3|
+ \item<4-> id [Int] = [X$\rightarrow$Int]($\lambda x:X. x$)
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{SystemF}
+ \begin{itemize}
+ \item<1->|poly = \f -> (f 1, f 'c')| is not typeable
+ \item<2->Undecidable
+ \item<3->Requires annotation
+ \item<4->|poly = \(f :: forall a -> Int) -> (f 1, f 'c')|
+ \end{itemize}
+}
+
+\section{HML}
+
+\frame{
+ \frametitle{Introduction}
+
+ \begin{itemize}
+ \item<1-> Extension of Hindley-Milner
+ \item<2-> Supports first class polymorphism
+ \item<3-> Insensitive to order of applications
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Type scheme}
+
+ \begin{eqnarray*}
+ \varphi & ::= & \forall (\alpha \geq \varphi_1). \varphi_2 \\
+ & || & \sigma \\
+ & || & \bot
+ \end{eqnarray*}
+
+ \begin{description}
+ \item[trivial] $\forall (\alpha \geq Int). \varphi$
+ \item[unconstrained] $\forall (\alpha \geq \bot). \varphi$
+ \end{description}
+}
+
+\frame{
+ \frametitle{Type Prefix}
+
+ \begin{eqnarray*}
+ Q ::= \alpha_1\geq\hat{\varphi}_1,\ldots,\alpha_n\geq\hat{\varphi}_n
+ \end{eqnarray*}
+
+ \begin{itemize}
+ \item<1-> Cannot contain a trivial bounds
+ \item<2-> domain of prefix and substitutions disjoint
+ \item<3-> A type is quantified by prefix or itself
+ \end{itemize}
+}
+
+\frame{
+ \frametitle{Benefits}
+
+ \begin{quotation}
+ \textit{inc} $\hspace{16.5pt}$ :: Int $\rightarrow$ Int\\
+ \indent \textit{single} $\hspace{3pt}$ :: $\forall\alpha$. $\alpha$ $\rightarrow$ List $\alpha$\\
+ \indent \textit{append} :: $\forall\alpha$. List $\alpha$ $\rightarrow$ List $\alpha$ $\rightarrow$ List $\alpha$\\
+ \indent \textit{map} $\hspace{10.2pt}$ :: $\forall\alpha\beta$. ($\alpha$ $\rightarrow$ $\beta$) $\rightarrow$ List $\alpha$ $\rightarrow$ List $\beta$
+ \end{quotation}
+
+ and the expression
+
+ \begin{code}
+ let ids = single id
+ poly = \(f::forall a. a -> a) -> (f 1, f 'a')
+ in (map poly ids, append (single inc) ids)
+ \end{code}
+}
+
+\frame{
+ \frametitle{Benefits}
+
+ \begin{code}
+ let ids = single id
+ poly = \(f::forall a. a -> a) -> (f 1, f 'a')
+ in (map poly ids, append (single inc) ids)
+ \end{code}
+
+ \begin{itemize}
+ \item<1-> Not possible in GHC
+ \item<2-> Requires |ids| to be List (Int $\rightarrow$ Int) and List ($\forall\alpha$. $\alpha$ $\rightarrow$ $\alpha$)
+ \item<3-> Anything inside the bounds are instantiated
+ \item<4-> |forall(beta >= forall alpha. alpha -> alpha) . List beta|
+ \item<5-> "The type |List beta| can be used for any |beta| that is an instance of |forall alpha. alpha -> alpha|"
+ \end{itemize}
+}
+
+\section{Conclusion}
+
+\frame{
+ \frametitle{Conclusions}
+
+ \begin{itemize}
+ \item Ruler-core allows traversals of arbitrary number of trees
+ \item Default rules save a lot of work
+ \item Higher-ordered
+ \item Saved \~26\% of code on case study
+ \end{itemize}
+}
View
12 Docs/Thesis/A-Formal.tex
@@ -6,17 +6,17 @@ \chapter{Specification of algorithms}
\begin{figure}[h!]
< subsume :: (Q, phi, sigma) -> (Q, theta)
-< -- where |phi| and |sigma| are in normal form
+< -- where |sigma| is in normal form and |phi| is converted to normal form.
< --
< subsume (Q, bot, rho) =
< _return (Q, [])
< --
< subsume (Q, forall Q_2 . rho_1, forall (vec alpha) . rho_2) =
-< _assume (dom Q) # (dom Q_2) ^^ -- and |vec c| are fresh _constants
+< _assume (dom Q) # (dom Q_2) ^^ -- and |vec c| are fresh xconstants
< _let (Q_1, theta_1) = unify (Q ^ Q_2, subst (vec alpha) (vec c) rho_1, rho_2)
< _let (Q_3, Q_4) = split (Q_1, dom Q)
< _let theta_2 = theta_1 - dom Q_4
-< _failIf (c _in (_con theta_2 cup (_con Q_3))
+< _failIf (c _in (xcon theta_2 cup (xcon Q_3))
< _return (Q_3, theta_2)
\caption{Subsume specification}
@@ -100,9 +100,9 @@ \chapter{Specification of algorithms}
< _return (Q_3, theta_3 dot theta_2 dot theta_1)
< --
< unify(Q, forall alpha . sigma_1, forall beta . sigma_2) =
-< -- assume |c| is a fresh (skolem) _constant
+< -- assume |c| is a fresh (skolem) xconstant
< _let (Q_1, theta_1) = unify(Q, lbrack alpha := c rbrack sigma_1, lbrack beta := c rbrack sigma_2)
-< _failIf (c _in (_con(theta_1) cup (_con(Q_1)))
+< _failIf (c _in (xcon(theta_1) cup (xcon(Q_1)))
< _return (Q_1, theta_1)
\caption{Unify specification}
@@ -117,7 +117,7 @@ \chapter{Specification of algorithms}
< _return (Q, [], Gam(alpha))
< --
< infer(Q, Gam, c) =
-< _return (Q, [], alpha_conv (Gam(c)))
+< _return (Q, [], alphaxconv (Gam(c)))
< --
< infer(Q, Gam, Let (vec (x = e_1)) in e_2) =
< -- where |vec alpha| are fresh variables and |vec alpha # vec (x = e_1)|
View
1,334 Docs/Thesis/AttributeGrammar.tex
@@ -1,666 +1,668 @@
-\chapter{Attribute Grammars}
-\label{AG}
-\emph{Context Free Grammars}(CFG) can only describe the syntax of programming languages\cite{knuth1}. They cannot specify any context-sensitive conditions. In practice we are interested in more then the well-formedness of a grammar. For example the condition that the same value for \emph{n} be enforced in the string $a^nb^nc^n$ cannot be described by using context free grammars\cite{ken}.
-
-In 1968 by Donald E. Knuth~\cite{knuth1} introduced \ags as a way to define the \emph{meaning} to context free languages. A typical is the defining the value of an expression. The expression grammar is defined by the following \emph{CFG} expressed as \emph{BNF}:
-
-\input{./BNF-Expr}
-
-A symbol is either a terminal or a nonterminal. A terminal is typically a single character or a single token. A nonterminal represents a sequence of symbols, and can be seen as a classification of these symbols.
-The \emph{terminals} in this case are the \emph{operators} $+,-$ and the \emph{digits} $0\ldots 9$. The \emph{nonterminals} are the \emph{Expr} and \emph{number} symbols. The grammar specifies that an expression is either a \emph{number} or an \emph{Expr} followed by an \emph{operator} and then another \emph{Expr}. For every sentence that can be produced by this grammar a parse tree can be assigned. For instance, the expression "2 + (3 - 5) + (6 - 1)" produces the parse tree in figure \ref{fig.example1.parsetree}. We define the value of an expression as a composition of the values of its sub expressions, See figure \ref{fig.example1.parsetree}
-
-\begin{figure}[H]
-\centering
-\begin{tikzpicture}[>=stealth']
-\node (r0) [draw, circle] {+};
-\node (r1) [below left=1cm of r0, draw, circle] {2}
- edge[<-] (r0.south);
-\node (r2) [below right=1cm of r0, draw, circle] {+}
- edge[<-] (r0.south);
-\node (s0) [below left=1.4cm of r2, draw, circle] {-}
- edge[<-] (r2.south);
-\node (s1) [below left=1cm of s0, draw, circle] {3}
- edge[<-] (s0.south);
-\node (s2) [below right=1cm of s0, draw, circle] {5}
- edge[<-] (s0.south);
-\node (f0) [below right=1.5cm of r2, draw, circle] {-}
- edge[<-] (r2.south);
-\node (f1) [below left=1cm of f0, draw, circle] {6}
- edge[<-] (f0.south);
-\node (f2) [below right=1cm of f0, draw, circle] {1}
- edge[<-] (f0.south);
-\end{tikzpicture}
-\caption{Parse tree example for "2 + (3 - 5) + (6 - 1)"}
-\label{fig.example1.parsetree}
-\end{figure}
-
-AG's are additions to CFGs that propagate semantic information along through parse trees. Nodes in the AST correspond to either a production or a nonterminal in the grammar. With this in mind there are two kinds of \emph{attributes} defined by knuth\cite{knuth1}:
-\begin{description}
-\item[\textbf{synthesized}] An attribute that is defined in terms of attributes of the \emph{descendants} of the AST node. They are passed bottom-up through the tree.
-\item[\textbf{inherited}] An attribute that is defined in terms of the \emph{ancestors} of the nonterminal. They are passed top-down through the tree.
-\end{description}
-
-An attribute can be both \emph{synthesized} and \emph{inherited}. These are however different attributes which share the same name. Semantics can be defined for the tree in figure \ref{fig.example1.parsetree} by assigning a synthesized attribute named \emph{value} of type \textbf{Int} to the nonterminals \emph{number} and \emph{Expr}. The evaluation rules are:
-
-\begin{figure}[H]
-\begin{grammar}
- [(colon){$\rightarrow$}]
- [(semicolon)$||$]
- [(comma){}]
- [(period){\\}]
- [(quote){\begin{bf}}{\end{bf}}]
- [(nonterminal){$\langle$}{$\rangle$}]
-value(<$Expr^+$>) : value(<$Expr_1$>) + value(<$Expr_2$>).
-value(<$Expr^-$>) : value(<$Expr_1$>) - value(<$Expr_2$>).
-value(<Expr>) \hspace{5pt} : value(<number>).
-value(<number>): <number>.
-\end{grammar}
-\caption{attribute definition for Expressions}
-\label{semantics:bnf:expr}
-\end{figure}
-
-Subscripts disambiguate between the different expression types and superscripts distinguish between the different cases of the \emph{operator} in an expression. Figure \ref{fig.example2.decoratedtree} shows a tree decorated with the synthesized attribute \emph{v} (short for value) along with intermediate values of \emph{v}.
-
-\begin{figure}[H]
-\centering
-\begin{tikzpicture}[>=stealth']
-\node (r0) [draw, circle] {+};
-\node (ri0) [draw, rectangle, right=0.1cm of r0, gray] {v=5};
-\node (r1) [below left=1.7cm of r0, draw, circle] {2}
- edge[<-] (r0.south);
-\node (ri1) [draw, rectangle, right=0.1cm of r1, gray] {v=3};
-\node (r2) [below right=1.7cm of r0, draw, circle] {+}
- edge[<-] (r0.south);
-\node (ri2) [draw, rectangle, right=0.1cm of r2, gray] {v=-2};
-\node (s0) [below left=2.1cm of r2, draw, circle] {-}
- edge[<-] (r2.south);
-\node (ri3) [draw, rectangle, right=0.1cm of s0, gray] {v=15};
-\node (s1) [below left=1.0cm of s0, draw, circle] {3}
- edge[<-] (s0.south);
-\node (ri4) [draw, rectangle, right=0.1cm of s1, gray] {v=3};
-\node (s2) [below right=1.0cm of s0, draw, circle] {5}
- edge[<-] (s0.south);
-\node (ri5) [draw, rectangle, right=0.1cm of s2, gray] {v=5};
-\node (f0) [below right=2.2cm of r2, draw, circle] {-}
- edge[<-] (r2.south);
-\node (ri6) [draw, rectangle, right=0.1cm of f0, gray] {v=5};
-\node (f1) [below left=1.0cm of f0, draw, circle] {6}
- edge[<-] (f0.south);
-\node (ri7) [draw, rectangle, right=0.1cm of f1, gray] {v=6};
-\node (f2) [below right=1.0cm of f0, draw, circle] {1}
- edge[<-] (f0.south);
-\node (ri8) [draw, rectangle, right=0.1cm of f2, gray] {v=1};
-\end{tikzpicture}
-\caption{decorated tree example for "2 + (3 - 5) + (6 - 1)"}
-\label{fig.example2.decoratedtree}
-\end{figure}
-
-\Ags are akin to \emph{catamorphisms}, except without the need to define the algebra and explicit traversals of the tree.
-
-\section{Utrecht University Attribute Grammar Compiler (UUAGC)}
-UUAGC (Utrecht University Attribute Grammar Compiler) is a preprocessor which parses Attribute Grammars in a custom language.
-The grammar, attributes for nonterminals and rules for attributes per production can be specified in attribute grammar source code. The preprocessor translates the attribute grammar code to datatype declarations, folds and semantic functions, which are functions that assign meaning to the specified grammar. In the UHC Type checker, the UUAGC system is used to do most of the work, with the exception of unification.
-
-\subsection{Limits of UUAGC}
-The \emph{UUAGC} system is limited in that it cannot perform case distinctions over multiple \emph{abstract syntax trees} at the same time\cite{visitag}. For most applications this is not an issue, since in most cases there is only one \emph{AST} that needs to be traversed at a time.
-
-In an attribute grammar, we express a function between inherited and synthesized attributes. The productions form the cases of this function, and we give rules for each case. Thus, we distinguish cases based on the production. For some example, we wish to also distinguish cases based on attribute values. This we cannot express with conventional attribute grammars.
-
-With higher-order AGs, it is possible to define a grammar for the call-tree of such a function. A higher-order attribute can then inspect the attribute's values and dispatch to a suitable production. However, this is a rather primitive mechanism to express case distinctions.
-
-Unification is one such example where we need to distinguish cases based on attribute values. Unification is the act of trying to find structural \& semantic equivalence between two types. It is a critical part of type checking. Given two types \emph{$t_{1}$} and \emph{$t_{2}$}, unification attempts to find a list of substitutions that allows the instantiation of type \emph{$t_{1}$} to type \emph{$t_{2}$}. For this to be accomplished it needs to be possible to traverse both \emph{$t_{1}$} and \emph{$t_{2}$} concurrently; while comparing the types at every node.
-
-%\begin{figure}[!h]
-%\begin{center}
-%\begin{neatopic}[width=.5\textwidth, height=50mm]
-% subgraph type1 {
-% node [];
-% a1 [label="a"];
-% a2 [label="b"];
-% a1 -- a2;
-% label = "Type 1";
-% }
-%
-% subgraph type1 {
-% node [];
-% b1 [label="Int"];
-% b2 [label="Int"];
-% b1 -- b2;
-% label = "Type 2";
-% }
-%
-% {rank=same; a1 b1}
-% {rank=same; a2 b2}
-%
-% a1 -- b1 [style=dotted, label ="a:= Int"];
-% a2 -- b2 [style=dotted, label ="b:= Int"];
-%\end{neatopic}
-%\end{center}
-%\caption{Unification example}
-%\label{unify-simple}
-%\end{figure}
-
-When unifying the type \emph{$a \rightarrow b$} with \emph{$Int \rightarrow Int$} both trees are traversed concurrently while the nodes are compared, and ultimately terminating with the substitution list [(a,Int), (b, Int)]. The ability to be able to traverse \emph{AST}s that were just produced is also required because the structure of the \emph{type} being produced is not known beforehand. During type inference more type information is gradually gained on the type that needs to be produced. This generally presents a problem for AGs\cite{ruler-front} because the synthesis and evaluation phases are two separate phases.
-
-\section{Ruler-Core}
-\Rcore addresses these restrictions in AG by introducing a model based on \emph{visit}\cite{visits} functions. The resulting language is more flexible while still retaining the core semantics of reasoning over decorated trees with attributes. The simplest description of ruler-core would be:\emph{a language to manipulate visit sequences.}
-
-\begin{quotation}
-A \emph{visit function}\cite{visitag} is a (partial) function that takes several inputs (\emph{inherited attributes}) and produces several output values (\emph{synthesized attributes}) and a continuation for the next visit.
-\end{quotation}
-
-As with traditional AGs \emph{inherited}, attributes are passed top-down in the tree while \emph{synthesized} attributes are passed bottom-up. An attribute can be both \emph{synthesized} and \emph{inherited}.
-
-Everything that can be expressed in uuagc can be expressed in ruler-core, but the inverse is not true. One of the simplest things that can be done in ruler-core is the evaluation of a single AST. In order to write an evaluator for the expression type in figure \ref{grammar:bnf:expr} we first need to define the \emph{datatypes} and \emph{interfaces}.
-
-\begin{figure}[H]
-\begin{minipage}[t]{0.4\linewidth}
-\begin{code}
-data Expr
- con Num
- val :: Int
- con Expr
- exp1 : Expr
- op :: Operator
- exp2 : Expr
-\end{code}
-\end{minipage}
-\begin{minipage}[t]{0.6\linewidth}
-\begin{code}
-data Operator
- con Plus
- con Minus
-
-itf Expr
- visit eval
- inh ast :: Expr -- input
- syn v :: Int -- output
-\end{code}
-\end{minipage}
-\caption{Evaluating expressions in ruler-core: datatypes}
-\label{example:tutorial1:datatypes}
-\end{figure}
-
-\subsubsection{Data types}
-\Rcore data types resemble Haskell's record syntax with some notable differences.
-
-Instead of an $=$ or \textbar \space like in Haskell, an explicit \textbf{con} keyword is used to indicate the \emph{name} of the constructor. Every element of the constructor must be explicitly named. Indentation is also important since indentation separates constructors, in general \textbf{con} needs to be deeper indented then \textbf{data} and the members of a constructor should be indented further than the \textbf{con}.
-
-Figure \ref{example:tutorial1:datatypes} illustrates two different ways of declaring a type of constructor argument. $:$ is used to declare a type that is a \emph{nonterminal} and $::$ indicates a \emph{terminal}. The reason for this distinction is that for \emph{nonterminal} types some extra machinery is automatically defined. It is important to know that for every nonterminal \rcore enforces that at least one of the declared visits have an \emph{inherited} attribute named \emph{ast} with the type of the nonterminal.
-
-Only types of kind $:: \star$ are allowed by \rcore, which means only monomorphic types are accepted. As is the case with UUAGC, the constructors generated will be in the form of \emph{TypeName\_ConstructorName}. To put this concretely figure \ref{example:tutorial1:datatypes} exposes for the type \emph{Expr} the constructor functions \textbf{Expr\_Num} and \textbf{Expr\_Expr}.
-
-\subsection{Interfaces}
-\emph{Interfaces} are analogous to the interface definitions in other languages, except instead of declaring function prototypes/signatures we declare visits and their attributes. Interfaces declare \emph{nonterminal}s which can be named the same as their corresponding \emph{data types}. For those familiar with uuagc, a \textbf{ATTR} declaration in uuagc would equal an interface declaration with one visit and all the attributes declared in the \textbf{ATTR} would be part of this one visit. The ability to explicitly declare these visits and interfaces is where ruler-core's true abilities come in.
-
-\begin{figure}[h!]
-\begin{code}
-itf <name>
- {visit <name>
- {attributes}
- }
-\end{code}
-\caption{Ruler-core interface declaration syntax}
-\label{itf:syntax}
-\end{figure}
-
-Multiple visits can be defined for an interface. The order of appearance of visits is relevant. Every visit is a new \emph{co-routine} and will be scheduled to run by \rcore. The order of appearance of visits is relevant. Attributes that belong to an earlier visit are computed before attributes of a later visit. Visits are usually executed as soon as possible and as many visits as possible are executed at the same time. % Outside of visits we can also declare attributes. These attributes are thus not explicitly assigned to a visit. They will be automatically assigned to the earliest visit possible.
-
-\subsection{Semantics}
-We define the rules of a production per visit. A rule may only be applied in the same visit or a later visit than it is declared in. Also, visits introduce a scope.
-Rules in ruler core are either an assertion (like a match statement), an assignment (assignment to attributes) or a statement (default rules, invocations, child declarations, etc..)
-
-Furthermore, a visit is divided in one or more clauses. Each clause provides an alternative set of rules for the that visit of the production. During evaluation, clauses are tried in the order of appearance. Special match rules specify conditions for backtracking.
-In ruler-core, productions themselves are actually also clauses.
-
-Visits like all functions have arguments, or in this case attributes. The \emph{inh} keyword indicates an \emph{inherited} attribute (input value), whereas \emph{syn} indicates a \emph{synthesized} attribute (output value). The order of declaration of the attributes is not important.
-
-After declaring the datatypes and interfaces; the actual semantic function can be declared to evaluate the expressions:
-
-\begin{figure}[H]
-\begin{code}
-datasem Expr
- clause Num
- lhs.v = loc.val -- output
- clause Expr
- internal opcheck
- clause Plus
- match Operator.Plus@loc = loc.op
- lhs.v = exp1.v + exp2.v -- output
- clause Minus
- match Operator.Minus@loc = loc.op
- lhs.v = exp1.v - exp2.v -- output
-\end{code}
-\caption{Evaluating expressions in ruler-core: datasem}
-\label{example:tutorial1:datasem}
-\end{figure}
-
-The elements that are part of figure \ref{example:tutorial1:datasem} will be explained in the coming sections:
-
-\subsection{Semantic functions}
-\label{semantics}
-A semantic function is a function that defines a semantic for a interface. % Within a semantic function it is possible to \emph{invoke} any other coroutine(s) that might be needed. Although there is an implicit \textbf{invoke} keyword, it is rarely needed to explicitly \emph{invoke} a visit. When all attributes are defined for a visit it is implicitly invoked.
-
-The contents of a semantic function is formed by rules. Rules are given per production, per visit and per clause. Rules exists in several form. Clauses provide a way to do scoping inside semantic functions. A clause inherits all the attributes of its parent clauses in the same visit. If a visit has only one clause it does not have to be explicitly declared.
-
-There are two ways of defining a semantic function: using the \textbf{datasem} and a \textbf{sem} keyword. The example in figure \ref{example:tutorial1:datasem} uses the former.
-
-\subsubsection{Datasem}
-Defining semantics for a \emph{data type} and \emph{nonterminal}(interface) defined in \rcore can be done with a shorthand: \textbf{datasem}. As the name suggests \textbf{datasem} stands for \emph{datatype semantics}. % Those familiar with uuagc can compare defining a \textbf{datasem} in \rcore with a \textbf{SEM} declaration in uuagc.
-
-\begin{figure}[!h]
-\begin{code}
-datasem <nonterminal> [monad <type>]
- {clause <name>
- ...
- }
-\end{code}
-\caption{Syntax definition of a sem function}
-\label{datasem:syntax}
-\end{figure}
-
-Defining a \textbf{datasem} is a shorthand for defining a \textbf{sem} (more on this later). The \emph{monad} type is optional and will be ignored for now. %, however if a type was specified for any semantic function which is used by, or used in this \textbf{datasem} then to disambiguate you need to define the type in this declaration as well.
-
-The \emph{clauses} in a \textbf{datasem} should coincide with the constructors of the data type. The preprocessor enforces that there is a clause for every declared constructor. Every clause declaration implicitly adds a \textbf{match} statement for every clause. This is the reason why there is a required attribute \emph{ast} for ever nonterminal. This is the attribute on which matches are tried out on in the main clauses of a \textbf{datasem}. A \textbf{match} is essentially an assertion, if failed nothing else for that clause is tried out and backtracking takes place.
-
-For every \textbf{datasem}, in every clause where there is a nonterminal in the definition of that clause, there will be an implicit child declared for that field. It is for this reason that in figure \ref{example:tutorial1:datasem} it is refered to the operator terminal via the local child (loc.op) and the \emph{|exp1|} and \emph{|exp2|} nonterminals directly. In the case of \emph{Expr} there was only one \emph{inherited} attribute: ast, but since ast is filled in automatically by \rcore the invoke is implicitly performed. Which is why the synthesized attributes can be accessed without any further action.
-
-Because of all these properties, a \textbf{datasem} only provides the ability to traverse one tree at a time.
-\subsection{Bindings}
-\label{bindings}
-In Haskell the \emph{Let} binding is used when introducing new variables in a sequential computation. In \rcore the keyword \textbf{let} is not used when assigning values to bindings, however since bindings in \rcore are translated to \textbf{let} declarations by the preprocessor the same behaviors is to expected. This means that binding to a \emph{pattern} on the \emph{left hand side} is valid. e.g. \[ (\alpha, \beta) = \ldots \] is allowed. This allows the definition multiple attributes at the same time.
-
-An attribute may only be defined once. The compiler will generate an error if it finds code that tries to redefine an attribute (there is no shadowing).
-
-The notation for referencing patterns, expressions and variables is \emph{k}.\emph{x} where \emph{k} is the name of child name and \emph{x} the attribute to be referenced. There are two build in reserved children:
-
-\paragraph{lhs}
-The \emph{lhs} child is used to access the \emph{inherited} attributes and to assign values to the \emph{synthesized} attributes. Which one is intended is derived from the context in which they are used: When used at the \emph{left hand side} of an expression they are treated as \emph{synthesized} attributes, but when used in the \emph{right hand side} of a binding they refer to the \emph{inherited} values.
-
-\paragraph{loc}
-The \emph{loc} child is used in a way that is analogous to local variables in other languages. As many \emph{loc} attributes as needed can be defined. The scope of the \emph{loc} is only the clause/visit that it is declared in and the its children.
-
-\subsection{Clauses}
-Clauses are a way of defining alternatives. When an assertion in clause fails it backtracks out of the clause and the next one is tried out. Clauses are tried out in sequential order. A visit can contain multiple clauses, corresponding to the different ways to interpret the \emph{inherited} attributes of the visit.
-
-If no clause can be executed in a visit, the system backtracks to the parent visit and clause. This behavior goes all the way up to the root. In order to be able to generate proper errors it is recommended to always make the collection of clauses for a visit total. The easiest way to do this is to make a \emph{catch-all} clause at the end.
-
-\subsection{Matches}
-Matches are akin to case expressions, like case expressions they force evaluation and attempt to pattern match on the datatype. Except unlike case expressions, you can only on a pattern that contains a constructor. For any data type defined in \rcore itself and the \emph{Bool} type a \textbf{match} can be done.
-
-\begin{figure}[h!]
-\begin{code}
-match TypeName.ConstructorName@child = <expression>
-\end{code}
-\caption{Match syntax definition}
-\label{match:syntax}
-\end{figure}
-
-If the \emph{match} succeeds the \emph{named} attributes defined for the elements of the Constructor are added as attributes of the specified \emph{child}. Of the two reserved children \textbf{lhs} and \textbf{loc} only \textbf{lhs} is not allowed as a child name here\footnote{note that the childname "var" is also reserved, but in the case of var you will get an actual syntax error}. On the other hand if the \emph{match} fails the entire clause is aborted and backtracking is performed.
-There are exceptions to this syntax, \emph{build in} types such as \emph{Bool} which have no children have an alternative syntax:
-
-\begin{figure}[h!]
-\begin{code}
-match True = <expression>
-\end{code}
-\caption{Example match on Bools}
-\label{match:bool}
-\end{figure}
-
-Figure \ref{match:bool} shows that on certain types the rules are relaxed, particularly the Bool type. Note that because there is no build in support for the \emph{Maybe} type, often in this thesis this will be supported by first match on |True = isJust expr| and then a subsequent call the |fromJust| function.
-
-\subsection{Internal}
-Sometimes it is necessary to branch on a value that was just synthesized. This is achieved by using the \emph{internal} keyword. The \emph{internal} keyword provides a means of scoping and branching at the same time. Internals contain a list of \emph{clauses} which will be tried out in order one at a time. Attributes that were declared before the \emph{internal} statement are all in scope inside an internal block.
-
-\begin{figure}[h!]
-\begin{code}
-internal <name>
- {clause}
- {clause}
- ...
-\end{code}
-\caption{Internal syntax definition}
-\label{internal:syntax}
-\end{figure}
-
-A clause may have at most one internal visit, although an internal visit may have clauses with again an internal visit. Any code below the internal is floated upward if possible. Referencing a attribute defined in an internal from a parent clause is invalid.
-
-\subsection{Invoking semantic functions}
-\label{calling:sem}
-To complete this example we also need be able to call semantic functions from Haskell:
-
-\begin{figure}[H]
-\begin{code}
-eval :: Expr -> IO Int
-eval exp = do
- let inh = Inh_Expr_eval { ast_Inh_Expr = exp }
- syn <- invoke_Expr_eval dnt_Expr inh
- let x = v_Syn_Expr syn
- return x
-\end{code}
-\caption{calling wrappers from within haskell}
-\end{figure}
-
-The first line (the let) defines the \emph{inherited} attributes expected for the visit that is to be called. In this case the \emph{eval} visit, which specified that there is one \emph{inherited} attribute called \emph{ast}. For every visit there is a record for the \emph{inherited} and \emph{synthesized} attributes. The record name is build up as \textbf{X\_I\_v} where \textbf{X} equals "Inh" or "Syn", \textbf{I} is the interface name and \textbf{v} the visit name.
-
-The labels of the fields inside these record are made up of \textbf{attr\_X\_I} where \textbf{attr} is the attribute name, \textbf{X} either "Inh" or "Syn" indicating the attribute type and \textbf{I} the interface name.
-
-The second line invokes the \textbf{eval} routine with the given \emph{inherited} attributes and returns the \emph{synthesized} attribute records. The syntax for invoking a visit is \textbf{invoke\_I\_v \emph{wrapper inhs}}. The \textbf{I} indicates the interface name, the \textbf{v} the visit name, \textbf{inhs} stands for the record containing the inherited attributes and finally \textbf{wrapper} is the wrapper function to call. For every \textbf{datasem} \rcore defines a wrapper \textbf{dnt\_I} and for every \textbf{sem} function the name which was explicitly given is used (more on this later).
-
-The expression example can be scaled up by adding variables to the expressions. Two extra constructors need to be introduced to \emph{Expr}. They correspond to introduction and elimination:
-
-\begin{code}
- con Var
- nm :: String
- con Let
- nm :: String
- exp : Expr
- body : Expr
-\end{code}
-
-In order to be able to evaluate variables an \emph{environment} needs to be passed down through the tree to collect all the declarations. This is done by modifying the \emph{interface} of \emph{Expr}. The new \emph{interface} definition is:
-
-\begin{code}
-itf Expr
- visit eval
- inh ast :: Expr
- inh env :: Env
- syn v :: Int
- syn env :: Env
-\end{code}
-
-A new attribute \emph{env} is added which is both a \emph{synthesized} and \emph{inherited} attribute. Strictly speaking the \emph{env} can only be an \emph{inherited} attribute, however we allow access to variables introduced in the left branch in Expr in the right branch. In other words we allow the definition of variable in the left branch to be used in the right branch, which is why \emph{env} has to be synthesized as well.
-
-Now that the type and interface have been updated, the next step is to extend the datatype semantics to support the new clauses.
-
-\begin{code}
- clause Var
- loc.val = lookup loc.nm lhs.env
- lhs.v = fromMaybe (error ...) loc.val
- clause Let
- loc.env = (loc.nm, exp.v): lhs.env
- body.env = loc.env
- lhs.v = body.v
-\end{code}
-
-The code for |Var| and |Let| show that |lhs| is used both for synthesized and inherited attributes. Which is intended is determined by the way it is used (see section \ref{bindings}).
-
-Sometimes special types such as Lists are needed. The next example shows how list support is provided in \rcore by adding the ability to evaluate a list of expressions.
-
-\subsection{Special types}
-At the time of writing \rcore only supports \emph{List}s in the category of special types, but can easily be extended to support any product datatype like |Maybe| and |Map|.
-
-\subsubsection{Lists}
-Lists are declared using the \textbf{type} keyword. The syntax is very familiar to Haskell programmer: \[ \textbf{type} \hspace{5pt} \emph{name} : [\emph{type}] \]
-
-Declaring the list type \emph{Expr} would look like:
-
-\begin{figure}[H]
-\begin{code}
-type Exprs : [Expr]
-\end{code}
-\caption{Exprs declaration using a list type}
-\label{type:exprs}
-\end{figure}
-
-Figure \ref{type:exprs} declares the \emph{nonterminal} Exprs. This definition is semantically almost equivalent the data declaration in figure \ref{type:lists}. It introduces two type constructors \emph{Nil} and \emph{Cons} along with some extra attributes.
-
-\begin{figure}[H]
-\begin{code}
-data Exprs
- con Nil
- con Cons
- hd : Expr
- tl : Exprs
-\end{code}
-\caption{Syntactically equivalent definition of the Exprs type}
-\label{type:lists}
-\end{figure}
-
-Next to creating the syntactical information, ruler-core also generates some \emph{Interface} declarations for every list type.
-
-The interface declared for the \emph{Exprs} example in figure \ref{type:lists} is equivalent to:
-
-\begin{figure}[h!]
-\begin{code}
-itf Exprs
- visit exprs_visit
- inh ast :: Exprs
-\end{code}
-\caption{Ruler-core interface declaration syntax}
-\label{itf:exprs}
-\end{figure}
-
-A special \emph{inherited} attribute \textbf{ast} is declared on which \emph{matches} will be performed in clauses. This interface itself is not useful, instead the preprocessor enforces that at least one of the visits declared for the \emph{non-terminal} Exprs contain an inherited attribute \emph{ast}. If this is not the case an error will be generated.
-
-With figure \ref{type:lists} and \ref{itf:exprs} in mind the real interface to our \emph{Exprs} type can be declared along with the corresponding \textbf{datasem}.
-
-\begin{figure}
-\begin{minipage}[t]{0.4\linewidth}
-\begin{code}
-itf Exprs
- visit eval
- inh ast :: Exprs
- inh env :: Env
- syn v :: [Int]
- syn env :: Env
-\end{code}
-\end{minipage}
-\begin{minipage}[t]{0.6\linewidth}
-\begin{code}
-datasem Exprs
- clause Cons
- hd.env = lhs.env
- tl.env = lhs.env
- lhs.v = hd.v : tl.v
- lhs.env = lhs.env
- clause Nil
- lhs.env = lhs.env
- lhs.v = []
-\end{code}
-\end{minipage}
-\end{figure}
-
-A lot of time is spent just copying over the \emph{env} attribute without actually doing anything with it. The larger the program gets the more problematic this becomes. For that \rcore has a mechanism called default rules.
-
-\subsection{Default rules}
-Default rules are a way of specifying default values for attributes that will be used in case explicit values are not given for the attributes in question. This is particularly useful for when it is required to do nothing in some clauses.
-When using a \emph{default} rule, all the visits of all the children of the current production/node become active. This means every \emph{inherited} attribute must be filled in.
-These rules are for both \emph{inherited} and \emph{synthesized} attributes. In case an attribute is both then the rule applies to both.
-
-Default rules collect values of attributes with the same name as the name for which the rule was declared. The values are collected in order of appearance. The \emph{head} of the list contains the value of the \emph{lhs} attribute (if it exists) and the \emph{tail} of the list contains the value of the last child. There are two kinds of \emph{default} rules:
-\begin{description}
-\item[\textbf{default}] { The \textbf{default} keyword generates a static error when none of the children of an attribute for which the rule is defined for return a result. }
-\item[\textbf{default?}] { The \textbf{default?} returns the empty list instead of an exception when no child has the \emph{synthesized} attribute in mention. For \emph{inherited} attributes, the initial value \emph{lhs.attribute} is also added to the list. }
-\end{description}
-
-\begin{figure}[h!]
-\[
-\textbf{default[\textit{?}]} \hspace{5pt} \emph{attribute} = \emph{function}
-\]
-\caption{Syntax for default expressions}
-\label{default:syntax}
-\end{figure}
-
-The default rules look at the attribute names and not the types. If a child defines an attribute for which there is a \emph{default} rule defined but where the \emph{type} of this attribute is different then the type of the other element of the list, then a type error will be generated by the Haskell compiler since Haskell lists are homogeneous.
-
-Using default rules we can simplify the definition of the Exprs |datasem|:
-
-\begin{code}
-datasem Exprs
- default? env = last
- clause Cons
- lhs.v = hd.v : tl.v
- clause Nil
- lhs.v = []
-\end{code}
-
-\subsection{Multiple tree traversals}
-The previous examples are all possible with standard attribute grammar, whereas the following example is not. To show how to traverse multiple trees at the same time, and the higher-orderedness of \rcore the following example deals with how to compare two trees for equality.
-
-The interesting points of this example are being able to traverse two AST at a time, and compare them while evaluating the tree at the same time.
-If the trees turns out to be equal, then return a new tree with the value of the evaluated node.
-
-The easiest one to start with is the \emph{Operator} terminal. An interface that allows the comparison of two operators needs to be defined:
-
-\begin{figure}[H]
-\begin{minipage}[t]{0.3\linewidth}
-\begin{code}
-data Operator
- con Plus
- con Minus
-\end{code}
-\end{minipage}
-\begin{minipage}[t]{0.7\linewidth}
-\begin{code}
-itf OperatorEq
- visit compare
- inh op1 :: Operator
- inh op2 :: Operator
- syn eq :: Maybe Operator
-\end{code}
-\end{minipage}
-\caption{Interface to compare two operators}
-\end{figure}
-
-The interface \emph{OperatorEq} declares one visit "compare" and within this visit three attributes. The two inherited attributes are the inputs (the two operators to compare) and the synthesized attribute \emph{eq} is the result. The same result pattern will be followed in the rest of the example, if the inputs are equal we return one of the inputs, if not Nothing is returned.
-
-Earlier in section \ref{semantics} it was explained that there are two types of semantic functions. \emph{Data type semantic} functions were explained in that section. This section however requires the second method of implementing semantic functions. The semantic function to compare two Operators is:
-
-\begin{code}
-{
-eqOp = sem eqOp : OperatorEq monad IO
- visit compare
- default? eq = const False
- clause Plus
- match Operator.Plus@loc = lhs.op1
- match Operator.Plus@loc = lhs.op2
-
- lhs.eq = return Operator_Plus
- clause Minus
- match Operator.Minus@loc = lhs.op1
- match Operator.Minus@loc = lhs.op2
-
- lhs.eq = return Operator_Minus
- clause other
-}
-\end{code}
-
-\subsubsection{Sem}
-\begin{figure}[!h]
-\begin{code}
-<name> = sem <internal_name> : <Interface> [monad <type>]
- {visit <name>
- {clause <name>
- ...
- }
- }
-\end{code}
-\caption{Syntax definition of a sem function}
-\label{sem:syntax}
-\end{figure}
-
-With \textb{sem} we can define a semantic function for an arbitrary interface. The different components of figure \ref{sem:syntax} are decomposed as:
-
-%\begin{figure}[ht!]
-\begin{description}
-\item[\textbf{\textit{name}}] This is the name of a semantic function. It is also the name of the Haskell function that will ultimately be generated. The same naming rules apply as for normal Haskell functions.
-\item[\textbf{\textit{internal\_name}}] The internal name is only used internally and is not of real importance for anything done in this thesis.
-\item[\textbf{\textit{Interface}}] Interface is the interface for which we are defining a semantic function. Every \emph{synthesized} attribute of the \emph{interface} must be assigned in the semantic function.
-\item[\textbf{monad \textit{type}}] { This is optional. Failure is handled by backtracking. This value specifies the monad to be used while backtracking. This also forces the monad's type from a completely polymorphic type to a more concrete type. The \textit{type} specified needs to be an instance of \emph{Monad} and \emph{MonadError} due to backtracking of match failures being done by catching errors.}
-\item[\textbf{visit \textit{name}}] The name of the visit for which a semantic function is being defined. At least one visit out of the interface should be implemented, but every \emph{synthesized} attributes must be filled in.
-\item[\textbf{clause \textit{name}}] The clause \textit{name} needs to be a unique name as the names are used to differentiate the clauses from one another inside the semantic function.
-\end{description}
-%\end{figure}
-
-Rules can be given for the entire sem-block, or scoped per visit or per clause.
-A \textbf{sem} function has one important layout rule: every \emph{visit} must be indented at least as deeply as preceding \textbf{sem} keyword. If this is not the case a parse error will be generated.
-
-\subsection{Syntax}
-\Rcore like its ancestor \emph{UUAGC} is implemented as a preprocessor for the language Haskell. From a .rul file pure Haskell code is generated. This code can be used from other Haskell functions as described in Section\cite{calling:sem}. How ruler code is translated to Haskell is outside the scope of this thesis. Details can be found in Middelkoop, et al.~\cite{visitag}.
-
-Though it provides various syntactical elements, one of the most important to remember is that the right hand side of an equal sign ($=$) contains Haskell code, which means it is possible to call any Haskell function including library functions from the RHS.
-
-\subsubsection{Haskell}
-In order to \emph{escape} into Haskell mode \emph{curly} braces \{ \} are used. The declaration of a module header and importing Haskell modules can be done with:
-
-\begin{figure}[!h]
-\begin{code}
-{
-{-# LANGUAGE BangPatterns #-}
-module Eval where
-
-import Control.Monad.Error
-}
-\end{code}
-\caption{Example Haskell mode escape}
-\end{figure}
-
-The location and indentation of the braces do not matter, the code between the braces is copied verbatim to the generated Haskell file. For aesthetic reasons the \emph{curly} braces are usually placed at the beginning of the lines and on a line of their own.
-
-The next part will show how to compare two expressions and declare explicit children in a clause. The place to start is by defining the interface:
-
-\begin{code}
-itf Compare
- visit compare
- inh exp1 :: Expr
- inh exp2 :: Expr
- inh env :: Pair Env
- syn env :: Pair Env
- syn exp :: Maybe Expr
-\end{code}
-Comparing and evaluating two trees at once require two Environment. One per tree. The |Pair| type is just an alias for a 2-tuple of the same type. (e.g (Env, Env))
-
-\subsection{Children}
-One of the benefits of \rcore over traditional AG systems such as \emph{UUAGC} is the ability to construct new trees during attribute evaluation. To accomplish this \rcore allows the declaration of new child \emph{non-terminals}. Due to space constrains only a piece of the semantics for the \emph{Compare} interface will be shown. To see the full source please consult the Appendix \ref{appendix:compare}.
-
-\begin{code}
-eq = sem eq : Compare monad IO
- visit compare
- default? env = last
- default? exp = const Nothing
- clause Num
- match Expr.Num@e1 = lhs.exp1
- match Expr.Num@e2 = lhs.exp2
-
- loc.eq = e1.val == e2.val
- lhs.exp = guard loc.eq >> return (Expr_Num e1.val)
- clause Expr
- match Expr.Expr@l = lhs.exp1
- match Expr.Expr@r = lhs.exp2
-
- child left : Compare = eq
- left.exp1 = l.exp1
- left.exp2 = r.exp1
-
- child op : OperatorEq = eqOp
- op.op1 = l.op
- op.op2 = r.op
-
- child right : Compare = eq
- right.exp1 = l.exp2
- right.exp2 = r.exp2
-
- lhs.exp = liftM3 Expr_Expr left.exp op.eq right.exp
-\end{code}
-
-Contrary to a \textbf{datasem} there is no real distinction between terminals and nonterminals in a \textbf{sem}. There are no implicit children. If a routine is to be invoked it has to be explicitly declared.
-
-\begin{figure}[h!]
-\[
-\textbf{child} \hspace{5pt} \emph{name} : Interface \hspace{5pt} [= sem\_name]
-\]
-\caption{child declaration syntax}
-\end{figure}
-
-This declares a new \textbf{child} \emph{name} belonging to the nonterminal (Interface) defined by the specified visit\cite{visitag}. By default the co-routine to execute is \emph{id} unless otherwise specified. When invoking a child of a nonterminal for which we defined a datasem the \emph{sem\_name} does not have to be defined.
-
-\subsection{Ordering}
-Match statements are guaranteed to be executed in the visit that they are defined in. Rules are ordered automatically, and match statements are prioritized over any other rules. They are also treated as assertions. Statements and assignments can be in any order, when they are scheduled they will be ordered based on their dependencies. Which is why
-
-\begin{code}
-l.e = loc.e
-loc.e = m.l
-\end{code}
-
-is perfectly valid. Because of the dependence of \emph{l.e} on \emph{loc.e}, \emph{loc.e} will be floated above \emph{l.e}. This dependency resolving will also disallow any cycles inside the dependency graph. Cycles would cause an error to be generated by the preprocessor.
+\chapter{Attribute Grammars}
+\label{AG}
+\emph{Context Free Grammars}(CFG) can only describe the syntax of programming languages\cite{knuth1}. They cannot specify any context-sensitive conditions. In practice we are interested in more then the well-formedness of a grammar. For example the condition that the same value for \emph{n} be enforced in the string $a^nb^nc^n$ cannot be described by using context free grammars\cite{ken}.
+
+In 1968 by Donald E. Knuth~\cite{knuth1} introduced \ags as a way to define the \emph{meaning} to context free languages. A typical is the defining the value of an expression. The expression grammar is defined by the following \emph{CFG} expressed as \emph{BNF}:
+
+\input{./BNF-Expr}
+
+A symbol is either a terminal or a nonterminal. A terminal is typically a single character or a single token. A nonterminal represents a sequence of symbols, and can be seen as a classification of these symbols.
+The \emph{terminals} in this case are the \emph{operators} $+,-$ and the \emph{digits} $0\ldots 9$. The \emph{nonterminals} are the \emph{Expr} and \emph{number} symbols. The grammar specifies that an expression is either a \emph{number} or an \emph{Expr} followed by an \emph{operator} and then another \emph{Expr}. For every sentence that can be produced by this grammar a parse tree can be assigned. For instance, the expression "2 + (3 - 5) + (6 - 1)" produces the parse tree in figure \ref{fig.example1.parsetree}. We define the value of an expression as a composition of the values of its sub expressions, See figure \ref{fig.example1.parsetree}
+
+\begin{figure}[H]
+\centering
+\begin{tikzpicture}[>=stealth']
+\node (r0) [draw, circle] {+};
+\node (r1) [below left=1cm of r0, draw, circle] {2}
+ edge[<-] (r0.south);
+\node (r2) [below right=1cm of r0, draw, circle] {+}
+ edge[<-] (r0.south);
+\node (s0) [below left=1.4cm of r2, draw, circle] {-}
+ edge[<-] (r2.south);
+\node (s1) [below left=1cm of s0, draw, circle] {3}
+ edge[<-] (s0.south);
+\node (s2) [below right=1cm of s0, draw, circle] {5}
+ edge[<-] (s0.south);
+\node (f0) [below right=1.5cm of r2, draw, circle] {-}
+ edge[<-] (r2.south);
+\node (f1) [below left=1cm of f0, draw, circle] {6}
+ edge[<-] (f0.south);
+\node (f2) [below right=1cm of f0, draw, circle] {1}
+ edge[<-] (f0.south);
+\end{tikzpicture}
+\caption{Parse tree example for "2 + (3 - 5) + (6 - 1)"}
+\label{fig.example1.parsetree}
+\end{figure}
+
+AG's are additions to CFGs that propagate semantic information along through parse trees. Nodes in the AST correspond to either a production or a nonterminal in the grammar. With this in mind there are two kinds of \emph{attributes} defined by knuth\cite{knuth1}:
+\begin{description}
+\item[\textbf{synthesized}] An attribute that is defined in terms of attributes of the \emph{descendants} of the AST node. They are passed bottom-up through the tree.
+\item[\textbf{inherited}] An attribute that is defined in terms of the \emph{ancestors} of the nonterminal. They are passed top-down through the tree.
+\end{description}
+
+An attribute can be both \emph{synthesized} and \emph{inherited}. These are however different attributes which share the same name. Semantics can be defined for the tree in figure \ref{fig.example1.parsetree} by assigning a synthesized attribute named \emph{value} of type \textbf{Int} to the nonterminals \emph{number} and \emph{Expr}. The evaluation rules are:
+
+\begin{figure}[H]
+\begin{grammar}
+ [(colon){$\rightarrow$}]
+ [(semicolon)$||$]
+ [(comma){}]
+ [(period){\\}]
+ [(quote){\begin{bf}}{\end{bf}}]
+ [(nonterminal){$\langle$}{$\rangle$}]
+value(<$Expr^+$>) : value(<$Expr_1$>) + value(<$Expr_2$>).
+value(<$Expr^-$>) : value(<$Expr_1$>) - value(<$Expr_2$>).
+value(<Expr>) \hspace{5pt} : value(<number>).
+value(<number>): <number>.
+\end{grammar}
+\caption{attribute definition for Expressions}
+\label{semantics:bnf:expr}
+\end{figure}
+
+Subscripts disambiguate between the different expression types and superscripts distinguish between the different cases of the \emph{operator} in an expression. Figure \ref{fig.example2.decoratedtree} shows a tree decorated with the synthesized attribute \emph{v} (short for value) along with intermediate values of \emph{v}.
+
+\begin{figure}[H]
+\centering
+\begin{tikzpicture}[>=stealth']
+\node (r0) [draw, circle] {+};
+\node (ri0) [draw, rectangle, right=0.1cm of r0, gray] {v=5};
+\node (r1) [below left=1.7cm of r0, draw, circle] {2}
+ edge[<-] (r0.south);
+\node (ri1) [draw, rectangle, right=0.1cm of r1, gray] {v=3};
+\node (r2) [below right=1.7cm of r0, draw, circle] {+}
+ edge[<-] (r0.south);
+\node (ri2) [draw, rectangle, right=0.1cm of r2, gray] {v=-2};
+\node (s0) [below left=2.1cm of r2, draw, circle] {-}
+ edge[<-] (r2.south);
+\node (ri3) [draw, rectangle, right=0.1cm of s0, gray] {v=15};
+\node (s1) [below left=1.0cm of s0, draw, circle] {3}
+ edge[<-] (s0.south);
+\node (ri4) [draw, rectangle, right=0.1cm of s1, gray] {v=3};
+\node (s2) [below right=1.0cm of s0, draw, circle] {5}
+ edge[<-] (s0.south);
+\node (ri5) [draw, rectangle, right=0.1cm of s2, gray] {v=5};
+\node (f0) [below right=2.2cm of r2, draw, circle] {-}
+ edge[<-] (r2.south);
+\node (ri6) [draw, rectangle, right=0.1cm of f0, gray] {v=5};
+\node (f1) [below left=1.0cm of f0, draw, circle] {6}
+ edge[<-] (f0.south);
+\node (ri7) [draw, rectangle, right=0.1cm of f1, gray] {v=6};
+\node (f2) [below right=1.0cm of f0, draw, circle] {1}
+ edge[<-] (f0.south);
+\node (ri8) [draw, rectangle, right=0.1cm of f2, gray] {v=1};
+\end{tikzpicture}
+\caption{decorated tree example for "2 + (3 - 5) + (6 - 1)"}
+\label{fig.example2.decoratedtree}
+\end{figure}
+
+\Ags are akin to \emph{catamorphisms}, except without the need to define the algebra and explicit traversals of the tree.
+
+\section{Utrecht University Attribute Grammar Compiler (UUAGC)}
+UUAGC (Utrecht University Attribute Grammar Compiler) is a preprocessor which parses Attribute Grammars in a custom language.
+The grammar, attributes for nonterminals and rules for attributes per production can be specified in attribute grammar source code. The preprocessor translates the attribute grammar code to datatype declarations, folds and semantic functions, which are functions that assign meaning to the specified grammar. In the UHC Type checker, the UUAGC system is used to do most of the work, with the exception of unification.
+
+\subsection{Limits of UUAGC}
+The \emph{UUAGC} system is limited in that it cannot perform case distinctions over multiple \emph{abstract syntax trees} at the same time\cite{visitag}. For most applications this is not an issue, since in most cases there is only one \emph{AST} that needs to be traversed at a time.
+
+In an attribute grammar, we express a function between inherited and synthesized attributes. The productions form the cases of this function, and we give rules for each case. Thus, we distinguish cases based on the production. For some example, we wish to also distinguish cases based on attribute values. This we cannot express with conventional attribute grammars.
+
+With higher-order AGs, it is possible to define a grammar for the call-tree of such a function. A higher-order attribute can then inspect the attribute's values and dispatch to a suitable production. However, this is a rather primitive mechanism to express case distinctions.
+
+Unification is one such example where we need to distinguish cases based on attribute values. Unification is the act of trying to find structural \& semantic equivalence between two types. It is a critical part of type checking. Given two types \emph{$t_{1}$} and \emph{$t_{2}$}, unification attempts to find a list of substitutions that allows the instantiation of type \emph{$t_{1}$} to type \emph{$t_{2}$}. For this to be accomplished it needs to be possible to traverse both \emph{$t_{1}$} and \emph{$t_{2}$} concurrently; while comparing the types at every node.
+
+%\begin{figure}[!h]
+%\begin{center}
+%\begin{neatopic}[width=.5\textwidth, height=50mm]
+% subgraph type1 {
+% node [];
+% a1 [label="a"];
+% a2 [label="b"];
+% a1 -- a2;
+% label = "Type 1";
+% }
+%
+% subgraph type1 {
+% node [];
+% b1 [label="Int"];
+% b2 [label="Int"];
+% b1 -- b2;
+% label = "Type 2";
+% }
+%
+% {rank=same; a1 b1}
+% {rank=same; a2 b2}
+%
+% a1 -- b1 [style=dotted, label ="a:= Int"];
+% a2 -- b2 [style=dotted, label ="b:= Int"];
+%\end{neatopic}
+%\end{center}
+%\caption{Unification example}
+%\label{unify-simple}
+%\end{figure}
+
+When unifying the type \emph{$a \rightarrow b$} with \emph{$Int \rightarrow Int$} both trees are traversed concurrently while the nodes are compared, and ultimately terminating with the substitution list [(a,Int), (b, Int)]. The ability to be able to traverse \emph{AST}s that were just produced is also required because the structure of the \emph{type} being produced is not known beforehand. During type inference more type information is gradually gained on the type that needs to be produced. This generally presents a problem for AGs\cite{ruler-front} because the synthesis and evaluation phases are two separate phases.
+
+\section{Ruler-Core}
+\Rcore addresses these restrictions in AG by introducing a model based on \emph{visit}\cite{visits} functions. The resulting language is more flexible while still retaining the core semantics of reasoning over decorated trees with attributes. The simplest description of ruler-core would be:\emph{a language to manipulate visit sequences.}
+
+\begin{quotation}
+A \emph{visit function}\cite{visitag} is a (partial) function that takes several inputs (\emph{inherited attributes}) and produces several output values (\emph{synthesized attributes}) and a continuation for the next visit.
+\end{quotation}
+
+As with traditional AGs \emph{inherited}, attributes are passed top-down in the tree while \emph{synthesized} attributes are passed bottom-up. An attribute can be both \emph{synthesized} and \emph{inherited}.
+
+Everything that can be expressed in uuagc can be expressed in ruler-core, but the inverse is not true. One of the simplest things that can be done in ruler-core is the evaluation of a single AST. In order to write an evaluator for the expression type in figure \ref{grammar:bnf:expr} we first need to define the \emph{datatypes} and \emph{interfaces}.
+
+\begin{figure}[H]
+\begin{minipage}[t]{0.4\linewidth}
+\begin{code}
+data Expr
+ con Num
+ val :: Int
+ con Expr
+ exp1 : Expr
+ op :: Operator
+ exp2 : Expr
+\end{code}
+\end{minipage}
+\begin{minipage}[t]{0.6\linewidth}
+\begin{code}
+data Operator
+ con Plus
+ con Minus
+
+itf Expr
+ visit eval
+ inh ast :: Expr -- input
+ syn v :: Int -- output
+\end{code}
+\end{minipage}
+\caption{Evaluating expressions in ruler-core: datatypes}
+\label{example:tutorial1:datatypes}
+\end{figure}
+
+\subsubsection{Data types}
+\Rcore data types resemble Haskell's record syntax with some notable differences.
+
+Instead of an $=$ or \textbar \space like in Haskell, an explicit \textbf{con} keyword is used to indicate the \emph{name} of the constructor. Every element of the constructor must be explicitly named. Indentation is also important since indentation separates constructors, in general \textbf{con} needs to be deeper indented then \textbf{data} and the members of a constructor should be indented further than the \textbf{con}.
+
+Figure \ref{example:tutorial1:datatypes} illustrates two different ways of declaring a type of constructor argument. $:$ is used to declare a type that is a \emph{nonterminal} and $::$ indicates a \emph{terminal}. The reason for this distinction is that for \emph{nonterminal} types some extra machinery is automatically defined. It is important to know that for every nonterminal \rcore enforces that at least one of the declared visits have an \emph{inherited} attribute named \emph{ast} with the type of the nonterminal.
+
+Only types of kind $:: \star$ are allowed by \rcore, which means only monomorphic types are accepted. As is the case with UUAGC, the constructors generated will be in the form of \emph{TypeName\_ConstructorName}. To put this concretely figure \ref{example:tutorial1:datatypes} exposes for the type \emph{Expr} the constructor functions \textbf{Expr\_Num} and \textbf{Expr\_Expr}.
+
+\subsection{Interfaces}
+\emph{Interfaces} are analogous to the interface definitions in other languages, except instead of declaring function prototypes/signatures we declare visits and their attributes. Interfaces declare \emph{nonterminal}s which can be named the same as their corresponding \emph{data types}. For those familiar with uuagc, a \textbf{ATTR} declaration in uuagc would equal an interface declaration with one visit and all the attributes declared in the \textbf{ATTR} would be part of this one visit. The ability to explicitly declare these visits and interfaces is where ruler-core's true abilities come in.
+
+\begin{figure}[h!]
+\begin{code}
+itf <name>
+ {visit <name>
+ {attributes}
+ }
+\end{code}
+\caption{Ruler-core interface declaration syntax}
+\label{itf:syntax}
+\end{figure}
+
+Multiple visits can be defined for an interface. The order of appearance of visits is relevant. Every visit is a new \emph{co-routine} and will be scheduled to run by \rcore. The order of appearance of visits is relevant. Attributes that belong to an earlier visit are computed before attributes of a later visit. Visits are usually executed as soon as possible and as many visits as possible are executed at the same time. % Outside of visits we can also declare attributes. These attributes are thus not explicitly assigned to a visit. They will be automatically assigned to the earliest visit possible.
+
+\subsection{Semantics}
+We define the rules of a production per visit. A rule may only be applied in the same visit or a later visit than it is declared in. Also, visits introduce a scope.
+Rules in \rcore are either an assertion (like a match statement), an assignment (assignment to attributes) or a statement (default rules, invocations, child declarations, etc..)
+
+There are two kind of visits, one being the visits declared inside an interface and the other the visits declared in a semantic function. These are separate concepts. The visits of an interface declares what we want to compute and those of a semantic function declares how we want to compute these values.
+
+Furthermore, a semantic visit is divided in one or more clauses. Each clause provides an alternative set of rules for the that visit of the production. During evaluation, clauses are tried in the order of appearance. Special match rules specify conditions for backtracking.
+In ruler-core, productions themselves are actually also clauses.
+
+Interface visits like all functions have arguments, or in this case attributes. The \emph{inh} keyword indicates an \emph{inherited} attribute (input value), whereas \emph{syn} indicates a \emph{synthesized} attribute (output value). The order of declaration of the attributes is not important.
+
+After declaring the datatypes and interfaces; the actual semantic function can be declared to evaluate the expressions:
+
+\begin{figure}[H]
+\begin{code}
+datasem Expr
+ clause Num
+ lhs.v = loc.val -- output
+ clause Expr
+ internal opcheck
+ clause Plus
+ match Operator.Plus@loc = loc.op
+ lhs.v = exp1.v + exp2.v -- output
+ clause Minus
+ match Operator.Minus@loc = loc.op
+ lhs.v = exp1.v - exp2.v -- output
+\end{code}
+\caption{Evaluating expressions in ruler-core: datasem}
+\label{example:tutorial1:datasem}
+\end{figure}
+
+The elements that are part of figure \ref{example:tutorial1:datasem} will be explained in the coming sections:
+
+\subsection{Semantic functions}
+\label{semantics}
+A semantic function is a function that defines a semantic for a interface. % Within a semantic function it is possible to \emph{invoke} any other coroutine(s) that might be needed. Although there is an implicit \textbf{invoke} keyword, it is rarely needed to explicitly \emph{invoke} a visit. When all attributes are defined for a visit it is implicitly invoked.
+
+The contents of a semantic function is formed by rules. Rules are given per production, per visit and per clause. Rules exists in several form. Clauses provide a way to do scoping inside semantic functions. A clause inherits all the attributes of its parent clauses in the same visit. If a visit has only one clause it does not have to be explicitly declared.
+
+There are two ways of defining a semantic function: using the \textbf{datasem} and a \textbf{sem} keyword. The example in figure \ref{example:tutorial1:datasem} uses the former.
+
+\subsubsection{Datasem}
+Defining semantics for a \emph{data type} and \emph{nonterminal}(interface) defined in \rcore can be done with a shorthand: \textbf{datasem}. As the name suggests \textbf{datasem} stands for \emph{datatype semantics}. % Those familiar with uuagc can compare defining a \textbf{datasem} in \rcore with a \textbf{SEM} declaration in uuagc.
+
+\begin{figure}[!h]
+
+\caption{Syntax definition of a sem function}
+\label{datasem:syntax}\begin{code}
+datasem <nonterminal> [monad <type>]
+ {clause <name>
+ ...
+ }
+\end{code}
+\end{figure}
+
+Defining a \textbf{datasem} is a shorthand for defining a \textbf{sem} (more on this later). The \emph{monad} type is optional and will be ignored for now. %, however if a type was specified for any semantic function which is used by, or used in this \textbf{datasem} then to disambiguate you need to define the type in this declaration as well.
+
+The \emph{clauses} in a \textbf{datasem} should coincide with the constructors of the data type. The preprocessor enforces that there is a clause for every declared constructor. Every clause declaration implicitly adds a \textbf{match} statement for every clause. This is the reason why there is a required attribute \emph{ast} for ever nonterminal. This is the attribute on which matches are tried out on in the main clauses of a \textbf{datasem}. A \textbf{match} is essentially an assertion, if failed nothing else for that clause is tried out and backtracking takes place.
+
+For every \textbf{datasem}, in every clause where there is a nonterminal in the definition of that clause, there will be an implicit child declared for that field. It is for this reason that in figure \ref{example:tutorial1:datasem} it is refered to the operator terminal via the local child (loc.op) and the \emph{|exp1|} and \emph{|exp2|} nonterminals directly. In the case of \emph{Expr} there was only one \emph{inherited} attribute: ast, but since ast is filled in automatically by \rcore the invoke is implicitly performed. Which is why the synthesized attributes can be accessed without any further action.
+
+Because of all these properties, a \textbf{datasem} only provides the ability to traverse one tree at a time.
+\subsection{Bindings}
+\label{bindings}
+In Haskell the \emph{Let} binding is used when introducing new variables in a sequential computation. In \rcore the keyword \textbf{let} is not used when assigning values to bindings, however since bindings in \rcore are translated to \textbf{let} declarations by the preprocessor the same behaviors is to expected. This means that binding to a \emph{pattern} on the \emph{left hand side} is valid. e.g. \[ (\alpha, \beta) = \ldots \] is allowed. This allows the definition multiple attributes at the same time.
+
+An attribute may only be defined once. The compiler will generate an error if it finds code that tries to redefine an attribute (there is no shadowing).
+
+The notation for referencing patterns, expressions and variables is \emph{k}.\emph{x} where \emph{k} is the name of child name and \emph{x} the attribute to be referenced. There are two build in reserved children:
+
+\paragraph{lhs}
+The \emph{lhs} child is used to access the \emph{inherited} attributes and to assign values to the \emph{synthesized} attributes. Which one is intended is derived from the context in which they are used: When used at the \emph{left hand side} of an expression they are treated as \emph{synthesized} attributes, but when used in the \emph{right hand side} of a binding they refer to the \emph{inherited} values.
+
+\paragraph{loc}
+The \emph{loc} child is used in a way that is analogous to local variables in other languages. As many \emph{loc} attributes as needed can be defined. The scope of the \emph{loc} is only the clause/visit that it is declared in and the its children.
+
+\subsection{Clauses}
+Clauses are a way of defining alternatives. When an assertion in clause fails it backtracks out of the clause and the next one is tried out. Clauses are tried out in sequential order. A visit can contain multiple clauses, corresponding to the different ways to interpret the \emph{inherited} attributes of the visit.
+
+If no clause can be executed in a visit, the system backtracks to the parent visit and clause. This behavior goes all the way up to the root. In order to be able to generate proper errors it is recommended to always make the collection of clauses for a visit total. The easiest way to do this is to make a \emph{catch-all} clause at the end.
+
+\subsection{Matches}
+Matches are akin to case expressions, like case expressions they force evaluation and attempt to pattern match on the datatype. Except unlike case expressions, you can only on a pattern that contains a constructor. For any data type defined in \rcore itself and the \emph{Bool} type a \textbf{match} can be done.
+
+\begin{figure}[h!]
+\begin{code}
+match TypeName.ConstructorName@child = <expression>
+\end{code}
+\caption{Match syntax definition}
+\label{match:syntax}
+\end{figure}
+
+If the \emph{match} succeeds the \emph{named} attributes defined for the elements of the Constructor are added as attributes of the specified \emph{child}. Of the two reserved children \textbf{lhs} and \textbf{loc} only \textbf{lhs} is not allowed as a child name here\footnote{note that the childname "var" is also reserved, but in the case of var you will get an actual syntax error}. On the other hand if the \emph{match} fails the entire clause is aborted and backtracking is performed.
+There are exceptions to this syntax, \emph{build in} types such as \emph{Bool} which have no children have an alternative syntax:
+
+\begin{figure}[h!]
+\begin{code}
+match True = <expression>
+\end{code}
+\caption{Example match on Bools}
+\label{match:bool}
+\end{figure}
+
+Figure \ref{match:bool} shows that on certain types the rules are relaxed, particularly the Bool type. Note that because there is no build in support for the \emph{Maybe} type, often in this thesis this will be supported by first match on |True = isJust expr| and then a subsequent call the |fromJust| function.
+
+\subsection{Internal}
+Sometimes it is necessary to branch on a value that was just synthesized. This is achieved by using the \emph{internal} keyword. The \emph{internal} keyword provides a means of scoping and branching at the same time. Internals contain a list of \emph{clauses} which will be tried out in order one at a time. Attributes that were declared before the \emph{internal} statement are all in scope inside an internal block.
+
+\begin{figure}[h!]
+\begin{code}
+internal <name>
+ {clause}
+ {clause}
+ ...
+\end{code}
+\caption{Internal syntax definition}
+\label{internal:syntax}
+\end{figure}
+
+A clause may have at most one internal visit, although an internal visit may have clauses with again an internal visit. Any code below the internal is floated upward if possible. Referencing a attribute defined in an internal from a parent clause is invalid.
+
+\subsection{Invoking semantic functions}
+\label{calling:sem}
+To complete this example we also need be able to call semantic functions from Haskell:
+
+\begin{figure}[H]
+\begin{code}
+eval :: Expr -> IO Int
+eval exp = do
+ let inh = Inh_Expr_eval { ast_Inh_Expr = exp }
+ syn <- invoke_Expr_eval dnt_Expr inh
+ let x = v_Syn_Expr syn
+ return x
+\end{code}
+\caption{calling wrappers from within haskell}
+\end{figure}
+
+The first line (the let) defines the \emph{inherited} attributes expected for the visit that is to be called. In this case the \emph{eval} visit, which specified that there is one \emph{inherited} attribute called \emph{ast}. For every visit there is a record for the \emph{inherited} and \emph{synthesized} attributes. The record name is build up as \textbf{X\_I\_v} where \textbf{X} equals "Inh" or "Syn", \textbf{I} is the interface name and \textbf{v} the visit name.
+
+The labels of the fields inside these record are made up of \textbf{attr\_X\_I} where \textbf{attr} is the attribute name, \textbf{X} either "Inh" or "Syn" indicating the attribute type and \textbf{I} the interface name.
+
+The second line invokes the \textbf{eval} routine with the given \emph{inherited} attributes and returns the \emph{synthesized} attribute records. The syntax for invoking a visit is \textbf{invoke\_I\_v \emph{wrapper inhs}}. The \textbf{I} indicates the interface name, the \textbf{v} the visit name, \textbf{inhs} stands for the record containing the inherited attributes and finally \textbf{wrapper} is the wrapper function to call. For every \textbf{datasem} \rcore defines a wrapper \textbf{dnt\_I} and for every \textbf{sem} function the name which was explicitly given is used (more on this later).
+
+The expression example can be scaled up by adding variables to the expressions. Two extra constructors need to be introduced to \emph{Expr}. They correspond to introduction and elimination:
+
+\begin{code}
+ con Var
+ nm :: String
+ con Let
+ nm :: String
+ exp : Expr
+ body : Expr
+\end{code}
+
+In order to be able to evaluate variables an \emph{environment} needs to be passed down through the tree to collect all the declarations. This is done by modifying the \emph{interface} of \emph{Expr}. The new \emph{interface} definition is:
+
+\be