-
Notifications
You must be signed in to change notification settings - Fork 1
/
main.tex
438 lines (383 loc) · 16.7 KB
/
main.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
%outline:
% an explanation of both the type system, a run through of one or two examples
% an architectural description of the implementation.
% Features and limitations should be prominently displayed in your documentation.
\documentclass[a4paper]{article}
\usepackage{semantic}
\usepackage{dsfont}
\usepackage{a4wide}
\usepackage{array}
\usepackage{bussproofs}
\usepackage{latexsym}
\usepackage{verbatim}
% things for the semantic package
\reservestyle{\command}{\textbf}
\command{let,in,:,case,of,if,then,else,letrec,nil,cons,false,true,[]}
\mathlig{-->}{\longrightarrow}
\newcommand{\tyrel}{\sqsubseteq}
% end semantic
\author{Jurri\"en Stutterheim\footnote{3555003}\and Ruben de Gooijer\footnote{3508617}\and Paul van der Walt\footnote{3120805}}
\date{\today}
\title{Implementation notes for Type and Effect Systems}
\begin{document}
\maketitle \tableofcontents
\section{Introduction}
This project aims to implement a tool which can do a control-flow analysis on a
lambda-calculus language with support for data structures. The focus is on
lists%and pairs
, for now.
Control-flow analysis aims to determine for each function application, which
functions may be applied. We also track the creation of data structures and that
of lambda functions. The implementation of our analysis is built on the
algorithm W type inferencing algorithm, using an annotated type system.
Our implementation currently offers support for monovariance and subeffecting.
While support for polyvariance has been attempted, we failed to deliver it fully
operational on time.
This document is organised as follows. First we introduce the design of the
analysis by showing our target language, its type system, its semantics and
our analysis rules. We then proceed to the code level implementation details of
the analysis, after which we discuss in more detail what does and does not work
in our current implementation.
\section{Design}
Here we will detail the design of the language, for example which features it
supports. In Tables \ref{tab:elems} and \ref{tab:grammar} one can see the
grammar of the language. It is based on Haskell (in fact we use the Haskell
source extensions parser), but of course only a subset is supported.
We include support for % pairs and
lists, as well as a case statement in which one
can pattern-match on these data structures. There is not yet support for general
data structures.
\subsection{A simple functional language}
%todo: copied from slides, may need adjustment.
The basic elements of the language can be found in Table \ref{tab:elems}. The
language includes natural number constants, variables, and operators. Program
points are introduced whenever a lambda-expression is encountered.
\begin{table}
\begin{centering}
\begin{tabular}{rcll}
$n$ & $\in$ & \textbf{Num = $\mathds{N}$}& numerals \\
$f,x$ & $\in$ & \textbf{Var} & variables \\
$\oplus$ & $\in$ & \textbf{Op} & binary operators \\
$\pi$ & $\in$ & \textbf{Pnt} & program points \\
$t$ & $\in$ & \textbf{Tm} & terms \\
\end{tabular}
\caption{\label{tab:elems}Basic elements}
\end{centering}
\end{table}
In Table \ref{tab:grammar} one can see what constructs are valid in our subset
of Haskell, since only a very limited subset is supported. Note that recursion
is only supported inside \textbf{let}s.
\begin{table}
\begin{centering}
\begin{tabular}{lcl}
\hline
$t$ & ::= & $n\: |\: \textbf{ false }\: |\: \textbf{ true }\: |\: x\: |\: \lambda_\pi x.t_1 \:$ \\
& $|$ & $t_1 t_2 \:|\: \textbf{ if } t_1 \textbf{ then } t_2 \textbf{ else } t_3 \:|\: \textbf{ let } x = t_1 \textbf{ in } t_2$\\
& $|$ & $\mu f.\textbf{ let } x = t_1 \textbf{ in } t_2$\\
& $|$ & $t_1 \oplus t_2 \:|\: [\,]_\pi \:|\:t_1\<:>_\pi t_2 $\\% \: | \: (t_1,t_2)_\pi $ \\
& $|$ & $ \textbf{case } e_0 \textbf{ of } alts $ \\
$alts$ & ::= & $alt_0; alt_1$ \\
$alt$ & ::= & $patt \rightarrow t$ \\
$patt$ & ::= & $x \:|\: \textbf{True} \:|\: \textbf{False} $\\% \:|\: (patt, patt)$ \\
& $|$ & $(x: patt) \:|\: [\,]$\\
\hline
\end{tabular}
\caption{Abstract syntax. $t$ is a term.}
\label{tab:grammar}
\end{centering}
\end{table}
\subsection{Type System}
Here we will detail the type system which has been implemented in this project.
It consists of a polymorphic type system, with the important addition of
annotations. We use annotations and constraints to be able to do a control-flow
analysis, eventually telling us which functions one may expect to be used where.
Our type system has variables, arrows, and type environments, as well as
annotations, which is what makes it different from a normal polymorphic type
system. Table \ref{tab:typingelems} and \ref{tab:typing} show the elements of
the type system. In the var and let/letrec case \emph{inst} and \emph{gen} are mentioned.
Their definition is identical to the ones found in literature on the standard
Hindley Milner let-polymorphic type system. The final syntax-directed type system is given in table \ref{tab:typing-rules}.
\begin{table}
\begin{centering}
\begin{tabular}{rcll}
$\varphi$ & $\in$ & \textbf{Ann} & annotations \\
$\widehat{\tau}$& $\in$ & $\widehat{\textbf{Ty} } $ & annotated types \\
$\widehat{\sigma} $& $\in$ & $\widehat{\textbf{TyScheme}} $ & annotated type schemes\\
$\widehat{\Gamma}$& $\in$ & $\widehat{\textbf{TyEnv} } $ & annotated type environments \\
\end{tabular}
\caption{Typing elements, annotated types}
\label{tab:typingelems}
\end{centering}
\end{table}
\begin{table}
\begin{centering}
\begin{tabular}{lcl}
$ \varphi$ & ::= & $ \emptyset \:|\: \{\pi\} \:|\: \varphi_1 \cup \varphi_2 $ \\
$\widehat{\tau}$ & ::= & $\alpha \:|\: Nat \: | \: Bool \: | \: $ \\
& $|$ & $\widehat{\tau}_1^{\varphi_1}
\stackrel{\varphi}{\rightarrow}
\widehat{\tau}_2^{\varphi_2} $ \\
& $|$ & $ % \widehat{\tau}_1 \times^\varphi \widehat{\tau}_2 \:|\:
[\widehat{\tau}]^\varphi $ \\
$\widehat{\sigma}$ & ::= & $\widehat{\tau} \:|\: \forall \alpha. \widehat{\sigma}_1 $ \\
$\widehat{\Gamma}$ & ::= & $[\,] \:|\: \widehat{\Gamma}_1[x \mapsto \widehat{\sigma}] $ \\
\end{tabular}
\caption{Typing}
\label{tab:typing}
\end{centering}
\end{table}
\begin{table}
\begin{centering}
\begin{tabular}{ll}
\hline \\
$ [$\emph{t-nat}$] $& \inference{}
{
{\Gamma},C |- \mathds{N} : Nat^{\varphi}
} \\
~&~\\
$ [$\emph{t-true}$] $& \inference{}
{
{\Gamma},C |- \<true>\ : Bool^{\varphi}
} \\
~&~\\
$ [$\emph{t-false}$] $& \inference{}
{
{\Gamma},C |- \<false>\ : Bool^{\varphi}
} \\
~&~\\
$ [$\emph{t-nil}$] $& \inference{
C |- \varphi \sqsupseteq \pi
}
{
{\Gamma},C |- \<nil>_\pi : [\tau^{\varphi_1}]^{\varphi}
} \\
~&~\\
$ [$\emph{t-cons}$] $& \inference{
{\Gamma},C |- e_1 : \tau^{\varphi_1}
& \Gamma,C |- e_2 : [\tau^{\varphi_2}]^{\varphi_3}
& C |- \varphi_1 \sqsupseteq \varphi_2
& C |- \varphi \sqsupseteq \pi
}
{
{\Gamma},C |- \<cons>_\pi\ e_1\ e_2 : [\tau^{\varphi_1}]^{\varphi}
} \\
~&~\\
$ [$\emph{t-var}$] $& \inference{
{\Gamma}(x) = ({\sigma},\varphi)
& C |- \varphi_1 \sqsupseteq \varphi
& inst(\sigma) = \tau
}
{
{\Gamma},C |- x : {\tau^{\varphi_1}}
} \\
~&~\\
$ [$\emph{t-lam}$] $& \inference{
\Gamma[x \mapsto (\tau_x, \varphi_x)], C |- e_o : \tau_0^{\varphi_0}
& C |- \varphi \sqsupseteq \pi
}
{
\Gamma,C |- \lambda_\pi x -> e_0 : \tau_x^{\varphi_x} ->^\varphi \tau_0^{\varphi_0}
} \\
~&~\\
$ [$\emph{t-app}$] $& \inference{
\Gamma,C |- e_1 : \tau_2^{\varphi_2} ->^{\varphi} \tau_0^{\varphi_0}
& \Gamma,C|- e_2 : \tau_2^{\varphi_3}
& C|- \varphi_2 \sqsupseteq \varphi_3
& C|- \varphi_4 \sqsupseteq \varphi_0
}
{
\Gamma, C |- e_1 e_2 : \tau_0^{\varphi_4}
} \\
~&~\\
$ [$\emph{t-if}$] $& \inference{
\Gamma, C |- e_0 : Bool^{\varphi_0}
& \Gamma, C |- e_1 : \tau^{\varphi_1}
& \Gamma, C |- e_2 : \tau^{\varphi_2}
& C |- \varphi \sqsupseteq \varphi_1
& C |- \varphi \sqsupseteq \varphi_2
}
{
\Gamma,C|- \<if>\: e_0\: \<then>\: e_1\: \<else>\: e_2 : \tau^{\varphi}
} \\
~&~\\
$ [$\emph{t-bin-op}$] $& \inference{ % what about all the annotated simple vars?
\Gamma, C |- e_1 : \tau_{1\oplus}^{\varphi_1}
& \Gamma, C |- e_2 : \tau_{2\oplus}^{\varphi_2}
& C |- \varphi \sqsupseteq \varphi_1
& C |- \varphi \sqsupseteq \varphi_2
}
{
\Gamma, C|- e_1 \oplus e_2 : \tau_{\oplus}^\varphi
} \\
~&~\\
$ [$\emph{t-let}$] $& \inference{
\Gamma, C |- e_1 : \tau_1^{\varphi_1}
& \sigma = gen(\tau_1)
& \Gamma[x \mapsto (\sigma, \varphi_1)],C |- e_2 : \tau_2^{\varphi_2}
& C |- \varphi_3 \sqsupseteq \varphi_2
}
{
\Gamma,C |- \<let>\: x = e_1\: \<in>\: e_2 : \tau_2^{\varphi_3}
} \\
~&~\\
$ [$\emph{t-letrec}$] $& \inference{
& \Gamma[f \mapsto (\tau_x^{\varphi_x} -> \tau_r^{\varphi_r}, \varphi),x \mapsto (\tau_x, \varphi_x)],
& C |- e_1 : \tau_r^{\varphi_r}, \\
& \sigma = gen(\tau_r),
& \Gamma[f \mapsto (\sigma, \varphi_r)],
& C |- e_2 : \tau_2^{\varphi_2}
& C |- \varphi_3 \sqsupseteq \varphi_2
}
{
\Gamma,C |- \<letrec>\: f\ x = e_1\: \<in>\: e_2 : \tau_2^{ \varphi_3}
} \\
~&~\\
$ [$\emph{t-case}$] $& \inference{
& \Gamma, C |- e_0 : [\tau_0]^\varphi
& C |- \varphi_3 \sqsupseteq \varphi_1
& C |- \varphi_3 \sqsupseteq \varphi_2
}
{
\Gamma,C |- \<case>\:\ e_0\ \<of>\ \{ pat : \tau_0 \hookrightarrow e_1 : \tau_1^{\varphi_1} ; pat : \tau_0 \hookrightarrow e_2 : \tau_1^{\varphi_2} \} : \tau_2^{ \varphi_3}
} \\
~&~\\
% $ [$\emph{t-app}$] $& \inference{
% ~
% }
% {
% ~
% } \\
% ~&~\\
%TODO FIXME
\hline
\end{tabular}
\caption{Typing judgements}
\label{tab:typing-rules}
\end{centering}
\end{table}
% \subsection{Example Derivation}
% \begin{prooftree}
% \small
% \AxiomC{$asdf$}
% \RightLabel{\scriptsize{id}}
% \UnaryInfC{$A a \vdash A a$}
% \AxiomC{$instance A a \Rightarrow A (Maybe a)$}
% \RightLabel{\scriptsize{inst}}
% \BinaryInfC{$A a oplus A Maybe( a)$}
% \RightLabel{\scriptsize{closure}}
% \UnaryInfC{$A (Maybe a) oplus A (Maybe (Maybe a))$}
% \AxiomC{$$}
% \RightLabel{\scriptsize{id}}
% \UnaryInfC{$A a oplus A a$}
% \AxiomC{$instance A a \Rightarrow A (Maybe a)$}
% \RightLabel{\scriptsize{inst}}
% \BinaryInfC{$A a oplus A (Maybe a)$}
% \RightLabel{\scriptsize{trans}}
% \BinaryInfC{$A a oplus A (Maybe (Maybe a))$}
% \RightLabel{\scriptsize{closure}}
% \UnaryInfC{$A Int oplus A (Maybe (Maybe Int))$}
% \AxiomC{$asdf$}
% \RightLabel{\scriptsize{id}}
% \UnaryInfC{$B a oplus B a$}
% \AxiomC{$class (A a) \Rightarrow B a$}
% \BinaryInfC{$B a oplus A a$}
% \RightLabel{\scriptsize{closure}}
% \UnaryInfC{$B Int oplus A Int$}
% \RightLabel{\scriptsize{trans}}
% \BinaryInfC{$B Int oplus A (Maybe (Maybe Int))$}
% \end{prooftree}
\subsection{Natural semantics}
In this section, Table \ref{tab:natural-semantics}, we define how terms are
evaluated. This is also known as big-step semantics. Notice the addition of
\textbf{case}-statements for data structure use. See Table 5.4 in \cite{nnh}
for reference. Our listing isn't complete, only the additions with respect to the
aforementioned table are shown.
\begin{table}
\begin{centering}
\begin{tabular}{ll}
\hline
$ [$\emph{ns-con}$] $& $ |- c --> c$ \\ ~&~\\
$ [$\emph{ns-fn}$] $& $ |- \lambda_\pi x.t_1 --> \lambda_\pi x.t_1$ \\ ~&~\\
% $ [$\emph{ns-fn$_{rec}$}$] $& $ |- \mu f.\lambda_\pi x.t_1 --> \lambda_\pi x.(t_1[f \mapsto \lambda_\pi x.t_1 ])$ \\ ~&~\\
% $ [$\emph{ns-case$_{pair}$}$] $& \inference{|- e_0 --> (x_1,x_2) & |- x_1 --> v_1 & |- x_2 --> v_2}
% {|- (\<case>\: e_0\: \<of>\: (x_1,x_2) -> e_1) --> e_1[x_1\mapsto v_1, x_2 \mapsto v_2]} \\ ~&~\\
$ [$\emph{ns-case$_{list1}$}$] $& \inference{|- e_0 --> [] }
{|- (\<case>\: e_0\: \<of>\: [] -> e_1; (x\<:>xs) -> e_2) --> e_1} \\ ~&~\\
$ [$\emph{ns-case$_{list2}$}$] $& \inference{|- e_0 --> (x\<:>xs) & x --> v & xs --> vs }
{|- (\<case>\: e_0\: \<of>\: [] -> e_1; (x\<:>xs) -> e_2) --> e_2[x\mapsto v, xs \mapsto vs]} \\ ~&~\\
\hline
\end{tabular}
\caption{Natural semantics for the language}
\label{tab:natural-semantics}
\end{centering}
\end{table}
\section{Program design}
The program compiles to a single executable, named \emph{cfa}. It relies heavily
on the UU Attribute Grammar system. Below is a list with filenames and a short
description what every one of these files does.
\begin{description}
\item [Components.hs] Uses haskell-src-exts to parse Haskell code and convert
it into a simplified AST. It also provides functions to start the analysis.
\item [Main.hs] Main file containing the program entry point.
\item [AG.ag] Main AG file which includes all other AG files. Contains the worklist,
unification and substitution algorithms.
\item [CollectBinders.ag] Disects a quantified type into separate parts.
\item [DataTypes.ag] Contains all of our data types.
\item [FreeAnnVars.ag] Collects all free annotation variables from the AST.
\item [FreeTyVars.ag] Collects all free type variables from the AST.
\item [FreeVars.ag] Collects all free variables from the AST.
\item [Infer.ag] Contains algorithm W and the generalise and instantiate functions.
\end{description}
\section{Features and Limitations}
The assignment suggested implementing CFA with support for tracking data structure
creation, with lists, pairs and a case-statement. Our implementation supports lists and
a case statement, with pattern matching and a fall-through case. As mentioned in
the introduction, we do not offer support for pairs.
Currently we only support a monovariant analysis. One of the most prominent issues
preventing us from fully implementing a polyvariant analysis was that it was not
entirely clear yet how we should go about resolving constraints during the analysis.
In addition, the role of the qualified type (or rather, qualified constraints)
was not completely clear either. While we did have support for quantifying over
annotation variables and generating several constraints on the quantified types,
we were unfortunately forced to remove this support in order to make the monovariant
analysis work correctly. So, even though we did not manage to implement a polyvariant
analysis, we did at the very least learn how to do a polyvariant anaylsis on paper,
how to support it in the generalisation and instantiation functions and which
constraints we should generate for it.
Our case blocks are limited to support for exactly two alternatives.
Generalising to $n$ alternatives would be quite straight-forward, but was dropped
due to time constraints.
We currently know of one particular case in which our current implementation fails to
deliver the expected output:
\begin{verbatim}
let f = \lambda x -> x + 1 in
let g = \lambda y -> y * 2 in
let h = \lambda z -> z 3 in
h g + h f
\end{verbatim}
We expect the result of \emph{h} to be:
\begin{verbatim}
(Nat -> t) ^ {x,y} -> t ^ {z}
\end{verbatim}
But what we actually get is:
\begin{verbatim}
(Nat -> t) ^ {x} -> t ^ {z}
\end{verbatim}
We could not figure out within the time allotted what was causing this to happen. Although
we suspect that somewhere a substitution is going wrong or a constraint that should have been
present isn't generated.
\section{Example programs}
Example programs can be found in the \texttt{examples/} directory of the distribution,
and have an extension \texttt{.hm}. Feed the program the files via \emph{stdin}. Each file
includes comments which describe its expected analysis result, and possible peculiarities
it may have. Note the directory \texttt{ill-typed/}, in which the examples fail by design.
They include examples of infinite types, etc.
\begin{thebibliography}{9}
\bibitem{nnh}
{Flemming Nielson, Hanne Riis Nielson, Chris Hankin},
\emph{Principles of Program Analysis},
Springer, Berlin,
2nd Edition,
2005.
\end{thebibliography}
\end{document}