/
report.tex
1142 lines (1025 loc) · 64.1 KB
/
report.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[acmsmall, nonacm, screen]{acmart}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}
\usepackage{listings}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{float}
\usepackage{xcolor}
\usepackage{stmaryrd}
\usepackage{anyfontsize}
\newif\ifdraft\draftfalse
\definecolor{green}{HTML}{298a33}
\definecolor{orange}{HTML}{995d02}
\newcommand{\outline}[1]{
\ifdraft
{\color{red}{#1}}
\fi
}
\newcommand\doverline[1]{%
\setbox0=\hbox{$\overline{#1}$}%
\ht0=\dimexpr\ht0-.30ex\relax% CHANGE .15 TO AFFECT SPACING
\overline{\copy0}%
}
\newcommand{\kw}[1]{\textsf{\color{ACMDarkBlue} #1}}
\newcommand{\ifThenElse}[3]{\textsf{\color{ACMDarkBlue}if}~#1~\textsf{\color{ACMDarkBlue}then}~#2~\textsf{\color{ACMDarkBlue}else}~#3}
\newcommand{\caseOf}[2]{\textsf{\color{ACMDarkBlue} case}~#1~\textsf{\color{ACMDarkBlue}of}~\{~#2~\}}
\newcommand{\letIn}[3]{\textsf{\color{ACMDarkBlue}let}~#1 = #2~\textsf{\color{ACMDarkBlue}in}~#3}
\newcommand{\shift}[2]{\textsf{\color{ACMDarkBlue}shift}~#1~\textsf{\color{ACMDarkBlue}in}~#2}
\newcommand{\callcc}[2]{\textsf{\color{ACMDarkBlue}call/cc}~#1~\textsf{\color{ACMDarkBlue}in}~#2}
\newcommand{\reset}[1]{\kw{reset}~#1}
\newcommand{\lambdaE}[2]{\lambda #1.\, #2}
\newcommand{\just}[1]{\textsf{Just}~#1}
\newcommand{\nothing}{\textsf{Nothing}}
\newcommand{\map}[3]{\textsf{map}^{\textsf{#1}}~#2~#3}
\newcommand{\unit}[2]{\textsf{unit}^{\textsf{#1}}~#2}
\newcommand{\join}[2]{\textsf{join}^{\textsf{#1}}~#2}
\newcommand{\cps}[1]{\mathcal{C}\llbracket #1 \rrbracket}
\newcommand{\cpsm}[1]{\mathcal{C}'\llbracket #1 \rrbracket}
\newcommand{\cpsmc}[1]{\mathcal{C}''\llbracket #1 \rrbracket}
\newcommand{\denote}[1]{\mathcal{E}\llbracket #1 \rrbracket}
\newcommand{\stringE}[1]{\textsf{\color{green} ``#1''}}
\newcommand{\quoteE}[1]{{\color{orange} \ulcorner #1 \urcorner}}
\newcommand{\unquoteE}[1]{{\color{black} \llparenthesis #1 \rrparenthesis }}
\lstset{ %
backgroundcolor=\color{white},
commentstyle=\color{ACMGreen},
keywordstyle=\color{ACMDarkBlue},
stringstyle=\color{ACMPurple},
basicstyle=\ttfamily
}
\lstdefinestyle{hs}{
language=Haskell
}
\lstdefinestyle{rkt}{
language=lisp,
deletekeywords={get},
morekeywords={define},
literate=*{(}{{\textcolor{gray}{(}}}{1}
{)}{{\textcolor{gray}{)}}}{1}
}
\AtBeginDocument{%
\providecommand\BibTeX{{%
\normalfont B\kern-0.5em{\scshape i\kern-0.25em b}\kern-0.8em\TeX}}}
\acmBooktitle{}
\begin{document}
\noindent {\bf Delimited Continuations and Monads: Exploring Abstractions from First Principles}
\noindent Submitted for WPE-II examination April 5, 2021 by Harrison Goldstein.
\bigskip
\noindent Committee: Stephanie Weirich (chair), Benjamin Pierce (adviser), Rajeev Alur
\bigskip
This report is intended for a general computer science research audience (including early PhD
students), and assumes basic familiarity with programming languages concepts (e.g., CIS 500 or
equivalent).
\vfill
\pagebreak
\title{Delimited Continuations and Monads}
\subtitle{Exploring Abstractions from First Principles}
\titlenote{
This report was compiled for part of the University of Pennsylvania's WPE-II Exam. The
accompanying talk is available on the author's website.
}
\author{Harrison Goldstein}
\email{hgo@seas.upenn.edu}
\orcid{0000−0001−9631−1169}
\affiliation{%
\institution{University of Pennsylvania}
\city{Philadelphia, PA}
\country{USA}
}
\renewcommand{\shortauthors}{Goldstein}
\begin{abstract}
In 1990, two programming abstractions were introduced independently: {\em delimited
continuations} and {\em monads}. These ideas quickly gained popularity, and over the course of
30 years they have permeated programming languages literature. Modern researchers often assume
knowledge of these abstractions when writing papers, a practice which streamlines exposition
but increases barriers to entry.
This report presents both delimited continuations and monads from first principles by following
the papers that originally popularized the ideas. First I discuss {\em Abstracting Control} by
\citet{danvy1990abstracting}, and then I explore {\em Comprehending Monads} by
\citet{wadler1990comprehending}. Though the two abstractions initially seem to have little to
do with one-another, delimited continuations and monads actually have a lot in common: they can
implement many of the same design patterns, and their meta-theories are surprisingly
compatible. I leverage these connections to develop intuition and provide context for future
reading.
\end{abstract}
\maketitle
\section{Introduction} \label{sec:introduction}
Abstractions are the most powerful tools in a computer scientist's toolbox. They separate
high-level ideas from low-level details, highlighting the novel and interesting parts of a
program. Sometimes a particular abstraction becomes so ingrained in a community that members
naturally think in terms of the abstraction and use it without explanation. Broadly, this is
good---a powerful common language makes it easier to communicate ideas---but it does make it more
difficult for outsiders and new researchers to understand a body of work. This report sets out to
explain a couple of these abstractions for researchers familiar with, but not steeped in, the
programming languages literature.
The notion of {\em delimited continuations} is one such abstraction that is ingrained in the
programming languages community. These operators were first presented in {\em Abstracting
Control} by \citet{danvy1990abstracting}, and they are one of the many tools used to manipulate
program continuations.
A {\em continuation} is a first-class representation of the {\em context} around a computation.
For example, when evaluating the sub-expression ``$2$'' in the program
\[ \letIn{i}{84}{\letIn{j}{2}{i / j}}, \]
we say the context is
\[ \letIn{i}{84}{\letIn{j}{\_}{i / j}}, \]
and the continuation, $k$, is
\[ k = \lambdaE{x}{\letIn{i}{84}{\letIn{j}{x}{i / j}}}. \]
The continuation captures the context as a function, which can be used as a first-class value.
Danvy and Filinski present two operators for manipulating continuations: ``$\shift{k}{e}$'' and
``$\reset{e}$.'' The \kw{reset} operation delimits a context, and \kw{shift} packages that
context as a continuation. For example, we can use \kw{shift} and \kw{reset} to extract the
continuation from above:
\begin{align*}
& \reset{(\letIn{i}{84}{\letIn{j}{(\shift{k}{k~2})}{i / j}})} \Rightarrow \\
& \letIn{k}{(\lambdaE{x}{\letIn{i}{84}{\letIn{j}{x}{i / j}}})}{k~2}
\end{align*}
We replaced the expression $2$ with $\shift{k}{k~2}$, which says ``capture the context as $k$ and
then apply that to $2$,'' and we wrapped the whole computation in \kw{reset} to delimit the
context. The result is the same as the original expression, but the continuation is made
explicit. In that example, the \kw{shift} body just applies $k$, but it can do anything. In
particular, if $k$ is not used at all,
\[ \reset{(\letIn{i}{84}{\letIn{j}{(\shift{k}{\textsf{Nothing}})}{i / j}})} \]
the expression essentially throws an {\em exception}: the context is discarded and the result of
the entire computation is just \textsf{Nothing}. This flexibility makes delimited continuations
powerful but also notoriously difficult for newcomers to reason about.
Another abstraction that is deeply embedded in programming languages literature is the {\em
monad}. Monads are notorious for being explained poorly, to the point where people jokingly quote
``a monad is just a monoid in the category of endofunctors''---a sentence which is
incomprehensible even to most community insiders~\cite{iry_2009,mac2013categories}. The internet
is filled with monad tutorials that compare the mathematical structure to burritos (tasty, but
unhelpful) and other ``more accessible'' concepts, but few provide useful explanations. To avoid
falling into this trap of explaining monads badly, I follow here one of the earliest and clearest
explanations of monads for programming, {\em Comprehending Monads} by
\citet{wadler1990comprehending}.
Wadler presents monads using an analogy to list comprehensions like
\[
[ (x,\,y) \mid x \leftarrow [1,\, 2] ;\ y \leftarrow [3,\, 4] ],
\]
which evaluates to the Cartesian product of the lists $[1,\, 2]$ and $[3,\, 4]$. Wadler observes
that comprehension syntax can be used with more than just lists---in fact it can be used with any
monad. One of the simplest monads is, called \textsf{Maybe} is defined as \[ \textsf{data
Maybe}~\alpha = \textsf{Nothing} \mid \textsf{Just}~\alpha \] in languages like Haskell. We can
write a {\em \textsf{Maybe} comprehension} like,
\[
[ i / j \mid i \leftarrow [84]^{\textsf{Maybe}} ;\ j \leftarrow \ifThenElse{n = 0}{\textsf{Nothing}}{[n]^{\textsf{Maybe}}} ]^{\textsf{Maybe}}
\]
which uses comprehension syntax with the \textsf{Maybe} monad to abstract over {\em error
handling}. In the expression, we first bind $i$ to the value $84$. Then we check if $n$ is $0$:
if it is, the whole expression results in \textsf{Nothing}, otherwise we bind $j$ to $n$.
Finally, if the computation has not failed, we compute $i / j$. Just as list comprehensions
provide a way to sequence lists, \textsf{Maybe} comprehensions provide a way to sequence
computations that might fail. There are dozens of monads that programmers use in practice, each
of which gives a different interpretation of comprehension syntax.
Both delimited continuations and monads are interesting on their own, but why present them
together? It is not immediately clear that manipulating continuations has anything to do with
interpreting comprehension syntax; one could be forgiven for assuming that these ideas are
largely orthogonal. But we have already seen that delimited continuations and monads can both be
used to simulate exceptions, and it turns out that there is significantly more overlap. Design
patterns like state, nondeterminism, and more can be implemented using both delimited
continuations and monads. Furthermore, continuations actually form a monad, and Wadler shows that
this fact can be exploited to recover results from Danvy and Filinski.
In this report I make three contributions:
\begin{itemize}
\item I explain delimited continuations and monads by following two influential papers: {\em
Abstracting Control} by Danvy and Filinski (\S~\ref{sec:danvy}) and {\em Comprehending Monads}
by Wadler (\S~\ref{sec:wadler}).
\item I develop intuition for programming with delimited continuations by implementing a number
of the design patterns that Wadler presents as monads. This solidifies the close relationship
between the two language features (\S~\ref{sec:patterns}).
\item I present the {\em continuation monad} and show that, by performing a monad-agnostic
transformation given by Wadler, we can recover the ``extended continuation passing style''
transformation presented by Danvy and Filinski (\S~\ref{sec:contmonad}). This is a small
extension of Wadler's original result.
\end{itemize}
I conclude with references to further reading that builds on these abstractions
(\S~\ref{sec:conclusion}).
\subsection*{Notation} \label{sec:notation}
My source papers differ slightly in notation and conventions. I have chosen a middle-ground and
use one unified set of conventions to discuss both papers. Throughout this report, programs are
written in a lambda calculus with syntax defined by:
\begin{align*}
e \Coloneqq &\ x \mid \lambdaE{x}{e} \mid e_1~e_2 \mid \letIn{x}{e_1}{e_2} \\
| &\ \kw{true} \mid \kw{false} \mid \ifThenElse{e_1}{e_2}{e_3} \mid n \mid e_1 + e_2 \\
| &\ (e_1,\, e_2) \mid \kw{fst}~e \mid \kw{snd}~e \mid \cdots
\end{align*}
The core calculus is made up of variables, $x$, lambda abstractions, $\lambdaE{x}{e}$, and
function application $e_1~e_2$. Let-binding makes presentation clearer, so I assume we have that
too, although formally I assume that
\[ \letIn{x}{e_1}{e_2} \]
is implemented as
\[ (\lambdaE{x}{e_2})~e_1, \]
so I often ignore it in meta-theory. Danvy and Filinski use Booleans and natural numbers to
illustrate delimited continuations; Wadler uses pairs and lightweight data structure definitions
to illustrate monads---I also use these features as needed. Since we are almost exclusively
concerned with dynamic semantics, I do not formalize a type system.
Built-in language operators are written in \kw{blue sans-serif} and defined functions are written
in \textsf{black sans-serif}.
\section{{\em Abstracting Control}} \label{sec:danvy}
Before explaining delimited continuations, I should explain some background on continuations and
first-class continuation operators. Continuations first appeared as meta-theoretical
tools~\cite{strachey2000continuations}, but before long they were introduced into concrete
languages via the {\em continuation-passing style} or CPS translation. Here is a call-by-value
CPS translation, $\cps{\cdot}$, for our basic lambda calculus with Booleans and if-statements:
\begin{align*}
\cps{x} &= \lambdaE{\kappa}{\kappa~x} \\
\cps{\lambdaE{x}{e}} &= \lambdaE{\kappa}{\kappa~(\lambdaE{x}{\cps{e}})} \\
\cps{e_1~e_2} &= \lambdaE{\kappa}{\cps{e_1}~(\lambdaE{f}{\cps{e_2}~(\lambdaE{x}{f~x~\kappa})})} \\
\cps{\kw{true}} &= \lambdaE{\kappa}{\kappa~\kw{true}} \\
\cps{\ifThenElse{e_1}{e_2}{e_3}} &= \lambdaE{\kappa}{\cps{e_1}~(\lambdaE{b}{\ifThenElse{b}{\cps{e_2}~\kappa}{\cps{e_3}~\kappa}})}
\end{align*}
This translation extracts every evaluation context---essentially every individual sub-term of a
program---as an explicit continuation function. The resulting terms do not ``return'' in the
traditional sense: when these terms finish computing, they pass the resulting value to a
continuation, $\kappa$, which is provided as an argument. A translated term can be run by
applying it to the trivial continuation, and $\cps{e}~(\lambdaE{x}{x})$ should have the same
semantics as $e$. In other words, this translation is semantics-preserving---it only changes the
structure of the program.
The CPS translation has been used extensively in compilers and interpreters, where it gives
fine-grained control over evaluation. This is because the translation makes evaluation order
explicit. This particular CPS translation is clearly call-by-value, since in the rule for
application,
\[ \cps{e_1~e_2} = \lambdaE{\kappa}{\cps{e_1}~(\lambdaE{f}{\cps{e_2}~(\lambdaE{x}{f~x~\kappa})})} \]
the argument $e_2$ is evaluated before $f$ is applied.
As CPS translations became more popular, researchers began to consider first-class abstractions
for working with continuations. The most famous of these early abstractions, ``call with current
continuation,'' or \kw{call/cc}~\cite{haynes1984continuations}, can be added as an operator in
our source language and implemented as an extension to the CPS translation:
\[ \cps{\callcc{k}{e}} = (\lambdaE{\kappa}{\cps{e}~\kappa})[k \mapsto
\lambdaE{x}{\lambdaE{\kappa'}{\kappa~x}}], \]
where $e[x \mapsto v]$ means $e$ with $v$ substituted for free instances of $x$.
The \kw{call/cc} operator is extremely powerful. The expression $\callcc{k}{e}$ captures the current
program context as $k$ and then runs $e$, allowing the programmer to arbitrarily pause and resume
evaluation as they see fit. For example,
\[ \callcc{k}{1} \]
aborts the computation, returning the value $1$, and
\[ \callcc{k}{(\kw{print}~\stringE{foo};\ k~10)} \]
pauses the computation, prints $\stringE{foo}$, and then resumes computation with the value 10.
Unfortunately, many have argued that \kw{call/cc} is {\em too} powerful. Consider the popular
Yin-Yang Puzzle:
\begin{align*}
& \kw{let}\ \textsf{yin}\ =\ (\lambdaE{c}{(\kw{print}~\stringE{@};\ c)})~(\callcc{k}{k})~\kw{in} \\
& \kw{let}\ \textsf{yang}\ =\ (\lambdaE{c}{(\kw{print}~\stringE{*};\ c)})~(\callcc{k}{k})~\kw{in} \\
& \textsf{yin}~\textsf{yang}
\end{align*}
What does this confusing mess of continuations do? Apparently it counts. This program prints
increasing numbers in a unary representation,
\[ \stringE{@*@**@***@****@*****...}, \]
but it is extremely difficult to understand exactly why that is.
\citet{felleisen1988theory} reigned in the power of \kw{call/cc} with operators that he
called \kw{prompt} and \kw{control}. These allowed programmers to delimit the
effects of \kw{call/cc} with {\em scopes}. Unfortunately, Felleisen's scopes were dynamic,
and his approach did not admit a straightforward translation into a standard lambda calculus.
This was the impetus for Danvy and Filinski's statically delimited continuations.
\subsection{Delimited Continuations and ECPS}
In {\em Abstracting Control}, Danvy and Filinski introduce the \kw{shift} and \kw{reset} operations
mentioned in Section \ref{sec:introduction}, which provide a more usable alternative to
\kw{call/cc}. We extend our lambda calculus:
\[
e \Coloneqq \cdots \mid \shift{k}{e} \mid \reset{e}
\]
Like \kw{call/cc}, these operators can be interpreted via a modified CPS translation. Danvy
and Filinski call the new translation {\em extended continuation-passing style} (ECPS):
\begin{align*}
\cps{\shift{k}{e}} &= (\lambdaE{\kappa}{\cps{e}~\textsf{id}})[k \mapsto \lambdaE{x}{\lambdaE{\kappa'}{\kappa'~(\kappa~x)}}] \\
\cps{\reset{e}} &= \lambdaE{\kappa}{\kappa~(\cps{e}~\textsf{id})}
\end{align*}
The equation for $\reset{e}$ translates $e$ and extracts its final value by applying it to
\textsf{id}. This means that any continuation manipulation done by $e$ does not leak outside of
the scope of the \kw{reset}, so we say that \kw{reset} ``delimits'' the continuation. The
translation for \kw{shift} gives $e$ access to its continuation, $\kappa$, via the variable $k$.
Together, \kw{shift} and \kw{reset} provide an elegant interface for working with continuations.
For example, take the following expression and its evaluation:
\begin{align*}
& 1 + \reset{(10 + \shift{k}{k~(k~100)})} \Rightarrow \\
& 1 + (10 + (10 + 100)) \Rightarrow \\
& 121
\end{align*}
The continuation, $k = \lambdaE{x}{10 + x}$, is captured by the \kw{shift} operator, since
``$10 + \shift{k}{\dots}$'' is within a context delimited by \kw{reset}. The outer ``$1\ +$'' is
not captured because it is outside of the \kw{reset}. The function $k$ is applied twice to $100$,
resulting in $120$, and then we add $1$ to get $121$.
\subsection{A Denotational Semantics}
Danvy and Filinski point out that the ECPS translation, with rules for \kw{shift} and \kw{reset},
is no longer truly in continuation-passing style. In particular, the rule for \kw{shift} means
that the resulting program might not enforce strict call-by-value semantics, since the
translation of \kw{shift} introduces a redex $\kappa'~(\kappa~x)$. This could be solved by
translating the result a second time, but we can use this as an opportunity to take a step back
and look at another way of defining the semantics of our calculus with \kw{shift} and \kw{reset}.
In addition to the ECPS translation, Danvy and Filinski give a denotational translation. Whereas
the ECPS translation is purely syntactic, this transformation is {\em semantic} and maps every
program to a mathematical object in a given {\em domain}. In this case the domain contains
Booleans (since we have \kw{true} and \kw{false} as
base values) and computable functions. This is common practice in programming languages
literature and hearkens back to the origin of continuations~\cite{strachey2000continuations}.
Assuming $\textsf{Ans}$ is a suitable domain of final answers, we define:
\begin{center}
\begin{tabular}{llr}
$\rho \in \textsf{Env}$ & $=$ & $\textsf{Var} \rightharpoonup \textsf{Val}$ \\
$\gamma \in \textsf{MCont}$ & $=$ & $\textsf{Val} \to \textsf{Ans}$ \\
$\kappa \in \textsf{Cont}$ & $=$ & $\textsf{Val} \to \textsf{MCont} \to \textsf{Ans}$ \\
$\mathcal{E}$ & $:$ & $\textsf{Exp} \to \textsf{Env} \to \textsf{Cont} \to \textsf{MCont} \to \textsf{Ans}$
\end{tabular}
\end{center}
The variable environment, $\rho$, maps variables to domain values, $\gamma$ is a
``meta''-continuation that captures the continuation {\em outside} of the closest enclosing
\kw{reset}, and $\kappa$ is the continuation that we are used to (inside of the closest
\kw{reset}). The denotational semantics is given by the equations:
\begin{align*}
\denote{x}~\rho~\kappa~\gamma &= \kappa~(\rho[x])~\gamma \\
\denote{\lambdaE{x}{e}}~\rho~\kappa~\gamma &= \kappa~(\lambdaE{v}{\denote{e}~(\rho[x \mapsto v])})~\gamma \\
\denote{e_1~e_2}~\rho~\kappa~\gamma &=
\denote{e_1}~\rho~(\lambdaE{f}{\denote{e_2}~\rho~(\lambdaE{x}{f~x~\kappa})})~\gamma \\
\denote{\kw{true}}~\rho~\kappa~\gamma &= \kappa~\kw{true}~\gamma \\
\denote{\ifThenElse{e_1}{e_2}{e_3}}~\rho~\kappa~\gamma &=
\denote{e_1}~\rho~(\lambdaE{b}{\ifThenElse{b}{\denote{e_2}~\rho~\kappa}{\denote{e_3}~\rho~\kappa}})~\gamma \\
\denote{\shift{k}{e}}~\rho~\kappa~\gamma &=
\denote{e}~(\rho[k \mapsto \lambdaE{x}{\lambdaE{\kappa'}{\lambdaE{\gamma'}{\kappa~x~(\lambdaE{w}{\kappa'~w~\gamma'})}}}])~(\lambdaE{x}{\lambdaE{\gamma''}{\gamma''~x}})~\gamma \\
\denote{\reset{e}}~\rho~\kappa~\gamma &= \denote{e}~\rho~(\lambdaE{x}{\lambdaE{\gamma'}{\gamma'~x}})~(\lambdaE{x}{\kappa~x~\gamma})
\end{align*}
It is safe to ignore the $\gamma$ arguments for all rules other than the ones for \kw{shift} and
\kw{reset}, since $\gamma$ can be $\eta$-reduced away (since $\lambdaE{x}{f~x}$ is equivalent to
$f$). Without $\gamma$, the first five equations look a lot like the standard CPS translation,
just lifted to a semantic domain. The meta-continuations are used in the \kw{shift} and
\kw{reset} rules to capture the outer continuation as a computable function. Remember that the
constructions on the right-hand side are part of a meta-language and represent mathematical
objects rather than concrete syntax. Our continuations are embedded in the meta-theory.
All of this is important for theoretical work on this language, but the syntactic ECPS
transformation is more useful for concrete implementations. Next, we return to the ECPS
translation and explore ways that it can be made more efficient.
\subsection{Metacircular Interpreters}
A {\em metacircular interpreter} is a powerful tool for expressing a language's
semantics~\cite{reynolds1972definitional}. In the case of our \kw{shift}--\kw{reset} language,
the metacircular interpreter looks a lot like the ECPS translation, but it does a bit more work
by translating constructs an executable meta-language rather than generating syntax directly.
The meta-language for this interpreter, is the same lambda calculus we have been working with,
but without \kw{shift} or \kw{reset}. We use a simple {\em quoting} mechanism to specify which
parts of the output are executable code and which are concrete syntax. The code in black is the
executable interpreter, while the $\quoteE{\text{quoted}}$ code is concrete syntax output. We use
$\unquoteE{\text{unquote brackets}}$ to splice computations into quoted segments. The translation
is defined as:
\begin{align*}
\cpsm{x} &= \lambdaE{\kappa}{\kappa~x} \\
\cpsm{\lambdaE{x}{e}} &=
\lambdaE{\kappa}{\kappa~\quoteE{\lambdaE{\unquoteE{x}}{\lambdaE{k}{\unquoteE{\cpsm{e}~(\lambdaE{a}{\quoteE{k~\unquoteE{a}}})}}}}} \\
\cpsm{e_1~e_2} &= \lambdaE{\kappa}{\cpsm{e_1}~(\lambdaE{f}{\cpsm{e_2}~(\lambdaE{x}{\quoteE{\unquoteE{f}~\unquoteE{x}~(\lambdaE{t}{\unquoteE{\kappa~\quoteE{t}}})}})})} \\
\cpsm{\kw{true}} &= \lambdaE{\kappa}{\kappa~\kw{true}} \\
\cpsm{\ifThenElse{e_1}{e_2}{e_3}} &= \lambdaE{\kappa}{\cpsm{e_1}~(\quoteE{\lambdaE{b}{\ifThenElse{b}{\unquoteE{\cpsm{e_2}~\kappa}}{\unquoteE{\cpsm{e_3}~\kappa}}}})} \\
\cpsm{\shift{k}{e}} &= (\lambdaE{\kappa}{\cpsm{e}~\textsf{id}})[k \mapsto \quoteE{\lambdaE{x}{\lambdaE{\kappa'}{\kappa'~\unquoteE{\kappa~\quoteE{x}}}}}] \\
\cpsm{\reset{e}} &= \lambdaE{\kappa}{\kappa~(\cpsm{e}~\textsf{id})}
\end{align*}
and the final ECPS result is given by applying $\cpsm{e}$ to the identity function.
Like in to the denotational semantics, the continuations are all meta-functions; there are no
$\kappa$s in the resulting programs. The difference is that the meta-functions are executed to
produce concrete syntax, rather than interpreted as abstract functions. For example, the rule for
if-statements will produce a concrete term of the form $\lambdaE{b}{\ifThenElse{\_}{\_}{\_}}$,
rather than a mathematical object. Representing continuations at the meta-level drastically
reduces the size of the resulting programs---there are no unnecessary redexes in the output.
It turns out that we can do even better if we include \kw{shift} and \kw{reset} in our
meta-language. This final ECPS translation,
\begin{align*}
\cpsmc{x} &= x \\
\cpsmc{\lambdaE{x}{e}} &= \quoteE{\lambdaE{\unquoteE{x}}{\lambdaE{\kappa}{\unquoteE{\reset{(\quoteE{\kappa~\unquoteE{\cpsmc{e}}})}}}}} \\
\cpsmc{e_1~e_2} &= \shift{\kappa}{\quoteE{\unquoteE{\cpsmc{e_1}}~\unquoteE{\cpsmc{e_2}}~(\lambdaE{t}{\unquoteE{\kappa~\quoteE{t}}})}} \\
\cpsmc{\kw{true}} &= \kw{true} \\
\cpsmc{\ifThenElse{e_1}{e_2}{e_3}} &= \kw{shift}~\kappa~\kw{in} \\
&\ \ \quoteE{\ifThenElse{\unquoteE{\cpsmc{e_1}}}{\unquoteE{\reset{(\kappa~\cpsmc{e_2})}}}{\unquoteE{\reset{(\kappa~\cpsmc{e_3})}}}} \\
\cpsmc{\shift{k}{e}} &= \shift{\kappa}{(\reset{\cpsmc{e})}[k \mapsto \quoteE{\lambdaE{x}{\lambdaE{\kappa'}{\kappa'~\unquoteE{\kappa~\quoteE{x}}}}}]} \\
\cpsmc{\reset{e}} &= \reset{\cpsmc{e}}
\end{align*}
also avoids unnecessary $\eta$-expansion (i.e., it always produces $f$ instead of
$\lambdaE{x}{f~x}$), resulting in a pleasingly compact final representation.
This final interpreter is essentially written in the same language that it interprets---this is
why metacircular interpreters are called ``meta''---and that poses a bootstrapping problem.
Luckily, we already have both the ECPS translation and our denotational semantics as descriptions
of this meta-language. The efficient interpreter can be written in this meta-language and then
that interpreter can itself be ECPS transformed into a standard lambda calculus.
\subsection{Use Case: Nondeterministic Programming} \label{sec:danvy:nondet}
We can understand this machinery better by examining a use-case for delimited continuations that
Danvy and Filinski mention in their paper: nondeterministic computations.
We start by defining operators \textsf{fail} and \textsf{flip}, which form the basis of
nondeterministic computation:
\begin{align*}
\textsf{fail}~() &= \shift{k}{\stringE{failure}} \\
\textsf{flip}~() &= \shift{k}{(k~\kw{true};\ k~\kw{false};\ \textsf{fail}~())}
\end{align*}
We can think of \textsf{flip} as flipping a coin and \textsf{fail} as signaling a dead-end
outcome.
Concretely the \textsf{fail} operation acts as an exception. It throws away the continuation and
returns the string $\stringE{failure}$. The \textsf{flip} operation calls its continuation on
{\em both} \kw{true} and \kw{false}, which simulates a nondeterministic choice between the two
options. This works because the result of the nondeterministic computation is actually a sequence
of values---all of the possible outcomes. When we call \textsf{flip}, the possible options are
\kw{true} and \kw{false}, so we continue with both. If later in the computation one of those
values is undesirable, we can call \textsf{fail}.
The \textsf{choice} operation builds on \textsf{flip} to simulate a nondeterministic choice of
positive integers less than a given value.
\[
\textsf{choice}~n = \ifThenElse{n < 1}{\textsf{fail~()}}{\ifThenElse{\textsf{flip}~()}{\textsf{choice}~(n - 1)}{n}}
\]
Calling $\textsf{choice}~2$ will continue with $2$, then $1$, and then $\stringE{failure}$.
Finally, Danvy and Filinski implement the function
\begin{align*}
\textsf{triple}~n~s =& \\
& \kw{let}~i = \textsf{choice}~n~\kw{in} \\
& \kw{let}~j = \textsf{choice}~(i - 1)~\kw{in} \\
& \kw{let}~k = \textsf{choice}~(j - 1)~\kw{in} \\
& \ifThenElse{i + j + k = s}{(i,\, j,\, k)}{\textsf{fail}~()},
\end{align*}
which finds all triples of distinct positive integers that sum to a given integer $s$. The
complexity is pushed into \textsf{choice}, so this function is nice and readable. Evaluating the
expression
\[ \reset{(\kw{print}~(\textsf{triple}~9~15))} \]
prints all triples of integers up to $9$ that sum to $15$.
The \textsf{triple} example is mostly a toy based on an early benchmark, but these primitives can
be practically useful. In their paper, Danvy and Filinski show an evaluator for {\em
nondeterministic finite automata} (NFAs), which is straightforward to implement thanks to
operators like \textsf{flip} and \textsf{fail}. The implementation is not especially instructive,
so I do not repeat it here.
We return to continuations later on, but for now we can take a break from continuations and look
at monads.
\section{{\em Comprehending Monads}} \label{sec:wadler}
As mentioned in Section \ref{sec:introduction}, monads are as popular as they are poorly
understood. They were originally introduced to the programming languages literature by
\citet{moggi1991notions}, who was working in {\em category theory}. Luckily, monads can be
understood without learning a new branch of mathematics, and {\em Comprehending Monads} by
\citet{wadler1990comprehending} is a great place to start.
A monad is essentially just a data structure (e.g., \textsf{List} or \textsf{Maybe}) with certain
associated operations. First, a monad must have an operation \textsf{map} that applies a function
to the ``elements'' of the data structure. For example,
\[ \map{List}{f}{\overline{x}} \]
applies $f$ to all elements of $\overline{x}$ and
\[ \map{Maybe}{f}{\overline{x}} \]
applies $f$ to the value contained in $\overline{x}$, if one exists. (I adopt Wadler's convention
of writing monadic values with a bar over the variable name.) The \textsf{map} operation must
obey two laws:
\begin{align*}
\textsf{map}~\textsf{id} &= \textsf{id} \\
\textsf{map}~(g \circ f) &= \textsf{map}~g \circ \textsf{map}~f
\end{align*}
The first law says that mapping the identity function over a structure does nothing---this is
necessary to make sure that \textsf{map} does not change the structure itself, only the elements
in the structure. The second law says that \textsf{map} is well-behaved with respect to function
composition.
A data structure with a valid \textsf{map} operation is called a {\em functor}. In order for a
functor to be a monad, it needs two more operations: \textsf{unit} and \textsf{join}. Applying
\textsf{unit} to a value injects that value into the monad structure. In the case of
\textsf{List},
\[ \unit{List}{x} = \textsf{singleton}~x = [x], \]
which is the simplest way to inject a value into a list. Likewise,
\[ \unit{Maybe}{x} = \textsf{Just}~x. \]
The \textsf{join} operation works on ``doubled up'' instances of the monad. It takes a doubled
value like
\begin{center}
$[[1, 2], [3, 4]]$\hspace{5mm}or\hspace{5mm}$\textsf{Just}~(\textsf{Just}~5)$
\end{center}
and flattens it down to a single application of the monad like
\begin{center}
$[1, 2, 3, 4]$\hspace{5mm}or\hspace{5mm}$\textsf{Just}~5$.
\end{center}
To be precise:
\begin{center}
\begin{tabular}{lll}
$\join{List}{\doverline{x}}$ & $=$ & $\textsf{flatten}~\doverline{x}$ \\
$\join{Maybe}{\doverline{x}}$ & $=$ & $\caseOf{\doverline{x}}{\textsf{Just}~(\textsf{Just}~x) \to \textsf{Just}~x;\ \_ \to \textsf{Nothing}}$
\end{tabular}
\end{center}
These operations must obey certain laws in order for the structure to truly be a monad:
\begin{align}
\textsf{map}~f \circ \textsf{unit} &= \textsf{unit} \circ f \\
\textsf{map}~f \circ \textsf{join} &= \textsf{join} \circ \textsf{map}~(\textsf{map}~f) \\
\textsf{join} \circ \textsf{unit} &= \textsf{id} \\
\textsf{join} \circ \textsf{map}~\textsf{unit} &= \textsf{id} \\
\textsf{join} \circ \textsf{join} &= \textsf{join} \circ \textsf{map}~\textsf{join}
\end{align}
Law (1) and (2) enforce that \textsf{unit} and \textsf{join} are well-behaved with respect to
\textsf{map}. In particular, (1) is a strong statement that says that \textsf{unit} cannot {\em
do} very much---it must commute with an arbitrary function $f$ (modulo the monad structure). Law
(3) says that applying \textsf{unit} to a monadic value and then \textsf{join}-ing is a no-op.
Law (4) says the same as (3) except we apply \textsf{unit} {\em under} the monad using
\textsf{map}. Finally, law (5) says that for a triple-stacked monad (e.g., $[[[1]]]$), either way
of flattening it to a single monad (e.g., $[1]$) is the same.
Importantly, the monad laws require that \textsf{unit} and \textsf{join} are the {\em simplest}
ways to do their respective tasks. For example, suppose we tried to define:
\[ \unit{List}{x} = [x,\, x,\, x,\, x,\, x] \]
Technically this does inject a value into a list, but it is not the simplest way to do so and
\[
(\textsf{join}^{\textsf{List}} \circ \textsf{unit}^{\textsf{List}})~[3] = [3,\, 3,\, 3,\, 3,\, 3] \neq \textsf{id}~[3]. \\
\]
All of this background on monads and their operations is important, but in practice using these
operations directly does not make for particularly elegant code. In the next section, I explore
Wadler's use of {\em comprehensions} to make programming with monads much more intuitive.
\subsection{Comprehensions}
{\em Comprehending Monads} does more than just explain monads. As the name cleverly suggests, the
paper develops a notation for monadic computations derived from list comprehensions.
List comprehensions are based on set-builder notation (e.g. $\{n + 1 \mid n \in \mathbb{N}\}$)
and were first introduced to programming by the SETL language in the
1960s~\cite{schwartz2012programming}. Since then, list comprehensions have become ubiquitous,
appearing in many modern languages including Racket, Python, and even C++ (with some abuse of
operator overloading). Comprehensions are a convenient way of building lists from other lists,
for example a construction like
\[ [(x,\, y) \mid x \leftarrow \overline{x};\ y \leftarrow \overline{y}]^{\textsf{List}} \]
computes the Cartesian product of $\overline{x}$ and $\overline{y}$. The analogous set-builder
construction looks like
\[ \{(x,\, y) \mid x \in X \wedge y \in Y\}. \]
List comprehensions have the form $[e \mid q]$, where $q$ is called a {\em qualifier}. Wadler
defined the syntax of qualifiers as:
\[ q \Coloneqq \Lambda \mid x \leftarrow e \mid (q_1;\ q_2) \]
Qualifiers can be empty (denoted $\Lambda$), a binder for a variable $x$, or a composition of two
qualifiers. This notation can be a bit heavy, so Wadler usually writes ``$[e]$'' instead of ``$[e
\mid \Lambda]$,'' and he proves that qualifier composition is associative so he can write ``$[e
\mid q_1;\ q_2]$'' instead of $[e \mid (q_1;\ q_2)]$.
{\em Comprehending Monads} centers around the observation that lists are not the only structure
for which comprehensions make sense. We can characterize behavior of list comprehensions with
following rules:
\begin{align*}
[e \mid \Lambda] &= \textsf{singleton}~e \\
[e_1 \mid x \leftarrow e_2] &= \textsf{map}~(\lambdaE{x}{e_1})~e_2 \\
[e \mid (q_1;\ q_2)] &= \textsf{flatten}~[[e \mid q_2] \mid q_1]
\end{align*}
Notice that \textsf{singleton} is $\textsf{unit}^{\textsf{List}}$ and \textsf{flatten} is
$\textsf{join}^{\textsf{List}}$. Would using \textsf{unit} and \textsf{join} for other monads
give a reasonable way to re-interpret comprehensions?
Yes! Wadler defines a lambda calculus with generic {\em monad comprehension} syntax:
\begin{align*}
e &\Coloneqq \cdots \mid [e \mid q]^{\textsf{M}} \\
q &\Coloneqq \Lambda \mid x \leftarrow e \mid (q_1;\ q_2)
\end{align*}
For any monad \textsf{M} the monad comprehension behaves according to the following rules:
\begin{align*}
[e \mid \Lambda]^{\textsf{M}} &= \unit{M}{e} \\
[e_1 \mid x \leftarrow e_2]^{\textsf{M}} &= \map{M}{(\lambdaE{x}{e_1})}{e_2} \\
[e \mid (q_1;\ q_2)]^{\textsf{M}} &= \join{M}{[[e \mid q_2]^{\textsf{M}} \mid q_1]^{\textsf{M}}}
\end{align*}
The empty qualifier simply injects $e$ into the monad using \textsf{unit}. The binding qualifier
defines a function: $e_2$ produces some monadic value, and the result of the comprehension is
$\lambdaE{x}{e_1}$ mapped over that value. Finally, qualifier composition translates to a nested
comprehension which is flattened using \textsf{join}. Note that while Wadler opts to extend the
language with comprehensions, they could instead be implemented as syntactic sugar---in other
words, they can be implemented as surface syntax and omitted from the formal language.
Comprehensions can be used for computations like the \textsf{Maybe} comprehension from Section
\ref{sec:introduction},
\[
[ i / j \mid i \leftarrow [84]^{\textsf{Maybe}} ;\ j \leftarrow \ifThenElse{n = 0}{\textsf{Nothing}}{[n]^{\textsf{Maybe}}} ]^{\textsf{Maybe}}
\]
which cleanly handles the fact that the expression binding $j$ might fail, without introducing
significant syntactic overhead.
\subsection{Examples of Monads} \label{sec:monad-examples}
\textsf{List} and \textsf{Maybe} are far from the only monads, and while the formal definitions
are important, the best way to understand monads is by example. Wadler presents a whole zoo of
monads in his paper; exploring those structures provides valuable intuition for how monads work
and what they can do.
For each of these monads, I define the monad data structure (with a fairly standard syntax,
similar to Haskell) and the three operations, \textsf{map}, \textsf{unit}, and \textsf{join}. I
assume that we have case-statements in our language for destructuring these data structures.
\subsubsection{List and Maybe} \label{sec:monad-examples:maybe} \hfill {\bf The Running Examples}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{data List}$~$\alpha$ & $=$ & $\textsf{Nil} \mid \textsf{Cons}~\alpha~(\textsf{List}~\alpha)$ \\
$\map{List}{f}{\overline{x}}$ & $=$ & $\caseOf{\overline{x}}{\textsf{Nil} \to \textsf{Nil};\ \textsf{Cons}~y~\overline{y} \to \textsf{Cons}~(f~y)~(\map{List}{f}{\overline{y}})}$ \\
$\unit{List}{x}$ & $=$ & $\textsf{singleton}~x$ \\
$\join{List}{\doverline{x}}$ & $=$ & $\textsf{flatten}~\doverline{x}$ \\
\\
$\textsf{data Maybe}$~$\alpha$ & $=$ & $\textsf{Nothing} \mid \textsf{Just}~\alpha$ \\
$\map{Maybe}{f}{\overline{x}}$ & $=$ & $\caseOf{\overline{x}}{\textsf{Nothing} \to \textsf{Nothing};\ \textsf{Just}~y \to \textsf{Just}~(f~y)}$ \\
$\unit{Maybe}{x}$ & $=$ & $\textsf{Just}~x$ \\
$\join{Maybe}{\doverline{x}}$ & $=$ & $\caseOf{\doverline{x}}{\textsf{Just}~(\textsf{Just}~x) \to \textsf{Just}~x;\ \_ \to \textsf{Nothing}}$
\end{tabular}
}
\end{center}
\medskip
We have already seen the definitions for \textsf{List} and \textsf{Maybe}: the \textsf{List}
monad coincides exactly with list comprehensions, and the \textsf{Maybe} monad can be used to
sequence computations that might fail (returning \textsf{Nothing}).
\subsubsection{Identity and Strictness} \label{sec:monad-examples:identity} \hfill {\bf The Degenerate Cases}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{data Id}$~$\alpha$ & $=$ & $\alpha$ \\
$\map{Id}{f}{\overline{x}}$ & $=$ & $f~\overline{x}$ \\
$\unit{Id}{x}$ & $=$ & $x$ \\
$\join{Id}{\doverline{x}}$ & $=$ & $\doverline{x}$ \\
\\
$\textsf{data Str}$~$\alpha$ & $=$ & $\alpha$ \\
$\map{Str}{f}{\overline{x}}$ & $=$ & $\ifThenElse{\overline{x} \neq \bot}{f~\overline{x}}{\bot}$ \\
$\unit{Str}{x}$ & $=$ & $x$ \\
$\join{Str}{\doverline{x}}$ & $=$ & $\doverline{x}$
\end{tabular}
}
\end{center}
\medskip
\textsf{List} and \textsf{Maybe} are relatively simple monads, but there are a couple that are even
simpler. The most basic monad is just an identity---it does nothing. This monad is almost not
worth mentioning, but it does have one cute use-case: comprehension syntax for the identity monad
can be used instead of \kw{let} binding. We can write
\[ [e_3 \mid x \leftarrow e_1;\ y \leftarrow e_2]^{\textsf{Id}} \]
instead of
\[ \letIn{x}{e_1}{\letIn{y}{e_2}{e_3}}. \]
There is no real reason to prefer this syntax, so I will continue to use \kw{let}, but it is a
fun observation.
A slight variation on the identity monad, the {\em strictness} monad, does actually have some
interesting uses. Assuming a call-by-name base language, the strictness monad provides
control over evaluation order by altering the definition of $\textsf{map}$ to force the
evaluation of the argument $\overline{x}$. The expression
\[ [e_3 \mid x \leftarrow e_1;\ y \leftarrow e_2]^{\textsf{Str}} \]
works like
\[ \letIn{!x}{e_1}{\letIn{!y}{e_2}{e_3}} \]
where $\letIn{!x}{e_1}{e_2}$ forces the evaluation of $e_1$ before assigning to $x$. \textsf{Str}
comprehensions give access to call-by-value semantics as needed, without changing the underlying
language.
\subsubsection{State} \label{sec:monad-examples:state} \hfill {\bf Pure ``Imperative'' Programming}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{data State}$~$\alpha$ & $=$ & $\sigma \to (\alpha,\, \sigma)$ \\
$\map{State}{f}{\overline{x}}$ & $=$ & $\lambdaE{s}{\letIn{(x,\, s')}{\overline{x}~s}{(f~x,\, s')}}$ \\
$\unit{State}{x}$ & $=$ & $\lambdaE{s}{(x,\, s)}$ \\
$\join{State}{\doverline{x}}$ & $=$ & $\lambdaE{s}{\letIn{(\overline{x},\, s')}{\doverline{x}~s}{\overline{x}~s'}}$
\end{tabular}
}
\end{center}
\medskip
The \textsf{State} monad enables code that {\em looks} stateful, even though there are no actual
references under the hood. Assuming a fixed type $\sigma$ of states, the type $\sigma \to
(\alpha,\, \sigma)$ says that each stateful computation takes a state as input and produces a
{\em result} of type $\alpha$ along with a new state.
Since \textsf{State} is such an important and interesting monad, we should look at the operations
in a bit of detail. As usual, \textsf{unit} is the simplest; it simply injects a value into a
stateful computation. The state stays the same, and the result is the argument passed to
\textsf{unit}. The \textsf{map} operation takes a computation, runs it, and then applies a
function to the result value. Finally, \textsf{join} runs a computation which results in {\em
another} computation, and then it runs that computation to get a final result. These operations
together mean that state comprehensions act like imperative code, with state handled implicitly.
As with many monads, interesting programs in the state-monad make use of some auxiliary
operations:
\begin{center}
\begin{tabular}{lll}
$\textsf{get}$ & $=$ & $\lambdaE{s}{(s,\, s)}$ \\
$\textsf{put}~s'$ & $=$ & $\lambdaE{s}{((),\, s')}$
\end{tabular}
\end{center}
The \textsf{get} operation accesses the state, and the \textsf{put} operation sets the state.
These can be used to write a program like
\[
p = [k \mid i \leftarrow \textsf{get};\ \_ \leftarrow \textsf{put}~(i + 3);\ j \leftarrow
\textsf{get};\ \_ \leftarrow \textsf{put}~(j * 7);\ k \leftarrow \textsf{get}]^{\textsf{State}},
\]
which simulates a simple stateful computation. We can call $p~3$ (choosing $3$ as the initial
state) to get the pair $(42,\, 42)$. If we expand the comprehension syntax and monad operation
definitions,
\begin{align*}
\lambda s_0.\ & \kw{let}~(i,\, s_1) = \textsf{get}~s_0~\kw{in} \\
& \kw{let}~((),\, s_2) = \textsf{put}~(i + 3)~s_1~\kw{in} \\
& \kw{let}~(j,\, s_3) = \textsf{get}~s_2~\kw{in} \\
& \kw{let}~((),\, s_4) = \textsf{put}~(j * 7)~s_3~\kw{in} \\
& \kw{let}~(k,\, s_5) = \textsf{get}~s_4~\kw{in} \\
& (k,\, s_5),
\end{align*}
we can see that the state parameters $s_0$, $s_1$, etc. are being passed from one expression to
the next, tracking the state as it evolves.
\subsubsection{Reader} \label{sec:monad-examples:reader} \hfill {\bf AKA ``State Reader''}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{data Reader}$~$\alpha$ & $=$ & $\rho \to \alpha$ \\
$\map{Reader}{f}{\overline{x}}$ & $=$ & $\lambdaE{r}{f~(\overline{x}~r)}$ \\
$\unit{Reader}{x}$ & $=$ & $\lambdaE{r}{x}$ \\
$\join{Reader}{\doverline{x}}$ & $=$ & $\lambdaE{r}{(\doverline{x}~r)~r}$
\end{tabular}
}
\end{center}
\medskip
The \textsf{Reader} monad is a simplification of the \textsf{State} monad that does not allow the
state to be modified. Despite its relative simplicity, the \textsf{Reader} monad is quite
powerful. After defining the auxiliary operation,
\begin{center}
\begin{tabular}{lll}
$\textsf{ask}$ & $=$ & $\lambdaE{r}{r}$,
\end{tabular}
\end{center}
\textsf{Reader} can be used to keep track of configuration values, environment information, and
any other static value that would otherwise be cumbersome to pass around explicitly. Readers are
also safer and more flexible than global variables or constants, since they are implemented as
functions and thus have a clearly delimited scope.
\subsubsection{Nondeterminism} \label{sec:monad-examples:nondet} \hfill {\bf The (Power) Set Monad}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{data ND}$~$\alpha$ & $=$ & $\mathcal{P}(\alpha)$ \\
$\map{ND}{f}{\overline{x}}$ & $=$ & $\{f~x \mid x \in \overline{x}\}$ \\
$\unit{ND}{x}$ & $=$ & $\{x\}$ \\
$\join{ND}{\doverline{x}}$ & $=$ & $\bigcup \doverline{x}$
\end{tabular}
}
\end{center}
\medskip
The nondeterminism monad enables direct-style programming of nondeterministic algorithms, much
like the \textsf{triple} example from Section \ref{sec:danvy:nondet}. A nondeterministic value is
really a set of values, corresponding to each of the possible outcomes of a nondeterministic
computation. We define the same operators that we saw before:
\begin{center}
\begin{tabular}{lll}
$\textsf{fail}$ & $=$ & $\varnothing$ \\
$\textsf{flip}$ & $=$ & $\{\kw{true},\, \kw{false}\}$,
\end{tabular}
\end{center}
We can \textsf{fail} by returning the empty set---there are no possible outcomes---and we can
also \textsf{flip} between \kw{true} and \kw{false} as expected.
The monad operations keep track of all possible outcomes at a given point in the program, but
programs can be written as if there is only one path.
\subsubsection{Parser} \hfill {\bf Top-Down Parser Combinators}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{data Parser}$~$\alpha$ & $=$ & $\textsf{String} \to \textsf{List}~(\alpha,\, \textsf{String})$ \\
$\map{Parser}{f}{\overline{x}}$ & $=$ & $\lambdaE{i}{[(f~x,\, i') \mid (x,\, i') \leftarrow \overline{x}~i]^{\textsf{List}}}$ \\
$\unit{Parser}{x}$ & $=$ & $\lambdaE{i}{[(x, i)]^{\textsf{List}}}$ \\
$\join{Parser}{\doverline{x}}$ & $=$ & $\lambdaE{i}{[(x,\, i'') \mid (\overline{x},\, i') \leftarrow \doverline{x}~i;\ (x,\ i'') \leftarrow \overline{x}~i']^{\textsf{List}}}$
\end{tabular}
}
\end{center}
\medskip
Finally, here is a more complicated monad: \textsf{Parser}. Parsers can be seen as a combination
of two monads: \textsf{List} and \textsf{State}. Intuitively, the \textsf{String} argument is the
parser input, and the result is a list of potential parses, each of which is made up of a result
of type $\alpha$ and the \textsf{String} that remains after parsing.
There are a number of auxiliary operations that are useful for \textsf{Parser} (often called {\em
parser combinators}), including \textsf{satisfy} which parses a character matching a given
predicate and \textsf{many} which applies a parser multiple times. An amazing amount of research
has been done on the \textsf{Parser} monad and parser combinators~\cite{hutton1996monadic,
leijen2001parsec}, and it is still an active area~\cite{willis2020staged}.
\subsection{Other Formulations of Monads}
Before moving on, I want to address some other formulations of monads that have become more
popular since {\em Comprehending Monads} was published.
A monad's \textsf{join} operation can be used to define special kind of function composition
called {\em Kleisli composition}. Given a monad \textsf{M} and two functions $f: \beta \to
\textsf{M}~\gamma$ and $g: \alpha \to \textsf{M}~\beta$, we can write a function
\[ f \circ_\textsf{M} g = \textsf{join}^\textsf{M} \circ \textsf{map}^\textsf{M}~f
\circ g \]
that composes them together while properly keeping track of $\textsf{M}$. This is useful for
programming because it means that monadic operations like $f$ and $g$ can be built up
individually and composed together to build larger programs. A very similar construction can also
yield the \kw{bind} (or \textsf{>>=}) definition of monads that is more common in modern
programming languages.
The \textsf{bind} definition of monads admits syntactic sugar called {\em do--notation}, which is
popular in Haskell. Do--notation is actually very similar to monad comprehension syntax:
\[
\kw{do}\ \{\ i \leftarrow \unit{Maybe}{84};\ j \leftarrow
(\ifThenElse{n = 0}{\nothing}{\unit{Maybe}{n}});\ \unit{Maybe}{i/j}\ \}
\]
The distinctions between comprehension syntax and do--notation are mostly cosmetic. When using
do--notation, programmers write ``$\unit{Maybe}{e}$'' (or more idiomatically
``$\textsf{return}~e$'') instead of ``$[e]^{\textsf{M}}$.'' Also rather than put a return value at
the front of the comprehension, do--blocks evaluate to their final expression. For the remainder
of this paper I stick to comprehension syntax, but programming-focused monad examples usually use
something closer to do--notation.
Next, we examine examples of monads that are defined using delimited continuations.
\section{Common Design Patterns} \label{sec:patterns}
The monadic design patterns that Wadler presents are also a great showcase for delimited
continuations. We have already seen delimited continuations implement exceptions (\textsf{Maybe})
and nondeterminism (\textsf{ND}), but it turns out that they can be used to implement all of the
monads from the previous section.
This presentation was inspired by a blog post by \citet{xia_2019}, which shows how to implement a
variety of monads using continuation-passing. Following Xia's definitions, I present each pattern
by defining (1) the primitive operations for the pattern and (2) a \textsf{run} function that
delimits the scope of the computation using \kw{reset}.
\subsubsection{Maybe} \hfill {\bf See Section \ref{sec:monad-examples:maybe}}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{abort}$ & $=$ & $\shift{k}{\textsf{Nothing}}$ \\
$\textsf{run}^{\textsf{Maybe}}~c$ & $=$ & $\reset{(\textsf{Just}~c)}$
\end{tabular}
}
\end{center}
\medskip
We have already seen exceptions in the contexts of both continuations and monads already,
although I have been imprecise when ``throwing an exception'' using \kw{shift}. This construction
faithfully simulates the \textsf{Maybe} monad, since its computations either fail and return
$\nothing$ or succeed with a value $v$ and return $\just{v}$.
\pagebreak
\subsubsection{Identity and Strictness} \hfill {\bf See Section \ref{sec:monad-examples:identity}}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{noop}~e$ & $=$ & $\shift{k}{k~e}$ \\
$\textsf{run}^{\textsf{Id}}~c$ & $=$ & $\reset{c}$ \\
\\
$\textsf{force}~e$ & $=$ & $\shift{k}{k~(\ifThenElse{e \neq \bot}{e}{\bot})}$ \\
$\textsf{run}^{\textsf{St}}~c$ & $=$ & $\reset{c}$
\end{tabular}
}
\end{center}
\medskip
Identity can be encoded without using continuations at all, of course, but this presentation is
instructive. Here the \textsf{noop} operation shows how to do nothing using delimited
continuations---just call the continuation. All \textsf{run} needs to do is delimit the
continuation scope.
Just like with monads, we can also encode strictness (provided our source language is
non-strict). We do the same trick as before, forcing $e$ and then calling the continuation.
\subsubsection{State} \hfill {\bf See Section \ref{sec:monad-examples:state}}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{get}$ & $=$ & $\shift{k}{\lambdaE{s}{k~s~s}}$ \\
$\textsf{put}~s'$ & $=$ & $\shift{k}{\lambdaE{s}{k~()~s'}}$ \\
$\textsf{run}^{\textsf{State}}~c~i$ & $=$ & $\reset{((\lambdaE{v}{\lambdaE{s}{v}})~c)}~i$
\end{tabular}
}
\end{center}
\medskip
As with the state monad, encoding state with continuations is about capturing state in function
parameters and return values. Our continuations take two parameters: the first is the computation
result, and the second is the new state. The \textsf{get} operation takes in the current state
and continues with that state as both the result and the new state. The \textsf{put} operation
ignores the current state and continues with unit as the result and $s'$ as the new state. The
\textsf{run} function takes an initial state and passes it in {\em outside} the continuation
scope.
As with the monadic version, these operations can be used to simulate stateful code. For example,
this program, which is essentially the same program from Section \ref{sec:monad-examples:state},
\begin{align*}
\textsf{run}^{\textsf{State}}~(& \kw{let}~i = \textsf{get}~\kw{in} \\
& \textsf{put}~(i + 3); \\
& \kw{let}~j = \textsf{get}~\kw{in} \\
& \textsf{put}~(j * 7); \\
& \textsf{get})~3
\end{align*}
returns 42 as expected. In this case the control flow is a bit more complicated, but the
intuition is the same as for the monadic version.
\subsubsection{Reader} \hfill {\bf See Section \ref{sec:monad-examples:reader}}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{ask}$ & $=$ & $\textsf{get}$ \\
$\textsf{run}^{\textsf{Reader}}$ & $=$ & $\textsf{run}^{\textsf{State}}$
\end{tabular}
}
\end{center}
\medskip
\textsf{Reader} is strictly simpler than \textsf{State}, but we can use the same definitions
anyway. We just rename \textsf{get} to \textsf{ask} and don't provide a \textsf{put} operation.
\subsubsection{Nondeterminism} \hfill {\bf See Section \ref{sec:monad-examples:nondet}}
\begin{center}
\framebox[\textwidth]{
\begin{tabular}{lll}
$\textsf{fail}$ & $=$ & $\shift{k}{k~\varnothing}$ \\
$\textsf{flip}$ & $=$ & $\shift{k}{k~\kw{true} \cup k~\kw{false}}$ \\
$\textsf{run}^{\textsf{ND}}~c$ & $=$ & $\reset{((\lambdaE{v}{\{v\}})~c)}$
\end{tabular}
}
\end{center}
\medskip
We already saw one way of encoding nondeterminism using continuations, but here is another one
that is a bit closer to the monadic presentation. We again make use of sets to define the
primitive operations \textsf{fail} and \textsf{flip}: Our implementation of \textsf{run} lifts
computations into singleton sets, and our operators call the continuation on whatever values
should be considered---either nothing in the case of \textsf{fail} or both \kw{true} and
\kw{false} in the case of \textsf{flip}.
\subsection{Other Monad Implementations}
A couple of Wadler's monad examples are missing here, but not for any fundamental reason.
\textsf{List} can be implemented using delimited continuations without issues, but there are a
number of subtly different ways to do it and the technicalities are not informative.
\textsf{Parser}s can also be done using \kw{shift} and \kw{reset}, but the construction is
complex and, again, not informative. It would seem that anything that can be done with monads can
also be done with delimited continuations.
It turns out that with enough effort, this is roughly true! A few years after {\em Abstracting
Control}, \citet{filinski1994representing} published another paper showing that delimited
continuations can represent all {\em pure} monads (roughly those whose \textsf{map}, \textsf{join},
and \textsf{bind} operations can be implemented in our base lambda calculus). While the
constructions in Section \ref{sec:patterns} are ad-hoc and dependent on the particular monad in
question, Filinski managed to give a translation that represents monadic patterns with
continuations once-and-for-all. Still, the ad-hoc connections are valuable: they provide more
granular intuition than Filinski's presentation, and they are all entirely pure, while Filinski's
formulation relies on a single reference cell.
\section{The Continuation Monad} \label{sec:contmonad}
Up to this point we have viewed delimited continuations and monads in parallel, bouncing back and
forth between the abstractions and showing how they compare. But there is one final insight from
Wadler's paper that I skipped over, and it serves as a bridge between continuations and monads.
Towards the end of {\em Comprehending Monads}, Wadler presents the {\em continuation monad}.
Given a result type, $\rho$, the continuation monad is defined as: