Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: master
Fetching contributors…

Cannot retrieve contributors at this time

486 lines (395 sloc) 11.747 kb
\documentclass{beamer}
\usepackage{listings}
% \usepackage{pgfpages}
% \pgfpagesuselayout{4 on 1}[a4paper,border shrink=5mm,landscape]
\title{Reasoning about laziness}
\author{Johan Tibell\\johan.tibell@gmail.com}
\date{2011-02-12}
\begin{document}
\lstset{language=Haskell}
\frame{\titlepage}
\begin{frame}[fragile]
\frametitle{Laziness}
\begin{itemize}
\item Haskell is a lazy language
\item Functions and data constructors don't evaluate their arguments
until they need them
\begin{lstlisting}
cond :: Bool -> a -> a -> a
cond True t e = t
cond False t e = e
\end{lstlisting}
\item Same with local definitions
\begin{lstlisting}
abs :: Int -> Int
abs x | x > 0 = x
| otherwise = neg_x
where neg_x = negate x
\end{lstlisting}
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Why laziness is important}
\begin{itemize}
\item Laziness supports \emph{modular programming}
\item Programmer-written functions instead of built-in language
constructs
\begin{lstlisting}
(||) :: Bool -> Bool -> Bool
True || _ = True
False || x = x
\end{lstlisting}
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Laziness and modularity}
Laziness lets us separate producers and consumers and still get
efficient execution:
\begin{itemize}
\item Generate all solutions (a huge tree structure)
\item Find the solution(s) you want
\end{itemize}
\begin{lstlisting}
nextMove :: Board -> Move
nextMove b = selectMove allMoves
where
allMoves = allMovesFrom b
\end{lstlisting}
The solutions are generated as they are consumed.
\end{frame}
\begin{frame}[fragile]
\frametitle{Example: summing some numbers}
\begin{lstlisting}
sum :: [Int] -> Int
sum xs = sum' 0 xs
where
sum' acc [] = acc
sum' acc (x:xs) = sum' (acc + x) xs
\end{lstlisting}
\lstinline!foldl! abstracts the accumulator recursion pattern:
\begin{lstlisting}
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
sum = foldl (+) 0
\end{lstlisting}
\end{frame}
\begin{frame}[fragile]
\frametitle{A misbehaving function}
How does evaluation of this expression proceed?
\begin{lstlisting}
sum [1,2,3]
\end{lstlisting}
Like this:
\begin{verbatim}
sum [1,2,3]
==> foldl (+) 0 [1,2,3]
==> foldl (+) (0+1) [2,3]
==> foldl (+) ((0+1)+2) [3]
==> foldl (+) (((0+1)+2)+3) []
==> ((0+1)+2)+3
==> (1+2)+3
==> 3+3
==> 6
\end{verbatim}
\end{frame}
\begin{frame}
\frametitle{Thunks}
A \emph{thunk} represents an unevaluated expression.
\begin{itemize}
\item GHC needs to store all the unevaluated \lstinline!+! expressions
on the heap, until their value is needed.
\item Storing and evaluating thunks is costly, and unnecessary if the
expression was going to be evaluated anyway.
\item \lstinline!foldl! allocates \emph{n} thunks, one for each
addition, causing a stack overflow when GHC tries to evaluate the
chain of thunks.
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Controlling evaluation order}
The \lstinline!seq! function allows to control evaluation order.
\begin{lstlisting}
seq :: a -> b -> b
\end{lstlisting}
Informally, when evaluated, the expression \lstinline!seq a b!
evaluates \lstinline!a! and then returns \lstinline!b!.
\end{frame}
\begin{frame}[fragile]
\frametitle{Weak head normal form}
Evaluation stops as soon as a data constructor (or lambda) is
reached:
\begin{verbatim}
ghci> seq (1 `div` 0) 2
*** Exception: divide by zero
ghci> seq ((1 `div` 0), 3) 2
2
\end{verbatim}
We say that \lstinline!seq! evaluates to \emph{weak head normal
form} (WHNF).
\end{frame}
\begin{frame}[fragile]
\frametitle{Weak head normal form}
Forcing the evaluation of an expression using \lstinline!seq! only
makes sense if the result of that expression is used later:
\begin{lstlisting}
let x = 1 + 2 in seq x (f x)
\end{lstlisting}
The expression
\begin{lstlisting}
print (seq (1 + 2) 3)
\end{lstlisting}
doesn't make sense as the result of \lstinline!1+2! is never used.
\end{frame}
\begin{frame}[fragile]
\frametitle{Exercise}
Rewrite the expression
\begin{lstlisting}
(1 + 2, 'a')
\end{lstlisting}
so that the component of the pair is evaluated before the pair is
created.
\end{frame}
\begin{frame}[fragile]
\frametitle{Solution}
Rewrite the expression as
\begin{lstlisting}
let x = 1 + 2 in seq x (x, 'a')
\end{lstlisting}
\end{frame}
\begin{frame}[fragile]
\frametitle{A strict left fold}
We want to evaluate the expression \lstinline!f z x! \emph{before}
evaluating the recursive call:
\begin{lstlisting}
foldl' :: (a -> b -> a) -> a -> [b] -> a
foldl' f z [] = z
foldl' f z (x:xs) = let z' = f z x
in seq z' (foldl' f z' xs)
\end{lstlisting}
\end{frame}
\begin{frame}[fragile]
\frametitle{Summing numbers, attempt 2}
How does evaluation of this expression proceed?
\begin{verbatim}
foldl' (+) 0 [1,2,3]
\end{verbatim}
Like this:
\begin{verbatim}
foldl' (+) 0 [1,2,3]
==> foldl' (+) 1 [2,3]
==> foldl' (+) 3 [3]
==> foldl' (+) 6 []
==> 6
\end{verbatim}
Sanity check:
\begin{verbatim}
ghci> print (foldl' (+) 0 [1..1000000])
500000500000
\end{verbatim}
\end{frame}
\begin{frame}[fragile]
\frametitle{Computing the mean}
A function that computes the mean of a list of numbers:
\begin{lstlisting}
mean :: [Double] -> Double
mean xs = s / fromIntegral l
where
(s, l) = foldl' step (0, 0) xs
step (s, l) a = (s+a, l+1)
\end{lstlisting}
We compute the length of the list and the sum of the numbers in one
pass.
\begin{verbatim}
$ ./Mean
Stack space overflow: current size 8388608 bytes.
Use `+RTS -Ksize -RTS' to increase it.
\end{verbatim}
Didn't we just fix that problem?!?
\end{frame}
\begin{frame}[fragile]
\frametitle{seq and data constructors}
Remember:
\begin{itemize}
\item Data constructors don't evaluate their arguments when
created
\item \lstinline!seq! only evaluates to the outmost data
constructor, but doesn't evaluate its arguments
\end{itemize}
Problem: \lstinline!foldl'! forces the evaluation of the pair
constructor, but not its arguments, causing unevaluated thunks build
up inside the pair:
\begin{verbatim}
(0.0 + 1.0 + 2.0 + 3.0, 0 + 1 + 1 + 1)
\end{verbatim}
\end{frame}
\begin{frame}[fragile]
\frametitle{Forcing evaluation of constructor arguments}
We can force GHC to evaluate the constructor arguments before the
constructor is created:
\begin{lstlisting}
mean :: [Double] -> Double
mean xs = s / fromIntegral l
where
(s, l) = foldl' step (0, 0) xs
step (s, l) a = let s' = s + a
l' = l + 1
in seq s' (seq l' (s', l'))
\end{lstlisting}
\end{frame}
\begin{frame}[fragile]
\frametitle{Bang patterns}
A \emph{bang patterns} is a concise way to express that an argument
should be evaluated.
\begin{lstlisting}
{-# LANGUAGE BangPatterns #-}
mean :: [Double] -> Double
mean xs = s / fromIntegral l
where
(s, l) = foldl' step (0, 0) xs
step (!s, !l) a = (s + a, l + 1)
\end{lstlisting}
\lstinline!s! and \lstinline!l! are evaluated before the right-hand
side of \lstinline!step! is evaluated.
\end{frame}
\begin{frame}[fragile]
\frametitle{Strictness}
We say that a function is \emph{strict} in an argument, if
evaluating the function always causes the argument to be evaluated.
\begin{lstlisting}
null :: [a] -> Bool
null [] = True
null _ = False
\end{lstlisting}
\lstinline!null! is strict in its first (and only) argument, as it
needs to be evaluated to pick a return value.
\end{frame}
\begin{frame}[fragile]
\frametitle{Strictness - Example}
\lstinline!cond! is strict in the first argument, but not in the
second and third argument:
\begin{lstlisting}
cond :: Bool -> a -> a -> a
cond True t e = t
cond False t e = e
\end{lstlisting}
Reason: Each of the two branches only evaluate one of the two last
arguments to \lstinline!cond!.
\end{frame}
\begin{frame}[fragile]
\frametitle{Strict data types}
Haskell lets us say that we always want the arguments of a
constructor to be evaluated:
\begin{lstlisting}
data PairS a b = PS !a !b
\end{lstlisting}
When a \lstinline!PairS! is evaluated, its arguments are evaluated.
\end{frame}
\begin{frame}[fragile]
\frametitle{Strict pairs as accumulators}
We can use a strict pair to simplify our \lstinline!mean! function:
\begin{lstlisting}
mean :: [Double] -> Double
mean xs = s / fromIntegral l
where
PS s l = foldl' step (PS 0 0) xs
step (PS s l) a = PS (s + a) (l + 1)
\end{lstlisting}
Tip: Prefer strict data types when laziness is not needed for your
program to work correctly.
\end{frame}
\begin{frame}[fragile]
\frametitle{Reasoning about laziness}
A function application is only evaluated if its result is needed,
therefore:
\begin{itemize}
\item One of the function's right-hand sides will be evaluated.
\item Any expression whose value is required to decide which RHS to
evaluate, must be evaluated.
\end{itemize}
By using this ``backward-to-front'' analysis we can figure which
arguments a function is strict in.
\end{frame}
\begin{frame}[fragile]
\frametitle{Reasoning about laziness: example}
\begin{lstlisting}
max :: Int -> Int -> Int
max x y
| x > y = x
| x < y = y
| otherwise = x -- arbitrary
\end{lstlisting}
\begin{itemize}
\item To pick one of the three RHS, we must evaluate \lstinline!x > y!.
\item Therefore we must evaluate \emph{both} \lstinline!x! and
\lstinline!y!.
\item Therefore \lstinline!max! is strict in both \lstinline!x! and
\lstinline!y!.
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Poll}
\begin{lstlisting}
data BST = Leaf | Node Int BST BST
insert :: Int -> BST -> BST
insert x Leaf = Node x Leaf Leaf
insert x (Node x' l r)
| x < x' = Node x' (insert x l) r
| x > x' = Node x' l (insert x r)
| otherwise = Node x l r
\end{lstlisting}
Which arguments is \lstinline!insert! strict in?
\begin{itemize}
\item None
\item 1st
\item 2nd
\item Both
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Solution}
Only the second, as inserting into an empty tree can be done without
comparing the value being inserted. For example, this expression
\begin{lstlisting}
insert (1 `div` 0) Leaf
\end{lstlisting}
does not raise a division-by-zero expression but
\begin{lstlisting}
insert (1 `div` 0) (Node 2 Leaf Leaf)
\end{lstlisting}
does.
\end{frame}
\begin{frame}[fragile]
\frametitle{Some other things worth pointing out}
\begin{itemize}
\item \lstinline!insert x l! is not evaluated before the
\lstinline!Node! is created, so it's stored as a thunk.
\item Most tree based data structures use strict sub-trees:
\begin{lstlisting}
data Set a = Tip
| Bin !Size a !(Set a) !(Set a)
\end{lstlisting}
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Strict function arguments are great for performance}
\begin{itemize}
\item Strict arguments can often be passed as \emph{unboxed} values
(e.g. a machine integer in a register instead of a pointer to an
integer on the heap).
\item The compiler can often infer which arguments are stricts, but
can sometimes need a little help (like in the case of
\lstinline!insert! a few slides back).
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Summary}
Understanding how evaluation works in Haskell is important and
requires practice.
\end{frame}
\end{document}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: t
%%% TeX-PDF-mode: t
%%% End:
Jump to Line
Something went wrong with that request. Please try again.