tibbe/talks

Switch branches/tags
Nothing to show
Fetching contributors…
Cannot retrieve contributors at this time
486 lines (395 sloc) 11.5 KB
 \documentclass{beamer} \usepackage{listings} % \usepackage{pgfpages} % \pgfpagesuselayout{4 on 1}[a4paper,border shrink=5mm,landscape] \title{Reasoning about laziness} \author{Johan Tibell\\johan.tibell@gmail.com} \date{2011-02-12} \begin{document} \lstset{language=Haskell} \frame{\titlepage} \begin{frame}[fragile] \frametitle{Laziness} \begin{itemize} \item Haskell is a lazy language \item Functions and data constructors don't evaluate their arguments until they need them \begin{lstlisting} cond :: Bool -> a -> a -> a cond True t e = t cond False t e = e \end{lstlisting} \item Same with local definitions \begin{lstlisting} abs :: Int -> Int abs x | x > 0 = x | otherwise = neg_x where neg_x = negate x \end{lstlisting} \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Why laziness is important} \begin{itemize} \item Laziness supports \emph{modular programming} \item Programmer-written functions instead of built-in language constructs \begin{lstlisting} (||) :: Bool -> Bool -> Bool True || _ = True False || x = x \end{lstlisting} \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Laziness and modularity} Laziness lets us separate producers and consumers and still get efficient execution: \begin{itemize} \item Generate all solutions (a huge tree structure) \item Find the solution(s) you want \end{itemize} \begin{lstlisting} nextMove :: Board -> Move nextMove b = selectMove allMoves where allMoves = allMovesFrom b \end{lstlisting} The solutions are generated as they are consumed. \end{frame} \begin{frame}[fragile] \frametitle{Example: summing some numbers} \begin{lstlisting} sum :: [Int] -> Int sum xs = sum' 0 xs where sum' acc [] = acc sum' acc (x:xs) = sum' (acc + x) xs \end{lstlisting} \lstinline!foldl! abstracts the accumulator recursion pattern: \begin{lstlisting} foldl :: (a -> b -> a) -> a -> [b] -> a foldl f z [] = z foldl f z (x:xs) = foldl f (f z x) xs sum = foldl (+) 0 \end{lstlisting} \end{frame} \begin{frame}[fragile] \frametitle{A misbehaving function} How does evaluation of this expression proceed? \begin{lstlisting} sum [1,2,3] \end{lstlisting} Like this: \begin{verbatim} sum [1,2,3] ==> foldl (+) 0 [1,2,3] ==> foldl (+) (0+1) [2,3] ==> foldl (+) ((0+1)+2) [3] ==> foldl (+) (((0+1)+2)+3) [] ==> ((0+1)+2)+3 ==> (1+2)+3 ==> 3+3 ==> 6 \end{verbatim} \end{frame} \begin{frame} \frametitle{Thunks} A \emph{thunk} represents an unevaluated expression. \begin{itemize} \item GHC needs to store all the unevaluated \lstinline!+! expressions on the heap, until their value is needed. \item Storing and evaluating thunks is costly, and unnecessary if the expression was going to be evaluated anyway. \item \lstinline!foldl! allocates \emph{n} thunks, one for each addition, causing a stack overflow when GHC tries to evaluate the chain of thunks. \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Controlling evaluation order} The \lstinline!seq! function allows to control evaluation order. \begin{lstlisting} seq :: a -> b -> b \end{lstlisting} Informally, when evaluated, the expression \lstinline!seq a b! evaluates \lstinline!a! and then returns \lstinline!b!. \end{frame} \begin{frame}[fragile] \frametitle{Weak head normal form} Evaluation stops as soon as a data constructor (or lambda) is reached: \begin{verbatim} ghci> seq (1 div 0) 2 *** Exception: divide by zero ghci> seq ((1 div 0), 3) 2 2 \end{verbatim} We say that \lstinline!seq! evaluates to \emph{weak head normal form} (WHNF). \end{frame} \begin{frame}[fragile] \frametitle{Weak head normal form} Forcing the evaluation of an expression using \lstinline!seq! only makes sense if the result of that expression is used later: \begin{lstlisting} let x = 1 + 2 in seq x (f x) \end{lstlisting} The expression \begin{lstlisting} print (seq (1 + 2) 3) \end{lstlisting} doesn't make sense as the result of \lstinline!1+2! is never used. \end{frame} \begin{frame}[fragile] \frametitle{Exercise} Rewrite the expression \begin{lstlisting} (1 + 2, 'a') \end{lstlisting} so that the component of the pair is evaluated before the pair is created. \end{frame} \begin{frame}[fragile] \frametitle{Solution} Rewrite the expression as \begin{lstlisting} let x = 1 + 2 in seq x (x, 'a') \end{lstlisting} \end{frame} \begin{frame}[fragile] \frametitle{A strict left fold} We want to evaluate the expression \lstinline!f z x! \emph{before} evaluating the recursive call: \begin{lstlisting} foldl' :: (a -> b -> a) -> a -> [b] -> a foldl' f z [] = z foldl' f z (x:xs) = let z' = f z x in seq z' (foldl' f z' xs) \end{lstlisting} \end{frame} \begin{frame}[fragile] \frametitle{Summing numbers, attempt 2} How does evaluation of this expression proceed? \begin{verbatim} foldl' (+) 0 [1,2,3] \end{verbatim} Like this: \begin{verbatim} foldl' (+) 0 [1,2,3] ==> foldl' (+) 1 [2,3] ==> foldl' (+) 3 [3] ==> foldl' (+) 6 [] ==> 6 \end{verbatim} Sanity check: \begin{verbatim} ghci> print (foldl' (+) 0 [1..1000000]) 500000500000 \end{verbatim} \end{frame} \begin{frame}[fragile] \frametitle{Computing the mean} A function that computes the mean of a list of numbers: \begin{lstlisting} mean :: [Double] -> Double mean xs = s / fromIntegral l where (s, l) = foldl' step (0, 0) xs step (s, l) a = (s+a, l+1) \end{lstlisting} We compute the length of the list and the sum of the numbers in one pass. \begin{verbatim} \$ ./Mean Stack space overflow: current size 8388608 bytes. Use +RTS -Ksize -RTS' to increase it. \end{verbatim} Didn't we just fix that problem?!? \end{frame} \begin{frame}[fragile] \frametitle{seq and data constructors} Remember: \begin{itemize} \item Data constructors don't evaluate their arguments when created \item \lstinline!seq! only evaluates to the outmost data constructor, but doesn't evaluate its arguments \end{itemize} Problem: \lstinline!foldl'! forces the evaluation of the pair constructor, but not its arguments, causing unevaluated thunks build up inside the pair: \begin{verbatim} (0.0 + 1.0 + 2.0 + 3.0, 0 + 1 + 1 + 1) \end{verbatim} \end{frame} \begin{frame}[fragile] \frametitle{Forcing evaluation of constructor arguments} We can force GHC to evaluate the constructor arguments before the constructor is created: \begin{lstlisting} mean :: [Double] -> Double mean xs = s / fromIntegral l where (s, l) = foldl' step (0, 0) xs step (s, l) a = let s' = s + a l' = l + 1 in seq s' (seq l' (s', l')) \end{lstlisting} \end{frame} \begin{frame}[fragile] \frametitle{Bang patterns} A \emph{bang patterns} is a concise way to express that an argument should be evaluated. \begin{lstlisting} {-# LANGUAGE BangPatterns #-} mean :: [Double] -> Double mean xs = s / fromIntegral l where (s, l) = foldl' step (0, 0) xs step (!s, !l) a = (s + a, l + 1) \end{lstlisting} \lstinline!s! and \lstinline!l! are evaluated before the right-hand side of \lstinline!step! is evaluated. \end{frame} \begin{frame}[fragile] \frametitle{Strictness} We say that a function is \emph{strict} in an argument, if evaluating the function always causes the argument to be evaluated. \begin{lstlisting} null :: [a] -> Bool null [] = True null _ = False \end{lstlisting} \lstinline!null! is strict in its first (and only) argument, as it needs to be evaluated to pick a return value. \end{frame} \begin{frame}[fragile] \frametitle{Strictness - Example} \lstinline!cond! is strict in the first argument, but not in the second and third argument: \begin{lstlisting} cond :: Bool -> a -> a -> a cond True t e = t cond False t e = e \end{lstlisting} Reason: Each of the two branches only evaluate one of the two last arguments to \lstinline!cond!. \end{frame} \begin{frame}[fragile] \frametitle{Strict data types} Haskell lets us say that we always want the arguments of a constructor to be evaluated: \begin{lstlisting} data PairS a b = PS !a !b \end{lstlisting} When a \lstinline!PairS! is evaluated, its arguments are evaluated. \end{frame} \begin{frame}[fragile] \frametitle{Strict pairs as accumulators} We can use a strict pair to simplify our \lstinline!mean! function: \begin{lstlisting} mean :: [Double] -> Double mean xs = s / fromIntegral l where PS s l = foldl' step (PS 0 0) xs step (PS s l) a = PS (s + a) (l + 1) \end{lstlisting} Tip: Prefer strict data types when laziness is not needed for your program to work correctly. \end{frame} \begin{frame}[fragile] \frametitle{Reasoning about laziness} A function application is only evaluated if its result is needed, therefore: \begin{itemize} \item One of the function's right-hand sides will be evaluated. \item Any expression whose value is required to decide which RHS to evaluate, must be evaluated. \end{itemize} By using this backward-to-front'' analysis we can figure which arguments a function is strict in. \end{frame} \begin{frame}[fragile] \frametitle{Reasoning about laziness: example} \begin{lstlisting} max :: Int -> Int -> Int max x y | x > y = x | x < y = y | otherwise = x -- arbitrary \end{lstlisting} \begin{itemize} \item To pick one of the three RHS, we must evaluate \lstinline!x > y!. \item Therefore we must evaluate \emph{both} \lstinline!x! and \lstinline!y!. \item Therefore \lstinline!max! is strict in both \lstinline!x! and \lstinline!y!. \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Poll} \begin{lstlisting} data BST = Leaf | Node Int BST BST insert :: Int -> BST -> BST insert x Leaf = Node x Leaf Leaf insert x (Node x' l r) | x < x' = Node x' (insert x l) r | x > x' = Node x' l (insert x r) | otherwise = Node x l r \end{lstlisting} Which arguments is \lstinline!insert! strict in? \begin{itemize} \item None \item 1st \item 2nd \item Both \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Solution} Only the second, as inserting into an empty tree can be done without comparing the value being inserted. For example, this expression \begin{lstlisting} insert (1 div 0) Leaf \end{lstlisting} does not raise a division-by-zero expression but \begin{lstlisting} insert (1 div` 0) (Node 2 Leaf Leaf) \end{lstlisting} does. \end{frame} \begin{frame}[fragile] \frametitle{Some other things worth pointing out} \begin{itemize} \item \lstinline!insert x l! is not evaluated before the \lstinline!Node! is created, so it's stored as a thunk. \item Most tree based data structures use strict sub-trees: \begin{lstlisting} data Set a = Tip | Bin !Size a !(Set a) !(Set a) \end{lstlisting} \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Strict function arguments are great for performance} \begin{itemize} \item Strict arguments can often be passed as \emph{unboxed} values (e.g. a machine integer in a register instead of a pointer to an integer on the heap). \item The compiler can often infer which arguments are stricts, but can sometimes need a little help (like in the case of \lstinline!insert! a few slides back). \end{itemize} \end{frame} \begin{frame}[fragile] \frametitle{Summary} Understanding how evaluation works in Haskell is important and requires practice. \end{frame} \end{document} %%% Local Variables: %%% mode: latex %%% TeX-master: t %%% TeX-PDF-mode: t %%% End: