Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
486 lines (316 sloc) 12 KB
{-# LANGUAGE MultiParamTypeClasses, FlexibleInstances #-}
module Algebraically where
import Control.Arrow
import Data.List (inits, tails)
{-
***
Notes and Exercises in Algebraic Program Derivation
Gershom Bazerman, May 2015
***
Tested with: GHC 7.10, GHC 7.8
The following are a set of notes and exercises to introduce the very basic notions of program derivation in the style of "The Algebra of Programming" (Bird and de Moor, 1997). That book provides the best account of this material and much more. However, there is a long and storied history of work in this field. See for instance:
"Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire" and follow on work by Meijer, Fokkinga and Paterson (1991)
"Algorithmics -- Towards programming as a mathematical activity" (Meertens, 1987)
"An Exploration of the Bird-Meertens Formalism" (Backhouse, 1989)
"Algebraic Data Types and Program Transformation" (Malcolm, 1990)
"Upwards and downwards accumulations on trees" (Gibbons, 1992)
-}
{- --- --- --- --- --- ---
"I practice till the difficult becomes easy; the easy becomes habit; and the habit becomes beautiful."
--- --- --- --- --- --- -}
--------------------------------------------------------------
-- Part 1: Fixpoints of Functors, and Algebras on Functors
-- equational reasoning requires equations
-- level up equational reasoning by studying equations more powerful than simple substitutions.
-- data types and functions on them come equipped with laws beyond the obvious.
-- strictly positive recursive types can be written as the fixpoints of "signature functors" like so.
data NatF r = ZeroF | SuccF r
data ListF a r = NilF | ConsF a r
data TreeF a r = LeafF a | BranchF r r
data FixF f = FixF {unRoll :: f (FixF f)}
-- Isomorphism between algebras as sets of things to do in each case, and functions out of a functor that encompasses all cases.
type LAlg a b = (a -> b -> b, b)
listFold :: LAlg a b -> ListF a b -> b
listFold (c, z) NilF = z
listFold (c, z) (ConsF a r) = c a r
mkListAlg :: (ListF a b -> b) -> LAlg a b
mkListAlg f = (\x xs -> f (ConsF x xs) , f NilF)
{-
Exercise: Show this is an isomorphism
Exercise: Write the same thing for Trees
-}
listMap :: (a -> b) -> ListF c a -> ListF c b
listMap f NilF = NilF
listMap f (ConsF x y) = ConsF x (f y)
fold :: (a -> b -> b, b) -> [a] -> b
fold (c, z) = foldr c z
a = ((:),[])
{-
exercise: write as pair algebras
1: Sum
2: Length
Now, write them as functor algebras (ListF Int Int -> Int)
substitute and convince yourself these are the same.
-}
-- Part 2: Universal Properties of Folds and equational reasoning
{-
universal property of folds
if h = fold f
<==>
h . listFold a = listFold f . listMap h
-}
lfa :: ListF a [a] -> [a]
lfa = listFold a
up1 :: ([a] -> b) -> LAlg a b -> ListF a [a] -> b
up1 h f = h . listFold a
up2 :: ([a] -> b) -> LAlg a b -> ListF a [a] -> b
up2 h f = listFold f . listMap h
-- Substitute h for fold f, and the following are clearly an identity
up3 :: LAlg a b -> ListF a [a] -> b
up3 f = fold f . listFold a
up4 :: LAlg a b -> ListF a [a] -> b
up4 f = listFold f . listMap (fold f)
-- the universal property shows that this goes the other direction
{-
exercise:
show that up3 and up4 are necessarily the same
-}
{-
exercise:
fold a = id
-}
{-
fold fusion:
h . listFold f = listFold g . listMap h
==>
h . fold f = fold g
-}
ff1 h f g = h . listFold f
ff2 h f g = listFold g . listMap h
{-
*Algebraically> :t ff1
ff1 :: (b -> c) -> LAlg a b -> t -> ListF a b -> c
*Algebraically> :t ff2
ff2 :: (a1 -> c) -> t -> LAlg a c -> ListF a a1 -> c
*Algebraically> :t ff1 `asTypeOf` ff2
ff1 `asTypeOf` ff2 :: (b -> c) -> LAlg a b -> LAlg a c -> ListF a b -> c
h :: b -> c
f :: LALg a b
g :: LAlg a c
h . fold f :: [a] -> c
fold g :: [a] -> c
-}
{-
Exercise: Take h to be show, f to be sum.
Determine g.
-}
sumAlg = ((+),0)
ffex1 :: ListF Integer Integer -> String
ffex1 = show . listFold sumAlg
ffex2 :: ListF Integer Integer -> String
ffex2 = g . listMap (show :: Integer -> String)
where g :: ListF Integer String -> String
g = undefined -- ?
{-
exercise: show that
listFold a . fold (mkListAlg (listMap (listFold a))) == id
fold (mkListAlg (listMap (listFold a))) . listFold a == id
-}
--------------------------------------------------------------
-- Part 3: The Banana Split Threorem
bs1, bs2, bs3, bs4, bs5, bs6 :: LAlg a b -> LAlg a c -> ListF a [a] -> (b,c)
bs1 c1 c2 = (fold c1 &&& fold c2) . listFold a
-- split fusion
bs2 c1 c2 = fold c1 . listFold a &&& fold c2 . listFold a
-- properties of fold
bs3 c1 c2 = listFold c1 . listMap (fold c1) &&& listFold c2 . listMap (fold c2)
-- split expansion
bs4 c1 c2 = listFold c1 . listMap (fst . (fold c1 &&& fold c2)) &&&
listFold c2 . listMap (snd . (fold c1 &&& fold c2))
-- functor splitting
bs5 c1 c2 = listFold c1 . listMap fst . listMap (fold c1 &&& fold c2) &&&
listFold c2 . listMap snd . listMap (fold c1 &&& fold c2)
-- split fusion (backwards)
bs6 c1 c2 = (listFold c1 . listMap fst &&& listFold c2 . listMap snd) . listMap (fold c1 &&& fold c2)
{-
by the universal property of lists, bs1 == bs6 implies that
(fold c1 &&& fold c2) === fold (listFold c1 . listMap fst &&& listFold c2 . listMap snd)
-}
bsResult1, bsResult2 :: LAlg a b -> LAlg a c -> [a] -> (b,c)
bsResult1 c1 c2 = fold c1 &&& fold c2
bsResult2 c1 c2 = fold (mkListAlg (listFold c1 . listMap fst &&& listFold c2 . listMap snd))
{-
exercise, use the banana split theorem to write a single pass average.
-}
{-
exercise: write the bs theorem as an operation on LAlg explicitly, and argue why it is the same.
-}
--------------------------------------------------------------
-- Part 4: Maximum Segment Sum
-- inspired heavily by http://www.iis.sinica.edu.tw/~scm/2010/maximum-segment-sum-origin-and-derivation/
-- A) list homomorphisms
concatList :: [[a]] -> [a]
concatList = foldr (++) []
-- concatList is a natural transformation -- i.e.
-- map f . concatList = concatList . map (map f)
{-
h :: [a] -> a
h [h xs, h ys] == h (xs ++ ys)
==>
h . concat = h . map h
In such a circumstance, we say h is a "list homomorphism" and it is a special fold that is associative and with a zero element. -- i.e. it is given by the action of a monoid.
Other homomorphism laws:
define m xs ys = h [xs,ys]
now, h = foldr m z . map (\x -> h [x])
(where foldr itself may be executed in parallel as it is a fold over a monoidal operation)
Observe: list homomorphisms are subject to _map_ and _reduce_.
-}
{-
Exercise: Give examples of list homomorphisms, and write algebras for them.
Hint: Look at Data.Monoid
-}
-- B) Scanr, inits fusion fusion
tailsAlg :: LAlg a [[a]]
tailsAlg = (go,[[]])
where go x [] = [[x]]
go x (y:ys) = (x : y) : y : ys
{-
fold tailsAlg [1,2,3,4]
[[1,2,3,4],[2,3,4],[3,4],[4],[]]
-}
-- note this is reversed from a typical scanr.
scanRight :: LAlg a b -> LAlg a [b]
scanRight (c,z) = (go,[z])
where go x [] = [c x z]
go x (y:ys) = c x y : y : ys
{-
By inspection, tails is initial with regards to rightward scans.
scanRight alg = map (fold alg) . fold tailsAlg
rememember fusion
h . listFold f = listFold g . listMap h
take:
h = map (fold alg)
f = tailsAlg
g = scanRight alg
and we see
map (fold alg) . listFold tailsAlg :: ListF a [[a]] -> [a]
listFold (scanRight alg) . listMap (map (fold alg)) :: ListF a [[a]] -> [a]
the two are equal, and therefore by fusion, scanRight alg = map (fold alg) . tailsAlg
Or:
scanr f e = map (foldr f e) . tails
-}
{-
note this is an inefficient inits and scan..
-}
initsAlg :: LAlg a [[a]]
initsAlg = (go,[])
where go x [] = [] : [x] : []
go x xs = [] : map (x:) xs
{-
*Algebraically> fold initsAlg [1,2,3,4,5]
[[1],[1,2],[1,2,3],[1,2,3,4],[1,2,3,4,5]]
-}
scanLeft :: LAlg a b -> LAlg a [b]
scanLeft (c,z) = (go,[])
where go x [] = [z,c x z]
go x ys = z : map (c x) ys
{-
inits fusion
map (fold alg) . fold initsAlg :: ListF a [[a]] -> [a]
listFold (scanLeft alg) . listMap (map (fold alg)) :: ListF a [[a]] -> [a]
these two are equal, therefore by fusion, scanLeft alg = map (fold alg) . initsAlg
Or:
scanl f e = map (foldl f e) . inits
-}
-- C) The problem
testData :: [Integer]
testData = [1,2,3,5,-100,5,1000,-1000,-4,25]
segments :: [Integer] -> [[Integer]]
segments = concat . map inits . tails
mss0 :: [Integer] -> Integer
mss0 = maximum . map sum . segments
mss1, mss2, mss3, mss4 :: [Integer] -> Integer
--inline segments
mss1 = maximum . map sum . concatList . map inits . tails
-- by naturality of concat
mss2 = maximum . concatList . map (map sum) . map inits . tails
-- by maximum being a list homomorphism
mss3 = maximum . map maximum . map (map sum) . map inits . tails
-- by map functorality / fusion
mss4 = maximum . map (maximum . map sum . inits) . tails
-- We now denote maximum . map sum . tails as maximum prefix sum
-- maximum segment sum is the maximum of the prefix sums of the tails
-- we wish to manipulate mps into a fold, so that we can use scanr fusion.
mps0,mps1,mps2,mps3,mps4 :: [Integer] -> Integer
mps0 = maximum . map sum . inits
--inits fusion
mps1 = maximum . scanl (+) 0
--expand maximum
mps2 = foldr max 0 . scanl (+) 0
--fold/scan fusion
{-
I believe the following is correct:
if h and g are list homomorphisms, and
g x (h y z) = h (g x y) (g x z) -- i.e. h distributes over g
then
foldr h z . scanr g z = foldr (\x y -> h (g x y) z)
If it is not, then it holds in this specific case, and can be carried through by reasoning directly on (+) and max.
-}
mps3 = foldr (\x y -> max 0 (x + y)) 0
zmax x y = max 0 (x + y)
mps4 = foldr zmax 0
mss5,mss6 :: [Integer] -> Integer
--subsitute in the new mps
mss5 = maximum . map (foldr zmax 0) . tails
-- tails fusion
mss6 = maximum . scanr zmax 0
-- we've gone from quadratic time to linear time!
{-
Exercise: observe that while zmax is derived from two list homomorphisms, zmax itself is not a list homomorphism. Argue why.
-}
--------------------------------------------------------------
-- Part 5: Parallel Prefix Sum
{-
Recall that inits as we gave it had complexity O(N^2).
Note that as a foldl, inits may have complexity O(N), and tails does have such complexity as given.
Can we do better?
One school says, obviously not.
Another says, think parallel!
For simplicity we work with suffix sums (generalized scanr), and assume an O(1) list concatenation
-}
{-
Exercise 1: given an arbitrary operator "op" and an arbitrary z, (scanr op z) is not a list homomorphism. Name a counterexample.
-}
{-
Recall: a list homomorphism is a function h and an operator `op` such that:
h xs `op` h ys = h (xs ++ ys) -- i.e. you can combine then translate, or translate than combine.
This induces a monoid with `op` as the action, and h [] as the unit.
Now we wish to define a function of type
scanOp :: (a -> a -> a) -> [a] -> [a] -> [a]
this function obeys the property that:
scanOp op (scanr op z xs) (scanr op z ys) = scanr op z (xs ++ ys)
-- i.e. we want scanOp to be the monoidal operation induced by "scanr op z"
-}
scanOp :: (a -> a -> a) -> [a] -> [a] -> [a]
-- obvious
scanOp op xs [] = xs
scanOp op [] ys = ys
-- scanOp op xs ys = ?
{-
By the laws of scanr, we know that:
scanr op z (x:xs) = op x (head (scanr op z xs)) : scanr op z xs
We also know that:
scanr op z (xs ++ ys) = _ + scanr op z ys
From this we can conclude that if op is associative,
scanr op z (xs ++ ys) = map (`op` (head (scanr op z ys))) (init $ scanr op z xs) ++ scanr op z ys
From this property we can write our scanOp as desired.
-}
scanOp op xs (y:ys) = map (`op` y) (init xs) ++ (y:ys)
{-
Checking some cases we see that indeed for (op,z) a monoid algebra:
foldr (scanOp op) [] . map (scanr op z) === scanr op z
and hence scanr may be written as a map reduce, and parallelized.
Note that this is not a very good parallel prefix sum, due to the map. Deriving a genuinely efficient one in this manner, yielding O(log n) complexity, takes a fair amount more work.
See: "Extracting and implementing list homomorphisms in parallel program development" (Gorlatch, 1999).
This includes an application of a generalized method to turn maximum segment sum into a homomorphism as well.
-}