# Functional Programming Workshop

This is a short dive into FP concepts and the `functools` module of the Python std lib. We'll focus on things that aren't too Python-specific, and touch on the ideas behind popular async and parallel execution strategies along the way.

This document is organised as follows.
  1. Introduction to (Pure) Functions
  2. Higher order functions (with an emphasis on the ideas behind Map/Reduce)
  3. Functional Design & Patterns
  4. Control Flow


Throughout, remember that programming paradigms (Object-Oriented, Procedural, Functional, etc.) are just stylistic recommendations. An accomplished artist understands the rules of their craft and may, if they choose, intentionally break the rules to produce a desired effect. 

# 1) Functions

## 1.1) Anonymous Functions (lambda)

Python functions are canonically defined with

```def fname (fargs):\n\t```

but in many circumstances it will be useful to create use-once-and-throw-away functions, for which coming up with names is not necessary.

In [1]:
f = (lambda x: x+1) # implicit return, fits on one line. Parentheses optional, but more readable.
print(f(3)) # you can call them like regular functions

print((lambda x : -3*x+7)(4)) # Once again, but now without polluting the global namespace

del f

4
-5


The most commons uses of anonymous functions are (i) mapping a throw-away function over a collection, and (ii) filtering collection elements using a throw-away predicate function (i.e. a function with type `[?] -> Boolean`). Here's a little spoiler of what's to come:

In [2]:
a = map((lambda x:x+6), [0,1,2,3,4,5,6,7,8,9]) # apply the "add six" function to each element
b = filter((lambda x: x%2 == 0), a) # only keep elements that are even numbers

#print(list(a))

for x in b:
    print(x)
    
del a, b

6
8
10
12
14


## 1.2) Pure Functions

The foundation of Functional programming is... well... functions. The distinction between functions and procedures is that a function computes and returns a value, while a procedure executes commands (typically to modify their surroundings). Between pure functions and pure procedures, there's a grey area of so-called *impure* functions.

In [3]:
import time, random
state = 3

def impure_assign (x):
    global state          # Notice that these
    state = x             # procedures aren't...
    
def impure_output (x):    # ... even using the
    print(x)              # return statement!
    
def impure_error ():      # Side effects are basically 
    temp = "str"+42       # anything except computing
    return temp           # and returning a value.

def impure_wait (x):
    time.sleep(x)
    return x

def impure_nondeterministic (x):
    global state         # I have no idea what the global
    return state + x     # state will be when I run this.

# If you don't modify your surroundings eventually, you're basically just
# using your CPU as an electric heater. But, you can use impurity responsibly.

# Functional programming is about avoiding code like this:
state = random.random()
impure_assign(impure_wait(impure_nondeterministic(1)));
impure_output(impure_output(state));

del state
del impure_assign, impure_output, impure_wait, impure_error

1.1015932908462354
None


A function is *pure* when it promises not to do anything except compute and return some value. It's possible to make finer distinctions between kinds and degrees side effects, and what kinds of properties functions gain or lose by engaging in them (e.g. parallelisability, referential transparency), but that's a bit too involved for this first incursion.

## 1.3) Contrast with the Imperative style

In functional programming, you tell the computer _what_ to do instead of _how_ to do it. It's up to language designers and compiler writers to figure out _how_ to do _what_ you said.

One consequence of this is that you might even not know (or care) how the computer is producing what you asked to compute. In OOP, data is encapsulated (in objects) and functionality is exposed (in methods). In FP, it's the opposite: data is public and functionality is encapsulated (in functions).

In [4]:
# Avoiding side effects whenever possible makes programming a lot easier.

# Compare imperative (+ side effects):
a = []
b = [1,2,3,4,5,6,7,8,9,10] 
i = b[0]
j = 0
while i <= 5:
    if i % 2 == 0:
        j += 1
    else:
        a.append(i)
        b.remove(i)
    i = b[j]
print(a,b)

del a, b, i, j

# vs declarative (no side effects):
small = (lambda x: x<=5)
odd = (lambda x: x%2!=0)
ls = [1,2,3,4,5,6,7,8,9,10]

# python has syntacic sugar for map/filter: comprehensions
a = [x for x in ls if small(x) and odd(x)]
b = [x for x in ls if not (small(x) and odd(x))]
print(a,b)

del small, ls, a, b

[1, 3, 5] [2, 4, 6, 7, 8, 9, 10]
[1, 3, 5] [2, 4, 6, 7, 8, 9, 10]


These two approaches have their benefits and their tradeoffs. The first approach is efficient in memory and in computation time -- it is actually _O(const)_ since it terminates as soon as the numbers in `b` aren't small -- while the second approach, while _O(n)_, still works on lists that haven't been pre-sorted, and is parallelisable (more on this later).

The approach I'd recommend is not to ban all side-effects, but to limit the use of side effects to the inside of functions, so that the functions you're using are both pure and efficient. That said, generally you should try to stay declarative throughout and only tell the computer _how_ to do _what_ you want when you think you know better than the language designer / compiler. And that should not be very often.


# 2) Higher-Order Functions (HOFs)

Earlier, you saw a spolier of the `map` and `filter` functions in Python. These are instances of higher-order functions.

In [5]:
# functions can be consumed and returned as values in other functions:

def derivative (f, epsilon=0.01):
    return (lambda x: (f(x+epsilon)-f(x-epsilon))/(2*epsilon))

f = lambda x: 2*x**2+x+4
df = derivative(f)
ddf = derivative(df)

print(round(df(3)))
print(round(df(4)))

print(round(ddf(0)))

del derivative, f, df, ddf

13
17
4


In [6]:
# map is no different: just a function that consumes a function to compute its value. 
def my_map(f, args):
    return [f(x) for x in args] # `map` is basically the same idea as comprehensions and `for` loops

my_map(print, [1,2,3,":-)\n"])


# Watch out: Subtle scoping difference
adderlist1 = [(lambda n: i + n) for i in [0,1,2,3,4,5,6,7,8,9,10]]
adderlist2 = my_map((lambda i: (lambda n: n+i)),[0,1,2,3,4,5,6,7,8,9,10])

print(adderlist1[2](3), "<-- bug in Python list comprehension scoping?")
print(adderlist1[5](1), "<-- bug in Python list comprehension scoping?")
print(adderlist2[2](3), " <-- expected answer")
print(adderlist2[5](1), " <-- expected answer")

del adderlist1, adderlist2, my_map

1
2
3
:-)

13 <-- bug in Python list comprehension scoping?
11 <-- bug in Python list comprehension scoping?
5  <-- expected answer
6  <-- expected answer


In [7]:
def flip (f):
    return (lambda *args: f(*reversed(args)))

vec = (lambda *args: list(map(str,args)))

print(vec(4,2,5))
print(flip(vec)(4,2,5))
del vec, flip

['4', '2', '5']
['5', '2', '4']


In [8]:
# Exercise: Write a function that checks whether an input function int -> [?] returns a boolean or not

f = None # <your answer here>

# Unit tests:

assert f(lambda x: x%2==0) == True
assert f(lambda x: -3*x +7) == False
assert f(str)==False
print("Success!")

TypeError: 'NoneType' object is not callable

In [9]:
# Solution









# Don't peek!











f = (lambda f: type(f(42)) == type(True)) # Not the most elegant, but it passes the tests...

assert f(lambda x: x%2==0) == True
assert f(lambda x: -3*x +7) == False
assert f(str)==False

## 2.2) Currying and Composing

Software development is easiest when you're reusing code instead of rewriting it. In Functional programming, the code you focus on reusing is ... well, functions.

In OOP, you'd take a general class and subclass it to get your specialised functionality. In FP, you specialise a general function by fixing some of its inputs. This is known as Currying (named after a person, no relation to food), aka partial application.

In [10]:
# Currying a function
from functools import partial

# `partial` consumes a general function, and returns a specialised function with some of its arguments fixed.

add = (lambda x,y : x+y)
neq = (lambda x,y : x != y)

inc = partial(add, 1) # incrementing is a special kind of adding
print(inc(4))

warn = partial(print,"WARNING:") # warning is a special kind of printing
warn("Don't feed the working students!")

rm_zeroes = partial(filter, partial(neq,0)) # cleaning a dataset
print(list(rm_zeroes([-1,0,1])))

del inc, warn, neq

5
[-1, 1]


In [11]:
# The most important HOF (IMHO) is composition.
# Because Guido doesn't like functional programming, we'll have to write it ourself:

# Given two functions f(y) and g(x), return fog(x) = f(g(x))
# Notice: the innermost function (g) is called first, then the leftmost function (f) is called on the result
def compose(f,g):
    return (lambda *args, **kwargs :f(g(*args, **kwargs)))

fog = compose(lambda y:y*2,lambda x: x+1)
alt = compose(lambda x:x+1,lambda y:y*2)

print(fog(2),alt(2))

from math import isfinite
rm_Nans = partial(filter,isfinite)

f = compose(list,compose(rm_Nans,rm_zeroes)) # read from right to left
print(f([-1,0,1,float('Nan'),float('+inf')]))

del fog, alt, rm_Nans, f

6 5
[-1, 1]


As a subjective estimate, 80% Functional programming consists in composing and currying functions. It is so common that in the functional language [Haskell](https://www.haskell.org/) both of these operations are actually represented by the whitespace character (for readability).

Writing functions using composition and partial application of other functions (instead of defining them with lambdas) is known as *tacit* functional programming, or _function-level_ programming. It's functional programming in its purest form: your code won't even need explicit variables any more!

In [12]:
# Exercise: Write a function for the polynomial -3x + 7 using only
# partial application and composition of the following functions:

add = (lambda x,y:x+y)
mult = (lambda x,y:x*y)

# Do NOT use `def` or `lambda` in your solution.

f = None # <Your answer here>

# Unit tests
assert f(0)==7
assert f(1)==4
assert f(4)==-5
print("Success!")

TypeError: 'NoneType' object is not callable

In [13]:
# Solution









# Don't peek!










add = (lambda x,y:x+y)
mult = (lambda x,y:x*y)
f = compose(partial(add,7), partial(mult,-3))

assert f(0)==7
assert f(1)==4
assert f(4)==-5

## 2.3) Reducing functions

Another important HOF is `fold`, aka `reduce` (from map/reduce fame)

In [14]:
# Because Guido doesn't like functional programming,
# reduce was removed from the core namespace in Py3.

from functools import reduce

foldl = (lambda f, acc, xs: reduce(f,xs,acc))
foldr = (lambda f, acc, xs: reduce((lambda x, y: f(y, x)), reversed(xs), acc))

def order (x,y):
    return "["+x+" and "+y+"]"

alphabet = ["A","B","C","D"]

print(foldl(order,"<breathe-in>",alphabet))
print(foldr(order,"<stop>",alphabet,))
del order, alphabet

[[[[<breathe-in> and A] and B] and C] and D]
[A and [B and [C and [D and <stop>]]]]


As you can see, the results of folding are nested (from the left or right, up to you!), which means the function we're calling in the reduction gets to see -- and compute values using -- the results of previous calls to itself. This implicit self-reference allows very high-level ideas to be encapsulated with folds.

In [15]:
# Exercise: write the `average` function using a reduction. You may need the following function:

add = (lambda x,y : x+y)

f = None #<your answer here>

#Unit tests
assert f([42]) == 42
assert f([-1,0,1]) == 0
assert f([0,1])==0.5
print("Success!")

TypeError: 'NoneType' object is not callable

In [16]:
# Solution









# Don't peek!









add = (lambda x,y:x+y)
f = lambda ls: foldl(add,0,ls)/len(ls)

assert f([42]) == 42
assert f([-1,0,1]) == 0
assert f([0,1])==0.5

In [17]:
# folds are the generalisation of a for-loop with an accumulator:
add = (lambda x,y:x+y)
foldl((lambda acc,x: acc+[x+acc[-1]]), [0], [1,2,3,4,5])

[0, 1, 3, 6, 10, 15]

In [18]:
# folds are the generalisation of filter
even = (lambda x: x%2==0)
foldr((lambda x,acc: ([x]+acc if even(x) else acc)) ,[], [1,2,3,4,5,6,7,8])

[2, 4, 6, 8]

In [19]:
# folds are the generalisation of map
f = lambda s: "f("+str(s)+")"
foldl((lambda acc, x: acc+[f(x)]), [], [1,2,3,5])

['f(1)', 'f(2)', 'f(3)', 'f(5)']

One thing that makes `map` and `filter` special amongst other reducing functions is that the computations they perform can be parallelised (if the functions they're folding are pure). Another special thing about them is that they are actually in the core Python namespace! But, because Guido doesn't like functional programming, [they were almost removed from Py3](https://www.artima.com/weblogs/viewpost.jsp?thread=98196).

Given that `map`ping and `filter`ing are special cases of `fold`ing, it should be clear that some subset of all folds can be parallelised. This is possible whenever the function you're `fold`ing is [both left-associative and right-associative](https://en.wikipedia.org/wiki/Operator_associativity). Mapping and filtering are parallelisable because list-joining is associative:

In [20]:
[1]+[2]+[3]+[4] == ([1]+[2])+([3]+[4]) == [1,2]+[3,4] == [1,2,3,4]

True

Sums and products are also associative, incl. their generalisations to quaternions (used in 3D graphics, that's why GPUs can do their work in parallel!), booleans, sets, ... , as well as string concatenation, matrix multiplication, ...

Depending on what you're doing, you might be able to write whatever function you're building to maintain this associativity. For instance, the average function you wrote above is not associative because it includes division, but if you store the sum and the count of all the numbers separately you can write an associative version of the same logic:

In [21]:
def my_sum (ls_or_dict):
    if type(ls_or_dict) == type({}):
        return ls_or_dict['sum']
    else:
        return foldl((lambda x,y:x+y),0,ls_or_dict)

def my_count (ls_or_dict):
    if type(ls_or_dict) == type({}):
        return ls_or_dict['count']
    else:
        return len(ls_or_dict)

def my_assoc (x,y):
    # only uses `+`, manifestly associative
    return {'count': my_count(x)+my_count(y), 'sum':my_sum(x)+my_sum(y)}

# and, at the end of the parallelisable reduction:
def average(m):
    return m['sum']/m['count']


# This can be parallelised (exponential gains in time, because reductions trees have logarithmic execution time).
# Check this yourself by replacing `foldr` by `foldl`!
combined = foldr(my_assoc,{'count':0,'sum':0},[[0,1,2,3],[4,5,6],[7,8,9]])

# this can't be parallelised, but it's a single operation so it's OK.
print(combined, "--->", average(combined))

{'count': 10, 'sum': 45} ---> 4.5


In [22]:
# Before moving on, let's just admit to ourselves that Python's built-in parallelism is not very good for FP.

# For instance, in the `multiprocessing` library we have `map`, but nothing else. E.g. there is no filter (!!!)
# Also, it relies on python pickles, so every function needs to be defined at the top-level of a program with `def`.
# Every.
# Single.
# One.
# That means, no `curry` and no `compose`. You can't sensibly do FP without them.

import multiprocessing, time, random

def pretend_work (x):
    time.sleep(random.random())
    print(x)
    return x

def square (x):
    return pretend_work(x**2)

square2 = lambda x: x**2

if __name__ == '__main__':
    with multiprocessing.Pool(2) as p:
        print(p.map(square, [1, 2, 3, 4]))
        print("--------",flush=True)
        print(p.map(square, [1, 2, 3, 4]))
        print("--------",flush=True)
        try:
            p.map(square2, [1, 2, 3, 4])
        except Exception:
            print("Can't Pickle !")
        try:
            p.map((lambda x : x**2), [1, 2, 3, 4])
        except Exception:
            print("Can't Pickle !!")
        try:
            p.filter((lambda x : x%2 == 0),[1,2,3,4])
        except:
            print("Can't Filter !!!")
        
del square


# In what follows, I'll show you code that is parallelisable, but not actually parallel.
# Just assume your language's compiler / big data framework / library gives you decent
# versions of the functions we're discussing.

4
1
16
9
[1, 4, 9, 16]
--------
4
9
1
16
[1, 4, 9, 16]
--------
Can't Pickle !
Can't Pickle !!
Can't Filter !!!


## 2.4) map/reduce

In [23]:
# Let's create a pointless, parallelisable map/reduce pipeline!

# first let's fix some of Guido's design oversights in Python...
from functools import partial
from functools import reduce

foldl = (lambda f, acc, xs: reduce(f,xs,acc))

def compose(f,g):
    return (lambda *args, **kwargs :f(g(*args, **kwargs)))

def comp(*fns): # a compose that works for more than two functions
    identity = (lambda x:x)
    return foldl(compose,identity,fns)

add = (lambda x,y:x+y)
even = (lambda x: x%2==0)

# now let's make our pipeline... hint: read these right-to-left ;-)
pipeline = comp(lambda x:x+", ", str, partial(add,1))
parallelisable_map = comp(partial(map,pipeline),partial(filter,even))
mapreduce = (lambda ls: foldl(add,"", parallelisable_map(ls))) # string concat is associative --> parallelisable!

# Just like any other function, this pipeline is reusable, so parallelism is trivial:
thread1 = mapreduce([0,1,2,3,4,5,6])
thread2 = mapreduce([7,8,9,10,11,12])
thread3 = mapreduce([13,14,15,16,17,18])
print(thread1+thread2+thread3)

del pipeline, parallelisable_map, mapreduce

1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 


If we hadn't realised our reduction was parallelisable, then we might have written something closer to
```
thread1 = parallelisable_map(...)
thread2 = parallelisable_map(...)
...
mapreduce(thread1+thread2+...)
```
which is _O(n)_ instead of _O(log(n))_ runtime efficiency -- this can be relevant for BIG data! Of course, Python doesn't know it can parallelise this, you'll need to do that part manually.

In [24]:
# Exercise: Write a mapping pipeline that maps the polynomial -3x+7 only over even numbers, and returns only positive results.
# You may want to use your answers to previous exercises.

f = None # <Your answer here>

# Unit tests
assert list(f([-2,-1,0,1,2,3,4,5,6])) == list(f([-2,0,2,4,6])) == [13,7,1]
print("Success!")

TypeError: 'NoneType' object is not callable

In [25]:
# Solution









# Don't peek!











poly = (lambda x: -3*x+7)
pos = (lambda x: x>=0)

f = comp(partial(filter,pos),partial(map,poly),partial(filter,even))
assert list(f([-2,-1,0,1,2,3,4,5,6])) == list(f([-2,0,2,4,6])) == [13,7,1]

## 2.5) Apply

`map` is good if you have one function and lots of data; what if you want to apply multiple functions to a single datum? There's a trick for that: encapsulate the idea of "applying a function to a value" in a function.

In [26]:
apply = (lambda f, v: f(v)) # Deceptively simple

# Having function application be a function means we can control function application with functions like map
for x in map((lambda f: apply(f,4)), [(lambda x:x+1),(lambda x:x%2==0),comp((lambda x:x+"!"),str)]):
    print(x)

5
True
4!


In [27]:
# Exercise: write a function `mapply` that applies a list of functions over a datum

mapply = None # your code here

# Unit tests

assert list(mapply([lambda x:x+1,lambda x:x-1],0)) == [1,-1]

# can't be bothered to import a library for a few one-liners, sorry :-p
add = lambda x,y:x+y
mean = lambda ls: foldl(add,0,ls)/len(ls)
stddev = lambda ls: (foldl(add, 0, map((lambda l: (l-mean(ls))**2),ls)) / (len(ls)-1))**0.5

# generate a statistic summary report
data = [4,5,6,7,8]
report = [min,mean,stddev,max]
assert list(mapply(report,data)) == [4, 6.0, 1.5811388300841898, 8]

del mean, stddev

TypeError: 'NoneType' object is not callable

In [28]:
# Solution









# Don't peek!












mapply = (lambda fs,d: map((lambda f:f(d)),fs))

# Unit tests

assert list(mapply([lambda x:x+1,lambda x:x-1],0)) == [1,-1]

add = lambda x,y:x+y
mean = lambda ls: foldl(add,0,ls)/len(ls)
stddev = lambda ls: (foldl(add, 0, map((lambda l: (l-mean(ls))**2),ls)) / (len(ls)-1))**0.5

# generate a statistic summary report
data = [4,5,6,7,8]
report = [min,mean,stddev,max]
assert list(mapply(report,data)) == [4, 6.0, 1.5811388300841898, 8]

del mean, stddev

In [29]:
# Everything together: Using `mapply` to write an extensible data cleaning predicate (for filter).

from math import isfinite
import re

data = [-104, 0 , 1, 12, 36, 2351825723845692374, float('Nan')]

fns = [(lambda x : x != 0), # non-zero
       (lambda x: x < 100), # not too big
       isfinite,            # Not Nan
       (lambda x: re.search("^(1|-1)",str(x)) != None) # first digit is 'one'
      ]

#print(list(map(compose(list,partial(mapply,fns)),data)))

pred = compose((lambda ls: foldl((lambda x,y : x and y),True,ls)),partial(mapply,fns))

list(filter(pred,data)) # this data is clean, just add a `partial(filter,pred)` to the start of the pipeline!!!

[-104, 1, 12]

# 3) Functional Design & Patterns

## 3.1) Laziness

You should notice, when using `map` and `filter`, that you don't actually get the result of the computation you asked for immediately.

Instead, Python returns a *promise*. It won't actually do the work until it has to.

In [30]:
map((lambda x:x+2),[0,1,2,3])

<map at 0x7fb7f85038d0>

In [31]:
# computations (including their side-effects) aren't be performed until we actually need them:
m = map(print,[1,2,3])
print("here")
print(list(m))
del m

here
1
2
3
[None, None, None]


In [32]:
# You can create your own promises: just wrap a computation into a lambda with zero arguments!
def lazily_add(x,y):
    print("Too much effort for a demo...")
    return (lambda : x+y)

z = lazily_add(2,3)
print("NOW do the work:")
print(z()) # force the promise by calling the thunk
del lazily_add, z

Too much effort for a demo...
NOW do the work:
5


Promises and forces are the functional programmer's solution to almost every efficiency concern. They can save execution time (don't calculate stuff you don't need) and memory (don't store stuff you don't need). When your promise's thunk is a pure function, you can even choose to store the computed value in memory in order to save some more execution time in the future (_memoization_, covered in the next section), or even try to speculatively evaluate the values you think you'll need on a parallel thread to reduce latency (a special kind of promise known as a _future_).

Contrast this to the imperative style discussed in section 1.3; if you tell the computer how to do what you want, you need to manage execution time and memory yourself.

In [33]:
# Exercise: write a function that performs the same task as the builtin "if" statement,
# using promises to delay evaluation of the irrelevant branch in the function args.

if_fn = None # your code here

# Unit tests

assert if_fn(True,lambda : 42,lambda : 12) == 42
assert if_fn(False,lambda : 42,lambda : 12) == 12
assert if_fn(True,lambda : 42,lambda : 12+"!") == 42

TypeError: 'NoneType' object is not callable

In [34]:
# Solution









# Don't peek!










# this is kind of cheap
# if_fn = lambda t,y,n: y() if t else n()

# this exploits the *sequential-left-to-right* short-circuiting of built-in `and`, `or`.
if_fn = lambda t,y,n: ((t and y()) or n())

# Unit tests

assert if_fn(True,lambda : 42,lambda : 12) == 42
assert if_fn(False,lambda : 42,lambda : 12) == 12
assert if_fn(True,lambda : 42,lambda : 12+"!") == 42

In [35]:
# the benefits of having control flow reified as functions will be explored more extensively in Sec. 4; until then:

import time, random

def pretend_work (x):
    time.sleep(random.random())
    print(x)
    return x

lazyprintinc = (lambda x: (lambda : pretend_work(x+1)))

# define some data
t = [True, False, True]
y = list(map(lazyprintinc,[10,20,30]))
n = list(map(lazyprintinc,[100,200,300]))

a = map(if_fn,t,y,n)
b = map(print,["Too", "much", "effort", "to", "actually", "do", "this."])

print("Go!")
print(list(a))

del lazyprintinc, a, b

Go!
11
201
31
[11, 201, 31]


## 3.2) Memoization (and decorators more generally)

In [36]:
from functools import lru_cache as memoize
import time

# memoize() returns a function which, given a function, returns a function
# that remembers the values it computes to avoid repeating the work next time

def impure_42 ():
    time.sleep(3)
    for i in range(1,4):
        print("step "+str(i)+" of 1827469261 complete...")
        time.sleep(2)
    return 42

f = memoize(maxsize=1)(impure_42)

# side effects (incl. the time spent waiting while computing the value) occur only once

print("First return takes forever:")
print(f())
print("Next return is immediate (and side-effect free)!")
print(f())

del f

First return takes forever:
step 1 of 1827469261 complete...
step 2 of 1827469261 complete...
step 3 of 1827469261 complete...
42
Next return is immediate (and side-effect free)!
42


In [37]:
# Python (uncharacteristically) has some nice synactic sugar for this

from random import random

@memoize(maxsize=1)
def rand ():
    # guaranteed to be random!
    return random()

print(rand())
print(rand())
print(rand())

del rand

0.4870539735738739
0.4870539735738739
0.4870539735738739


Python's `@` notation is synactic sugar for a functional pattern known as _decoration_: take a function, and return a function that adds some extra functionality to it.

In [38]:
# You can think of Currying with as "decorating a function with a fixed value for its first argument".

from functools import partial

curry = (lambda x: (lambda f: partial(f,x)))

@curry(1) # curry(1) returns (lambda f: partial(f,1))
def inc(x,y):
    return x+y

print(inc(4)) # NB: the arity of `inc` after decoration is NOT guaranteed to be the same as in the `def` line.

# the unsugared syntax is a few characters shorter and (arguably?) less confusing 
inc = curry(1)(lambda x,y:x+y)

print(inc(6))

del inc

5
7


Decoration is the canonical way to implement ideas from Aspect Oriented programming in the functional style. With decorators, you can separate different concerns into different functions.

In [39]:
def guard_inputs(fn):
    def decorator(f):
        def wrapper(*args):
            assert fn(*args)
            return f(*args)
        return wrapper
    return decorator

def celebrate(f):
    def wrapper(*args):
        try:
            a = f(*args)
            print("successfully called!")
        except AssertionError:
            print("successfully guarded!")
            a = None
        print(a)
        return a
    return wrapper

@celebrate
@guard_inputs(lambda x,y: x>y>0) # assert arbitrary relations between fn inputs
def birthday(x,y):
    return "Python is "+str(x)+" years old! Python 2.7 retires in "+str(y)+" years!"

birthday(-2,0) # would throw an error without decoration
birthday(27,2)

del guard_inputs, celebrate, birthday

successfully guarded!
None
successfully called!
Python is 27 years old! Python 2.7 retires in 2 years!


In the code above, we've separated out three concerns into three functions: the business of talking about Python milestones is separated from the validation code, both of which are separated from the error handling for the validation.

What you would likely have written instead would have been

```
def birthday (x,y):
    if (x>y>0):
        print("successfully called!")
        return "Python Milestones string"
    else:
        print("successfully guarded!")
        return None
```

One benefit of working with Aspects is that you can reuse guard/celebrate to automatically add this validation logic to any function you're writing (or importing from a library!). Adding more aspects to `birthday` (automatic logging, authentication, memoization, ...) is just a matter of decorating the function; you don't need to change (or sometimes even look at) the code from `birthday`. 

Another benefit is that you can easily remove this validation logic once your testing convinces you the rest of your code is calling `birthday` correctly, again without changing the code from `birthday`.

In [40]:
# `wraps` is a decorator for helping you decorate functions
# without losing their original metadata... very meta!

# adapted from the functools documentation:

from functools import wraps

def decorate(f):
    
    @wraps(f) # decorate wrapper, so that Python knows wrapper is itself a decorator
    def wrapper(*args):
        print('Calling decorated function with args =',args)
        return f(*args)
    
    return wrapper


@decorate
def poly (x,y):
    """A diophantine polynomial"""
    return -3*x +2*y + 7

print(poly(4,6))
print(poly.__doc__) # not lost!

del decorate, poly

Calling decorated function with args = (4, 6)
7
A diophantine polynomial


In [41]:
# Exercise: Write a decorator that censors emails from any decorated function's inputs (privacy concerns!)

import re

def censor (s):
    if type(s) == type("string"):
        return re.subn("\S*\@\S*","####",s)[0]
    else:
        return s

    
censoring = None # your code here

# Unit tests

@censoring
def identity(s):
    return s

@censoring
def join(s1,s2):
    return s1+s2

assert identity("My email is name.name@quantillion.io") == 'My email is ####'
assert identity("Our emails are secret@quantillion.io and private@dpa.nl") == "Our emails are #### and ####"
assert join("Nothing to censor here: ","e@pistu.la") == "Nothing to censor here: ####"
print("Success!")

TypeError: 'NoneType' object is not callable

In [42]:
# Solution









# Don't peek!











def censoring (f):
    def wrapper(*args):
        return f(*map(censor,args))
    return wrapper


# Unit tests

@censoring
def identity(s):
    return s

@censoring
def join(s1,s2):
    return s1+s2

assert identity("My email is name.name@quantillion.io") == 'My email is ####'
assert identity("Our emails are secret@quantillion.io and private@dpa.nl") == "Our emails are #### and ####"
assert join("Nothing to censor here: ","e@pistu.la") == "Nothing to censor here: ####"

## 3.3) Lazy Data Structures

Functions produce values. By combining `fold` and laziness, we can lazily generate these values in arbitrary (and possibly nested) structures.

Python actually has built-in support for lazy sequence-like data structures: Generators.

In [43]:
# List comprehension
[x**2 for x in [1,2,3,4]]
# Generator
(x**2 for x in [1,2,3,4])

<generator object <genexpr> at 0x7fb7f84aa678>

In [44]:
# List comprehensions are just syntactic sugar for the following:
a = (x**2 for x in [1,2,3,4])
print(list(a))
del a

[1, 4, 9, 16]


In [45]:
bignum = ((64**64)**64)**64
print("bignum has",len(str(bignum)),"digits")

bignum has 473480 digits


In [46]:
xs = (x**2 for x in range(bignum)) # this sequence would not fit in memory if realised

sum = 0
for x in xs:
    print(x)
    sum += x
    if x>120:
        break

del sum

0
1
4
9
16
25
36
49
64
81
100
121


One limitation of Python's generators is that they only support _serial iteration_ (via a `next()` method) from the _beginning_ of the generator (Generators behave like "lazy linked lists"). For pure functions this is just an efficiency concern, but when the functions are impure it can trigger a bunch of side-effects you might not want.

Thankfully, since we can create our own promises/thunks, we can fix that limitation by making a "lazy linked list of promises":

In [49]:
from itertools import islice # the canonically "right" way to slice a generator ...
import time, random

print("Wrong:")
a = (pretend_work(x) for x in range(bignum))
print(list(islice(a,7,10))) # ... still evaluates a bunch of junk before reaching what we want!


print("Fixed:")
b = ((lambda: pretend_work(x)) for x in range(10))

# map forcing of promises, i.e. partial(map,force), plus forcing the original generator with list
map_force = (lambda seq: list(map((lambda f: f()),seq)))

print(map_force(islice(b,7,10)))

del a,b,map_force

Wrong:
0
1
2
3
4
5
6
7
8
9
[7, 8, 9]
Fixed:
7
8
9
[7, 8, 9]


## 3.4) Functions as Data

In Python, lazy sequences are provided by generators. But, not all data structures are sequences. You can kind of think of a function as providing the sort of functionality you'd want from a "lazy dictionary"; given an argument (key), return a value. This suggests a completely new way of thinking about data structures... what if, instead of generating them with functions, we _represented_ them with functions?

Now, this isn't necessarily a good idea, especially given that functional programming emphasises composition and processes over data structures and values. But, it's still kind of cool, and it serves to introduce the powerful notion that `code == data`.

In [50]:
# Let's represent a directed graph using functions.

def a () : return [a, b]
def b () : return [c, d]
def c () : return [a]
def d () : return [b, c, e]
def e () : return [f]
def f () : return [e]

# Now let's randomly walk around the graph

import random

next = a
random.seed(2)
for i in range(20):
    next = random.choice(next()) # randomly choose amongst available options
    print(next.__name__)
    
del a,b,c,d,e,f,next

a
a
a
b
c
a
b
c
a
a
b
d
e
f
e
f
e
f
e
f


Traversals of this specific graph eventually hit the two-cycle `f->e->f->e->...`. If you replace the vectors `[c d]` returned by these functions by dictionaries `{c:0.2, d:0.8}`, then you've built your own (runnable!) Markov model. *Executable data structures* are just one of the many implications of "Code is Data".

In [51]:
# Exercise: Write a function which compiles a dict of dicts into a (runnable!) Markov model.

model = {"a":{"a":0.2,"b":0.3,"c":0.5},"b":{"a":0.3,"b":0.2,"c":0.5},"c":{"a":0.4,"b":0.4,"c":0.2}}

runnable_model = None # Your code here


# Just kidding.

del model

Another cool implication of `code == data` is that data structures are actually code in disguise. Some languages take this to the extreme and say that any source code you can execute is basically just structured data. In these languages, _metaprogramming is just regular programming_. In Python, the only data type that can obviously be used to represent a chunk of code is the string. In fact, each cell in this notebook is just a JSON object with a list of strings in a source field.

Functional programming is basically just the insight that if functions can be put into lists, they are data to be manipulated. In fact, our definition of `comp` was just a `fold` of some function over a list of functions. What if we want to print the result of each intermediate step of our map/reduce pipeline? All we need to do is add a `print-and-return-identically` function between the steps, or map a printing decorator to each step, and we're done!

In [52]:
# Of course, adding a printing step between each computation step is just folding over the pipeline:

from functools import reduce
foldl = (lambda f, acc, xs: reduce(f,xs,acc))

def prid (x):
    print(x)
    return x

pipeline = [(lambda x:x+1),(lambda x:x*2),(lambda x:x+4),(lambda x:round(x/2)),(lambda x:x-1)]

printing_pipeline = foldl((lambda acc,x: acc+[prid,x]),[],pipeline)
print([p.__name__ for p in printing_pipeline])

# and then turning the data structure into a function is the same as before:

def compose(f,g): # from earlier
    return lambda x:f(g(x))

foldl(compose,(lambda x:x),printing_pipeline)(7)

del prid, printing_pipeline

['prid', '<lambda>', 'prid', '<lambda>', 'prid', '<lambda>', 'prid', '<lambda>', 'prid', '<lambda>']
6
3
7
14
15


In [53]:
# Exercise: achieve the same result as above by decorating each step of the mapping pipeline

pipeline = [(lambda x:x+1),(lambda x:x*2),(lambda x:x+4),(lambda x:round(x/2)),(lambda x:x-1)]

printing_pipeline = None # your code here

# Unit tests (visual inspection)
foldl(compose,(lambda x:x),printing_pipeline)(7)

TypeError: reduce() arg 2 must support iteration

In [54]:
# Solution









# Don't peek!














def with_printing (f):
    def wrapper(*args):
        a = f(*args)
        print(a)
        return a
    return wrapper

printing_pipeline = [with_printing(p) for p in pipeline] # map(with_printing,pipeline)

foldl(compose,(lambda x:x),printing_pipeline)(7)

del with_printing, printing_pipeline

6
3
7
14
15


Functions are _closures_: they carry around the lexical scope they were defined in. So far, we've created _immutable_ data structures. In order to start mutating them, we need to explicitly tell Python we want to mess around with the nonlocal scope.

In [55]:
undefined = None # or whatever value makes sense for undefined variables
verbose = True
def make_var():
    a = undefined
    def var(x=a):
        nonlocal a
        if x != undefined:
            a = x     # var is NOT a pure function
            if verbose: print(a)
            return a
        else:
            if verbose: print(a)
            return a
    return var

i = make_var()
i()       # Initially undefined
i(1)      # set the var-like function stored in `i` to one
i()       # now calls to the var-like function return one
i(2)
i(undefined)   # undefined is the only value you can't assign
i(3)

verbose=False
del i

None
1
1
2
2
3


Whenever you change a cell-value in an Excel spreadsheet, every other cell that depended on that value automatically changes. This idea is known as data flow (aka _Reactive_) programming (as opposed to Control flow programming, which we'll visit in the next section).

Using functions as data makes reactive programming really easy:

In [56]:
# initially undefined
i = make_var()
j = make_var()

k = lambda: (i()+j()) # this is assignment at the function-level, but true equality on the computed-value level!

# let's change the values of i,j:
i(4)
j(3)
# check
print(i(),"+",j(),"==",k())

# and again
print("pick a number...")
j(float(input(">>> "))) # try typing -3
print(i(),"+",j(),"==",k())

del i, j, k

4 + 3 == 7
pick a number...
>>> -3
4 + -3.0 == 1.0


In the code above, the value of the thunk `k()` is recomputed on each call. In practise you might want to perform such re-computations lazily: the value of `i()` didn't change between the two calls to `k()`, you could use a memoized value instead of calling `i()` -- but personally I'd rather avoid managing mutable state in the first place, so I'm not sure what the best approach is.

In [57]:
# True equality also means you can _backpropagate_ assignments
i = make_var()
i(1)

def k (x=undefined):
    if x!=undefined:
        i(x)
    return i()

print(k())   # check that k() == i()
print(k(42)) # setting the value of k...
print(i()) # ... also sets the value of i!

# When k depends on multiple variables i, j, ..., you'll be able to design your own backpropagation strategies. 

1
42
42


## 3.6) Bonus Material

The reason I'm even bringing up reactive programming is to showcase _hot-swapping_. Suppose you know in advance that you might want to change a program's code during the program's runtime; particularly, you want to change the definition of some function and make sure any other function you've built on top of it sees the change.

(The only reason I can think of for wanting this is if you're modifying a live system with high-availability requirements)

In [58]:
from functools import partial

print("Regular `partial` behaviour:")

f1 = lambda x:x-3
f2 = partial(map,f1)
print(list(f2([1,2,3])))
f1 = lambda x:x+5
print(list(f2([1,2,3])), "# change in `f1` not seen by `f2`")

print("\nHot-Swapping with manual currying:")

f1 = lambda x:x-3
f2 = lambda ls: map(f1,ls) # !!!
print(list(f2([1,2,3])))
f1 = lambda x:x+5
print(list(f2([1,2,3])))

del f1,f2

Regular `partial` behaviour:
[-2, -1, 0]
[-2, -1, 0] # change in `f1` not seen by `f2`

Hot-Swapping with manual currying:
[-2, -1, 0]
[6, 7, 8]


The difference between currying `f1` with `partial` and manually currying `f1` into `f2` is the difference between calling `f1` _by value_ (pure fn arg) vs. calling it _by reference_.

The variable `f1` in the manually curried version of `f2` is just a reference to some value: because lambda means laziness, Python is not actually fetching the value referenced by `f1` until we call `f2`. The difference is subtle, but super relevant: `partial(map,f1)` is a pure function, while `lambda ls: map(f1,ls)` is very slightly impure: `f1` is resolved out of the global namespace. <sub><sup>Unless your compiler optimises more than you expect it to, and inlines the call to f1...</sup></sub>

In [59]:
# Evil time

f = lambda ls: map(lambda x: x%2==0,ls)

map = filter

list(f([1,2,3,4,5,6,7,8]))

[2, 4, 6, 8]

# 4) Control Flow

A pure function computes a value (usually by calling other pure functions) and then returns. But what if we could hijack the whole "return" business?

## 4.1) Recursion

The common way muck around with how functions return, is to have functions recursively call *themselves* to compute the value they want to return. In order to do this, you usually need a named function (or, if you're feeling fancy, you can use the [Y combinator](https://en.wikipedia.org/wiki/Fixed-point_combinator#Fixed_point_combinators_in_lambda_calculus)).

In [60]:
# The simplest case of recursion is the infinite loop:
def infiniteLoop():
    return infiniteLoop()

# Python is a decent language, and will throw an error
# instead of just using up all the computer's memory.
try:
    infiniteLoop()
except RecursionError:
    print("maximum recursion depth exceeded!")

maximum recursion depth exceeded!


In [61]:
# Designing a recursive function is like designing a loop: you want some termination condition to avoid infinities

def up_to_ten(x):
    if x > 10:
        return True
    else:
        print("2^"+str(x),"=",str(2**x))
        up_to_ten(x+1)

up_to_ten(-2)
del up_to_ten

2^-2 = 0.25
2^-1 = 0.5
2^0 = 1
2^1 = 2
2^2 = 4
2^3 = 8
2^4 = 16
2^5 = 32
2^6 = 64
2^7 = 128
2^8 = 256
2^9 = 512
2^10 = 1024


As you can see, recursive functions use input arguments (`x` above) to represent statefulness (instead of capturing variables from the global scope). This is the same strategy we used in `fold`, and a big theme in functional programming more generally.

In [62]:
# not all recursion is self-recursion: 

def one_step_back (x):
    print("< ",x)
    return two_steps_forward(x-1)

def two_steps_forward (x):
    print(">>",x)
    if x>6:
        return True
    else:
        one_step_back(x+2)

two_steps_forward(1)

del two_steps_forward, one_step_back

>> 1
<  3
>> 2
<  4
>> 3
<  5
>> 4
<  6
>> 5
<  7
>> 6
<  8
>> 7


## 4.2) Trampolines

Laziness + recursion = no stack consumption

## 4.3) Continuations

And now for something completely different.

The most direct way to hijack function returns is to create a function that provides the same utility as `return`: capture the rest of the calculation a function is supposed to return to (aka the functions' *continuation*) in an object, and make a function that takes us to that part of the calculation.

In [63]:
# From Peter Norvig:
def call_with_escape_continuation(proc):
    "Call proc with current continuation; escape only"
    ball = RuntimeWarning("Sorry, can't continue this continuation any longer.")
    def throw(retval): ball.retval = retval; raise ball
    try:
        return proc(throw)
    except RuntimeWarning as w:
        if w is ball: return ball.retval
        else: raise w

In [64]:
def g (x):
    return x+"!"

def f (k):
    g(k(41))
    print("!")

call_with_escape_continuation(f)+1

42

There's something deeply abusive going on here: `call/ec` created a function-like thing (`k`) that skips all the "!" parts of the computation and returns 41 immediately to the place where `call/ec` was called. Continuations can prevent functions `f` and `g` from returning, or even from running cleanup code (closing open files). Basically, we've taken all the dangerous parts of Python's exception handling and bundled them into a (one-use only) function.

In other languages, continuations can do even more pathological things to your control flow. My recommendation is to treat them like a pathologist treats the plague: study them in order to recognise the symptoms, but never use them.

I only mention this because in the analogy that reducing functions == loops, we have `k` == `break`.

# 5) Final Thoughts

There's a lot more that could be talked about. Other common functional patterns, the rich history and connections to mathematics, and a few deeper topics that Python simply isn't suited for (tail call elimination, delimited continuations, monads, homoiconicity).