# Functional Programming for Data Analysis

### Jim Pivarski

First notebook: introduction

# x = x + 1

means **x** is infinite, right?

After a long day of messy details, it's good to remember what computers really are:


magical chalkboards that solve the problems we write on them.

<table style="width: 100%; height: 329px;">
<tr style="background-color: white;"><td><span style="font-family: Lato, sans-serif; font-size: 35.84px">Fun quiz: what were the first two high-level programming languages, both written for the IBM 704?</span></td>
<td><img src="https://upload.wikimedia.org/wikipedia/commons/7/7d/IBM_704_mainframe.gif" style="width: 500px; margin-left: auto; margin-right: auto;"></tr>
</table>

\#1 FORTRAN

\#2 Lisp

Although you might not think of it as a high-level language, the whole purpose of "FOR-TRAN" is to _translate formulas_ of human-readable mathematics into executable code.

<img src="http://worrydream.com/dbx/slides/slide.007.png" style="width: 600px; margin-left: auto; margin-right: auto;">

And Lisp is...

<img src="https://leanpub.com/site_images/lisphackers/lisp_cycles.png" style="width: 80%; margin-left: auto; margin-right: auto;">

In time, FORTRAN begat COBOL, ALGOL, Pascal, C, Objective C, C++, D, Rust...

("Practical languages for real work...")

While Lisp engendered Prolog, Scheme, ML, Haskell, Clojure...

("Conceptual languages for exploring the bounds of computing...")

Except lately, the distinction is much less clear.

In the "multicore era," functional programming languages have been attracting attention as a way to simplify the human interface to parallel processing.

You say what you want, and the computer gives it to you.

In [None]:
map(lambda x: x**2, [1, 2, 3, 4, 5])

Did that just apply $f(x) = x^2$ to each integer from left to right, from right to left, or did it spawn five processes on five computers all around the world to produce the results and bring them back to me?

The abstraction, `map`, doesn't specify.

One popular application:

<table style="margin-left: auto; margin-right: auto;">
<tr style="background-color: white;"><td><img src="https://scr.sad.supinfo.com/articles/resources/207908/2807/1.png" style="height: 400px"></td>
<td><img src="https://pbs.twimg.com/profile_images/1252505253/elephant_rgb_sq.png" style="height: 200px"></tr>
</table>


The big deal about Hadoop was simply that a distributed shuffling system could be abstracted from the task in question by letting the users specify those tasks as function objects.

In [11]:
def mapreduce(mapper, reducer, data):
    output = {}
    for key1, value1 in data.items():
        for key2, value2 in mapper(value1):
            if key2 not in output:
                output[key2] = value2
            else:
                output[key2] = reducer(output[key2], value2)
    return output

In [12]:
web_pages = {"one.html": "lol cat", "two.html": "justin bieber cat", "three.html": "bieber cat bieber"}

# every Hadoop tutorial starts with counting words on web pages
find_words = lambda page: [(word, 1) for word in page.split(" ")]
count_them = lambda count1, count2: count1 + count2

mapreduce(find_words, count_them, web_pages)

{'bieber': 3, 'cat': 3, 'justin': 1, 'lol': 1}

In [13]:
residuals = {"track1": [0.10, -0.03, 0.07], "track2": [0.18, 0.04, -0.03], "track3": [-0.22, 0.13, -0.05]}

# but detector alignment is also a good application
collect_resids = lambda track: [("layer%d" % i, (1.0, track[i])) for i in range(3)]
average_them = lambda (w1, r1), (w2, r2): (w1 + w2, (w1*r1 + w2*r2)/(w1 + w2))

mapreduce(collect_resids, average_them, residuals)

{'layer0': (3.0, 0.02),
 'layer1': (3.0, 0.04666666666666667),
 'layer2': (3.0, -0.003333333333333332)}