In [1]:
# setup
from IPython.core.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})


# CMPS 6610
# Algorithms

## Reduce Sort


Today's agenda:  

- `iterate` vs `reduce`
- sorting with sequences

## Reduce


> A function that repeatedly applies an **associative binary operation** to a collection of elements until the result is *reduced* to a single value.

Associative operations allow commuting the order of operations.
- $plus(plus(2,3), 5) = plus(2, plus(3,5)) = 10$

<br>

**formal definition of reduce**:

$reduce \: (f : \alpha \times \alpha \rightarrow \alpha) (id : \alpha) (a : \mathbb{S}_\alpha) : \alpha$

Input is:
- $f$: an associative binary function
- $a$ is the sequence
- $id$ is the **left identity** of $f$ $\:\: \equiv \:\:$ $f(id, x) = x$ for all $x \in \alpha$

Returns:
- a value of type $\alpha$ that is the result of the "sum" with respect to $f$ of the input sequence $a$


<br>

When $f$ is associative: $reduce \: f \: id \: a  \: \equiv \: iterate \: f \: id \: a$

<br>

$reduce \: f \: id \: a =
\begin{cases}
id & \hbox{if} \: |a| = 0\\
a[0] & \hbox{if} \: |a| = 1\\
f(reduce \: f \: id \: (a[0 \ldots \lfloor \frac{|a|}{2} \rfloor - 1]), \\ \:\:\:reduce \: f \: id \: (a[\lfloor \frac{|a|}{2} \rfloor \ldots |a|-1])& \hbox{otherwise}
\end{cases}
$

## reduce is a variant of iterate that allows for easier parallelism





In [3]:
def reduce(f, id_, a):
    # print('a=%s' % a) # for tracing
    if len(a) == 0:
        return id_
    elif len(a) == 1:
        return a[0]
    else:
        # can call these in parallel
        return f(reduce(f, id_, a[:len(a)//2]),
                 reduce(f, id_, a[len(a)//2:]))
        
def times(x, y):
    return x * y

reduce(times, 1, [1,2,4,6,8])

384

Work and Span of reduce?

$$W(n) = 2W(n/2) + 1 \in O(n)$$

$$S(n) = S(n/2) + 1 \in O(\lg n)$$

In [3]:
# compare with iterate; sometimes called "left folding"
def iterate(f, x, a):
    if len(a) == 0:
        return x
    else:
        return iterate(f, f(x, a[0]), a[1:])
    
iterate(times, 1, [1,2,4,6,8])

384

Work and Span of iterate?

$W(n) = W(n-1) + 1 \in O(n)$

$S(n) = S(n-1) + 1 \in O(n)$

## Does order matter?

![lfold](figures/lfold.png)

For what function $f$ would $iterate$ and $reduce$ return different answers?

```python
return iterate(f, f(x, a[0]), a[1:])
```

vs


```python
return f(reduce(f, id_, a[:len(a)//2]),
         reduce(f, id_, a[len(a)//2:])
```      
           

In [19]:
def subtract(x, y):
    return x - y

print(iterate(subtract, 0, [10,5,2,1]))

print(reduce(subtract, 0, [10,5,2,1]))

-18
4


So, why use *reduce*?

- Unlike *iterate*, which is strictly sequential, *reduce* is parallel.
  - Span of *iterate* is **linear**; span of *reduce* is **logarithmic**. 
  - (we'll cover this later)

Many divide and conquer algorithms can be expressed with reduce.

<br>

Recall `sum_list_recursive`:

In [9]:
# recursive, serial
def sum_list_recursive(mylist):    
    if len(mylist) == 1:
        return mylist[0]
    return (
        sum_list_recursive(mylist[:len(mylist)//2]) +
        sum_list_recursive(mylist[len(mylist)//2:])
    )

sum_list_recursive(range(10))

45

How can we specify this with reduce?

In [11]:
def plus(x, y):
    return x + y

reduce(plus, 0, range(10))

45

For more complicated combination functions, we can define a generic version of most divide and conquer algorithms and show that it can be implemented with `reduce` and `map`.

In [None]:
## Generic divide and conquer algorithm.

def my_divide_n_conquer_alg(mylist):    
    if len(mylist) == 0:
        return LEFT_IDENTITY# <identity>
    elif len(mylist) == 1:
        return BASECASE(mylist[0]) # basecase for 1
    else:
        return COMBINE_FUNCTION(
            my_divide_n_conquer_alg(mylist[:len(mylist)//2]),
            my_divide_n_conquer_alg(mylist[len(mylist)//2:])
        )

def COMBINE_FUNCTION(solution1, solution2):
    """ return the combination of two recursive solutions"""
    pass

def BASECASE(value):
    """ return the basecase value for a single input"""
    pass

### is equivalent to
reduce(COMBINE_FUNCTION, LEFT_IDENTITY, (map(BASECASE, mylist)))

### Example: Sorting with Reduce

In [2]:
def merge(left, right):
    """
    Takes in two sorted lists and returns a sorted list that combines them both.
    """
    i = j = 0
    result = []
    while i < len(left) and j < len(right):
        if right[j] < left[i]:   # out of order: e.g., left=[4], right=[3]
            result.append(right[j])
            j += 1
        else:                   # in order: e.g., left=[1], right=[2]
            result.append(left[i])
            i += 1    
    # append any remaining items (at most one list will have items left)
    result.extend(left[i:])
    result.extend(right[j:])
    return result

merge([1,4,8], [2,3,10])

[1, 2, 3, 4, 8, 10]

What is base case and left identity? 

In [29]:
def singleton(value):
    """ just created a list with one element. """
    return [value]

## reduce(COMBINE_FUNCTION, LEFT_IDENTITY, (map(BASECASE, mylist)))
reduce(merge, [], list(map(singleton, [1,3,6,4,8,7,5,2])))

[1, 2, 3, 4, 5, 6, 7, 8]

What if we use `iterate` instead of `reduce`?

In [30]:
iterate(merge, [], list(map(singleton, [1,3,6,4,8,7,5,2])))

[1, 2, 3, 4, 5, 6, 7, 8]

### Order matters

![order](figures/order.png)




### Analysis

`iterate(merge, [], list(map(singleton, [1,3,6,4,8,7,5,2])))`

This is **insertion sort**!

- We iterate from left to right.
- At each step we insert the next element into the appropriate place in the sorted list.

$[1] \rightarrow [1,3] \rightarrow [1,3,6] \rightarrow [1,3,4,6] \ldots$

<br><br>

Assuming the `merge` function has **work** $O(n)$.

$$W(n) = W(n-1) + n \in O(n^2)$$

<br><br>

Assuming the `merge` function has **span** $O(\lg n)$ (note our implementation above doesn't yet do this).

$$S(n) = S(n-1) + \lg n \in O(n \lg n)$$


<br><br><br><br>

`reduce(merge, [], list(map(singleton, [1,3,6,4,8,7,5,2])))`

This is **merge sort**!

<br><br>

Assuming the `merge` function has **work** $O(n)$.

$$W(n) = 2W(n/2) + n \in O(n \lg n)$$

Assuming the `merge` function has **span** $O(\lg n)$.

$$S(n) = S(n/2) + \lg n \in O(\lg^2 n)$$

### Map-Reduce

Scalable, parallel programming model popularized by Google.

![figures/mr.png](figures/mr.png)
[source](https://dzone.com/articles/word-count-hello-word-program-in-mapreduce)

