## Longest Gap

Consider the following problem: Given an input sequence $a$ of $n$ integers and an integer $key$, find the maximum distance between two appearances of $key$. For example, if $key=7$, then the answer for the below list is $3$:

$[5, 6, \mathbf{7}, 9, 8, \mathbf{7}, \underline{4, 5, 6}, \mathbf{7}, 9, 12, \mathbf{7}, 9]$

For simplicity, we will assume that $key$ appears at least twice in $a$.


We will solve this problem using `map`, `reduce`, and `scan`. The main observation is that we can first map the input to a sequence where position $i$ is $1$ if the value at $i$ matches $key$, and is $0$ otherwise. E.g., for the example above:

$[0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0]$

We will next run a scan on this sequence. With the resulting prefix sequence, we will run map-reduce to compute something that we can finally use to solve the original problem.

 

Below is a skeleton of the implementation. Your job is to fill in the missing blanks (marked with "...").

In [None]:
def longest_gap(a, key):
    a_prime = list(map(lambda v: my_map(key, v), a))
    prefixes, last = scan(...)
    mr_result = run_map_reduce(map_f, reduce_f, prefixes)
    counts = [v[1] for v in mr_result]
    result = reduce(...)
    return ...

    In the below, you are welcome to reference any functions we have defined in lecture, labs, or assignments. Assume that we are using our most efficient discussed versions of `map`, `reduce`, `scan`, and `run_map_reduce`.

    a) Write the `my_map` implementation in Python

    b) Complete the call to `scan` in line 3

    c) Complete the call to `run_map_reduce` in line 4. 

    d) Complete the call to reduce and return the final result in lines 6 and 7

    e) What is the Work and Span of the final algorithm? Show your work.





<pre class="python" style="font-family:monospace;"><ol><li style="font-weight: normal; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #ff7700;font-weight:bold;">def</span> longest_gap<span style="color: black;">&#40;</span>a<span style="color: #66cc66;">,</span> key<span style="color: black;">&#41;</span>:</div></li><li style="font-weight: normal; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    a_prime <span style="color: #66cc66;">=</span> <span style="color: #008000;">list</span><span style="color: black;">&#40;</span><span style="color: #008000;">map</span><span style="color: black;">&#40;</span><span style="color: #ff7700;font-weight:bold;">lambda</span> v: my_map<span style="color: black;">&#40;</span>key<span style="color: #66cc66;">,</span> v<span style="color: black;">&#41;</span><span style="color: #66cc66;">,</span> a<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span></div></li><li style="font-weight: normal; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    prefixes<span style="color: #66cc66;">,</span> last <span style="color: #66cc66;">=</span> scan<span style="color: black;">&#40;</span>...<span style="color: black;">&#41;</span></div></li><li style="font-weight: normal; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    mr_result <span style="color: #66cc66;">=</span> run_map_reduce<span style="color: black;">&#40;</span>...<span style="color: black;">&#41;</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    mr_result_2 <span style="color: #66cc66;">=</span> <span style="color: black;">&#91;</span>v<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">for</span> v <span style="color: #ff7700;font-weight:bold;">in</span> mr_result<span style="color: black;">&#93;</span></div></li><li style="font-weight: normal; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    result <span style="color: #66cc66;">=</span> <span style="color: #008000;">reduce</span><span style="color: black;">&#40;</span>...<span style="color: black;">&#41;</span></div></li><li style="font-weight: normal; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">    <span style="color: #ff7700;font-weight:bold;">return</span> ...</div></li></ol></pre>


In [None]:
from collections import defaultdict
import math

        
def plus(x, y):
    return x + y

def iterate(f, x, a):
    if len(a) == 0:
        return x
    else:
        return iterate(f, f(x, a[0]), a[1:])
        
def reduce(f, id_, a):
    # print('a=%s' % a) # for tracing
    if len(a) == 0:
        return id_
    elif len(a) == 1:
        return a[0]
    else:
        # can call these in parallel
        return f(reduce(f, id_, a[:len(a)//2]),
                  reduce(f, id_, a[len(a)//2:]))

def scan(f, id_, a):
    """
    This is a horribly inefficient implementation of scan
    only to understand what it does.
    We'll discuss how to make it more efficient later.
    """
    return (
            [reduce(f, id_, a[:i+1]) for i in range(len(a))],
             reduce(f, id_, a)
           )

def collect(pairs):
    result = defaultdict(list)
    for pair in pairs:
        result[pair[0]].append(pair[1])
    return list(result.items())

def flatten(sequences):
    return iterate(plus, [], sequences)



def run_map_reduce(map_f, reduce_f, mylist):
    pairs = flatten(list(map(map_f, mylist)))
    groups = collect(pairs)
    return [reduce_f(g) for g in groups]


## Start from Here Q3
def my_map(key, v):
    if key == v:
        return 1
    else:
        return 0

def map_f(value):
    return [(value, 1)]

def reduce_f(group):
    return (group[0], reduce(plus, 0, group[1]))

def longest_gap(a, key):
    a_prime = list(map(lambda v: my_map(key, v), a))
    print('a_prime', a_prime)
    ## Scan Here.
    prefixes, last = scan(plus, 0, a_prime)
    print('prefixes', prefixes)

    # # update! rem min and max duplicates to deal w cases like [9,9,7,9,7]
    # prefixes = [0] + prefixes
    # maxv = reduce(max, 0, prefixes)
    # minv = reduce(min, 1e10, prefixes)
    # prefixes = list(filter(lambda x: x != maxv and x != minv, prefixes))
    # print('new prefixes', prefixes)

    mr_result = run_map_reduce(map_f, reduce_f, prefixes)   
    print('mr_result', mr_result) 
    counts = [v[1] for v in mr_result]
    print('counts', counts)
    
    ## Reduce Here.
    return reduce(max, -math.inf, counts)-1

a = [5, 6, 7, 9, 8, 7, 4, 5, 6, 7, 9, 12, 7]
key = 7

# def longest_gap(a, key):
#     a_prime = list(map(lambda v: my_map(key, v), a))
#     prefixes, last = scan(...)
#     mr_result = run_map_reduce(map_f, reduce_f, prefixes)
#     counts = [v[1] for v in mr_result]
#     result = reduce(...)
#     return ...

longest_gap(a,key)

# work: n + n + n + n + n = n
# span: 1 + lg n + lg n + 1 + lg n = lgn


a_prime [0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1]
prefixes [0, 0, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4]
13
mr_result [(0, 2), (1, 3), (2, 4), (3, 3), (4, 1)]
counts [2, 3, 4, 3, 1]


3