## Python `map` and `reduce` (15 min)

- You may have heard about [MapReduce](https://en.wikipedia.org/wiki/MapReduce) in the context of big data.
- We won't get into details here, but at least we will introduce `map` and `reduce`.
- You also saw these in DSCI 511 (the R part), we'll go into slightly more detail and do it in Python.

#### `map`

In [3]:
def times_two(x):
    return x*2

data = [1, 2, 3, 4]

In [4]:
list(map(times_two, data))

[2, 4, 6, 8]

In [5]:
list(range(5))

[0, 1, 2, 3, 4]

- In more recent versions of Python, `map` returns a map object which is essentially a generator.
- The idea is that you might not want to store the results in memory, you probably just want to iterate through them.

In [6]:
result = map(times_two, data)

In [7]:
result[2]

TypeError: 'map' object is not subscriptable

In [8]:
next(result)

2

In [9]:
next(result)

4

You can explicitly cast the result to a list: 

In [10]:
list(map(times_two, data))

[2, 4, 6, 8]

In [11]:
[e for e in map(times_two, data)]

[2, 4, 6, 8]

But often this is unnecessary and just takes up more time/memory for no reason.

- You will see something similar to `map` going by many names, like `apply` in pandas. In R there is also `purrr:map`. 
- All of these have the same idea: apply a function to each element of a list.
- Or in fact it doesn't even have to be a list, it could be a generator:

In [128]:
result = map(times_two, map(times_two, data))

In [129]:
result

<map at 0xa224a7da0>

In [130]:
next(result)

4

Here, we applied `map` to a map object. This is already an example of the interchangeable nature of lists and generators. 

Note: you'll often see people using `lambda` functions inline, as in:

In [131]:
list(map(lambda x: x*2, data))

[2, 4, 6, 8]

This is more convenient than actually naming a function `times_two`, which feels unecessary.

#### `reduce`

In [132]:
from functools import reduce

- Another common operation is to reduce, or aggregate, data.
- Examples: sum, max

In [133]:
data

[1, 2, 3, 4]

In [134]:
sum(data)

10

In [135]:
max(data)

4

- These are examples of a general phenomenon in which data are aggregated **pairwise**. 
- That is, $1+2+3+4=((1+2)+3)+4$
- And $\max\{1,2,3,4\}=\max\{\max\{\max\{1,2\},3\},4\}$

In [136]:
reduce(lambda x, y: x+y, data)

10

In [137]:
reduce(lambda x, y: x if x > y else y, data)

4

Conveniently, `reduce` can take in a generator as the data, so it can be coupled effectively with `map`. For example:

In [138]:
reduce(lambda x, y: x+y, map(lambda x: x*2, data))

20

Here, we multiplied all the numbers by $2$ and then added them together. The generator from `map` was aggregated by `reduce`.

`reduce` and recursion:

- We tend to think of these functions recursively, especially `reduce`. 
- In fact, here is an implementation of `reduce`:

In [139]:
def my_reduce(func, data):
    """ 
    Apply a function to pairs of elements in data, recursively from left to right.

    Parameters
    ----------
    func : function
        a function taking two arguments, that we will reduce on
    data : list
        a list of values

    Returns
    -------
    object 
        in the interim this will return smaller lists, but the final result will be one object

    Example
    --------
    >>> data = [1,2,3,4]
    >>> my_reduce(lambda x,y: x*y, data)
    24
    """

    if len(data) == 1:
        return data[0]

    # Apply the function to the first two elements
    new_element = func(data[0], data[1])

    # Concatenate the new element and the rest of the list
    new_list = [new_element] + data[2:]

    # Recursively reduce on the new list
    return my_reduce(func, new_list)

In [140]:
my_reduce(lambda x, y: x+y, data)

10

- The ideas of data aggregation and recursion are tied together.
- You just need to define an aggregration operation on two elements, then apply recursively. 

Summary:

| Python name | Other names | Inputs      |  Outputs |
|-------------|-------------|-------------|----------|
|   `map`     |   apply     | a function of one argument and a list/iterable | a new list/iterable with the function applied to each element |
| `reduce`    | aggregate  | a function of two arguments and a list/iterable | a single value with the function applied recursively to pairs of elements |


## (optional) Exercise 7: `map` and `reduce`
rubric={accuracy:1}

**NOTE**: this optional exercise pertains to the optional section of the lecture video on `map` and `reduce`.

Write a function `commonLetters` that takes in a list of strings, converts the strings to lower case and finds all the characters that are present in _all_ the strings. Your function should return the result as a Python `set` of characters. Your function must use Python's `map` and `reduce` functions for the heavy lifting - no loops, recursion, or other trickery! You must also write at least 3 tests for your function, in addition to the test provided.

Some potentially helpful functions:

- You can convert a string `s` to lower case with [`s.lower()`](https://docs.python.org/3/library/stdtypes.html#str.lower).
- You can find the intersection (common elements) between two sets `a` and `b` with [`a.intersection(b)`](https://docs.python.org/3.7/library/stdtypes.html#frozenset.intersection), or `a & b` for short.

In [45]:
# Provided example code
s = "BLAH blah I am sayIng Stuff"
s.lower()

'blah blah i am saying stuff'

In [46]:
set1 = {'a', 'b', 'c', 99}
set2 = {'b', 'c', 'd', 99}
set1.intersection(set2)

{99, 'b', 'c'}

In [1]:
from functools import reduce

In [2]:
### BEGIN SOLUTION

def commonLetters(strings):
    """Finds the characters common to all the strings given and 
       returns them as a Python set. Case insensitive.

    Parameters
    ----------
    strings : list
        strings you want to find common characters between

    Returns
    -------
    set
        the common characters present in all strings

    Examples
    --------
    >>>  commonLetters(["123", "345","367"])
    {'3'}
    """

    lower_strings = map(lambda x: x.lower(), strings)
    set_strings = map(lambda x: set(x), lower_strings)
    return reduce(lambda x, y: x & y, set_strings)

### END SOLUTION

In [3]:
# provided code
assert commonLetters(["abc", "ABC", "AbCdE", "the quick brown fox jumped over the lazy dog",
                      "abraham lincoln", "abracadabra", "chEEs3"]) == {'c'}

In [4]:
### BEGIN SOLUTION

assert commonLetters(["abc", "def"]) == set()
assert commonLetters(["Aac", 'Bbaccc', 'cccbbbbaaa']) == {'a', 'c'}
assert commonLetters(["", "the quick brown fox jumped over the lazy dog"]) == set()

### END SOLUTION

## True/False

1. Python generators contain one or more `yield` statements instead of `return` statements.
2. Python generators allow "random access", i.e. `gen[3]` to get the 4th element.
3. `map(f, x)` applies the function `f` to every element of `x`.
4. It's is reasonable to use the same function `f` with both `map` and `reduce`.