# Class Review - Map, Reduce and Filter


## 1. Map

The `map` function usually receives two parameters. The first one is a function. The second one is an iterable (usually a list or a tuple). What `map` does is apply the function to all the elements in the iterable and return another iterable with the results. 

The terminology might be a bit confusing, but the idea is simple. Let's look at a couple of examples:

In [1]:
nums = [1,2,3,4,5]

def double(n):
    return n*2

In the code block above we have a list of numbers and a function that takes a single number and returns the same number multiplied by two. 

If we wanted to apply the function to each number in our `nums` list, we could use a `for` loop:

In [2]:
result = []
for n in nums:
    result.append(double(n))
print(result)

[2, 4, 6, 8, 10]


We could also use a list comprehension to achieve the same result:

In [9]:
result = [double(n) for n in nums]
print(result)

[2, 4, 6, 8, 10]


Map allows us to do the same thing, with a couple of key differences:

In [10]:
result = list(map(double,nums))
print(result)

[2, 4, 6, 8, 10]


The syntax is more concise: we don't have to use a `for`, assign elements of `nums` to the `n` variable or call the `double` function, since `map` does that job for us. 

Another key difference is that `map` does not return a list by default: we need to convert the result to a list in order to see the changes we would expect. Let's look at what would happen if we just printed the `map` result above, without converting it to a list:

In [11]:
result = map(double,nums)
print(result)

<map object at 0x7fb8357d1320>


`map object`... Now what is that?

A `map object` is a type of iterator. Without going into too much detail, an iterator is an object which contains a number of values. Unlike a list or tuple, however, in iterators we can only access one value at a time. We do that by using the `next` method:

In [13]:
next(result)

StopIteration: 

As we can see, `next(result)` gave us the first element in our `result` map object. If we continue using `next`, we can get the rest of the values:

In [12]:
print(next(result))
print(next(result))
print(next(result))
print(next(result))

2
4
6
8
10


We have arrived at the last value of our iterator. Using `next` again would throw a `StopIteration` exception. 

In [None]:
print(next(result))

Iterators can be quite useful when dealing with a very large number of items, for example: if our iterator contains one million values and we only need the first 100, there's no point in spending processing power and memory to generate a list of one million elements. Using an iterator would be a lot more efficient in those situations.

Again, there's no need to get into too much detail: most of the time we'll just convert the map object into a list instead of using it as an iterator.

## 2. Filter

Much like `map`, the `filter` function also receives two parameters. The first parameter is a *filtering function*; the second parameter is an iterable (usually a tuple or a list). 

The *filtering function* will receive an element of the list and **return a boolean (`True` or `False`) depending on whether the element fulfills a certain condition.** `filter` will then return an iterator containing only the elemens of the list for which the filtering function returned `True`.

Let's see how this would work in practice. 

In [16]:
nums = [1,2,3,4,5,6,7,8,9,10]

def no_odds(n):
    return n % 2 == 0

We want to use the `no_odds` function to return a new list without the odd numbers from `nums`. We could use a `for` loop and an `if` statement to do that:

In [17]:
result = []
for n in nums:
    if no_odds(n):
        result.append(n)
print(result)

[2, 4, 6, 8, 10]


We can do the same thing with a list comprehension, of course:

In [18]:
result = [n for n in nums if no_odds(n)]
print(result)

[2, 4, 6, 8, 10]


The same could be done using the `filter` function, as expected. It works a lot like `map`.

In [19]:
result = list(filter(no_odds,nums))
print(result)

[2, 4, 6, 8, 10]


As you probably noticed, the result from `filter` had to be converted into a list before printing. Without the conversion, the result would be a `filter` object:

In [22]:
result = filter(no_odds,nums)
print(result)

<filter object at 0x7fb8357d1cf8>


The `filter object` is an iterator, just like the `map object` we saw above. 

In [23]:
print(next(result))
print(next(result))
print(next(result))
print(next(result))
print(next(result))

2
4
6
8
10


Again, although the `filter object` has its uses, most of the time we'll just convert it into a list instead of using it as an iterator.

## 3. Reduce

Unlike the previous two methods, `reduce` has to be imported from the `functools` module. Let's do that before anything else.

In [24]:
from functools import reduce

Reduce will usually receive two parameters: a function and an iterable (usually a list or tuple). An optional third parameter can be passed -- we'll discuss it later. 

The idea behind reduce is to use a function to apply a function to each element in a list and accumulate the result, transforming the list into a single value. Sounds confusing? Let's see how we would do that with a loop:

In [25]:
nums = [1,2,3,4,5,6,7,8,9,10]
def multiply(a,b):
    return a*b

We want to use the multiply function to find out the result we would get after multiplying all elements in our list. This is how we would do it with a loop:

In [26]:
result = 1 #because if we start with 0, everything will be multiplied by 0 ;)
for n in nums:
    result = multiply(result,n)
print(result)

3628800


So, the result of multiplying all numbers in our list is `3628800`. Incidentally, we have just implemented a factorial function for ranges starting with 1. But that's quite a bit of code, isn't it? Let's see how we could do this with `reduce`:

In [27]:
result = reduce(multiply,nums)
print(result)

3628800


Beautiful, isn't it? We get the same result with a lot less code. As a bonus, there is no conversion needed: `reduce` returns the exact value we expected.

Now, I promised we would talk about the optional third parameter. Let's say we have this list of students:

In [29]:
students = [{'name':'Maria', 'age':20}, 
            {'name':'José', 'age':23}, 
            {'name':'Pancho', 'age':33},
            {'name':'Isabel', 'age':38}]

I want to use reduce to get the average of all ages. We could try to do it like this:

In [36]:
def sum_ages(a,b):
    return a['age'] + b['age']

result = reduce(sum_ages,students)/len(students)
print(result)

TypeError: 'int' object is not subscriptable

We got a weird error, didn't we? Here's what happened: the function runs once and adds the ages for Maria and José. We get 43. Then the function runs again, using as parameters the accumulated value (43) and the next element in the list ({'name':'Pancho', 'age':33}). So it tries to add `43['age']` and `{'name':'Pancho', 'age':33}['age']`... but the number 43 doesn't have an `age` property, does it? When we try to find a property in data type that doesn't support properties (like an integer), we get a TypeError: `'int' object is not subscriptable` 

We could try to fix it by adjusting our `sum_ages` function so that it doesn't look for an `age` property in the first parameter. Do you think it will work?

In [35]:
def sum_ages(a,b):
    return a + b['age']

result = reduce(sum_ages,students)/len(students)
print(result)

TypeError: unsupported operand type(s) for +: 'dict' and 'int'

Didn't work, right? What happened now is that the first time our `sum_ages` function runs, it runs with the arguments `{'name':'Maria', 'age':20}` and `{'name':'José', 'age':23}`. So it tries to do this:

`return {'name':'Maria', 'age':20} + {'name':'José', 'age':23}['age']`

Since we're not getting the `age` property for the first parameter, we end up trying to add a dictionary and an integer. Python tells us we cannot do that: `TypeError: unsupported operand type(s) for +: 'dict' and 'int'`

The solution is to pass an accumulator as the third parameter to `reduce`. The accumulator is the initial value that we will use to accumulate our results. Since it's a sum, we'll use `0`. For a multiplication, we would use `1`; for string concatenation, `""`... you get the idea.

In [37]:
def sum_ages(a,b):
    return a + b['age']

result = reduce(sum_ages,students,0)/len(students)
print(result)

28.5


Now it runs beautifully. The first time our `sum_ages` function runs, it receives our accumulator as the first parameter and the first element in our list (`{'name':'Maria', 'age':20}`) as a second value:

`return 0 + {'name':'Maria', 'age':20}['age']`

That runs with no issues, and we'll continue adding the value in the `'age'` key of each element to our accumulator until we reach the end of our list.

## BONUS:

To keep our code short and neat, we'll usually pass lambda functions to map, filter and reduce instead of defining our function outside. Here's the challenge: go back to the beginning of this notebook and reimplement the examples of map, filter and reduce using lambda functions. 