# Map, Filter, Reduce

or Processing Iterables Without a Loop

---

## [Mapping](https://docs.python.org/3.9/library/functions.html#map)

Consists of applying a transformation function to an iterable to produce a new iterable. Items in the new iterable are produced by calling the transformation function on each item in the original iterable.

- The goal of using the `map()` function is to apply a function to a sequence (i.e.: it allows you to process and transform all the items in an iterable without using an explicit for loop)

#### `map()` returns a _map object_, which is an iterator that yields items on demand (a.k.a.: generators). Python iterators are known to be quite efficient in terms of memory consumption. This is the reason why map() now returns an iterator instead of a list.

In [1]:
lst = [i for i in range(10_000_000)]

lst[:10]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [2]:
len(lst)

10000000

In [8]:
def restar(numero):
    
    """
    Esta funcion tiene que recibir solo un elemento de entrada.
    Para el map, pensad en esta funcion para un solo elemento del iterable
    
    numero es local, que solo existe aqui dentro
    
    """
    
    return numero-10

In [12]:
numero

NameError: name 'numero' is not defined

In [9]:
restar(8)

-2

In [5]:
%%time

res = []

for e in lst:
    
    # e es cada elemento de la lista
    
    n = restar(e)
    
    res.append(n)
    
res[:10]

CPU times: user 1.08 s, sys: 60.2 ms, total: 1.14 s
Wall time: 1.14 s


[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1]

In [6]:
%%time

# esto es lo mimso de antes

res = [restar(numero) for numero in lst]

res[:10]

CPU times: user 616 ms, sys: 62.4 ms, total: 678 ms
Wall time: 678 ms


[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1]

In [13]:
%%time

# map(funcion, iterable)

map(restar, lst)

CPU times: user 4 µs, sys: 1e+03 ns, total: 5 µs
Wall time: 8.11 µs


<map at 0x1456e4910>

In [14]:
%%time

# map(funcion, iterable)

list(map(restar, lst))[:10]

CPU times: user 424 ms, sys: 53.6 ms, total: 477 ms
Wall time: 477 ms


[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1]

In [16]:
%%time

x = map(restar, lst)

CPU times: user 5 µs, sys: 0 ns, total: 5 µs
Wall time: 9.3 µs


In [17]:
type(x)

map

In [18]:
print(x)

<map object at 0x1456dd100>


In [19]:
id(x)

5459792128

In [20]:
lst[:10]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [22]:
%%time

map(lambda numero: numero-10, lst)

CPU times: user 4 µs, sys: 1 µs, total: 5 µs
Wall time: 7.87 µs


<map at 0x1456c6f10>

In [24]:
%%time

map(lambda numero: numero*2, lst)

CPU times: user 4 µs, sys: 1 µs, total: 5 µs
Wall time: 9.78 µs


<map at 0x1456d5f10>

In [25]:
%%time

list(map(lambda numero: numero*2, lst))[:10]

CPU times: user 456 ms, sys: 61.1 ms, total: 517 ms
Wall time: 539 ms


[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [26]:
[1,2,3] * 2

[1, 2, 3, 1, 2, 3]

In [27]:
dictio = {'a': 2, 'b': 3, 'c': 4}

dictio

{'a': 2, 'b': 3, 'c': 4}

In [28]:
for e in dictio:
    print(e)

a
b
c


In [29]:
for e in dictio.values():
    print(e)

2
3
4


In [30]:
for e in dictio.items():
    print(e)

('a', 2)
('b', 3)
('c', 4)


In [31]:
map(lambda x: x**2, dictio.values())

<map at 0x1456d5fd0>

In [32]:
list(map(lambda x: x**2, dictio.values()))

[4, 9, 16]

In [33]:
map(lambda x: x[1]**2, dictio.items())

<map at 0x1456dd280>

In [34]:
list(map(lambda x: x[1]**2, dictio.items()))

[4, 9, 16]

In [38]:
list(map(lambda x: (x[0], x[1]**2), dictio.items()))

[('a', 4), ('b', 9), ('c', 16)]

In [39]:
dict(map(lambda x: (x[0], x[1]**2), dictio.items()))

{'a': 4, 'b': 9, 'c': 16}

In [41]:
def funcion():
    
    return 2,3

In [42]:
funcion()

(2, 3)

In [44]:
def funcion2():
    
    return (2,3)

In [45]:
funcion2()

(2, 3)

---

## [Filtering](https://docs.python.org/3.9/library/functions.html#filter)

Filtering consists of applying a predicate or Boolean-valued function to an iterable to generate a new iterable. Items in the new iterable are produced by filtering out any items in the original iterable that make the predicate function return false.

- The goal of the `filter()` function is to use the function we pass to it to remove elements from our sequence.

In [46]:
def buscar_par(numero):
    
    if numero%2==0:   # es par
        return True
    else:
        return False

In [47]:
buscar_par(8)

True

In [48]:
buscar_par(3)

False

In [49]:
lst[:10]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [50]:
%%time

#filter(funcion, iterable)

filter(buscar_par, lst)

CPU times: user 4 µs, sys: 1 µs, total: 5 µs
Wall time: 9.06 µs


<filter at 0x1100bae80>

In [53]:
%%time

list(filter(buscar_par, lst))[:10]

CPU times: user 531 ms, sys: 16 ms, total: 547 ms
Wall time: 545 ms


[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [54]:
%%time

res = []

for e in  lst:
    
    if e%2==0:
        
        res.append(e)
    
    else:
        pass
    

res[:10]

CPU times: user 576 ms, sys: 26.2 ms, total: 602 ms
Wall time: 602 ms


[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [59]:
%%time

list(filter(lambda patata: patata%2, lst))[:10]  # cuando es impar

CPU times: user 390 ms, sys: 13.7 ms, total: 404 ms
Wall time: 402 ms


[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

In [60]:
%%time

list(filter(lambda patata: patata%2==0, lst))[:10]  # cuando es par

CPU times: user 436 ms, sys: 13.8 ms, total: 450 ms
Wall time: 449 ms


[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [55]:
3%2

1

In [61]:
# formato json

hoteles=[
    
    {'name':'Ritz', 'hasPool':True, 'stars':5},
    
    {'name':'Pension Lola', 'hasPool':True, 'stars':2},
    
    {'name':'Roma Norte', 'hasPool':False, 'stars':3},
    
    {'name':'Palace', 'hasPool':True, 'stars':4},
]

In [64]:
type(hoteles)

list

In [62]:
# menos de 3 estrellas

# aqui en la lambda, x es el dictio

list(filter(lambda x: x['stars']<3, hoteles))

[{'name': 'Pension Lola', 'hasPool': True, 'stars': 2}]

In [63]:
# mas de 3 estrellas

list(filter(lambda x: x['stars']>3, hoteles))

[{'name': 'Ritz', 'hasPool': True, 'stars': 5},
 {'name': 'Palace', 'hasPool': True, 'stars': 4}]

In [66]:
(lambda x: x['stars']>3)({'name':'Palace', 'hasPool':True, 'stars':4})

True

In [68]:
# menos de 3 estrellas y con piscina (multifiltro)

list(filter(lambda x: x['stars']<3 and x['hasPool'], hoteles))

[{'name': 'Pension Lola', 'hasPool': True, 'stars': 2}]

In [69]:
list(filter(lambda x: x['stars']>3 and x['hasPool'] and x['name']=='Ritz', hoteles))

[{'name': 'Ritz', 'hasPool': True, 'stars': 5}]

In [71]:
dictio['d']

KeyError: 'd'

In [72]:
dictio.get('d', 20)

20

In [73]:
dictio.get('a', 20)

2

In [74]:
list(filter(lambda x: x.get('stars', 0)>3, hoteles))

[{'name': 'Ritz', 'hasPool': True, 'stars': 5},
 {'name': 'Palace', 'hasPool': True, 'stars': 4}]

In [78]:
list(filter(lambda x: x.get('hola', 10)>3, hoteles))

[{'name': 'Ritz', 'hasPool': True, 'stars': 5},
 {'name': 'Pension Lola', 'hasPool': True, 'stars': 2},
 {'name': 'Roma Norte', 'hasPool': False, 'stars': 3},
 {'name': 'Palace', 'hasPool': True, 'stars': 4}]

---

## [Reducing](https://docs.python.org/3.9/library/functools.html#module-functools#reduce)

Reducing consists of applying a reduction function to an iterable to produce a single cumulative value.

- The goal of the `reduce()` function is to aggregate all elements in a sequence.

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.

In [79]:
from functools import reduce

In [80]:
reduce

<function _functools.reduce>

In [81]:
# lambda a,b: a+b


def sumar(a, b):
    
    print(a, b)
    
    return a+b

In [82]:
sumar(2, 3)

2 3


5

In [83]:
lst[:10]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [84]:
reduce(sumar, lst[:10])

0 1
1 2
3 3
6 4
10 5
15 6
21 7
28 8
36 9


45

In [85]:
sum(lst[:10])

45

In [86]:
prod

NameError: name 'prod' is not defined

In [87]:
def producto(a, b):
    
    print(a, b)
    
    return a*b

In [88]:
reduce(producto, lst[1:10])

1 2
2 3
6 4
24 5
120 6
720 7
5040 8
40320 9


362880

In [89]:
numero_prod = reduce(producto, lst[1:10])

1 2
2 3
6 4
24 5
120 6
720 7
5040 8
40320 9


In [90]:
numero_prod

362880

In [92]:
res = 1

for e in lst[1:10]:
    
    print(res)
    
    res *= e   # res = res * e
    
res

1
1
2
6
24
120
720
5040
40320


362880

In [93]:
lst_lst = [[1,2,3], [4,5,6], [7,8,9]]


len(lst_lst)

3

In [94]:
# lista con 9 elementos

reduce(lambda a,b: a+b, lst_lst)  # flatten

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [95]:
reduce(sumar, lst_lst)  # flatten

[1, 2, 3] [4, 5, 6]
[1, 2, 3, 4, 5, 6] [7, 8, 9]


[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [96]:
reduce(lambda a,b,c: a+b+c, lst_lst)  

TypeError: <lambda>() missing 1 required positional argument: 'c'