In [None]:
import re
import numpy as np
import pandas as pd


# Map, Filter & Reduce

Let's recap the differences between the two programming paradigms we've seen so far:

**Imperative Paradigm**
- The program is a series of **instructions** that modify a **state**:

```python
x = 0
for i in range(10):
    x = (x + i)*2
print x
```
- The *variable* `x` is the state of our program, which is modified through the `for` loop.

- One the simplest forms of programming, typical of older programming languages (C, Fortran e COBOL por exemplo).

**Functional Programming**
- There is no state: the program defines functions which are applied over the input.
```python
def somar_2(x):
    return x + 2

def mult_4(x):
    return x * 4

saida = somar_2(mult_4(somar_2(entrada)))
```
- In the functional paradigm, functions are variables.
- Originated in the 1970s with LISP and is present today in many data-oriented languages such as R, Julia, Python (em parte).

## Funções are variables

In [None]:
soma_1 = lambda x: x + 1


In [None]:
soma_1


In [None]:
def soma_1_c(x):
    return x + 1


In [None]:
soma_1_c


In [None]:
soma_1(soma_1_c(10))


We can use this to create functions that `return` other functions:

In [None]:
somar_n = lambda x, n: x + n


In [None]:
soma_1 = lambda x: somar_n(x, 1)


In [None]:
soma_1(10)


## The `map` concept

One of the key concepts in functional programming is **mapping**: applying a function to the elements of a set, list or other iterable. 

In [1]:
lista_exemplo = [10, 12, 34, 23, 2, 6, 7]


In [2]:
def div_2(x):
    return x / 2


A simple call of `div_2(lista_exemplo)` will not work!

In [None]:
div_2(lista_exemplo)


The `div_2` is expecting a number as an argumento, but `lista_exemplo` is a list!

We could create an empty list and use a loop to iterate over `lista_exemplo`:

In [None]:
new_list = []

for item in lista_exemplo:
    new_list.append(div_2(item))

new_list


Another way is using a `list comprehension`: one of the tools in the functional programming toolbox:

In [None]:
[div_2(item) for item in lista_exemplo]


A third way is using `map()`:

In [None]:
for i in map(div_2, lista_exemplo):
    print(i)


The results from `map()` are **lazy**: it is not calculated when you call the functions but when you need the results!

In [3]:
list(map(div_2, lista_exemplo))


[5.0, 6.0, 17.0, 11.5, 1.0, 3.0, 3.5]

A *interesting* behavior of **lazy** iterators is that they become **empty as you iterate over their elements**:

In [None]:
resultado_map = map(div_2, lista_exemplo)
for i in resultado_map:
    print(i)


In [None]:
list(resultado_map)


In [None]:
resultado_map = map(div_2, lista_exemplo)


In [None]:
list(resultado_map)


### Lazy evaluation

Lazy evaluation is an important concept in Big Data: it saves memory and CPU by performing computations only **when they are needed**.

In [4]:
lista_telefones = [
    19999571559,
    "(21) 2412-0107",
    "(34) 99762-1166",
    "91-4002-8282",
    "(19) 3542-1820",
    "(19) 3561-9525",
    "(34) 3333-5802",
]
pattern = r"[0-9]{2}"


In [5]:
lista_dds = list(map(lambda x: re.findall(pattern, str(x))[0], lista_telefones))
print(lista_dds)


NameError: name 're' is not defined

In [None]:
for ddd in map(lambda x: "".join(re.findall(pattern, str(x)))[:2], lista_telefones):
    print(ddd)


## Filtering `filter()`

A segunda parte importante do paradigma funcional é a função `filter()`: ela nos permite filtrar os elementos de um iterável a partir de uma função que retorna valores booleanos. Assim como `map()`, `filter()` avalia (de forma preguiçosa) um iterável e retorna apenas os elementos onde a função aplicada retorna `True`.

Vamos continuar o nosso exemplo com uma lista de telefones e uma função para extrair o DDD:

In [None]:
lista_telefones = [
    19999571559,
    "(21) 2412-0107",
    "(34) 99762-1166",
    "91-4002-8282",
    "(19) 3542-1820",
    "(19) 3561-9525",
    "(34) 3333-5802",
]


def extrair_ddd(telefone):
    """
    Recebe um telefone e retorna seu DDD

    telefone (str or int): Telefone onde os dois primeiros digitios numéricos são o DDD
    """
    pattern = r"[0-9]{2}"
    return "".join(re.findall(pattern, str(telefone)))[:2]


In [None]:
map_19 = filter(lambda x: extrair_ddd(x) == "19", lista_telefones)
for i in map_19:
    print(i)


In [None]:
lista_ddd_19 = list(
    filter(lambda x: extrair_ddd(x) == "19", lista_telefones)
)
print(lista_ddd_19)


In [None]:
filtro_19 = filter(lambda x: extrair_ddd(x) == "19", lista_telefones)
for telefone in filtro_19:
    print(telefone)


Both `map()` and `filter()` are similar to `list comprehensions` - the only difference is that they're *lazy evaluators*!

In [6]:
[telefone for telefone in lista_telefones if extrair_ddd(telefone) == "19"]


NameError: name 'extrair_ddd' is not defined

## Agregando iteráveis com `reduce()`

The function `reduce()` implements an `accumulator`. Let's see how this works with the simple function `sum_two_elements(a, b)`:

```python
def sum_two_elements(a,b):
    return a+b
```

now, let's use `reduce()` to *reduce* our list through summing:

```python
reduce( sum_two_elements, [1,4,6,8] )
```

```python
a = 0 # accumulator
b = 1 # value
a + b = 1 # so the accumulator receives this cummulative sum

a = 1 # accumulator
b = 4 # value
a + b = 5
...
a = 5 # accumulator
b = 6 # value 
a + b = 11
...
a = 11 # accumulator
b = 8 # value
a + b = 19

return 19
```

In [8]:
from functools import reduce


### Example 1: Numbers

In [None]:
def somar_ab(a, b):
    print(f"a={a}, b={b}")
    return a + b


In [None]:
lista_numeros = [1, 4, 6, 8]
reduce(somar_ab, lista_numeros)


In [None]:
def comp_ab(x, y):
    print(f"a={x}, b={y}")
    if x > y:
        return x
    else:
        return y


reduce(comp_ab, [2, 10, 25, 1, -10, 13, 40, 20])


### Example 2: Strings

In [None]:
lista_letras = ["P", "e", "d", "r", "o"]


In [None]:
reduce(lambda x, y: x + y, lista_letras)


Let's use reduce to select the longest string in a list:

In [None]:
lista_nomes = ["Amapá", "Roraima", "Pará", "Piauí", "Maranhão"]
reduce(lambda x, y: x if len(x) > len(y) else y, lista_nomes)


### Example 3: Chaining Map, Filter & Reduce

In [12]:
list_tuples = [(12, 119), (-12, 43), (28, 39), (12, 21), (-14, 43)]

In [13]:
map_prod = map(lambda x: x[0] *x[1], list_tuples)
filt_neg = filter(lambda x: x > 0, map_prod)
smallest = reduce(lambda x, y: x if x < y else y, filt_neg)

print(smallest)

252
