## map, reduce and filter

These are three functions which facilitate a functional approach to programming. `map`, `reduce` and `filter` are three higher-order functions that appear in all pure functional languages including Python. They are often are used in functional code to make it more elegant.

### Map

It basically provides kind of parallelism by calling the requested function over all elements in a list/array or in other words, 
Map applies a function to all the items in the given list and returns a new list.

It takes a function and a collection of items as parameters and makes a new, empty collection, runs the function on each item in the original collection and inserts each return value into the new collection. It then returns the updated collection.

This is a simple map that takes a list of names and returns a list of the lengths of those names:

In [2]:
names = ["Manish Gupta", "Aalok", "Vivek","Durga Prasad"]
lst = []

for name in names:
    lst.append(len(name))
    
print(lst)

[12, 5, 5, 12]


In [13]:
names =  ("Manish Gupta", "Aalok", "Vibhor","Durga Prasad")

lst = map(len, names)
print(lst, list(lst))

<map object at 0x7f9485c9af50> [12, 5, 6, 12]


In [17]:
for data in map(len, names):
    print(data)

12
5
6
12


In [19]:
for data in map(str.upper, names):
    print(data)

MANISH GUPTA
AALOK
VIBHOR
DURGA PRASAD


In [22]:
# This is a map that squares every number in the passed collection:
power = map(lambda x: x * x,  lst)
print(power)

<map object at 0x7f9485c9a7d0>


Lets perform the similar operations on large collection

In [31]:
import random

student_count = 999
max_marks = 100
min_marks = 0
semester = 8
marks = [[random.randint(min_marks, max_marks) for _ in range(semester)] for _ in range(student_count)]

In [32]:
print(marks[:3])

[[49, 0, 95, 1, 23, 6, 83, 33], [93, 18, 84, 27, 74, 77, 68, 11], [94, 82, 84, 48, 96, 88, 74, 94]]


In [34]:
%%timeit

results = []
for a in marks:
    results.append(sum(a[:6]) * 0.1 + sum(a[6:]))

535 µs ± 12.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


Lets use `map` to get the same result

In [38]:
%%timeit
results = tuple(map(lambda a: sum(a[:6]) * 0.1 + sum(a[6:]), marks))

634 µs ± 35.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [40]:
%%timeit
results = map(lambda a: sum(a[:6]) * 0.1 + sum(a[6:]), marks)

227 ns ± 21.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [41]:
%%timeit

lst = []
for data in map(lambda a: sum(a[:6]) * 0.1 + sum(a[6:]), marks):
    lst.append(data)

651 µs ± 31.2 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [43]:
results = tuple(map(lambda a: sum(a[:6]) * 0.1 + sum(a[6:]), marks))

print(results[:10])

(133.4, 116.30000000000001, 217.2, 154.8, 76.5, 112.4, 97.7, 169.6, 147.4, 206.7)


In [46]:
user_details = {
    1: {
    "name": "Mayank Johri",
    "age":44
    },
    2: {
    "name": "Rahul Saxena",
    "age":48
    },
    3: {
    "name": "Sachin",
    "age": 42
    },
    4: {
    "name": "Rajeev Chaturvedi",
    "age": 43
    }
}

The objective is to find all users with age 44 or above.

In [47]:
def _isvalid(user):
    flg = True if user['age'] >= 44 else False
    return user['name'], flg

In [48]:
for user in user_details.values():
    print(_isvalid(user))

('Mayank Johri', True)
('Rahul Saxena', True)
('Sachin', False)
('Rajeev Chaturvedi', False)


In [49]:
valid_users = tuple(map(_isvalid, user_details.values()))
print(valid_users)

(('Mayank Johri', True), ('Rahul Saxena', True), ('Sachin', False), ('Rajeev Chaturvedi', False))


In [15]:
import random

names_dict = {}
names = ["Mayank", "Manish", "Aalok", "Roshan Musheer"]
code_names = ['Mr. Normal', 'Mr. God', 'Mr. Cool', 'The Big Boss']

for i in range(len(names)):
#     name = random.choice(code_names)
    while name in names_dict.values():
        name = random.choice(code_names)
    names_dict[names[i]] = name 
        
print(names_dict)

{'Manish': 'The Big Boss', 'Aalok': 'Mr. Cool', 'Roshan Musheer': 'Mr. God', 'Mayank': 'Mr. Normal'}


This can be rewritten as a lamba:

In [8]:
import random

names = ["Mayank", "Manish", "Aalok", "Roshan Musheer"]
code_names = ['Mr. Normal', 'Mr. God', 'Mr. Cool', 'The Big Boss']
random.shuffle(code_names)

a_dict = lambda: {k: v for k, v in zip(names, code_names)}
print(a_dict())

{'Mayank': 'Mr. Cool', 'Roshan Musheer': 'Mr. Normal', 'Aalok': 'The Big Boss', 'Manish': 'Mr. God'}


In [50]:
# TODO -> Try the above one using map, if possible

In [56]:
# WHY? 
# def dictMap(f, xs) :
#     return dict((f(i), i) for i in xs)

We can also have function which take more than one parameters

In [57]:
lst = [1, 2, 3, 4]
lst2 = [2, 3, 4, 5]
print(list(map(pow, lst, lst2)))

[1, 8, 81, 1024]


In [58]:
def fahrenheit(T):
    return ((float(9)/5)*T + 32)

temp = (36.5, 37, 37.5, 39)

F = map(fahrenheit, temp)
print(list(F))

[97.7, 98.60000000000001, 99.5, 102.2]


### Reduce

Reduce takes a function and a collection of items. It returns a value that is created by combining the items. This is a simple reduce. It returns a multiplication of all the items in the collection.

In [10]:
# Original Code
val = 1

for item in range(1, 6):
    val *=  item
    
print(val)

120


In [51]:
# Converted Code

from functools import reduce

product = reduce(lambda total, num: total * num, range(1, 6))
print(product) # (((1 * 2 )* 3 )* 4) * 5

120


In [53]:
# Adding the elements of the list
# Adding some inital value to the logic, 
# In the below example, `10` is the initial value.

print(reduce(lambda total, num: total + num, range(1, 6), 10)) 
#-> 10 + 1 + 2 + 3 + 4 + 5

25


check the last argument of the above example, `10` is the initail value so the result is `25` instead of `15`.

In the above example, `num` is the current iterated item and `total` is the accumulator. 
It is the value returned by the execution of the lambda on the previous item. reduce() walks through the items. For each one, it runs the lambda on the current a and x and returns the result as the a of the next iteration.

What is a in the first iteration? There is no previous iteration result for it to pass along. reduce() uses the first item in the collection for a in the first iteration and starts iterating at the second item. That is, the first x is the second item.

This code counts how often the word 'the' appears in a list of strings:

In [55]:
sentences = ['Copy the variable assignment for our new empty list ',
             'Copy the expression that we’ve been append-ing into this new list ',
             'Copy the for loop line, excluding the final ',
             'Copy the if statement line, also without the ']

count = 0
for sentence in sentences:
    count += sentence.count('the')

print(count)

6


In [56]:
def add(a, b):
    """
    In the first iteration, `a` is the first element, and `b` is the second element.
    In next iteration, now we have value for `a` which is the returned value from the first iteration and 
    value of `b` is third element.  and so on for the reminaing iterations.
    """
    print(a, b)
    if isinstance(a, int):  # Second iteration onwards
        result = a + b.count('the')
    else:  # First iteration
        result =  a.count('the') + b.count('the')
    return result


print(reduce(add, sentences))

Copy the variable assignment for our new empty list  Copy the expression that we’ve been append-ing into this new list 
2 Copy the for loop line, excluding the final 
4 Copy the if statement line, also without the 
6


In [57]:
def add(a, b):
    """
    By providing an initial value of `0`, I avoided all the logic of previous example.
    """
    print(a, b)
    result = a + b.count('the')
    return result


print(reduce(add, sentences, 0))

0 Copy the variable assignment for our new empty list 
1 Copy the expression that we’ve been append-ing into this new list 
2 Copy the for loop line, excluding the final 
4 Copy the if statement line, also without the 
6


#### Joining String using `reduce` 

- **Method 1**

In [58]:
colors = ["pink", "purple", "black", "yellow", "purple", "indego", "white", "peach"]
cols = "".join(colors)
print(cols)

pinkpurpleblackyellowpurpleindegowhitepeach


In [59]:
%%timeit

colors = ["pink", "purple", "black", "yellow", "purple", "indego", "white", "peach"]
cols = "".join(colors)

215 ns ± 16.8 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


- **Method 2**

In [61]:
import operator
from functools import reduce

colors = ["pink", "purple", "black", "yellow", "purple", "indego", "white", "peach"]
reduce(operator.add, colors)

'pinkpurpleblackyellowpurpleindegowhitepeach'

In [53]:
%%timeit
import operator
from functools import reduce

colors = ["pink", "purple", "black", "yellow", "purple", "indego", "white", "peach"]
d = reduce(operator.add, colors)

1.47 µs ± 65.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [62]:
import operator
from functools import reduce

colors = ["pink", "purple", "black", "yellow", "purple", "indego", "white", "peach"]


Lets take another example, We can also use `reduce` to 

In [63]:
# Logic without `reduce`
import random

colors = ["pink", "purple", "black", "yellow", "green", "indego", "white", "peach"]
pre_val = {}

for _ in range(10000):
    colour = random.choice(colors)
    if colour in pre_val:
        pre_val[colour] += 1
    else:
        pre_val[colour] = 1
        
print(pre_val)

{'yellow': 1259, 'black': 1263, 'peach': 1244, 'white': 1249, 'indego': 1259, 'green': 1234, 'purple': 1203, 'pink': 1289}


In [79]:
import random

colors = ["pink", "purple", "black", "yellow", "green", "indego", "white", "peach"]

def random_finder(pre_val, _):
    """
    Trying to find random.choice randomness. 
    """
    colour = random.choice(colors)
    if colour in pre_val:
        pre_val[colour] += 1
    else:
        pre_val[colour] = 1
    return pre_val

counts = reduce(random_finder, range(10000), {})
print(counts)

{'purple': 1295, 'black': 1294, 'indego': 1193, 'white': 1212, 'peach': 1230, 'pink': 1243, 'green': 1232, 'yellow': 1301}


In [52]:
vals = [[10.592503862004378, 10.381575625004189, 10.195353463001084],
        [0.7834748989989748, 0.7892336399963824, 0.786548912001308],
        [0.82304873400426, 0.7867131360035273, 0.7751070890008123]]

def my_sum(val, a):
    val.append(sum(a))
    return val

sum_val = reduce(my_sum, vals, [])
print(sum_val)

[31.16943295000965, 2.3592574509966653, 2.3848689590085996]


In [119]:
help(reduce)

Help on built-in function reduce in module _functools:

reduce(...)
    reduce(function, sequence[, initial]) -> value
    
    Apply a function of two arguments cumulatively to the items of a sequence,
    from left to right, so as to reduce the sequence to a single value.
    For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
    ((((1+2)+3)+4)+5).  If initial is present, it is placed before the items
    of the sequence in the calculation, and serves as a default when the
    sequence is empty.



In [122]:
vals = [[10.592503862004378, 10.381575625004189, 10.195353463001084],
        [0.7834748989989748, 0.7892336399963824, 0.786548912001308],
        [0.82304873400426, 0.7867131360035273, 0.7751070890008123]]

def my_sum(val, a):
    print(f"val: {val},\n a: {a}")
    val.append(sum(a))
    return val

sum_val = reduce(my_sum, vals)
print(sum_val)

val: [10.592503862004378, 10.381575625004189, 10.195353463001084],
 a: [0.7834748989989748, 0.7892336399963824, 0.786548912001308]
val: [10.592503862004378, 10.381575625004189, 10.195353463001084, 2.3592574509966653],
 a: [0.82304873400426, 0.7867131360035273, 0.7751070890008123]
[10.592503862004378, 10.381575625004189, 10.195353463001084, 2.3592574509966653, 2.3848689590085996]


**NOTE:**

How does this code come up with its initial a? The starting point for the number of incidences of 'Sam' cannot be 'Mary read a story to Sam and Isla.' The initial accumulator is specified with the third argument to reduce(). This allows the use of a value of a different type from the items in the collection.

In [101]:
import os

def xor(txt, key):
    return array('B', map(ord, chr(ord(a) ^ b) for a, b in zip(s, t))).tostring()
with open(os.path.join('code', 'data', "pg7864.txt")) as fp:
    txt = fp.read()
    

SyntaxError: Generator expression must be parenthesized (<ipython-input-101-5e33fbad83da>, line 4)

In [35]:
import array
a1 = array.array('B', map(ord,"from array import array;a = tuple('all the world is a stage and all the men and women merely players')"))
print(a1)

array('B', [102, 114, 111, 109, 32, 97, 114, 114, 97, 121, 32, 105, 109, 112, 111, 114, 116, 32, 97, 114, 114, 97, 121, 59, 97, 32, 61, 32, 116, 117, 112, 108, 101, 40, 39, 97, 108, 108, 32, 116, 104, 101, 32, 119, 111, 114, 108, 100, 32, 105, 115, 32, 97, 32, 115, 116, 97, 103, 101, 32, 97, 110, 100, 32, 97, 108, 108, 32, 116, 104, 101, 32, 109, 101, 110, 32, 97, 110, 100, 32, 119, 111, 109, 101, 110, 32, 109, 101, 114, 101, 108, 121, 32, 112, 108, 97, 121, 101, 114, 115, 39, 41])


In [34]:
# %%timeit
s = "from array import array;a = tuple('all the world is a stage and all the men and women merely players')"
d = [ord(a) for a in s]
print(d)

[102, 114, 111, 109, 32, 97, 114, 114, 97, 121, 32, 105, 109, 112, 111, 114, 116, 32, 97, 114, 114, 97, 121, 59, 97, 32, 61, 32, 116, 117, 112, 108, 101, 40, 39, 97, 108, 108, 32, 116, 104, 101, 32, 119, 111, 114, 108, 100, 32, 105, 115, 32, 97, 32, 115, 116, 97, 103, 101, 32, 97, 110, 100, 32, 97, 108, 108, 32, 116, 104, 101, 32, 109, 101, 110, 32, 97, 110, 100, 32, 119, 111, 109, 101, 110, 32, 109, 101, 114, 101, 108, 121, 32, 112, 108, 97, 121, 101, 114, 115, 39, 41]


In [6]:
import timeit
 
print(timeit.repeat("array('B', map(ord, a)).tostring()",setup="from array import array;a = tuple('all the world is a stage and all the men and women merely players')"))
print(timeit.repeat("''.join(a)",setup="from array import array;a = list('all the world is a stage and all the men and women merely players')"))
print(timeit.repeat("''.join(a)",setup="from array import array;a = tuple('all the world is a stage and all the men and women merely players')"))

[10.592503862004378, 10.381575625004189, 10.195353463001084]
[0.7834748989989748, 0.7892336399963824, 0.786548912001308]
[0.82304873400426, 0.7867131360035273, 0.7751070890008123]


#### Benefits `map` and `reduce `


* they are often one-liners.
* the important parts of the iteration - the collection, the operation and the return value - are always in the same places in every map and reduce.
* the code in a loop may affect variables defined before it or code that runs after it. By convention, maps and reduces are functional.
* map and reduce are elemental operations. Every time a person reads a for loop, they have to work through the logic line by line. There are few structural regularities they can use to create a scaffolding on which to hang their understanding of the code. In contrast, map and reduce are at once building blocks that can be combined into complex algorithms, and elements that the code reader can instantly understand and abstract in their mind. “Ah, this code is transforming each item in this collection. It’s throwing some of the transformations away. It’s combining the remainder into a single output.”
* `map` and `reduce` have many friends that provide useful, tweaked versions of their basic behaviour. For example: `filter`, `all`, `any` and `find`.

### Filtering

The function `filter(function, list)` offers an elegant way to filter out all the elements of a list, for which the function returns True.

The function `filter(f,l`) needs a function `f` as its first argument. `f` returns a Boolean value, i.e. either True or False. This function will be applied to every element of the list `l`. Only if f returns True will the element of the list be included in the result list. 

In [85]:
%%timeit
# Original Code
fib = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

result = []
for element in fib:
    if element % 2 != 0:  # That means its an odd number
        result.append(element)
# print(result)

988 ns ± 51.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [64]:
# %%timeit
# Using Filter

fib = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

# Odd Elements
# if the lambda return `False` that value is dropped 
# and if it returns `True` than its value is added to the 
# new collection

result = tuple(filter(lambda x: x % 2 != 0, fib))
print(result)

(1, 1, 3, 5, 13, 21, 55)


In [66]:
# Even Elements
result = list(filter(lambda x: x % 2 == 0, fib))
print(list(result))

[0, 2, 8, 34]


In [68]:
apis = [
    {'name': 'UpdateUser', 'type': 'POST', "body": "{'name': '$name'}"},
    {'name': 'addUser', 'type': 'POST', "body": "{name : '$name'}"},
    {'name': 'listUsers', 'type': 'GET'}
]

In [70]:
posts = 0
for api in apis:
    if 'type' in api and api['type'] == 'POST':
        posts += 1

print(posts)

2


In [71]:
posts = 0
for api in apis:
    if api.get('type', "") == 'POST':
        posts += 1

print(posts)

2


In [73]:
# !! Gotcha !!
# It will not return the count but all the elements which have `type` = `POST`
posts = 0
c = []

c = filter(lambda x:'type' in x and x['type'] == 'POST', apis)
print(c)
print(tuple(c))

# Once consumed, they will not return it again. 
print(c)
print(list(c))

<filter object at 0x7f94843bc490>
({'name': 'UpdateUser', 'type': 'POST', 'body': "{'name': '$name'}"}, {'name': 'addUser', 'type': 'POST', 'body': "{name : '$name'}"})
<filter object at 0x7f94843bc490>
[]


In [74]:
posts = 0
c = []

c = filter(lambda x:'type' in x and x['type'] == 'POST', apis)
posts = tuple(c)

print(posts)
print(len(posts))

({'name': 'UpdateUser', 'type': 'POST', 'body': "{'name': '$name'}"}, {'name': 'addUser', 'type': 'POST', 'body': "{name : '$name'}"})
2


In [76]:
posts = 0
c = []

c = filter(lambda x: x.get('type', "") == 'POST', apis)
posts = tuple(c)

print(posts)
print(len(posts))

({'name': 'UpdateUser', 'type': 'POST', 'body': "{'name': '$name'}"}, {'name': 'addUser', 'type': 'POST', 'body': "{name : '$name'}"})
2


In [77]:
# The Case is that we  need value of heights from the collection which have 
# provided its value. 

people = [{'name': 'Mary', 'height': 160},
          {'name': 'Isla', 'height': 80},
          {'name': 'Sam'}]

# Using filter we removed all the elements which do not have `height` 
# and then used `map` to provide its value.

heights = tuple(map(lambda x: x['height'],
              filter(lambda x: 'height' in x, people)))
print(heights)

# Now lets find the average height of the collection.
if len(heights) > 0:
    from operator import add
    average_height = reduce(add, heights) / len(heights)
    print(f'average_height: {average_height}')

(160, 80)
average_height: 120.0
