# Big Data Platforms

## Functional Programming and Map Reduce 

Functional programming is all about expressions. Expression oriented functions of Python provides are:

    map(aFunction, aSequence)
    filter(aFunction, aSequence)
    reduce(aFunction, aSequence)
    lambda
    list comprehension
    
References: https://www.bogotobogo.com/python/python_fncs_map_filter_reduce.php

Let us look at Map Reduce examples in simple Python. Hadoop is not being used in this case. 

## Lambda Functions

In [3]:
#regular function
def sum(x,y):
    return x+y

In [4]:
#Lambda
f = lambda x, y : x + y

In [5]:
print(sum(2,3))
print(f(2,3))

5
5


## Map

The map function is the simplest one among Python built-ins used for functional programming. One of the common things we do with list and other sequences is applying an operation to each item and collect the result. For example, updating all the items in a list can be done easily with a for loop:

In [None]:
items = [1, 2, 3, 4, 5]
squared = []
for x in items:
    squared.append(x ** 2)

squared

Since this is such a common operation, actually, we have a built-in feature that does most of the work for us.

The **map(aFunction, aSequence)** function applies a passed-in function to each item in an iterable object and returns a list containing all the function call results.

In [None]:
items = [1, 2, 3, 4, 5]

def sqr(x): return x ** 2

list(map(sqr, items))


We passed in a user-defined function applied to each item in the list. map calls sqr on each list item and collects all the return values into a new list.Because map expects a function to be passed in, it also happens to be one of the places where *lambda* routinely appears:

In [None]:
list(map((lambda x: x **2), items))

While we still use lambda as aFunction, we can have a list of functions as aSequence:

In [None]:
def square(x):
        return (x**2)
def cube(x):
        return (x**3)

funcs = [square, cube]
for r in range(5):
    value = map(lambda x: x(r), funcs)
    print (list(value))

Because using map is equivalent to for loops, with an extra code we can always write a general mapping utility:

In [None]:
def mymap(aFunc, aSeq):
	result = []
	for x in aSeq: result.append(aFunc(x))
	return result

print(list(map(sqr, [1, 2, 3])))

mymap(sqr, [1, 2, 3])

Since it's a built-in, map is always available and always works the same way. It also has some performance benefit because it is usually faster than a manually coded for loop. On top of those, map can be used in more advance way. For example, given multiple sequence arguments, it sends items taken form sequences in parallel as distinct arguments to the function:

In [None]:
print( pow(2,10), pow(3,11), pow(4,12))

list(map(pow, [2, 3, 4], [10, 11, 12]))

In [None]:
x = [1,2,3]
y = [4,5,6]

from operator import add
print (list(map(add, x, y)))  # output [5, 7, 9]

The map call is similar to the **list comprehension** expression. But map applies a function call to each item instead of an arbitrary expression. Because of this limitation, it is somewhat less general tool. In some cases, however, map may be faster to run than a list comprehension such as when mapping a built-in function. And map requires less coding.

If **function is None**, the **identity function is assumed**; if there are multiple arguments, map() returns a list consisting of tuples containing the corresponding items from all iterables (a kind of transpose operation). The iterable arguments may be a sequence or any iterable object; the result is always a list:

In [16]:
#string example

list(map(lambda word: word.upper(), ["University", "Chicago"]))

['UNIVERSITY', 'CHICAGO']

## Filter and Reduce

Filter extracts each element in the sequence for which the function returns True. The reduce function reduces a list to a single value by combining elements via a supplied function. 

The filter filters out items based on a test function which is a filter and apply functions to pairs of item and running result which is reduce.

Because they return iterables, range and filter both require list calls to display all their results in Python 3.0. 

In [None]:
print(list(range(-5,5)))

print(list( filter((lambda x: x < 0), range(-5,5))))

Like map, this function is roughly equivalent to a for loop, but it is built-in and fast:

In [None]:
result = []
for x in range(-5, 5):
	if x < 0:
		result.append(x)

result

Here is another use case for filter(): finding intersection of two lists:

In [None]:
a = [1,2,3,5,7,9]
b = [2,3,5,6,7,8]

print (list(filter(lambda x: x in a, b)))  # prints out [2, 3, 5, 7]

Note that we can do the same with *list comprehension*:

In [None]:
a = [1,2,3,5,7,9]
b = [2,3,5,6,7,8]
print ([x for x in a if x in b]) # prints out [2, 3, 5, 7]

In [None]:
from functools import reduce

reduce( (lambda x, y: x * y), [1, 2, 3, 4] )
reduce( (lambda x, y: x / y), [1, 2, 3, 4] )

In [None]:
def fahrenheit(T):
    return ((float(9)/5)*T + 32)
def celsius(T):
    return (float(5)/9)*(T-32)
temp = (36.5, 37, 37.5,39)

F = map(fahrenheit, temp)
C = map(celsius, F)

In [None]:
Celsius = [39.2, 36.5, 37.3, 37.8]
Fahrenheit = map(lambda x: (float(9)/5)*x + 32, Celsius)
print (list(Fahrenheit))

C = map(lambda x: (float(5)/9)*(x-32), Fahrenheit)
print (list(C))

In [None]:
#sum all numbers in this order [[[47+11] + 42] + 13] 
reduce(lambda x,y: x+y, [47,11,42,13])

In [None]:
#find the maximum number in a list 
f = lambda a,b: a if (a > b) else b

reduce(f, [47,11,42,102,13])

In [None]:
#Calculating the sum of the numbers from 1 to 100: 
reduce(lambda x, y: x+y, range(1,101))