# Python is cool

The article is re-made from the source github [huyen-chip](https://github.com/chiphuyen/python-is-cool#1-lambda-map-filter-reduce)

 #### Table of contents
 1. [Lambda, map, filter, reduce](#I)
 2. [List manipulation](#II)
 3. [Classes and magic methods](#III)
 4. [Local namespace, object's attributes](#IV)
 5. [Wild import](#VI)
 6. [Decorator to time your functions](#VI)
 7. [Caching with @functools.lru_cache](#VII)


### I. Lambda, map, filter, reduce <a name= "I"></a>

> **The lambda** keyword is used to create function inline.

In [8]:
# Function
def square_fn(x):
    return x*x
# Lambda
square_lbd= lambda x: x*x

print('Square function:' + str(square_fn(5))) 
print('Square lambda: ' + str(square_lbd(5)))

Square function:25
Square lambda: 25


In [13]:
for i in range(10):
    assert square_fn(i) == square_lbd(i)
# assert: check condition is True    

They are especially useful when used in conjunction with function like **filter, map and reduce** 

> ##### map(function, iterable)

In [25]:
nums = [1/3, 333/7, 2323/2230, 40/34, 2/3]
num_square1= [num*num for num in nums]
print(num_square1)

# used to map
num_square2= map(square_fn, nums)
print(list(num_square2))

# used to map conjuction lambda
num_square3= map(square_lbd, nums)
print(list(num_square3))

[0.1111111111111111, 2263.0408163265306, 1.0851472983570953, 1.384083044982699, 0.4444444444444444]
[0.1111111111111111, 2263.0408163265306, 1.0851472983570953, 1.384083044982699, 0.4444444444444444]
[0.1111111111111111, 2263.0408163265306, 1.0851472983570953, 1.384083044982699, 0.4444444444444444]


You can also use **map** with more than on iterable. For example, if you want to calculate the mean squared error of a simple linear function *f(x)= ax+b* with the true label *labels*, these two method are equivalent: 

In [32]:
a, b = 3, -0.5
xs = [2, 3, 4, 5]
labels = [6.4, 8.9, 10.9, 15.3]

# Method 1: using a loop
errors = []
for i, x in enumerate(xs):
    errors.append((a * x + b - labels[i]) ** 2)
    # ** pow: 3 ** 2 = 9
result1 = sum(errors) ** 0.5 / len(xs)

# Method 2: using a map
errors_2 = map(lambda x, y: (a * x + b - y) ** 2, xs, labels)
result2 = sum(errors_2) ** 0.5 / len(xs)

print(result1, result2)

0.35089172119045514 0.35089172119045514


> ##### filter(function, iterable)

In [34]:
bad_preds = filter(lambda x: x > 5, [2, 4, 7, 8])
print(list(bad_preds))

[7, 8]


> ##### reduce(function, iterable, initializer)

**reduce** is used when we want to iteratively apply an operator to all elements in a list. For example, if we want to calculate the product of all elements in a list: 

In [39]:
nums = [4, 5, 6]
product1= 0
for num in nums:
    product1 +=num
print(product1)

# using a reduce
from functools import reduce
product2 = reduce(lambda x, y: x + y, nums)
print(product2)

15
15


#### Note on the performance of lambda functions
**lambda** function are meant for one time use. Each time *lambda x: dosomething(x)* is called, the function has to be created, which hures the performance if you call *lambda x: dosomething(x)* multiple times(e.g. when you pass it inside **reduce**)

When you assign a name to the lambda function as in *fn = lambda x: dosomething(x)*, its performance is slightly slower than the same function defined using **def**, but the difference is negligible.


### II. List manipulation<a name= "II"></a>

#### Slicing

In [41]:
elems = list(range(10))
print(elems)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [44]:
print(elems[::])

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [42]:
print(elems[::-1])

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]


In [43]:
print(elems[::-2])

[9, 7, 5, 3, 1]


In [45]:
print(elems[::2])

[0, 2, 4, 6, 8]


In [46]:
print(elems[1:5])

[1, 2, 3, 4]


In [49]:
print(elems[-3::])

[7, 8, 9]


In [54]:
print(elems[-2::-2])

[8, 6, 4, 2, 0]


#### Flattenning
We can **flatten** a list of lists using **sum**

In [59]:
list_of_lists = ([1], [2, 3], [4, 5, 6])
print(sum(list_of_lists, []))

[1, 2, 3, 4, 5, 6]


#### List vs generator
To illustrate the difference between a list and a generator, let's look at an example of creating n-grams out of a list of tokens.

In [66]:
tokens = ['i', 'want', 'to', 'go', 'school']
def ngrams(tokens, n):
    gram = []
    for i in range(0, len(tokens)-n + 1):
        gram.append(tokens[i:i+n])
    return gram
print(ngrams(tokens, 3))

[['i', 'want', 'to'], ['want', 'to', 'go'], ['to', 'go', 'school']]


### III. Classes and magic methods<a name= "III"></a>

In Python, magic methoss are prefixed and suffixed with the double underscore **__**, also known as dunder. The most wellknown magic method is probably **__**init**__**

In [75]:
class Node:
    ''' A struct to denote the node of a binary tree.
    It contains a value a pointers to left and right children'''
    def __init__(self, value, left=None, right=None):
        self.value= value
        self.left= left
        self.right =right

In [76]:
root = Node(5)
print(root)

<__main__.Node object at 0x00000215A902AA30>


Ideally, when user prints out a node, we want to print out the node's value and the values of its children if it has children. To do so, we use the magic method **__**repr**__** , which must return a printable object, like string.

In [77]:
class Node:
    ''' A struct to denote the node of a binary tree.
    It contains a value a pointers to left and right children'''
    def __init__(self, value, left=None, right=None):
        self.value= value
        self.left= left
        self.right =right
    def __repr__(self):
        strings = [f'value: {self.value}']
        strings.append(f'left: {self.left.value}' if self.left else 'left: None')
        strings.append(f'right: {self.right.value}' if self.right else 'right: None')
        return ','.join(strings)
left = Node(4)
root = Node(5, left)
print(root)

value: 5,left: 4,right: None


We'd also like to compare two nodes by comparing their values. To do so, we overload the operator **==** with **__**eq**__**, **<** with **__**lt**__**, and **>=** **__**ge**__**

In [79]:
class Node:
    ''' A struct to denote the node of a binary tree.
    It contains a value a pointers to left and right children'''
    def __init__(self, value, left=None, right=None):
        self.value= value
        self.left= left
        self.right =right
    def __repr__(self):
        strings = [f'value: {self.value}']
        strings.append(f'left: {self.left.value}' if self.left else 'left: None')
        strings.append(f'right: {self.right.value}' if self.right else 'right: None')
        return ','.join(strings)
    def __eq__(self, other):
        return self.value == other.value
    def __lt__(self, other):
        return self.value < other.value
    def __ge__(self, other):
        return self.value >= other.value
left = Node(4)
root = Node(5, left)
print(left == root)
print(left < root)
print(left >= root)

False
True
False


For a comprehensive list of supported magic methods [here](https://www.tutorialsteacher.com/python/magic-methods-in-python) or see the official Python documentation [here](https://docs.python.org/3/reference/datamodel.html#special-method-names) (slightly harder to read).

Some of the methods that I highly recommend:

+ **__**len**__**: to overload the len() function.
+ **__**str**__**: to overload the str() function.
+ **__**iter**__**: if you want to your objects to be iterators. This also allows you to call next() on your object.
For classes like Node where we know for sure all the attributes they can support (in the case of Node, they are value, left, and right), we might want to use **__**slots**__** to denote those values for both performance boost and memory saving. For a comprehensive understanding of pros and cons of **__**slots**__**, see this [absolutely amazing answer by Aaron Hall on StackOverflow](https://stackoverflow.com/questions/472000/usage-of-slots/28059785#28059785).

### IV. Local namespace, object's attributes<a name= "IV"></a>

The **locals()** function returns a dictionary containning the variables definde in the local namespace

In [87]:
class Model1: 
    def __init__(self, hidden_size=100, num_layers=3, learning_rate=3e-4):
        print(locals())
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.learning_rate = learning_rate
model1 = Model1()

{'self': <__main__.Model1 object at 0x00000215A89E3D00>, 'hidden_size': 100, 'num_layers': 3, 'learning_rate': 0.0003}


All attributes of an object are stored in its **__**dict**__**

In [88]:
print(model1.__dict__)

{'hidden_size': 100, 'num_layers': 3, 'learning_rate': 0.0003}


Note that manually assigning each of the arguments to an attribute can be quite tiring when the list of the arguments is large. To avoid this, we can directly assign the list of arguments to the object's **__**dict**__**.

In [90]:
class Model2: 
    def __init__(self, hidden_size=100, num_layers=3, learning_rate=3e-4):
        params = locals()
        del params['self']
        self.__dict__ = params
model2 = Model2()
print(model2.__dict__)

{'hidden_size': 100, 'num_layers': 3, 'learning_rate': 0.0003}


This can be especially convenient when the object is initiated using the catch-all ***kwargs, though the use of **kwargs should be reduced to the minimum.

In [93]:
class Model3:
    def __init__(self, **kwargs):
        self.__dict__ = kwargs

model3 = Model3(hidden_size=100, num_layers=3, learning_rate=3e-4)
print(model3.__dict__)

{'hidden_size': 100, 'num_layers': 3, 'learning_rate': 0.0003}


### V. Wild import<a name= "V"></a>

Often, you run into this wild import * the looks something like this: 

In [None]:
from parts import * 

# import file parts.py

### VI. Decorator to time your functions<a name= "VI"></a>

It's often useful to know how long it takes a function to run, e.g when you need to compare the performance of two algorithms that do the same thing. One naive way is to call **time.time()** at the begin and end of each function and print out the difference.

For example: compare two algorithms to calculate the n-th Fibonacci number, one uses memoization and one doesn't.

In [95]:
def fib_helper(n):
    if n < 2:
        return n
    return fib_helper(n - 1) + fib_helper(n - 2)

def fib(n):
    """ fib is a wrapper function so that later we can change its behavior
    at the top level without affecting the behavior at every recursion step.
    """
    return fib_helper(n)

def fib_m_helper(n, computed):
    if n in computed:
        return computed[n]
    computed[n] = fib_m_helper(n - 1, computed) + fib_m_helper(n - 2, computed)
    return computed[n]

def fib_m(n):
    return fib_m_helper(n, {0: 0, 1: 1})

In [96]:
for n in range(20):
    assert fib(n) == fib_m(n)

In [99]:
import time

start = time.time()
fib(30)
print(f'Without memoization, it takes {time.time() - start:7f} seconds.')

start = time.time()
fib_m(30)
print(f'With memoization, it takes {time.time() - start:.7f} seconds.')

Without memoization, it takes 0.662593 seconds.
With memoization, it takes 0.0000000 seconds.


If you want to time multiple functions, it can be a drag having to write the same code over and over again. It'd be nice to have a way to specify how to change any function in the same way. In this case would be to call **time.time()** at the beginning and the end of each function, and print out the time difference

This is exactly what decorators do. They allow programmers to change the behavior of a function or class. Here's an example to create a decorator **timeit**.

In [100]:
def timeit(fn): 
    # *args and **kwargs are to support positional and named arguments of fn
    def get_time(*args, **kwargs): 
        start = time.time() 
        output = fn(*args, **kwargs)
        print(f"Time taken in {fn.__name__}: {time.time() - start:.7f}")
        return output  # make sure that the decorator returns the output of fn
    return get_time 

Add the decorator **@timeit** to your functions.

In [102]:
@timeit
def fib(n):
    return fib_helper(n)

@timeit
def fib_m(n):
    return fib_m_helper(n, {0: 0, 1: 1})

fib(30)
fib_m(30)


Time taken in fib: 0.6815777
Time taken in fib_m: 0.0000000


832040

### VII. Caching with @functools.lru_cache<a name= "VII"></a>

Memoization is a form of cache: we cache the previously calculated Fibonacci numbers so that we don't have to calculate them again.

Caching is such an important technique that Python provides a built-in decorator to give your function the caching capacity. If you want **fib_helper** to reuse the previously calculated Fibonacci numbers, you can just add the decorator **lru_cache** from **functools**. **lru** stands for "least recently used". For more information on cache, see [here](https://docs.python.org/3/library/functools.html).

In [103]:
import functools

@functools.lru_cache()
def fib_helper(n):
    if n < 2:
        return n
    return fib_helper(n - 1) + fib_helper(n - 2)

@timeit
def fib(n):
    """ fib is a wrapper function so that later we can change its behavior
    at the top level without affecting the behavior at every recursion step.
    """
    return fib_helper(n)

fib(50)
fib_m(50)

Time taken in fib: 0.0000000
Time taken in fib_m: 0.0000000


12586269025

Soure post: [github chiphuyen](https://github.com/chiphuyen/python-is-cool#1-lambda-map-filter-reduce)