<a href="https://colab.research.google.com/github/1dhiman/100days-ml/blob/master/2020/making_python_programs_fast.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Making Python Programs Fast

## Timing and Profiling


In [2]:
# slow_program.py
# it computes e to power of X
from decimal import *

def exp(x):
    getcontext().prec += 2
    i, lasts, s, fact, num = 0, 0, 1, 1, 1
    while s != lasts:
        lasts = s
        i += 1
        fact *= i
        num *= x
        s += num / fact
    getcontext().prec -= 2
    return +s

exp(Decimal(150))
exp(Decimal(400))
exp(Decimal(3000))

Decimal('1.393709580666379697318341937E+65')

#### Timing Specific Functions

We might want to time the slow function, without measuring rest of the code. For that we can use simple decorator:

In [0]:
from functools import wraps
import time

def timeit_wrapper(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()  # Alternatively, you can use time.process_time()
        func_return_val = func(*args, **kwargs)
        end = time.perf_counter()
        print('{0:<10}.{1:<8} : {2:<8}'.format(func.__module__, func.__name__, end - start))
        return func_return_val
    return wrapper

This decorator can be then applied to function under test like so:

In [12]:
@timeit_wrapper
def exp(x):
    getcontext().prec += 2
    i, lasts, s, fact, num = 0, 0, 1, 1, 1
    while s != lasts:
        lasts = s
        i += 1
        fact *= i
        num *= x
        s += num / fact
    getcontext().prec -= 2
    return +s

print('{0:<10} {1:<8} {2:^8}'.format('module', 'function', 'time'))
exp(Decimal(150))
exp(Decimal(400))
exp(Decimal(3000))

module     function   time  
__main__  .exp      : 0.004896852999991097
__main__  .exp      : 0.05113874599999235
__main__  .exp      : 14.764340761000085


Decimal('7.646200989054704889310727660E+1302')

One thing to consider is what kind of time we actually (want to) measure. Time package provides `time.perf_counter` and `time.process_time`. The difference here is that `perf_counter` returns absolute value, which includes time when your Python program process is not running, therefore it might be impacted by machine load. On the other hand `process_time` returns only user time (excluding system time), which is only the time of your process.

## Making It Faster

**Use Built-in Data Types**

Built-in data types are very fast, especially in comparison to our custom types like trees or linked lists. That's mainly because the built-ins are implemented in C, which we can't really match in speed when coding in Python.

**Use Local Variables**

This has to do with speed of lookup of variables in each scope. There's actually difference in speed of lookup even between - let's say - local variable in function (fastest), class-level attribute (e.g. self.name - slower) and global for example imported function like time.time (slowest).

You can improve performance, by using seemingly unnecessary (straight-up useless) assignments like this:

In [0]:
#  Example #1
class FastClass:

    def do_stuff(self):
        temp = self.value  # this speeds up lookup in loop
        for i in range(10000):
            ...  # Do something with `temp` here

#  Example #2
import random

def fast_function():
    r = random.random
    for i in range(10000):
        print(r())  # calling `r()` here, is faster than global random.random()

**Use Functions**

This might seem counter intuitive, as calling function will put more stuff onto stack and create overhead from function returns, but it relates to previous point. If you just put your whole code into one file without putting it into function, it will be much slower because of global variables. Therefore you can speed up your code just by wrapping whole code in main function and calling it once, like so:

In [0]:
def main():
    ...  # All your previously global code

main()

**Don't Access Attributes**

Another thing that might slow down your programs is dot operator (`.`) which is used when accessing object attributes. This operator triggers dictionary lookup using `__getattribute__`, which creates extra overhead in your code. 

In [0]:
#  Slow:
import re

def slow_func():
    for i in range(10000):
        re.findall(regex, line)  # Slow!

#  Fast:
from re import findall

def fast_func():
    for i in range(10000):
        findall(regex, line)  # Faster!

**Beware of Strings**

Operations on strings can get quite slow when ran in loop using for example modulus (`%s`) or `.format()`. The only thing we should be using is `f-string`, it's most readable, concise AND the fastest method. This is the list of methods you can use - fastest to slowest:

```
f'{s} {t}'  # Fast!
s + '  ' + t
' '.join((s, t))
'%s %s' % (s, t)
'{} {}'.format(s, t)
Template('$s $t').substitute(s=s, t=t)  # Slow!
```

**Generators Can Be Fast**

Generators are not inherently faster as they were made to allow for lazy computation, which saves memory rather than time. However, the saved memory can be cause for your program to actually run faster. How? Well, if you have large dataset and you don't use generators (iterators), then the data might overflow CPUs L1 cache, which will slow down lookup of values in memory significantly.

When it comes to performance, it's very import that CPU can save all the data it's working on, as close as possible, which is in the cache.

[Source](https://martinheinz.dev/blog/13)