# Intro to memoization/caching
with `functools.lru_cache` decorator (least-recently used caching)

Docs
* https://docs.python.org/3/library/functools.html

Other tricks
* tabulate - convert collections to tables for printing as strings, html, markdown, etc.
* tqdm - timing loops

# Generic Example

### lru_cache decorator
* LRU is short for Least Recently Used
* cache can grow without bounds
* to limit memory usage, use maxsize parameter; `@lru_cache(maxsize=n:int)`

From docs:<br>
*An LRU (least recently used) cache works best when the most recent calls are the best predictors of upcoming calls (for example, the most popular articles on a news server tend to change each day). The cache’s size limit assures that the cache does not grow without bound on long-running processes such as web servers. In general, the LRU cache should only be used when you want to reuse previously computed values. Accordingly, it doesn’t make sense to cache functions with side-effects, functions that need to create distinct mutable objects on each call, or impure functions such as time() or random().*

### No caching

In [1]:
from time import sleep
from tqdm import tqdm
from functools import lru_cache

def do_something(x):
    sleep(1)
    return x

# call do_something(1) 5 times; note that params are the the same
for _ in tqdm(range(5)):
    x = 1
    _ = do_something(x)

100%|██████████| 5/5 [00:05<00:00,  1.00s/it]


In [2]:
@lru_cache
def do_something_lru_cached(x):
    sleep(1)
    return x

#  5 calls with different arguments
inputs = [1,2,3,4,5]
for n in tqdm(inputs):
    _ = do_something_lru_cached(n)

100%|██████████| 5/5 [00:05<00:00,  1.00s/it]


In [3]:
# 100 calls with same argument
for n in tqdm(range(100)):
    _ = do_something_lru_cached(1)

100%|██████████| 100/100 [00:00<00:00, 1252031.04it/s]


### Naive cache from scratch
* naive because to implement Least Recently Used, you would need a special data structure 
* where size can be bounded and the ordering of keys would shift to pop off older keys (function inputs)

In [4]:
# naive cache
cache = {}
def do_something_my_cache(x):
    global cache
    
    # first check if we have computed this result before
    if cache.get(x):
        return cache[x]
    else: 
        sleep(1)
        
        # store result in cache for reuse
        cache[x] = x
        return x

# 100 calls with same argument
    # this performs much faster than the non-cached version
    # because we don't have to recompute the result
for n in tqdm(range(100)):
    _ = do_something_my_cache(1)

100%|██████████| 100/100 [00:01<00:00, 99.53it/s]


# Fibonacci (cliche)

Often we see demos of exponentially growing algos using Fibanocci or factorial recursive function calls.

In [5]:
from functools import lru_cache
from tabulate import tabulate
from tqdm import tqdm

def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

fib_numbers = []
for i in tqdm(range(10)):
    fib_numbers.append(fib(i))
    
fib_numbers = []
for i in tqdm(range(30)):
    fib_numbers.append(fib(i))
    
fib_numbers = []
for i in tqdm(range(35)):
    fib_numbers.append(fib(i))
    
table = tabulate(list(enumerate(fib_numbers))[:10], 
                 headers=['index', 'fibonacci number'], tablefmt='github')

print(table)
@lru_cache
def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

fib_numbers = []
for i in tqdm(range(100)):
    fib_numbers.append(fib(i))

100%|██████████| 10/10 [00:00<00:00, 114912.44it/s]
100%|██████████| 30/30 [00:00<00:00, 81.68it/s] 
100%|██████████| 35/35 [00:04<00:00,  8.68it/s] 


|   index |   fibonacci number |
|---------|--------------------|
|       0 |                  0 |
|       1 |                  1 |
|       2 |                  1 |
|       3 |                  2 |
|       4 |                  3 |
|       5 |                  5 |
|       6 |                  8 |
|       7 |                 13 |
|       8 |                 21 |
|       9 |                 34 |


100%|██████████| 100/100 [00:00<00:00, 528249.87it/s]
