# Summary

This notebook provides some tools on how to profile and time python code. This can 
help to make your code a lot faster. However, there are certain limits on how fast 
you can rewrite your code. For additional speedup you must use HPC libraries. I will 
cover this in additional notebook

In [1]:
import numpy as np
import time

In [2]:
def utility(x, floor):
    floored_consumption = np.where(x < floor, floor, x)
    return np.log(floored_consumption)

In [3]:
max_grid = 1_000_000
num_grid = 1_000_000
consumption = np.linspace(1, max_grid, num_grid)

#### Timing

Timing of code is the the basic building block of writing fast code. However do not waste time in the beginning of your project to make your code fast. First make it work, then make it fast. Because:

> [Premature optimization is the root of all evil. - Donald Knuth](https://wiki.c2.com/?PrematureOptimization)

We start with naive timing, i.e. taking the time when the code starts and when it finishes.

In [4]:
tic = time.time()
utility(consumption, 1)
toc = time.time()
print(f"Time elapsed: {toc - tic: .5f} seconds")

Time elapsed:  0.02897 seconds


Jupyter notebook however has built-in magic commands that can be used to time the execution of a cell. 
Two %% always mean cell magic, and one % always means line magic.

The time module runs the code once:

In [5]:
%%time
utility(consumption, 1)


CPU times: user 4.79 ms, sys: 5.62 ms, total: 10.4 ms
Wall time: 9.88 ms


array([ 0.        ,  0.69314718,  1.09861229, ..., 13.81550856,
       13.81550956, 13.81551056])

The timeit module multiple times to get an average runtime:

In [6]:
%timeit utility(consumption, 1)

6.06 ms ± 128 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


### Profiling

If you have multiple operations, you want to find out, what takes the most time. This is called profiling. Let's write down a few functions:

In [7]:
def calc_choosen_utility(consumption, floor, disutil):
    """Calclulate the choosen utility for a given consumption level."""
    unemployed_util, work_util = calc_utilities(consumption, disutil, floor)
    utilities = np.column_stack((unemployed_util, work_util))
    choice = determine_optimal_choice(utilities)
    utility_of_choice = np.take(utilities, choice)
    return utility_of_choice


def determine_optimal_choice(utilities):
    """Determine the optimal choice."""
    shocks = np.random.gumbel(size=(utilities.shape[0], 2))
    choice_specific_util = utilities + shocks
    return np.argmax(choice_specific_util, axis=1, keepdims=True)


def calc_utilities(cons, disutil, floor):
    """Calculate the utilities for unemployed and employed."""
    floor_consumption = calc_floor_consumption(cons, floor)
    base_utility = np.log(floor_consumption)
    utility_work = base_utility - disutil
    return base_utility, utility_work


def calc_floor_consumption(cons, floor):
    """Ensure that the consumption is above the floor."""
    mask = cons < floor
    cons[mask] = floor
    return cons

We will use snakeviz as the profiler. It is a browser based profiler, that can be used in jupyter notebook. It is not installed by default, so we have to install it first. It is installed via pip and specified in the environment.yml file. To use it after installation, we need to load it first.

In [8]:
%load_ext snakeviz

In [9]:
%%snakeviz -t


floor_consumption = 0.1
disutility = 0.5

utility_of_choice = calc_choosen_utility(consumption, floor_consumption, disutility)

 
*** Profile stats marshalled to file '/tmp/tmpavq4rt9v'.
Opening SnakeViz in a new tab...
snakeviz web server started on 127.0.0.1:8080; enter Ctrl-C to exit
http://127.0.0.1:8080/snakeviz/%2Ftmp%2Ftmpavq4rt9v


However large codebasis and models would not be executed in a jupyter notebook and therefore we need to use the command line. We can use the cProfile module to profile our code. We can use the -o flag to save the output to a file. We can then use snakeviz to visualize the output. The code for the command line for an example is:

```bash
python -m cProfile -o profile.prof profiling_timing.py
```

and then to visualiuze the output:
    
```bash
snakeviz profile.prof
```

In [10]:
%timeit utility_log(consumption)

NameError: name 'utility_log' is not defined

In [None]:
@nb.jit(nopython=True)
def utility_log_numba(x):
    for i in range(len(x)):
        x[i] = np.log(x[i])
    return x

In [None]:
utility_log_numba(consumption)
%timeit utility_log_numba(consumption)

3.86 ms ± 43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


# Introduce an consumption floor

In [None]:
def utility_log_floor(x, floor):
    return np.log(np.maximum(x, floor))

@nb.njit()
def utility_log_floor_numba(x, floor):
    for i in range(len(x)):
        x[i] = np.log(max(x[i], floor))
    return x

In [None]:
%timeit utility_log_floor(consumption, 1)

7.26 ms ± 159 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [None]:
utility_log_floor(consumption, 1)
%timeit utility_log_floor_numba(consumption, 1)

3.91 ms ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
