# Profiling tutorial
Never try to optimise your code without profiling first.

## 1. Coarse-grained profiling
Measuring the performance of an entire program run.

### /usr/bin/time
Great tool for getting an initial feel of program behaviour.
- Language independent - can measure anything
- Gives some basic data like total time spent, total memory used
- Also gives more advanced information like user/system CPU time, page faults, context switches, IO operations.
- Good for benchmarking your program: monitoring it's performance when we make changes and making sure it does not get worse.

These two examples show the behaviour of an io-intensive vs an computation-intensive program:

**MacOS** replace `-v` with `-l`

In [None]:
%%bash 
/usr/bin/time -v curl -s http://example.com > /dev/null

In [None]:
%%bash 
/usr/bin/time -v python -c "for i in list(range(int(1e6))): n = i"

The -o option will output to a file:

/usr/bin/time -v -o myfile.txt {command}

## 2. Function-level python profiling

### Timing a single block of code
For a one-off timing you can just time the function at the start and end, or use jupyter.

In [None]:
import time
def myfunc():
    start = time.perf_counter()
    n = 0
    for i in range(int(1e6)):
        n = i
    print('Time taken: {} secs'.format(time.perf_counter() - start))

myfunc()

You can easily wrap this in a decorator:

In [None]:
import time
def time_func(func):
    def wrapper(*args, **kw):
        start_time = time.perf_counter()
        result = func(*args, **kw)
        end_time = time.perf_counter()
        print('Func {} took {} secs'.format(func.__name__, (end_time - start_time)))
        return result
    return wrapper

@time_func
def myfunc():
    n = 0
    for i in range(int(1e6)):
        n = i

myfunc()

### Using Jupyter %time and %timeit
This way we can time code very easily:

In [None]:
%%time
n = 0
for i in range(int(1e6)):
    n = i

In [None]:
%%timeit -n 3 -r 10
n = 0
for i in range(int(1e6)):
    n = i

In [None]:
def myfunc():
    n = 0
    for i in range(int(1e6)):
        n = i
%timeit -n 3 -r 10 myfunc()

### Memory profiling a block of code
Need to install the `memory_profiler` package: `conda install memory_profiler` or `pip install memory_profiler`

In [None]:
%load_ext memory_profiler
def myfunc():
    n = 0
    for i in range(int(1e6)):
        n = i
%memit myfunc()

In [None]:
%%memit
n = 0
for i in range(int(1e6)):
    n = i

### Using cProfile to get function call data
The cProfile module will tell you how much time is spent on each function in your python program and how many times each was called. cProfile is included in the standard python library so no need to install anything.

In [None]:
%%bash
python -m cProfile -s cumulative walk.py

In [None]:
# Or you can call it from within python
import cProfile
from walk import keep_python_busy
cProfile.run('keep_python_busy()', sort='cumulative')

These results are not very readable though. Let's output them to a file so we can use some visualisation tools.

In [1]:
import cProfile
from walk import keep_python_busy
cProfile.run('keep_python_busy()', filename='walk.prof')

### Interpreting cProfile results with snakeviz

In [None]:
%%bash
snakeviz walk.prof

### Using PyCharm Pro's profiler
The "profile" button in Pycharm Pro can make it easy to do profiling-optimising iterations.

![pycharm profiling button](resources/pycharm_profile_button.png)
This will give you a call list and a call graph similar to KCacheGrind (but not interactive):
![pycharm profiling graph](resources/pycharm_profile_graph.png)

## 3. Line-level profiling
The stuff we've seen so far are great for narrowing down and finding where the bottlnecks are. However, once we have identified which functions are slow we may want to make the function faster and for that we need to know which lines are the slowest.

### Using line_profiler
We need to specify the funtion we want to profile and the root code we want to run - they may not be the same as the function to be profiled may be called from many other functions.

In [1]:
%load_ext line_profiler

from walk import primes_in_range

%lprun -f primes_in_range primes_in_range(100000)

### Using memory_profiler
Likewise, we may want to find out which lines in our code are incrementing our memory consumption.
You may want to restart the kernel before running this as python keeps a cache of objects.

In [1]:
%load_ext memory_profiler

from walk import use_memory

%mprun -f use_memory use_memory(str_repeat=100000, str_count=1000)


