# Performance
We already discussed that performance is stochastic, *i.,e.,* sequential runnings of the same program can lead to different execution times. Recapitulating, we can measure:
- **Counts** of how often an event occurs.
- The **duration** of some interval.
- The **size** of a given variable.

We are going to discuss next how we measure performance using Python.

## Measuring CPU time
### `timeit`
This module can be used to measure the execution time of a piece of code. In particular, it can be called from the terminal to be executed under a .py file using
```bash
python -m timeit my_script.py
python -m timeit "My Python code"
```
In a Jupyter Notebook, we do it using the **magic function** `%timeit`.

In [None]:
import numpy as np

In [None]:
%timeit [x**4 for x in range(10000)]
%timeit np.arange(10000)**4

As you can see, an approach to beat the stochastic CPU time is to use statistics. The output of this magic function shows the mean and standard deviation after running the subsequent code a number of times.

We can also measure the execution CPU time of functions

In [None]:
%%timeit # Now it applies to the entire cell
def sum2d(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result

In [None]:
# The previous script does not define the function
def sum2d(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result

In [None]:
a=np.ones((2048,2048)) 

In [None]:
a.size == 2048 ** 2 # elements

In [None]:
%timeit sum2d(a)

### `njit from numba`
The `njit` function from the Numba library is used for **Just-In-Time (JIT) compilation of Python code** to achieve significant performance improvements. `Numba` is a Just-In-Time compiler for Python that translates your Python functions into optimized machine code, often resulting in execution speeds comparable to compiled languages like C and Fortran.

**Note.** The old function `jit` is now deprecated. See [here](https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit).

In [None]:
from numba import njit

In [None]:
a=np.ones((2048,2048))

In [None]:
@njit
def sum2dv3(arr):
    M, N = arr.shape
    result = 0.0
    for i in range(M):
        for j in range(N):
            result += arr[i,j]
    return result

In [None]:
%timeit sum2d(a)

In [None]:
%timeit sum2dv3(a)

Amazing!

### `numexpr`
The `numexpr` library in Python is designed to efficiently evaluate numerical expressions on arrays. It provides a way to accelerate numerical computations, especially those involving large arrays, by optimizing memory usage and utilizing multiple CPU cores.

In [None]:
import numexpr as ne

In [None]:
a = np.random.rand(100000)
b = np.random.rand(100000)
%timeit np.sin(a) + np.log(b)
%timeit ne.evaluate("sin(a) + log(b)")

In [None]:
%timeit 2*a + 3*b
%timeit ne.evaluate("2*a + 3*b")

## Measuring Size

In [None]:
x = np.array([1.3, 2.4, 3.3])

In [None]:
x.data # Memory Location

In [None]:
# 'data' = A 2-tuple whose first argument is a 
# Python integer that points to the data-area storing the array contents.
x.__array_interface__

In [None]:
# Size (number of elements of the array)
x.size

In [None]:
# Memory size of one array element (in bytes)
x.itemsize

In [None]:
# Memory size of the full (in bytes)
x.itemsize * x.size

## Profiling
Profiling in Python involves analyzing the performance of your code to identify bottlenecks and areas that can be optimized for better efficiency. Python offers several tools and libraries for profiling code. Here we are going to cover some considered native (*i.,e.,* that do not require additional software).

### `cProfile`
**syntax (on bash)**
```bash
python -m cProfile my_script.py
```

In [None]:
from os import system # module to work with bash

In [None]:
system("cat examples/Example0.py")

In [None]:
system("python -m cProfile examples/Example0.py")

What are we seeing?
- `ncalls`: This column shows the number of times each function was called during the execution of the program.
- `tottime`: This column indicates the total time (in seconds) spent in each function excluding time spent in its subfunctions. It's the "internal" time spent exclusively in the function itself.
- `percall`: This column shows the average time (in seconds) spent in each function call, calculated as tottime / ncalls.
- `cumtime`: This column represents the cumulative time (in seconds) spent in the function and all its subfunctions. It includes the time spent in the function itself and all the functions called from it.
- `percall`: This column indicates the average cumulative time (in seconds) per call, calculated as cumtime / ncalls.
- `filename:lineno(function)`: This column provides information about the location of the function in your code, including the filename, line number, and function name.

The output is generally sorted by the cumtime column, which helps you quickly identify functions that consume the most overall time. These are potential candidates for optimization. You will want to look at functions with **high cumtime and ncalls values**.

### `profile`
The `profile` module is another built-in profiler that provides a higher-level interface for profiling your code. It outputs information about function calls and their time consumption. You can use the `profile` module to profile specific parts of your code.

In [None]:
import profile

In [None]:
def main():
	x=[1.0]*(2048*2048) 
	a=str(x[0]) 
	a+=" is a one..." 
	del x			
	print(a)

profiler = profile.Profile()
profiler.runcall(main)
profiler.print_stats()

We are still getting an output similar to `cProfile`. To get an output of the performance line-by-line, we should do something else.
1. Install `line_profiler` using `pip` or `anaconda`.
2. On the .py file that you want to analyze, put the decorator `@profile` above the function that you want to profile.
3. Use `kernprof.py` (found [here](https://github.com/pyutils/line_profiler/blob/main/kernprof.py), but also inside `examples`) on your .py file.
4. Execute the command 
```bash
python -m profile my_script.py
```

Try to do this for the example files in `examples/`

There is also a way to do it locally. Bear with me.

In [None]:
from line_profiler import LineProfiler

In [None]:
def main(a,b,c):
	print("a= ", a)
	print("b= ", b)
	print(np.dot(a,b))
	print(a @ b)

a = np.array([[1,2],[4,3]])
b = np.array([[1,2],[4,3]])
c = np.arange(2) + 1

lp = LineProfiler()
lp_wrapper = lp(main)
lp_wrapper(a,b,c)
lp.print_stats()