# 1. Code Profiling in Python: Introduction

**Profiling** is a technique that allows us to pinpoint the most resource intensive parts of an application. It allows to search for bottlenecks in the code in a rigorous and systematic way.

**Profiler** is a program that runs an application and monitors functions execution time, thus detecting the function on which the application spends the most of its time.

**Benchmarks** are small scripts used to assess the total execution time of an application.

We will learn how to run benchmarks and measure performance of the code.

## 1.1 Application Optimization Principles

- Make it run!
- Make it right!
- Make it fast!

## 1.2 Particle Simulator

A Particle Simulator is an application that was created  to showcase the benchmarking techniques.

The Particle Simulator application simulates a scientific performance-intensive system. The application is capable to simulate the rotation of multiple particles over a central point. The particles angular rotation speed is constant. Each particle starts rotating from some predefined position, identified by the coordinates x and y. The users of the simulator want to be able to identify the position of any particle at any point of time.

The code of the particle simulator is located in the project directory called **simulator**.

The particle simulator code is very simple, it contains two classes

1) Particle class, describing the particle. Each particle is characterized by the coordinates x and y, the angular velocity, and the color. 
2) ParticleSimulator class monitors the movement of the attached particles. It has the method called **.evolve()** that evaluates the position of every attached particle after time **dt**.

#### Running Particle Simulator Visualization Code
 
The Particle Simulator app contains a simple particle simulator visualization, it is located in the main.py file, and you can from the command line as follows:

```
$ python main.py
```

In order to see visualization on the screen, you havet to install **matplotlib** UI backend with the corresponding Python module.
On my Linux system I have qt5 installed, and in addition I needed to install a Python module PyQT5. I'm not adding this as a dependency,
because it is pretty much system dependent, and one needs to figure out what works best on his/her machine.

## 1.3 Project Structure

```
.
├── benchmarks
│   ├── benchmark.py
│   └── __init__.py
├── main.py
├── simulator
│   ├── __init__.py
│   └── particle_simulator.py
├── tests
│   ├── __init__.py
│   └── test_particle_simulator.py
└── Profiling_in_Python.ipynb
```

#### Explanation

- The file benchamrks/benchmark.py contains benchmark scripts.
- The file simulator/particle_simulator.py contains particle simulator code.
- The directory tests contains unit tests.

## 1.4 Code Profiling Tools We are Going to Learn

1) Unix **time** command
2) Python module **timeit**
3) Pytest plugin **pytest-benchmark**
4) Python module **cProfile**, and graphical tools used to visualize **cProfile** data
5) Python third party library **line_profiler**
6) Python third party library **memory_profiler**
7) Python disassembly module **dis**

## 1.5 Code Profiling Basics with **time** and **timeit**

In this section we will learn the code profiling ABC with Unix command **time** and Python module **timeit**.

### 1.5.1 Unix **time** command

There is a project file **benchmark/benchmark.py**, it contains a benchmark code that we want to run and measure the run time. 
I will show how we can do it with the Unix **time** command. The command needs to be run from the Unix command line:


```
$  time python -m benchmarks.benchmark
```

That is what I'm getting back:

```
real    0m0,096s
user    0m0,088s
sys     0m0,008s
```

Please note that your results may be different, because they depend on your computer configuration.

The interpretation of the results is as follows:

**real**: The actual time spent running the process from start to finish.

**user**: The cumulative time spent by all the central processing units (CPUs) during the direct computations.

**sys**: The cumulative time spent by all the CPUs during system-related tasks, such as memory allocation.

### 1.5.2 Python **timeit** module

**timeit** module is used to measure the execution time of small snippents of Python code.

The **timeit** runs the snippet of code in a loop for *n* times and measures the execution time of n loops, then it repeats the same operation *r* times, then it chooses the quickest series of n loops, and calculates the average loop runtime.

I will set *n* to 100 and *r* to 3. 

It is possible to use timeit from the command line, from the Jupyter Notebook (IPython), or as a part of the Python code. I will explain and demonstrate all the methods measuring the run time of the function called **run_benchmark()**. This function creates creates a ParticleSimulator object with one hundred Particle objects, and then moves the particles using the .evolve() method, the evolution time step is equal to 0.1.

Please note that the best option is to run the timeit from the command line, the execution from the command line is a whole way faster than the execution from the Jupyter Notebook. 

#### Example 1: Using **timeit** from the command line

```
$ python -m timeit -s "from benchmarks.benchmark import run_benchmark" -n 100 -r 3 
```

On my machine I got the following result:

```
100 loops, best of 3: 7.22 nsec per loop
```

Essentially the the **timeit** command runs the function 100 times and measures the runtime of all 100 runtimes, then it repeats the process 3 times. The best (the quickest) series of 100 runs is chosen, and based on this series, the average function runtime is evaluated.

#### Example 2: Using **timeit** from Jupyter Notebook (IPython environment)

IPython is a Python shell that improves the interactivity of Python interpreter. IPython accepts *magic commands* - the statements that start with % symbol. The **timeit** can be invoked as a magic command.

In [1]:
# Timing the run_benchmark() execution time using timeit and IPython magic commands

from benchmarks.benchmark import run_benchmark

%timeit -n 100 -r 3 run_benchmark()

36.9 ms ± 216 µs per loop (mean ± std. dev. of 3 runs, 100 loops each)


#### Example 3: Using **timeit** from the Python code

In [5]:
# Timing the run_benchmark() using timeit and Python

import timeit

n = 100
r = 3

result = timeit.repeat("run_benchmark()", setup="from benchmarks.benchmark import run_benchmark", number=n, repeat=r)

print("Result:\n", result)

Result:
 [3.8569349130002593, 3.785130720000325, 3.818997477999801]


The results received from the Python code are in a slightly different form, but they are still comparable to the results received using the magic command in the **Example 2**. (In both cases the code was executed from the same environment - the Jupyter Notebook.)

The result shows execution time of three series of 100 loops. Let's derive the average loop runtime, the way the **timeit** does it.
In order to do so, we need to select the best series of 100 loops, and then we have to calculate the average loop runtime dividing the time by the number of loops in the series. 

#### Please pay attention how much faster was the execution when running the timeit from the command line

#### Check the calculations and compare the results with the results received in the previous example.

In [7]:
# Single loop execution time

function_avg_runtime = min(result) / n
print(f"Function average runtime: {function_avg_runtime * 1000} ms.")

Function average runtime: 37.85130720000325 ms.


## Exercise 1:

Re-run the **Example 1**, the **Example 2**, and the **Example 3** on your own machine, and update the notebook on your machine with your own results.