# Chapter 01 - Benchmarking and Profiling

## 01 - Designing your application

Let's talk about the importance of profiling in optimizing Python code. Profiling helps identify the bottlenecks or slow parts of an application. By focusing on these critical sections, developers can significantly improve performance without wasting time on unnecessary optimizations.

Tools for profiling:
- **cProfile**: A standard Python module for measuring function execution time.
- **line_profiler**: A third-party package for profiling individual lines of code.
- **memory_profiler**: A tool for analyzing memory usage.
- **KCachegrind**: A graphical tool for visualizing profiling data.

Additionally, we will use benchmarks as a method for measuring the overall execution time of an application.

In [35]:
from matplotlib import pyplot as plt
from matplotlib import animation
from random import uniform
import timeit

import ipytest
import pytest

ipytest.autoconfig()

#### Purpose of Computer Simulations:
- Used in various fields like Physics, Chemistry, and Astronomy.
- Essential for studying realistic systems with a large number of bodies.
- Performance optimization is crucial for efficient simulations.

#### Particle Simulation Example:
- Simulates particles rotating around a central point.
- Requires: Initial positions, Speeds, and Rotation directions.
- Calculates: Particle positions at each time step.

#### Visual representation:
Origin: (0, 0)
Position: x, y vector
Velocity: vx, vy vector

> **Key Point**: Optimizing performance is critical for large-scale simulations to handle complex systems efficiently.

An example system is shown in the following figure. he origin of the system is the $\large (0, 0)$ point, the position is indicated by
the $\large x, y$ vector and the velocity is indicated by the $\large vx, vy$ vector:

<div style="text-align: center;">
<img src="imgs/001.png" style="width:250px">
</div>

The basic feature of a circular motion is that the particles always move perpendicular to the direction connecting the particle and the center. To move the particle, we simply change the position by taking a series of very small steps (which correspond to advancing the system for a small interval of time) in the direction of motion, as shown in the following figure:

<div style="text-align: center;">
<img src="imgs/002.png" style="width:250px">
</div>

Nosso objetivo é criar um sistema de partículas em movimento circular uniforme.

We will start by designing the application in an object-oriented way. According to our requirements, it is natural to have a generic **`Particle`** class that stores the particle positions, **x** and **y**, and their angular velocity (**ang_vel**).

In [36]:
class Particle:
    """Stores the particle positions, x and y, and their angular velocity"""

    __slots__ = ('x', 'y', 'ang_vel')

    def __init__(self, x, y, ang_vel):
        self.x = x
        self.y = y
        self.ang_vel = ang_vel

Another class, called **`ParticleSimulator`**, will encapsulate the laws of motion and will be responsible for changing the positions of the particles over time. The `__init__` method will store a list of **Particle** instances and the evolve method will change the particle positions according to our laws.

Em resumo, vamos calcular as componentes da velocidade (v_x e v_y) de uma partícula em um determinado ponto (x, y) de sua trajetória circular, sempre perpendiculares ao raio que liga a partícula ao centro da circunferência. Devemos considerar que:

**Particle Movement:**
- Particles move at a constant speed.
- Direction of movement is perpendicular to the radius from the center.
- Velocity components (v_x, v_y) are calculated using:
    - $\large v_x = -\frac{y}{\sqrt{x^2 + y^2}}$
    - $\large v_y = \frac{x}{\sqrt{x^2 + y^2}}$
    - onde:
        - v_x e v_y: representam as componentes da velocidade nas direções x e y, respectivamente.
        - x e y: representam as coordenadas da partícula em um determinado instante.


**Circular Motion Approximation:**
- Circular motion is approximated by dividing time into small time steps (dt).
- In each time step, the particle moves in a straight line tangent to the circle.

In order to avoid a strong divergence, such as the one illustrated in the following figure, it is necessary to take very small time steps:

<div style="text-align: center;">
<img src="imgs/003.png" style="width:250px">
</div>



In [37]:
class ParticleSimulator:

    def __init__(self, particles):
        self.particles = particles

    def evolve(self, dt):
        timestep = 0.00001
        nsteps = int(dt/timestep)

        for i in range(nsteps):
            for p in self.particles:
                # 1. calculate the direction
                norm = (p.x**2 + p.y**2)**0.5
                v_x = (-p.y)/norm
                v_y = p.x/norm
                # 2. calculate the displacement
                d_x = timestep * p.ang_vel * v_x
                d_y = timestep * p.ang_vel * v_y

                p.x += d_x
                p.y += d_y
                # 3. repeat for all the time steps

- *`def visualize(simulator: ParticleSimulator):`* The visualize function takes a particle ParticleSimulator instance as an argument and displays the trajectory in an animated plot.

In [38]:
def visualize(simulator: ParticleSimulator):
    # plt.matplotlib.use('Qt5Agg')  # Or 'TkAgg' 
    plt.matplotlib.use('TkAgg') 

    X = [p.x for p in simulator.particles]
    Y = [p.y for p in simulator.particles]

    fig = plt.figure()
    ax = plt.subplot(111, aspect='equal')
    # Set up the axes and use the plot function to display the particles. plot takes a list of x and y coordinates.
    line, = ax.plot(X, Y, 'ro')

    # Axis limits
    plt.xlim(-1, 1)
    plt.ylim(-1, 1)

    # Write an initialization function, init, and a function, animate, that updates the 
    # x and y coordinates using the line.set_data method.

    # It will be run when the animation starts
    def init():
        
        line.set_data([], [])
        return line,

    def animate(i):
        # We let the particle evolve for 0.1 time units
        simulator.evolve(0.01)
        X = [p.x for p in simulator.particles]
        Y = [p.y for p in simulator.particles]

        line.set_data(X, Y)
        return line,

    # Create a FuncAnimation instance by passing the init and animate functions
    # plus the interval parameters, which specify the update interval, and blit,
    # which improves the update rate of the image.

    # Call the animate function each 10 ms
    anim = animation.FuncAnimation(fig,
                                   animate,
                                   init_func=init,
                                   blit=True,
                                   interval=10, cache_frame_data=False)
    plt.show()

> To make an interactive visualization, we will use the *matplotlib.pyplot.plot* function to display the particles as points and the *matplotlib.animation.FuncAnimation* class to animate the evolution of the particles over time.

- *`def test_visualize():`* animates a system of three particles rotating in different directions. Note that the third particle completes a round three times faster than the others. The *test_visualize* function is helpful to graphically understand the system time evolution.

In [39]:
def test_visualize():
    particles = [Particle(0.3, 0.5, +1),
                 Particle(0.0, -0.5, -1),
                 Particle(-0.1, -0.4, +3)]

    simulator = ParticleSimulator(particles)
    visualize(simulator)

In [40]:
test_visualize()

## 02- Writing tests and benchmarks

At this moment, we have the first version of the particle simulator. Now we can start measuring our performance and tune up our code so that the simulator can handle as many particles as possible. As a first step, we will write a test and a benchmark.

Tests are important to code optimization. By creating a robust test suite, you can confidently experiment with an optimization technique without compromising the correctness of the code.

The specific example involves testing a simulation function named evolve. The test will simulate three particles for a short duration and compare the results with a known correct implementation. This unit-testing approach ensures that the evolve function produces accurate results, even after code modifications.

#### Unit-Testing

In [41]:
def test_evolve():
    particles = [Particle(0.3,  0.5, +1),
                 Particle(0.0, -0.5, -1),
                 Particle(-0.1, -0.4, +3)]

    simulator = ParticleSimulator(particles)

    simulator.evolve(0.1)

    p0, p1, p2 = particles

    def fequal(a, b):
        return abs(a - b) < 1e-5

    assert fequal(p0.x, 0.2102698450356825)
    assert fequal(p0.y, 0.5438635787296997)

    assert fequal(p1.x, -0.0993347660567358)
    assert fequal(p1.y, -0.4900342888538049)

    assert fequal(p2.x,  0.1913585038252641)
    assert fequal(p2.y, -0.3652272210744360)

    print("🟢 test_evolve passed")

In [42]:
test_evolve()

🟢 test_evolve passed


#### Benchmark

> Benchmarking Python code refers to comparing the performance of one program to variations of the program.

- A test ensures the correctness of our functionality but gives little information about its running time. 
- A benchmark is a simple and representative use case that can be run to assess the running time of an application. 
- Benchmarks are very useful to keep score of how fast our program is with each new version that we implement.

We can write a representative benchmark by instantiating a thousand Particle objects with random coordinates and angular velocity, and feed them to a ParticleSimulator class. We then let the system evolve for 0.1 time units:

In [43]:
def benchmark():
    particles = [Particle(uniform(-1.0, 1.0),
                          uniform(-1.0, 1.0),
                          uniform(-1.0, 1.0))
                 for i in range(100)]

    simulator = ParticleSimulator(particles)
    simulator.evolve(0.1)

**Timing your benchmark**

A very simple way to time a benchmark is through the Unix time command.

**Using the `time` Command**

The `time` command is a built-in Linux utility that allows you to measure the execution time of a command. It provides three types of time measurements:

1. **Real time**: The total elapsed time from when the command starts to when it finishes.
2. **User time**: The amount of time the command spent executing in user mode.
3. **System time**: The amount of time the command spent executing in kernel mode.

The Unix *`time`* command is one of the simplest and more direct ways to benchmark a program.

In [44]:
!time python3 'ch-01-particle-sim.py'

python3 'ch-01-particle-sim.py'  1.50s user 0.05s system 164% cpu 0.944 total


Another convenient way to time Python scripts is the *`timeit`* module. This module runs a snippet of code in a loop for $\large n$ times and measures the total execution times. Then, it repeats the same operation $\large r$ times (by default, the value of $\large r$ is 3) and records the time of the best run.

The timeit module can be used as a Python package, from the command line or from IPython with magic commands statements.

In [45]:
%timeit benchmark()

425 ms ± 42 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [46]:
result = timeit.timeit('benchmark()', 
                       setup='from __main__ import benchmark', 
                       number=10)

# result is the time (in seconds) to run the whole loop
print(result)

result = timeit.repeat('benchmark()',
                       setup='from __main__ import benchmark',
                       number=10,
                       repeat=3)

# result is a list containing the time of each repetition (repeat=3 in this case)
print(result)

4.289500959999714
[4.092992166002659, 4.682570675002353, 4.19110455199916]


The pytest executable can be used from the command line to discover and run tests contained in Python modules.

In [47]:
!python3 -m pytest 'ch-01-particle-sim-test.py'

platform linux -- Python 3.10.12, pytest-7.3.1, pluggy-1.0.0
rootdir: /home/vitormeriat/repos/python-fundamentals/py_fundamentals/high-performance-techniques
plugins: cov-4.0.0
collected 2 items                                                              [0m

ch-01-particle-sim-test.py [32m.[0m[32m.[0m[32m                                            [100%][0m



In [48]:
%%ipytest

def my_func(x):
    return x // 2 * 2 

def test_my_func_0():
    assert my_func(0) == 0
    assert my_func(1) == 0

def test_my_func_2():
    assert my_func(2) == 2
    assert my_func(3) == 2

[32m.[0m[32m.[0m[32m                                                                                           [100%][0m
[32m[32m[1m2 passed[0m[32m in 0.01s[0m[0m


In [49]:
@pytest.mark.parametrize("n", [0.1])
def test_evolve(n):
    particles = [Particle(0.3,  0.5, +1),
                 Particle(0.0, -0.5, -1),
                 Particle(-0.1, -0.4, +3)]

    simulator = ParticleSimulator(particles)

    simulator.evolve(n)

    p0, p1, p2 = particles

    def fequal(a, b):
        return abs(a - b) < 1e-5

    assert fequal(p0.x, 0.2102698450356825)
    assert fequal(p0.y, 0.5438635787296997)

    assert fequal(p1.x, -0.0993347660567358)
    assert fequal(p1.y, -0.4900342888538049)

    assert fequal(p2.x,  0.1913585038252641)
    assert fequal(p2.y, -0.3652272210744360)

In [50]:
ipytest.run()

[32m.[0m[32m.[0m[32m.[0m[32m                                                                                          [100%][0m
[32m[32m[1m3 passed[0m[32m in 0.02s[0m[0m


<ExitCode.OK: 0>

## 03-Finding bottlenecks with cProfile



In [None]:
def timing():
    result = timeit.timeit('benchmark()',
                           setup='from __main__ import benchmark',
                           number=10)
    # Result is the time it takes to run the whole loop
    print(result)

    result = timeit.repeat('benchmark()',
                           setup='from __main__ import benchmark',
                           number=10,
                           repeat=3)
    # Result is a list of times
    print(result)


def benchmark_memory():
    particles = [Particle(uniform(-1.0, 1.0),
                          uniform(-1.0, 1.0),
                          uniform(-1.0, 1.0))
                 for i in range(100000)]

    simulator = ParticleSimulator(particles)
    simulator.evolve(0.001)

In [2]:
%%timeit

benchmark()

383 ms ± 3.03 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [3]:
%%time

benchmark()

CPU times: user 373 ms, sys: 2.98 ms, total: 376 ms
Wall time: 375 ms
