# Profiling

Profiling our code is a crucial step to determining where the bottlenecks in our code are in order to figure out what tools we can use to speed up the code. There are several existing libraries we can use to achieve this, `cProfile` is good for high-level profiling, `line_profiler` is good for profiling specific lines of code, `memory_profiler` is good for tracking memory usage, `timeit` is good for evaluating specific lines in isolation.

## cProfile

In [1]:
!python -m cProfile -s time ../scripts/01-time-profiling.py

[1, 2, 3, 4, 5] -> job -> [1, 4, 9, 16, 25]
         12147 function calls (12011 primitive calls) in 5.027 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        5    5.001    1.000    5.001    1.000 {built-in method time.sleep}
       83    0.005    0.000    0.005    0.000 {built-in method nt.stat}
        2    0.003    0.001    0.003    0.001 {built-in method _imp.create_dynamic}
       13    0.002    0.000    0.002    0.000 {built-in method _io.open_code}
       13    0.002    0.000    0.002    0.000 {built-in method marshal.loads}
      241    0.001    0.000    0.002    0.000 <frozen importlib._bootstrap_external>:96(_path_join)
        2    0.001    0.000    0.001    0.000 {built-in method _imp.exec_dynamic}
       13    0.001    0.000    0.001    0.000 {method 'read' of '_io.BufferedReader' objects}
        1    0.001    0.001    0.020    0.020 context.py:1(<module>)
       46    0.001    0.000    0.001    0.000 {bui

### memory_profiler

In [2]:
!python ../scripts/02-memory-profiling.py

Filename: c:\Users\samca\Documents\Python Projects\nextgen2025-codingbootcamp-session08\scripts\02-memory-profiling.py

Line #    Mem usage    Increment  Occurrences   Line Contents
     5    102.9 MiB    102.9 MiB           1   @profile
     6                                         def memory_intensive_function():
     7                                             # say this doesnt fit into memory
     8    127.6 MiB     24.7 MiB           1       df = pd.read_csv("../data/huge_file.csv") # e.g. 100TiB of memory?!
     9    127.8 MiB      0.2 MiB           1       df["transaction_value"].sum()




## timeit

In [None]:
import time
import timeit

def job_with_fixed_execution_time(x: int):
    """
    execution time is a fixed 1.0s
    """
    time.sleep(0.1)
    return x ** 2

N = 10
execution_time = timeit.timeit(lambda: job_with_fixed_execution_time(0), number=N)
print(f"Execution Time: {execution_time/N:.3f} seconds")

Execution Time: 0.100 seconds


In [6]:
import time

t = 0
N = 10

for _ in range(N):
    t_i = time.perf_counter_ns()
    job_with_fixed_execution_time(0)
    t_f = time.perf_counter_ns()

    t += (t_f - t_i)

print(t/1e9)

print(t/N/1e9)


1.0047918
0.10047918


# PyTorch Profiler

If you're in the business of debugging models designed using PyTorch their profiling provides excellent insights into timing and memory consumption of the models functions, particularly when enabling tracing functionality.

https://pytorch.org/tutorials/recipes/recipes/profiler_recipe.html