Sometimes, `function-level profiling` shows that a method is expensive but doesn’t explain why because the method is complex.

* `Line-level profiling` provides more detail by measuring performance at the level of individual lines of code.
* It records how many times each line is executed and how much time is spent on it.
* Because this level of detail can be costly to collect, it is applied only to specific methods you choose.
* This approach makes it easier to spot individual lines that consume a disproportionate amount of runtime.


# Exercice1:

* Download the python code file (`fizzbuzz.py`)
We want to demonstrate how line_profiler works on a example.

We start with the below code, defined in the global context.

The below code loops through the numbers 1 to 100.
For each number:

* Prints “FizzBuzz” if it’s divisible by both 3 and 5.
* Prints “Fizz” if it’s divisible by 3 only.
* Prints “Buzz” if it’s divisible by 5 only.
* Otherwise, prints the number itself.

In [None]:
n = 100
for i in range(1, n + 1):
    if i % 3 == 0 and i % 5 == 0:
        print("FizzBuzz")
    elif i % 3 == 0:
        print("Fizz")
    elif i % 5 == 0:
        print("Buzz")
    else:
        print(i)

## To profile it, we will have to wrap it up in a function, e.g., `fizzbuzz`

In [None]:
def fizzbuzz(n):
    for i in range(1, n + 1):
        if i % 3 == 0 and i % 5 == 0:
            print("FizzBuzz")
        elif i % 3 == 0:
            print("Fizz")
        elif i % 5 == 0:
            print("Buzz")
        else:
            print(i)

fizzbuzz(100)

### Let's decorate the function using @profile

In [1]:
# load the required package as follows
!pip install line_profiler
%load_ext line_profiler

from line_profiler import profile

@profile
def fizzbuzz(n):
    for i in range(1, n + 1):
        if i % 3 == 0 and i % 5 == 0:
            print("FizzBuzz")
        elif i % 3 == 0:
            print("Fizz")
        elif i % 5 == 0:
            print("Buzz")
        else:
            print(i)

fizzbuzz(100)

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
17
Fizz
19
Buzz
Fizz
22
23
Fizz
Buzz
26
Fizz
28
29
FizzBuzz
31
32
Fizz
34
Buzz
Fizz
37
38
Fizz
Buzz
41
Fizz
43
44
FizzBuzz
46
47
Fizz
49
Buzz
Fizz
52
53
Fizz
Buzz
56
Fizz
58
59
FizzBuzz
61
62
Fizz
64
Buzz
Fizz
67
68
Fizz
Buzz
71
Fizz
73
74
FizzBuzz
76
77
Fizz
79
Buzz
Fizz
82
83
Fizz
Buzz
86
Fizz
88
89
FizzBuzz
91
92
Fizz
94
Buzz
Fizz
97
98
Fizz
Buzz


### Trigger the line profiling```bash
python -m kernprof -lvr fizzbuzz.py
```
or in jupyter notebook

In [3]:
%lprun -f fizzbuzz fizzbuzz(100)

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
17
Fizz
19
Buzz
Fizz
22
23
Fizz
Buzz
26
Fizz
28
29
FizzBuzz
31
32
Fizz
34
Buzz
Fizz
37
38
Fizz
Buzz
41
Fizz
43
44
FizzBuzz
46
47
Fizz
49
Buzz
Fizz
52
53
Fizz
Buzz
56
Fizz
58
59
FizzBuzz
61
62
Fizz
64
Buzz
Fizz
67
68
Fizz
Buzz
71
Fizz
73
74
FizzBuzz
76
77
Fizz
79
Buzz
Fizz
82
83
Fizz
Buzz
86
Fizz
88
89
FizzBuzz
91
92
Fizz
94
Buzz
Fizz
97
98
Fizz
Buzz


Timer unit: 1e-09 s

Total time: 0.004619 s
File: /var/folders/s0/2zsybdkd50s2xdxzyg37j4611m1b1l/T/ipykernel_96351/402999492.py
Function: fizzbuzz at line 7

Line #      Hits         Time  Per Hit   % Time  Line Contents
     7                                           @profile
     8                                           def fizzbuzz(n):
     9       101      24000.0    237.6      0.5      for i in range(1, n + 1):
    10       100      50000.0    500.0      1.1          if i % 3 == 0 and i % 5 == 0:
    11         6     237000.0  39500.0      5.1              print("FizzBuzz")
    12        94      34000.0    361.7      0.7          elif i % 3 == 0:
    13        27    1077000.0  39888.9     23.3              print("Fizz")
    14        67      22000.0    328.4      0.5          elif i % 5 == 0:
    15        14     553000.0  39500.0     12.0              print("Buzz")
    16                                                   else:
    17        53    2622000.0  49471.7     56.8  

* In 100 iterations, “FizzBuzz” is printed 6 times.

* Profiling shows it takes 4.5% of runtime, slightly less than the expected 6% because of extra control flow overhead.

* “Fizz” is printed 27 times, accounting for 22.3% of runtime.

* “Buzz” is printed 14 times, accounting for 12.% of runtime.

* Each print statement has a similar per-hit time of about 39–50 microseconds.

# Exercice 2 - Bubble sort:
* Download `bubblesort.py` from: https://icr-rse-group.github.io/carpentry-pando-python/files/bubblesort/bubblesort.py

Here’s what it does step by step:

* Loops through the array multiple times.
* In each pass, compares adjacent elements arr[j] and arr[j+1].
* If they’re out of order (arr[j] > arr[j+1]), it swaps them.
* Uses a flag swapped to track if any swap happened in the pass.
* If no swaps occur, the array is already sorted, so the loop stops early (this is an optimisation).

In [None]:
# Doing the necessary changes to profile the code

import sys
import random
from line_profiler import profile        # Import profile decorator

@profile                                 # Decorate the function to be profiled
def main():                              # Create a simple function with the code to be profiled
    # Argument parsing
    if len(sys.argv) != 2:
        print("Script expects 1 positive integer argument, %u found."%(len(sys.argv) - 1))
        sys.exit()
    n = int(sys.argv[1])
    # Init
    random.seed(12)
    arr = [random.random() for i in range(n)]
    print("Sorting %d elements"%(n))
    # Sort
    for i in range(n - 1): 
        swapped = False
        for j in range(0, n - i - 1):
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]
                swapped = True
        # If no two elements were swapped in the inner loop, the array is sorted
        if not swapped:
            break
    # Validate
    is_sorted = True
    for i in range(n - 1):
        if arr[i] > arr[i+1]:
            is_sorted = False
    print("Sorting: %s"%("Passed" if is_sorted else "Failed"))
    
main()                                  # Call the created function

In [None]:
# below is the actual content of the file
import sys
import random

# Argument parsing
if len(sys.argv) != 2:
    print("Script expects 1 positive integer argument, %u found."%(len(sys.argv) - 1))
    sys.exit()
n = int(sys.argv[1])
# Init
random.seed(12)
arr = [random.random() for i in range(n)]
print("Sorting %d elements"%(n))
# Sort
for i in range(n - 1): 
    swapped = False
    for j in range(0, n - i - 1):
        if arr[j] > arr[j + 1]:
            arr[j], arr[j + 1] = arr[j + 1], arr[j]
            swapped = True
    # If no two elements were swapped in the inner loop, the array is sorted
    if not swapped:
        break
# Validate
is_sorted = True
for i in range(n - 1):
    if arr[i] > arr[i+1]:
        is_sorted = False
print("Sorting: %s"%("Passed" if is_sorted else "Failed"))
    

## To profile the code could in command line, where <elements> in an integer>0
First, make sure to add the `@profile` decorator.

```bash
python -m kernprof -lvr bubblesort.py 100
```

# Observations:
* From the profiling output we can identify that the `print` statements were the most expensive individual calls (“Per Hit”), 
however both were only called once. 
* Most execution time was spent at the inner loop (lines 19-22)!!, This part can benefit from some optimisation.

#### In the following sections we are going to see with Stacy how we can optimise the code now that we know which line is the most expensive one.