## ⚡ Profiler

- Implement element-wise multiplication in two ways
    - In the first, implement it in one line using list comprehension and `zip`
    - In the second, use numpy to perform it in one line as well

**Expected Time To Finish Task:** ≤ 10 Minutes

In [3]:
import numpy as np

def python_multiply(A:list[float], B:list[float]) -> list[float]:
    # TODO [1]: Return the element-wise product of A and B using pure Python
    return [a * b for a, b in zip(A, B)]
   
def numpy_multiply(A: np.ndarray, B: np.ndarray) -> np.ndarray:
    # TODO [2]: Return the element-wise product of A and B using NumPy
    return np.multiply(A, B)

In [4]:
A = [1, 2, 3, 4]
B = [2, 3, 4, 5]
assert python_multiply(A, B) == [2, 6, 12, 20]
assert (numpy_multiply(np.array(A), np.array(B)) == np.array([2, 6, 12, 20])).all()

Now let's compare both functions

In [5]:
import time
import random

def multiply_benchmark(*functions):
    benchmarks = []
    # loop over each given function, we want to test each over 100 multiplications
    for i, func in enumerate(functions):
        total_time = 0
        for _ in range(100):
            # TODO [3]: Generate two random lists of length 100000
            
            list1 = [random.random() for _ in range(100000)]
            list2 = [random.random() for _ in range(100000)]

            # Measure the time taken for the operation
            if i == 1: list1, list2 = np.array(list1), np.array(list2)
            start_time = time.time()
            func(list1, list2)
            end_time = time.time()

            # TODO [4]: Add the time taken to the total time
            total_time += (end_time - start_time)

        # TODO [5]: Calculate the average time over the 100 multiplications
        avg_time = total_time / 100
        # append the function name and average time to the list
        benchmarks.append((func.__name__, avg_time))

    return benchmarks


benchmarks = multiply_benchmark(python_multiply, numpy_multiply)
print(benchmarks)

[('python_multiply', 0.01834573030471802), ('numpy_multiply', 0.0005717158317565918)]


What do you notice from the output above and why?

In [8]:
'''
NumPy is significantly faster than pure Python (typically 100-1000x faster).

Why:
1. NumPy operations are implemented in C, not Python
2. NumPy uses vectorization - processes all elements at once instead of looping
3. Memory layout is optimized for CPU cache efficiency
4. No Python interpreter overhead per element
5. NumPy is specifically designed for numerical computations
'''

'\nNumPy is significantly faster than pure Python (typically 100-1000x faster).\n\nWhy:\n1. NumPy operations are implemented in C, not Python\n2. NumPy uses vectorization - processes all elements at once instead of looping\n3. Memory layout is optimized for CPU cache efficiency\n4. No Python interpreter overhead per element\n5. NumPy is specifically designed for numerical computations\n'

Would multithreading help speed the pure Python function? Why?

In [9]:
'''
No, multithreading would NOT help speed up the pure Python function. Here's why:

1. Global Interpreter Lock (GIL): Python has a GIL that prevents multiple threads from executing Python bytecode simultaneously. Only one thread can have access to the interpreter at a time.

2. CPU-bound vs I/O-bound: This is a CPU-bound task (pure computation), not I/O-bound (waiting for files, network, etc.). Multithreading only helps with I/O-bound tasks where threads can wait independently.

3. For multithreading to help: You would need multiprocessing instead, which uses separate Python processes to bypass the GIL. However, this has overhead for spawning processes.

4. NumPy is different: NumPy releases the GIL during operations, so it can benefit from multithreading and doesn't suffer from this limitation.
'''

"\nNo, multithreading would NOT help speed up the pure Python function. Here's why:\n\n1. Global Interpreter Lock (GIL): Python has a GIL that prevents multiple threads from executing Python bytecode simultaneously. Only one thread can have access to the interpreter at a time.\n\n2. CPU-bound vs I/O-bound: This is a CPU-bound task (pure computation), not I/O-bound (waiting for files, network, etc.). Multithreading only helps with I/O-bound tasks where threads can wait independently.\n\n3. For multithreading to help: You would need multiprocessing instead, which uses separate Python processes to bypass the GIL. However, this has overhead for spawning processes.\n\n4. NumPy is different: NumPy releases the GIL during operations, so it can benefit from multithreading and doesn't suffer from this limitation.\n"