# Vectorization

Vectorization - method to accelerate mathematical operations.

Vectorization converts algorithms that operate on single values into ones that operate on entire arrays (vectors) at once. This means that instead of using explicit loops to iterate through elements, you can apply functions directly to arrays, which NumPy handles internally using pre-compiled C code for efficiency

*TODO*:
Add "from scratch" implementation of np.vectorize

In [2]:
import numpy as np
import time

In [11]:
np.random.seed(42)

# Creating small arrays to demostrate difference in small array
x_small = np.random.rand(10_000)
y_small = np.random.rand(10_000)

# Creating large arrays to demostrate difference in small array
x_large = np.random.rand(100_000_000)
y_large = np.random.rand(100_000_000)

In [8]:
def my_dot(x: np.ndarray, y: np.ndarray) -> float:
    sum = 0
    for xi, yi in zip(x, y):
        sum += xi * yi
    
    return sum

## Compare implementations

### Small data

In [12]:
# Capture start time for vectorized version
start_time = time.time()
np.dot(x_small, y_small)
end_time = time.time()
print(f"Vectorized version duration: {1000 * (end_time - start_time):.2f} ms")

# Capture start time for non-vectorized version
start_time = time.time()
my_dot(x_small, y_small)
end_time = time.time()
print(f"Non-Vectorized version duration: {1000 * (end_time - start_time):.2f} ms")

Vectorized version duration: 0.80 ms
Non-Vectorized version duration: 15.90 ms


### Large data

In [13]:
# Measure the time for the vectorized version on large arrays
start_time = time.time()
np.dot(x_large, y_large)
end_time = time.time()
print(f"Vectorized version duration (large arrays): {1000 * (end_time - start_time):.2f} ms")

# Measure the time for the non-vectorized version on large arrays
start_time = time.time()
my_dot(x_large, y_large)
end_time = time.time()
print(f"Non-Vectorized version duration (large arrays): {1000 * (end_time - start_time):.2f} ms")


Vectorized version duration (large arrays): 203.87 ms
Non-Vectorized version duration (large arrays): 29846.21 ms
