### Experiment

Measuring the efficiency of vectorized Numpy operations over standard (non-vectorized) functions in Python.

In [12]:
import numpy as np
import random
import time

In [13]:
# Create two lists — one regular and one numpy – with one million random integers each
python_list = [random.randint(0, 10000) for _ in range(0, 1000000)]
numpy_array = np.array(python_list)

In [14]:
func = lambda num: num+2 if num%2==0 else num-2
vect = np.vectorize(func)

In [15]:
# Regular python list passed to non-vectorized function
start_time = time.time()
L = list(map(func, python_list))
end_time = time.time()
print(f"{round((end_time - start_time)*1000, 2)} milliseconds")

82.37 milliseconds


In [16]:
# Regular python list passed to vectorized function
start_time = time.time()
vect(python_list)
end_time = time.time()
print(f"{round((end_time - start_time)*1000, 2)} milliseconds")

154.53 milliseconds


In [17]:
# Numpy list passed to vectorized function
start_time = time.time()
vect(numpy_array)
end_time = time.time()
print(f"{round((end_time - start_time)*1000, 2)} milliseconds")

123.63 milliseconds


In [28]:
# Numpy list passed to a built-in vectorized operation
start_time = time.time()
L = np.where(numpy_array%2==0, numpy_array+2, numpy_array-2)
end_time = time.time()
print(f"{round((end_time - start_time)*1000, 2)} milliseconds")

6.51 milliseconds


In [29]:
# Measuring runtime using the %timeit magic command
numpy_array2 = np.random.randint(0, 10000, 1000000)
%timeit L = np.where(numpy_array2%2==0, numpy_array2+2, numpy_array2-2)

4.95 ms ± 83.6 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)


### Explanation

The standard Python list is faster than the vectorized NumPy version in this case due to several reasons:

1. **Overhead of `np.vectorize`**: The `np.vectorize` function is provided primarily for convenience, not for performance. The implementation is essentially a for loop that applies a Python function to each element of an array. It does not provide the same performance benefits as a true vectorized operation in NumPy.

2. **Function Call Overhead**: Each call to the lambda function in `np.vectorize` incurs Python function call overhead, which can be significant when applied to large arrays.

3. **Memory Layout**: NumPy arrays are optimized for operations that can be performed in bulk, using compiled code. When you use `np.vectorize`, you lose this advantage because it essentially reverts to element-wise operations in Python.

4. **List Comprehension and [`map`](command:_github.copilot.openSymbolFromReferences?%5B%22%22%2C%5B%7B%22uri%22%3A%7B%22scheme%22%3A%22file%22%2C%22authority%22%3A%22%22%2C%22path%22%3A%22%2FUsers%2Fhome%2FDEV%2Fscaler%2Fpython%2Fsrc%2Ftemp.ipynb%22%2C%22query%22%3A%22%22%2C%22fragment%22%3A%22%22%7D%2C%22pos%22%3A%7B%22line%22%3A99%2C%22character%22%3A22%7D%7D%5D%2C%22a3f2c677-ac7b-41eb-bf41-52d91af25f1b%22%5D "Go to definition")**: The list comprehension and [`map`](command:_github.copilot.openSymbolFromReferences?%5B%22%22%2C%5B%7B%22uri%22%3A%7B%22scheme%22%3A%22file%22%2C%22authority%22%3A%22%22%2C%22path%22%3A%22%2FUsers%2Fhome%2FDEV%2Fscaler%2Fpython%2Fsrc%2Ftemp.ipynb%22%2C%22query%22%3A%22%22%2C%22fragment%22%3A%22%22%7D%2C%22pos%22%3A%7B%22line%22%3A99%2C%22character%22%3A22%7D%7D%5D%2C%22a3f2c677-ac7b-41eb-bf41-52d91af25f1b%22%5D "Go to definition") function in Python are highly optimized for performance. They can be faster than `np.vectorize` because they avoid the overhead associated with NumPy's array handling and function calls.

### Conclusion
- **`vect(numpy_array)`**: This uses `np.vectorize` on a NumPy array, which is slow due to the reasons mentioned above.
- **`vect(python_list)`**: This uses `np.vectorize` on a Python list, which is even slower because it involves converting the list to a NumPy array internally.
- **`list(map(func, python_list))`**: This uses the [`map`](command:_github.copilot.openSymbolFromReferences?%5B%22%22%2C%5B%7B%22uri%22%3A%7B%22scheme%22%3A%22file%22%2C%22authority%22%3A%22%22%2C%22path%22%3A%22%2FUsers%2Fhome%2FDEV%2Fscaler%2Fpython%2Fsrc%2Ftemp.ipynb%22%2C%22query%22%3A%22%22%2C%22fragment%22%3A%22%22%7D%2C%22pos%22%3A%7B%22line%22%3A99%2C%22character%22%3A22%7D%7D%5D%2C%22a3f2c677-ac7b-41eb-bf41-52d91af25f1b%22%5D "Go to definition") function with a Python list, which is faster because it avoids the overhead of NumPy and directly applies the function in a more optimized manner.

For true performance benefits with NumPy, use built-in vectorized operations (e.g., `numpy.where`, `numpy.add`, etc.) instead of `np.vectorize`. If you need to apply a custom function element-wise, consider using list comprehensions or [`map`](command:_github.copilot.openSymbolFromReferences?%5B%22%22%2C%5B%7B%22uri%22%3A%7B%22scheme%22%3A%22file%22%2C%22authority%22%3A%22%22%2C%22path%22%3A%22%2FUsers%2Fhome%2FDEV%2Fscaler%2Fpython%2Fsrc%2Ftemp.ipynb%22%2C%22query%22%3A%22%22%2C%22fragment%22%3A%22%22%7D%2C%22pos%22%3A%7B%22line%22%3A99%2C%22character%22%3A22%7D%7D%5D%2C%22a3f2c677-ac7b-41eb-bf41-52d91af25f1b%22%5D "Go to definition") for better performance.