# Which is faster python list or Numpy arrays and why?

## Homogeneous Data Types

NumPy arrays are designed to store elements of the same data type which allow to take advantage of more efficient memory storage and access patterns while Python lists can store elements of different data types which require additional overhead to handle these diverse types.

In [6]:
import numpy as np
import time

# Creating a NumPy array with homogeneous data types
np_array = np.ones(1000000)

# Creating a Python list with homogeneous data types
py_list = [1] * 1000000

# Timing access for NumPy array
start_time = time.time()
sum_np = np.sum(np_array)
end_time = time.time()
print("Sum of NumPy array took:", end_time - start_time, "seconds")

# Timing access for Python list
start_time = time.time()
sum_list = sum(py_list)
end_time = time.time()
print("Sum of Python list took:", end_time - start_time, "seconds")


Sum of NumPy array took: 0.002003908157348633 seconds
Sum of Python list took: 0.006054878234863281 seconds


## Contiguous Memory Layout

NumPy arrays are stored in contiguous blocks of memory which allows for efficient access and manipulation of the data as the processor can take advantage of cache locality while Python lists however store references to objects which can be scattered in memory leading to less efficient data access patterns.

In [7]:
import numpy as np
import time

# Creating large NumPy arrays
array1 = np.ones(1000000)
array2 = np.ones(1000000)

# Creating large Python lists
list1 = [1] * 1000000
list2 = [1] * 1000000

# Timing NumPy array addition or contiguous memory
start_time = time.time()
result_array = array1 + array2
end_time = time.time()
print("NumPy array addition (contiguous memory) took:", end_time - start_time, "seconds")

# Timing Python list addition or non-contiguous memory
start_time = time.time()
result_list = [list1[i] + list2[i] for i in range(1000000)]
end_time = time.time()
print("Python list addition (non-contiguous memory) took:", end_time - start_time, "seconds")


NumPy array addition (contiguous memory) took: 0.014335155487060547 seconds
Python list addition (non-contiguous memory) took: 0.06327295303344727 seconds


## Vectorized Operations

NumPy provides a wealth of optimized precompiled functions for array operations that are implemented in C or Fortran. These functions take advantage of low-level optimizations and parallelism enabling faster execution While Python lists do not have such built-in vectorized operations often requiring explicit loops and additional overhead.

In [8]:
import numpy as np
import time

# Creating large NumPy arrays
array1 = np.ones(1000000)
array2 = np.ones(1000000)

# Creating large Python lists
list1 = [1] * 1000000
list2 = [1] * 1000000

# Timing NumPy vectorized addition
start_time = time.time()
result_array = array1 + array2
end_time = time.time()
print("NumPy vectorized addition took:", end_time - start_time, "seconds")

# Timing Python list addition with explicit loop
start_time = time.time()
result_list = [list1[i] + list2[i] for i in range(1000000)]
end_time = time.time()
print("Python list addition with loop took:", end_time - start_time, "seconds")


NumPy vectorized addition took: 0.005323171615600586 seconds
Python list addition with loop took: 0.08379411697387695 seconds


## Lower Overhead

NumPy arrays have lower overhead compared to Python lists because they store elements directly in their memory block whereas Python lists store pointers to the objects. This pointer dereferencing adds overhead and slows down operations on lists.

In [9]:
import numpy as np
import time

# Creating a large NumPy array
np_array = np.ones(1000000)

# Creating a large Python list
py_list = [1] * 1000000

# Timing element-wise multiplication for NumPy array
start_time = time.time()
result_np = np_array * 2
end_time = time.time()
print("NumPy array element-wise multiplication took:", end_time - start_time, "seconds")

# Timing element-wise multiplication for Python list
start_time = time.time()
result_list = [x * 2 for x in py_list]
end_time = time.time()
print("Python list element-wise multiplication took:", end_time - start_time, "seconds")


NumPy array element-wise multiplication took: 0.0031881332397460938 seconds
Python list element-wise multiplication took: 0.05414628982543945 seconds


These examples clearly illustrate why NumPy arrays are generally faster and more efficient than Python lists for numerical and large-scale data operations.