# Why numpy arrays can be faster

### Data structure

- An **array** is a collection of homogeneous data-types that are stored in contiguous memory locations. 
- On the other hand, a **list** in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
- The flexibility of lists comes at a cost: to allow these flexible (dynamic) types, each item is a complete Python object and contains its own type info, reference count, and other information. 
- In numpy arrays, the variables are of the same (fixed) type, so a lot of this information is redundant.


![memory](https://jakevdp.github.io/PythonDataScienceHandbook/figures/array_vs_list.png)

- For the the array, we have a single pointer to one contiguous block of data. 
- The Python list is a pointer to a block of pointers. Each of these pointers points to a Python object. All the extra information needs to be stored.
- Fixed-type NumPy-style arrays lack this flexibility, but are much more efficient for storing and manipulating data.

### Operations in C

- Numpy data structure is "C-compatible"
- Many `numpy` operations are directly implemented in C.
- Have less execution time, e.g. no dynamic type checks required. 


### Vectorisation
- Operations can be _vectorised_ and performed in parallel. If we want to multiply an array `[2, 4, 6, 8]` by 2, Python typically have to perform 4 computations, one after another.
- Numpy can vectorise this by multiplying `[2, 4, 6, 8]` with `[2, 2, 2, 2]` and all 4 computations are performed at the same time.