# Video: Comparing Memory Usage of NumPy Arrays vs Python lists

This video compares the memory usage of NumPy arrays and Python lists, and exactly quantifies the differences in space usage.

* At the beginning of this module, we saw that current 64 bit floating numbers which should take 8 bytes are taking 24 bytes.

In [None]:
import sys
sys.getsizeof(3.0)

24

* What if we convert that to NumPy?

In [None]:
import numpy as np
sys.getsizeof(np.asarray(3.0))

104

* Eek!
* NumPy is not the most efficient way to handle individual numbers.
* That is thirteen x overhead.
* Let's look at a bigger array, since we have been saying that NumPy will be more efficient for lots of numbers.

In [None]:
test_array = np.arange(1_000_000, dtype=np.float64)
sys.getsizeof(test_array)

8000112

* That test array just created has one million numbers, counting up from zero like the range function.
* The dtype parameter is to be explicit about what type of array we are making, so we can be 100% certain.
* Since each of those values takes 8 bytes, the minimum space to store them is 8 million bytes.
* The total size for the array is 8 million one hundred and twelve.
* So only one hundred and twelve bytes of overhead here.

* What about storing the same numbers in a list?


In [None]:
test_list = [float(v) for v in range(1_000_000)]
sys.getsizeof(test_list)

8448728

* The size of the list is less than 6% bigger than the NumPy array, but there's an important disclaimer - that does not include the size of all the numbers in the list.
* Reading the documentation of `getsizeof`,

https://docs.python.org/3/library/sys.html
> Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

* So a Python list's size reported does not include the sizes of the objects inside the list.

In [None]:
sys.getsizeof(test_list) + sum(sys.getsizeof(v) for v in test_list)

32448728

* So storing the same numbers actually takes about 4 times as much space with a Python list compared to a NumPy array.
* The exact ratios will depend based on how many numbers are involved, and the ratio will be a bit worse for more dimensions, since that will add more overhead on the Python side.
* This should be enough for now.
* We will circle back to this in a couple weeks as a visualization example. **<font color="red">TODO Add this visualization here now that Matplotlib is first.</font>**