# NumPy Array vs Python list

## Why Do We Need NumPy?

In data science, we often work with millions of numerical values.
Using Python lists for such data becomes slow and memory-inefficient.
NumPy solves this problem by providing fast, memory-optimized arrays
designed for numerical computation.

## NumPy vs. Python Lists – Performance Test

In [1]:
import numpy as np
import time

size = 1000_000_00
l1 = list(range(size))
l2 = list(range(size))

start = time.time()
[x + y for x, y in zip(l1, l2)]
end = time.time()
list_time = end - start

print(f"Python List Speed:  {list_time:.4} seconds")

print("=" * 55)

arr1 = np.array(l1)
arr2 = np.array(l2)

start = time.time()
arr1 + arr2
end = time.time()
numpy_time = end - start

print(f"Numpy Array Speed: {numpy_time:.4} seconds")

Python List Speed:  8.268 seconds
Numpy Array Speed: 0.2422 seconds


### Observation
NumPy is significantly faster because operations are vectorized
and avoid Python-level loops.

## Memory Efficiency – NumPy vs. Lists

In [2]:
import sys

list_data = list(range(1000))
numpy_data = np.array(list_data)

print(f"List Data : {sys.getsizeof(list_data) * len(list_data)} Bytes\nNumpy Data: {numpy_data.nbytes} Bytes")

List Data : 8056000 Bytes
Numpy Data: 8000 Bytes


Note: sys.getsizeof does not include the full memory used by
objects referenced inside Python lists, so actual memory usage
is even higher.

## Vectorization – No More Loops!

In [3]:
l1 = [1, 2, 3, 4, 5]
arr1 = np.array(l1)

cube = [l ** 3 for l in l1]
print("Creating Cube from loops: ", cube)

cube = arr1 ** 3
print("Creating Cube from NumPy: ", cube)

Creating Cube from loops:  [1, 8, 27, 64, 125]
Creating Cube from NumPy:  [  1   8  27  64 125]


## Summary

NumPy arrays store data in contiguous memory blocks and use
compiled C loops internally. Python lists store references to
objects, which increases memory usage and slows computation.
