# Python Lists vs NumPy Arrays (Beginner-Friendly)
What’s the difference, when to use each, and how to write faster and more memory‑efficient numeric code.

By the end of this mini‑lesson you will be able to:
- Explain how lists (heterogeneous, flexible) differ from NumPy arrays (homogeneous, vectorized)
- Perform elementwise math using arrays and understand broadcasting
- Compare performance and memory usage between lists and arrays
- Convert between lists and arrays and understand resizing trade‑offs

## 1) Setup
We’ll use NumPy and a few standard modules.

In [None]:
# If running locally and numpy isn't installed, uncomment:
# !pip install numpy

import numpy as np
import sys, time
print("NumPy:", np.__version__)

## 2) Quick Recap: Python Lists
- General‑purpose containers (can mix types)
- Dynamically sized (append/insert/remove are common)
- Great for small collections, text, mixed objects
- Not optimized for large‑scale numeric math

In [None]:
# Heterogeneous list example
lst = [1, 2.5, "three", True]
print("List contents:", lst)

# Repetition and concatenation
print("[1,2,3] * 3 ->", [1,2,3] * 3)      # list repetition (concatenates three copies)
print("[1,2,3] + [4,5] ->", [1,2,3] + [4,5])

# Elementwise math requires loops or comprehensions
x = [1,2,3,4,5]
y = [10,20,30,40,50]
xy_sum = [a + b for a, b in zip(x, y)]
xy_sq  = [a*a for a in x]
print("Sum (lists):", xy_sum)
print("Square (lists):", xy_sq)

## 3) NumPy Arrays: Homogeneous and Fast
- Fixed‑type, homogeneous elements (`dtype`)
- Vectorized operations (avoid Python loops)
- Efficient memory layout and C‑backed operations

In [None]:
arr = np.array([1, 2, 3, 4, 5], dtype=np.int64)
print("arr:", arr)
print("dtype:", arr.dtype, "ndim:", arr.ndim, "shape:", arr.shape)

# Elementwise math is natural
print("arr + 10 ->", arr + 10)
print("arr * 2  ->", arr * 2)

# NOTE: Contrast with lists: '*' means repetition for lists, multiplication for arrays
print("[1,2,3] * 2 (list repetition) ->", [1,2,3] * 2)
print("np.array([1,2,3]) * 2 (elementwise) ->", np.array([1,2,3]) * 2)

## 4) Elementwise Operations and Broadcasting
Broadcasting lets arrays with compatible shapes interact without manual loops.
Rules (brief): dimensions align from the end; size 1 dims can be "stretched".

In [None]:
A = np.arange(12).reshape(3,4)
b = np.array([10, 20, 30, 40])   # shape (4,)
print("A:\n", A)
print("b:", b)
print("A + b (row-wise broadcast):\n", A + b)

c = np.array([[100],[200],[300]]) # shape (3,1)
print("\nA + c (column-wise broadcast):\n", A + c)

## 5) Performance: Loops vs Vectorization
Vectorized array math is usually much faster than pure‑Python loops, especially for large data.

In [None]:
N = 500_000
lst1 = list(range(N))
lst2 = list(range(N))

start = time.time()
res_list = [a + b for a, b in zip(lst1, lst2)]
list_time = time.time() - start

arr1 = np.arange(N)
arr2 = np.arange(N)
start = time.time()
res_arr = arr1 + arr2
arr_time = time.time() - start

print(f"List time: {list_time:.4f} s, Array time: {arr_time:.4f} s, Speedup: {list_time/arr_time:.1f}x")

## 6) Memory Usage: Objects vs Contiguous Buffers
Lists hold references to Python objects (each with overhead).
Arrays store raw, contiguous numbers, which is compact.

In [None]:
# Rough memory comparison (values will vary by system)

def total_size_of_int_list(n):
    L = list(range(n))
    # Container size + objects (each int is a Python object)
    return sys.getsizeof(L) + sum(sys.getsizeof(x) for x in L)

for n in (1_000, 100_000):
    list_bytes = total_size_of_int_list(n)
    arr = np.arange(n, dtype=np.int64)
    array_bytes = arr.nbytes  # raw data buffer only
    print(f"n={n:>6} | list approx bytes={list_bytes:,} | array bytes={array_bytes:,} | ratio(list/array)≈{list_bytes/max(array_bytes,1):.1f}")

# Also note: the NumPy array object itself has some overhead beyond .nbytes,
# but for large n, the data buffer dominates.

## 7) Resizing and Methods: append/insert
- Lists: cheap appends, flexible inserts
- Arrays: size is fixed; operations like `np.append` return a new array (copy)
- For repeated growth, prefer pre‑allocation or list accumulation then convert

In [None]:
# List grows in-place
L = []
for i in range(5):
    L.append(i)
print("List after appends:", L)

# NumPy append creates a new array each time (can be costly in loops)
A = np.array([], dtype=np.int64)
for i in range(5):
    A = np.append(A, i)  # new array returned
print("Array after np.append in loop:", A)

# Preferred: preallocate when possible
B = np.empty(5, dtype=np.int64)
for i in range(5):
    B[i] = i
print("Preallocated array:", B)

## 8) Conversions and Dtypes
- List to array: `np.array(list, dtype=...)` (choose dtype for speed/memory)
- Array to list: `arr.tolist()` (converts numbers back to Python objects)
- Mixing types in arrays leads to upcasting (e.g., ints + floats -> float)

In [None]:
L = [1, 2, 3]
arr_i32 = np.array(L, dtype=np.int32)
arr_f32 = np.array(L, dtype=np.float32)
print("arr_i32:", arr_i32, arr_i32.dtype)
print("arr_f32:", arr_f32, arr_f32.dtype)

mix = np.array([1, 2.5])
print("Mixed -> upcast to float:", mix, mix.dtype)

back_to_list = arr_i32.tolist()
print("Back to list:", back_to_list, type(back_to_list[0]))

## 9) When to Use What
- Use lists for small, mixed‑type, or non‑numeric data and when frequent insertions are needed.
- Use NumPy arrays for large numeric datasets, vectorized math, and memory‑efficient storage.

## 10) Practice Exercises
1) Create two size‑10 lists and two arrays with values 0..9. Add them elementwise (list via comprehension, array via +). Compare results.
2) Show how `* 3` behaves for a list vs an array. Explain the difference.
3) Use broadcasting to add a vector of shape (4,) to a matrix of shape (3,4).
4) Measure time to compute squares of the first 300k integers using a list comprehension vs `np.arange` with vectorized `** 2`.
5) Estimate memory for a list of 50k ints vs an `int32` array of 50k elements.

In [None]:
# Your workspace for the exercises
# 1) Your code here

# 2) Your code here

# 3) Your code here

# 4) Your code here

# 5) Your code here