# 📌 Broadcasting in NumPy

NumPy provides **vectorization** and **broadcasting** to make numerical operations faster and memory-efficient. These techniques help avoid slow Python loops.

---

## 1. ❌ Why Loops Are Slow in Python?

- **Python’s interpreter**: Each loop iteration requires interpretation, making it slower than compiled code.
- **High overhead**: Every loop iteration involves function calls, memory access, and index management.

👉 While loops are convenient, they do not take advantage of NumPy’s optimized C-based backend.

### Example: Looping Over Arrays in Python
```python
import numpy as np
 
arr = np.array([1, 2, 3, 4, 5])
result = []
 
# Using a loop to square each element (slow)
for num in arr:
    result.append(num ** 2)
 
print(result)  
# Output: [1, 4, 9, 16, 25]
````

---

## 2. ⚡ Vectorization: Fixing the Loop Problem

Vectorization applies operations to the **entire array at once**, instead of iterating element by element.
This is faster because:

* NumPy uses **compiled C code** (low-level).
* Uses **SIMD (Single Instruction, Multiple Data)** → multiple operations at once.

### Example: Vectorized Operation

```python
arr = np.array([1, 2, 3, 4, 5])
result = arr ** 2  # Vectorized operation
print(result)  
# Output: [1 4 9 16 25]
```

✅ Faster and more readable.

---

## 3. 📡 Broadcasting: Scaling Arrays Without Extra Memory

Broadcasting lets you perform operations between arrays of **different shapes** without creating copies.
It "stretches" smaller arrays across larger arrays in a memory-efficient way.

---

### Example: Broadcasting with Scalar

```python
arr = np.array([1, 2, 3, 4, 5])
result = arr + 10   # Broadcasting scalar
print(result)  
# Output: [11 12 13 14 15]
```

---

### Example: Broadcasting with Two Arrays

```python
arr1 = np.array([1, 2, 3])
arr2 = np.array([10, 20, 30])
 
result = arr1 + arr2
print(result)  
# Output: [11 22 33]
```

---

### Example: Broadcasting a 2D Array and a 1D Array

```python
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([1, 2, 3])
 
result = arr1 + arr2
print(result)
# Output:
# [[2 4 6]
#  [5 7 9]]
```

---

## 4. 🔎 How Broadcasting Works?

1. Dimensions must be **compatible**:

   * Sizes of trailing dimensions must be **equal** OR one of them must be **1**.
2. If compatible → NumPy **stretches** the smaller array without copying data.

---

## 5. 🧑‍💻 Hands-on: Real-World Example (Machine Learning)

👉 Normalizing a dataset (common preprocessing step).

```python
# Simulating a dataset (5 samples, 3 features)
data = np.array([[10, 20, 30],
                 [15, 25, 35],
                 [20, 30, 40],
                 [25, 35, 45],
                 [30, 40, 50]])
 
# Mean & standard deviation (per column)
mean = data.mean(axis=0)
std = data.std(axis=0)
 
# Normalize data using broadcasting
normalized_data = (data - mean) / std
 
print(normalized_data)
```

Here, broadcasting subtracts `mean` and divides by `std` **column-wise** automatically.

---

## ✅ Summary

* **Loops are slow** → Python interpreter overhead.
* **Vectorization** → Applies operations to entire arrays at once using fast C backend.
* **Broadcasting** → Works on arrays of different shapes without copying data.
* **Real-world use** → Normalizing datasets, scaling features, etc.

---

In [1]:
import numpy as np

In [2]:
arr = np.array([1, 2, 3, 4, 5])
result = arr ** 2 # Vectorized operation
print(result) # Output: [1 4 9 16 25]

[ 1  4  9 16 25]


In [3]:
result + 10

array([11, 14, 19, 26, 35])

In [4]:
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([1, 2, 3])
result = arr1 + arr2 # Broadcasting arr2 across arr1
print(result)

[[2 4 6]
 [5 7 9]]


In [5]:
# Simulating a dataset (5 samples, 3 features)
data = np.array([[10, 20, 30],
[15, 25, 35],
[20, 30, 40],
[25, 35, 45],
[30, 40, 50]])
# Calculating mean and standard deviation for each feature (column)
mean = data.mean(axis=0)
std = data.std(axis=0)
# Normalizing the data using broadcasting
normalized_data = (data - mean) / std
print(normalized_data)

[[-1.41421356 -1.41421356 -1.41421356]
 [-0.70710678 -0.70710678 -0.70710678]
 [ 0.          0.          0.        ]
 [ 0.70710678  0.70710678  0.70710678]
 [ 1.41421356  1.41421356  1.41421356]]
