# Broadcasting in NumPy

## 1. Why Loops Are Slow

In Python, loops are typically slow because:
- **Interpreter Overhead:** Each loop iteration is interpreted by Python, adding extra processing time.
- **High Overhead:** Every iteration involves memory access, index management, and function calls.

While Python loops are easy to understand, they are inefficient for numerical computation. NumPy helps overcome this limitation.

### Example: Looping Over Arrays in Python
```python
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
result = []

# Using a loop to square each element (slow)
for num in arr:
    result.append(num ** 2)

print(result)  # Output: [1, 4, 9, 16, 25]
```
This works, but it’s **slow** — especially with large datasets.

---

## 2. Vectorization: Fixing the Loop Problem

**Vectorization** allows you to perform operations on entire arrays at once without explicit loops.  
NumPy’s backend (written in C) performs these operations efficiently, resulting in massive speed improvements.

### Example: Vectorized Operation
```python
arr = np.array([1, 2, 3, 4, 5])
result = arr ** 2  # Vectorized operation
print(result)  # Output: [1 4 9 16 25]
```

### Why is it Faster?
- **Low-level Implementation:** NumPy’s vectorized operations run in compiled C code.  
- **Batch Processing:** NumPy can execute multiple operations simultaneously using **SIMD** (Single Instruction, Multiple Data).

---

## 3. Broadcasting: Scaling Arrays Without Extra Memory

**Broadcasting** enables operations between arrays of different shapes without creating redundant copies.  
It "stretches" smaller arrays to match larger ones — efficiently and automatically.

### Example: Broadcasting with Scalar
```python
arr = np.array([1, 2, 3, 4, 5])
result = arr + 10  # Broadcasting scalar 10 across all elements
print(result)  # Output: [11 12 13 14 15]
```
Here, the scalar `10` is **broadcast** across the entire array.

---

## 4. Broadcasting with Arrays of Different Shapes

Broadcasting shines when working with arrays of different dimensions.  
NumPy automatically aligns arrays to perform element-wise operations without copying data.

### Example: Broadcasting with Two Arrays
```python
arr1 = np.array([1, 2, 3])
arr2 = np.array([10, 20, 30])
result = arr1 + arr2
print(result)  # Output: [11 22 33]
```

### Example: Broadcasting a 2D Array and a 1D Array
```python
arr1 = np.array([[1, 2, 3],
                 [4, 5, 6]])
arr2 = np.array([1, 2, 3])

result = arr1 + arr2
print(result)
# Output:
# [[2 4 6]
#  [5 7 9]]
```

In this example, `arr2` is broadcast **across each row** of `arr1`.

---

### How Broadcasting Works

1. **Dimension Compatibility:** The size of trailing dimensions must be the same or one of them must be `1`.
2. **Stretching Arrays:** NumPy virtually stretches the smaller array to match the larger one, **without copying data**.

---

## 5. Hands-on: Applying Broadcasting to Real-World Scenarios

Let’s use broadcasting in a real-world **machine learning** example — normalizing data.

### Example: Normalizing Data Using Broadcasting
```python
# Simulating a dataset (5 samples, 3 features)
data = np.array([[10, 20, 30],
                 [15, 25, 35],
                 [20, 30, 40],
                 [25, 35, 45],
                 [30, 40, 50]])

# Calculating mean and standard deviation for each feature (column)
mean = data.mean(axis=0)
std = data.std(axis=0)

# Normalizing the data using broadcasting
normalized_data = (data - mean) / std
print(normalized_data)
```
Here, broadcasting automatically applies subtraction and division for each column — no loops required!

---

## ✅ Summary

- **Loops are slow** due to Python’s interpreter and per-iteration overhead.  
- **Vectorization** performs operations on entire arrays at once, making code cleaner and faster.  
- **Broadcasting** allows operations between arrays of different shapes by stretching smaller arrays efficiently.  
- **Practical Use Case:** Broadcasting is widely used in **data preprocessing**, such as feature normalization in machine learning.


In [3]:
import numpy as np

### 2. Vectorization: Fixing the Loop Problem

In [4]:
arr = np.array([1, 2, 3, 4, 5])
result = arr ** 2                               # Vectorized operation
result                                          # Output: [1 4 9 16 25]

array([ 1,  4,  9, 16, 25])

###  3. Broadcasting: Scaling Arrays Without Extra Memory

In [5]:
arr1 = np.array([1, 2, 3, 4, 5])
result1 = arr + 10                                  # Broadcasting: 10 is added to all elements
result1                                             # Output: [11 12 13 14 15]

array([11, 12, 13, 14, 15])

###  4. Broadcasting with Arrays of Different Shapes

In [8]:
arr3 = np.array([[1, 2, 3], [4, 5, 6]])
arr4 = np.array([1, 2, 3])

result = arr3 + arr4                                        # Broadcasting arr2 across arr1
result                                                      # arr4 adds on both rows

array([[2, 4, 6],
       [5, 7, 9]])

###  5. Hands-on: Applying Broadcasting to Real-World Scenarios

In [10]:
data = np.array([[10, 20, 30],                                  # Simulating a dataset (5 Rows, 3 Columns)
 [15, 25, 35],
 [20, 30, 40],
 [25, 35, 45],
 [30, 40, 50]])


mean = data.mean(axis=0)                                      # Calculating mean and standard deviation for each feature (column)
std = data.std(axis=0)


normalized_data = (data - mean) / std                        # Normalizing the data using broadcasting
normalized_data

array([[-1.41421356, -1.41421356, -1.41421356],
       [-0.70710678, -0.70710678, -0.70710678],
       [ 0.        ,  0.        ,  0.        ],
       [ 0.70710678,  0.70710678,  0.70710678],
       [ 1.41421356,  1.41421356,  1.41421356]])