# üöÄ Section 3: Broadcasting at Scale and Advanced Indexing Tricks

NumPy‚Äôs broadcasting is one of its most powerful features ‚Äî it lets you perform operations on arrays of different shapes *without* explicit loops or copies.

In this section, we‚Äôll go deep into:
- The rules that govern broadcasting
- Expanding dimensions with `np.newaxis` and `reshape`
- Using advanced indexing (`np.ix_`, boolean logic, fancy indexing`)
- Writing expressive and efficient vectorized code

## ‚öôÔ∏è 1. Broadcasting Fundamentals

When NumPy performs arithmetic between arrays of different shapes, it applies **broadcasting rules** to align dimensions automatically.

### Broadcasting Rules:
1. NumPy compares shapes **from right to left**.
2. Dimensions are compatible if they are **equal** or one of them is **1**.
3. If all dimensions are compatible, NumPy ‚Äòstretches‚Äô dimensions of size 1 to match the other.

In [ ]:
import numpy as np

a = np.array([1, 2, 3])           # Shape: (3,)
b = np.array([[10], [20], [30]])  # Shape: (3,1)

print("a shape:", a.shape)
print("b shape:", b.shape)

# Broadcast addition
c = a + b
print("\nBroadcasted result (3x3):\n", c)

Here, NumPy expands `a` along the column axis and `b` along the row axis to match a `(3,3)` grid automatically.

No loops. No temporary arrays. Just broadcasting.

## üìê 2. Visualizing Broadcasting with Shape Alignment

Let‚Äôs look at a practical example using data normalization ‚Äî a common preprocessing step in data science.

In [ ]:
# Example: Normalize columns of a dataset
data = np.random.randint(0, 100, size=(4, 3))
print("Data:\n", data)

# Compute column means and std
col_mean = data.mean(axis=0)   # Shape: (3,)
col_std = data.std(axis=0)     # Shape: (3,)

# Broadcasting automatically applies across rows
normalized = (data - col_mean) / col_std
print("\nNormalized data:\n", normalized)

Here, `(4,3)` minus `(3,)` works because NumPy broadcasts the 1D mean vector to match `(4,3)` automatically.

This same concept powers **row-wise or column-wise transformations** without manual iteration.

## üß© 3. Expanding Dimensions Explicitly with `np.newaxis` and `reshape`

When shapes are *almost* compatible but not quite, we can insert singleton dimensions manually using `np.newaxis` or `.reshape()`.

Each added axis has size 1, allowing broadcasting along it.

In [ ]:
x = np.arange(3)        # Shape: (3,)
y = np.arange(4)        # Shape: (4,)

# Expand x to shape (3,1)
x2 = x[:, np.newaxis]

# Now x2 (3x1) and y (4,) ‚Üí broadcast to (3,4)
grid = x2 + y
print("x2 shape:", x2.shape)
print("y shape:", y.shape)
print("\nResulting grid:\n", grid)

### Quick pattern to remember:
- Add `np.newaxis` (or `.reshape()`) where dimensions differ.
- The side you expand determines how NumPy tiles your data.

## ‚ö° 4. Advanced Indexing Tricks ‚Äî `np.ix_` and Boolean Masks

`np.ix_` allows you to select **combinations** of rows and columns without loops. It creates index grids that broadcast together perfectly.

In [ ]:
# Example: Select specific rows and columns
arr = np.arange(16).reshape(4, 4)
rows = np.array([0, 2])
cols = np.array([1, 3])

sub = arr[np.ix_(rows, cols)]
print("Original array:\n", arr)
print("\nSelected elements:\n", sub)

Boolean masks work similarly ‚Äî they broadcast automatically to filter or modify data.

This is the foundation for conditional filtering in large datasets.

In [ ]:
# Masking example: Replace all values > threshold
mask = arr > 8
arr_masked = np.where(mask, -1, arr)

print("Mask:\n", mask)
print("\nMasked array:\n", arr_masked)

Masking and broadcasting combine beautifully ‚Äî for example, applying row-wise thresholds or per-column conditions in one line.

## üß† 5. Under the Hood: How Broadcasting Works Internally

- Broadcasting doesn‚Äôt duplicate data ‚Äî it creates **virtual views** with adjusted **strides**.
- Each singleton dimension (size 1) is treated as repeating along that axis.
- This saves massive amounts of memory and enables true **vectorized computation**.

Internally, NumPy computes an **output stride pattern** for each operand, aligning shapes before performing the elementwise operation.

In [ ]:
# Example: Broadcasting view mechanics
a = np.arange(3).reshape(3, 1)
b = np.arange(4).reshape(1, 4)
result = a + b

print("a shape, strides:", a.shape, a.strides)
print("b shape, strides:", b.shape, b.strides)
print("result shape:", result.shape)

NumPy computes the broadcasted shape `(3,4)` and uses adjusted strides to avoid copying ‚Äî only reading data as if it were expanded.

## ‚ö†Ô∏è 6. Best Practices and Pitfalls

‚úÖ **Best Practices:**
- Use broadcasting instead of loops for arithmetic and comparisons.
- Use `np.newaxis` or `.reshape()` to align shapes explicitly.
- Use `np.ix_` for clean row/column combinations.
- Chain broadcasting with ufuncs for clean, vectorized transformations.

‚ö†Ô∏è **Pitfalls:**
- Be cautious with **unintended expansion** ‚Äî broadcasting can silently produce massive temporary arrays.
- Always check `.shape` before operations.
- Avoid using broadcasting with extremely large differences in array sizes.

## üí™ Challenge Exercise

**Task:**
1. Create a 1D array `x = np.arange(5)` and another 1D array `y = np.arange(1,6)`.
2. Compute the **pairwise squared Euclidean distance matrix** between `x` and `y` using broadcasting.
3. Verify your result matches a loop-based computation.

*(Hint: use `(x[:, np.newaxis] - y[np.newaxis, :])**2`)*

# --- End of Section 3 ---

Next up ‚Üí **Section 4: Universal Functions (ufuncs) and Custom Creation**

We‚Äôll explore how NumPy‚Äôs core computation system ‚Äî the **ufunc** ‚Äî works internally, and how to define your own for high-performance, domain-specific operations.