# NumPy Broadcasting

Broadcasting is a powerful mechanism that allows NumPy to work with arrays of different shapes when performing arithmetic operations.

## The Broadcasting Rule

In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or one of them must be one.

In [None]:
import numpy as np

### Example 1: Scalar and an array

In [None]:
a = np.array([1.0, 2.0, 3.0])
b = 2.0
a * b

### Example 2: Mismatched dimensions

In [None]:
a = np.array([[ 0.0,  0.0,  0.0],
               [10.0, 10.0, 10.0],
               [20.0, 20.0, 20.0],
               [30.0, 30.0, 30.0]])
b = np.array([1.0, 2.0, 3.0])
a + b

## Reshaping Arrays

The `reshape` method allows you to change the shape of an array without changing its data. This is often used to prepare arrays for broadcasting.

In [None]:
a = np.arange(6)
print(f'Original array: {a}')
b = a.reshape((2, 3))
print(f'Reshaped array:
{b}')

### Example 3: Normalizing Data

Broadcasting is very useful for standardizing or normalizing data. For example, you can subtract the mean and divide by the standard deviation for each feature in a dataset.

In this example, `X` is a (5, 3) array representing 5 data points with 3 features each. We calculate the mean (`X_mean`) and standard deviation (`X_std`) along `axis=0`, which gives us the mean and standard deviation for each *column* (feature). Both `X_mean` and `X_std` are 1D arrays of shape (3,).

When we execute `X - X_mean`, NumPy sees that the shapes (5, 3) and (3,) are not the same. It then applies the broadcasting rule: it "stretches" `X_mean` across the rows of `X`, effectively subtracting the column's mean from every element in that column. The same process happens for the division with `X_std`. This allows us to normalize the entire dataset in a single, readable line of code.

In [None]:
X = np.random.rand(5, 3) * 10
print(f'Original data: {X}')
X_mean = X.mean(axis=0)
X_std = X.std(axis=0)
X_normalized = (X - X_mean) / X_std
print(f'Normalized data: {X_normalized}')

### Example 4: Creating a Distance Matrix

You can use broadcasting to efficiently compute the distance between every pair of points in a dataset.

Here, we want to compute the Euclidean distance between every pair of points in the `points` array. A naive approach would use nested loops, which is slow in Python. A broadcasted solution is much faster.

1. We start with a `points` array of shape (4, 2).
2. We use `np.newaxis` to create two new views of the data: `points_row` with shape (4, 1, 2) and `points_col` with shape (1, 4, 2).
3. When we subtract them (`points_row - points_col`), NumPy broadcasts the two arrays into a resulting array of shape (4, 4, 2). This new array contains the vector differences between every possible pair of points.
4. We then square the differences, sum along the last axis (`axis=2`) to get the squared Euclidean distance, and finally take the square root to get the final distance matrix.

In [None]:
points = np.array([[0, 0], [1, 2], [3, 1], [4, 4]])

# Prepare the points for broadcasting
points_row = points[:, np.newaxis, : ]
points_col = points[np.newaxis, :, : ]
distances = np.sqrt(np.sum((points_row - points_col)**2, axis=2))
print(f'Distance matrix: {distances}')