# Broadcasting in ML (Summary)

Broadcasting allows operations between tensors of different shapes by automatically expanding smaller tensors without copying data.

## Why it’s used:
- Enables fast vectorized computation
- Saves memory (no data duplication)
- Simplifies batch operations
- Allows clean bias addition in neural networks
- Produces readable, math-like code

**In short:** Broadcasting makes ML computations efficient, scalable, and easy to express.


In [None]:
import numpy as np
A = np.array([[12,1.3,5,69], #considering a matrix of calories consumed by 4 individuals over 3 days
            [1.2,1.5,6,11],
            [33,2.5,21,1.8]])
print(A)

[[12.   1.3  5.  69. ]
 [ 1.2  1.5  6.  11. ]
 [33.   2.5 21.   1.8]]


In [3]:
cal = A.sum(axis=0)
print(cal)

[46.2  5.3 32.  81.8]


In [4]:
percentage = (A / cal) * 100
print(percentage)

[[25.97402597 24.52830189 15.625      84.35207824]
 [ 2.5974026  28.30188679 18.75       13.44743276]
 [71.42857143 47.16981132 65.625       2.200489  ]]


Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python. It does this without making needless copies of data and usually leads to efficient algorithm implementations. (From the documentation of numpy)

NumPy’s broadcasting rule relaxes this constraint when the arrays’ shapes meet certain constraints. The simplest broadcasting example occurs when an array and a scalar value are combined in an operation:

In [7]:
import numpy as np
a = np.array([1.0, 2.0, 3.0])
b = 2.0
a * b

array([2., 4., 6.])

Two dimensions are compatible when:

1. they are equal, or
2. one of them is 1.


A set of arrays is called “broadcastable” to the same shape if the above rules produce a valid result.

For example, if a.shape is (5,1), b.shape is (1,6), c.shape is (6,) and d.shape is () so that d is a scalar, then a, b, c, and d are all broadcastable to dimension (5,6); and

a acts like a (5,6) array where a[:,0] is broadcast to the other columns,

b acts like a (5,6) array where b[0,:] is broadcast to the other rows,

c acts like a (1,6) array and therefore like a (5,6) array where c[:] is broadcast to every row, and finally,

d acts like a (5,6) array where the single value is repeated

Examples of shapes that do not broadcast

In [None]:
A      (1d array):  3
B      (1d array):  4 # trailing dimensions do not match

A      (2d array):      2 x 1
B      (3d array):  8 x 4 x 3 # second from last dimensions mismatched

Stretching is done in some cases through newaxis operator. Following is one of the examples:

In [8]:
import numpy as np
a = np.array([0.0, 10.0, 20.0, 30.0])
b = np.array([1.0, 2.0, 3.0])
a[:, np.newaxis] + b

array([[ 1.,  2.,  3.],
       [11., 12., 13.],
       [21., 22., 23.],
       [31., 32., 33.]])

Common debugging tips

In [1]:
import numpy as np

a = np.random.randn(5)
print(a)

[0.46756338 0.38641201 0.99530364 0.54788397 2.10916174]


In [None]:
print(a.shape) #this is a rank 1 array with shape (5,) so it does not act as a column or a row vector.

(5,)


In [None]:
print(a.T) #the transpose of a rank 1 array is the same as the original array

[0.46756338 0.38641201 0.99530364 0.54788397 2.10916174]


In [6]:
print(np.dot(a, a.T)) #the dot product of a rank 1 array with its transpose is a scalar

6.107299176019863


To prevent this from happening we can do the following.

In [None]:
a = np.random.randn(5,1) #this would give a proper 5x1 column vector
print(a)

b = np.random.randn(1,5) #this would give a proper 1x5 row vector

[[-1.55317978]
 [ 1.24865255]
 [ 0.72242178]
 [ 1.27838947]
 [ 0.2955039 ]]


In [None]:
print(a.T) 

[[-1.55317978  1.24865255  0.72242178  1.27838947  0.2955039 ]]


In [9]:
print(np.dot(a, a.T)) 

[[ 2.41236742 -1.93938189 -1.1220509  -1.98556867 -0.45897067]
 [-1.93938189  1.55913318  0.9020538   1.59626427  0.36898169]
 [-1.1220509   0.9020538   0.52189323  0.9235364   0.21347845]
 [-1.98556867  1.59626427  0.9235364   1.63427964  0.37776907]
 [-0.45897067  0.36898169  0.21347845  0.37776907  0.08732255]]


If you do end up with a rank 1 array, then you can also reshape it to a.reshape(5,1)