## Broadcasting Example
In Python, broadcasting refers to a method of performing arithmetic operations between arrays of different shapes, which leads to implicit replication of the smaller array to match the shape of the larger array. This is done to avoid explicit looping over the arrays and thus makes the code more efficient.

For example, let's say you have two arrays, A with shape (3,2) and B with shape (2,), you can perform element-wise multiplication between the two arrays using broadcasting as follows:

In [15]:
import numpy as np

A = np.array([[1,2],[3,4],[5,6]])
B = np.array([2,3])
C = A * B
print(C)

[[ 2  6]
 [ 6 12]
 [10 18]]


#### Cool, what just happened?
The value of B is broadcasted along the second dimension of A to form a temporary array of shape (3,2), which is then used for the element-wise multiplication with A.

In machine learning, broadcasting is particularly useful when dealing with large datasets that are represented as arrays. It allows us to perform mathematical operations between the dataset and a smaller array (such as a weight vector or bias term) without the need to explicitly replicate the smaller array to match the shape of the larger dataset. This can greatly reduce the memory footprint and computational complexity of the code. Broadcasting is used in many popular machine learning libraries in Python, such as NumPy, PyTorch, and TensorFlow.

## Lets look at another exampe with food!
Lets say we want to figure out what percentage of calories in each food type belongs to carbs, protein and fat.
            Apples      Beef        Eggs        Potatoes
Carb       [  56.0       0           4.4           68.0 ],
Protein    [   1.2      104          52.0           8.0 ],
Fat        [   1.8      135          99.0           0.9 ]

In [16]:
import numpy as np
# create the 3x4 numpy matrix of the data (number of rows, number of columns)
A = np.array([[56.0, 0.0, 4.4, 68.0],[1.2, 104, 52.0, 8.0],[1.8, 135, 99.0, 0.9]])
print(A)
print(A.shape)


[[ 56.    0.    4.4  68. ]
 [  1.2 104.   52.    8. ]
 [  1.8 135.   99.    0.9]]
(3, 4)


## Broadcasting and Array Axes
It is important to remember that axis 0 corresponds to the Y axis while axis 1 is the X axis.

The convention of numbering axes in Python and NumPy can be a bit counterintuitive at first, but it has its origins in mathematics and linear algebra.

In linear algebra, we often represent matrices as collections of column vectors. When we perform matrix multiplication, we are essentially taking linear combinations of these column vectors, with the coefficients of the linear combinations determined by the rows of the other matrix.

In NumPy, arrays are stored in memory in row-major order, meaning that the elements of each row are stored contiguously in memory. This can provide a performance benefit in many cases, as it allows for more efficient memory access patterns.

Given this convention, it makes sense to number the first axis (i.e., the rows) as 0, and the second axis (i.e., the columns) as 1. This is because the first index in an array corresponds to the row, and the second index corresponds to the column.

In other words, when we access an element of an array with the syntax arr[i,j], we are specifying the ith row and the jth column of the array, with the row index coming first. This is consistent with the convention of numbering axes in NumPy and Python, where the first axis (rows) is given the index 0, and the second axis (columns) is given the index 1.

In [17]:

# now lets sum together the total values in each column of our data
# axis 0 denotes that the sum will occur vertically instead of horizontally
cal = A.sum(axis=0)
print("cal:", cal)
# as you can see if axis is 1 the function will sum horizontally
hcal = A.sum(axis=1)
print("hcal: ", hcal)


cal: [ 59.  239.  155.4  76.9]
hcal:  [128.4 165.2 236.7]


## Calculating our Percentages without loops

In [18]:
# Next, for each value in our original matrix, we need to determine 
# what percent of the total calories for that item the value represents
# for this we can take advantage of broadcasting in Python, to automatically
# duplicate our smaller cal array to carry out the multiplications on each
# value in our larger A array
percents = A / cal * 100
print("percents: ", percents)
percents = A / cal.reshape(1, 4) * 100
print("percents: ", percents)

percents:  [[94.91525424  0.          2.83140283 88.42652796]
 [ 2.03389831 43.51464435 33.46203346 10.40312094]
 [ 3.05084746 56.48535565 63.70656371  1.17035111]]
percents:  [[94.91525424  0.          2.83140283 88.42652796]
 [ 2.03389831 43.51464435 33.46203346 10.40312094]
 [ 3.05084746 56.48535565 63.70656371  1.17035111]]


### Wait, but then why do both assignments of percents result in the same array?
The percents variable is the same after both reassignments because they are mathematically equivalent.

Initially, percents is assigned the result of dividing A by cal and then multiplying the result by 100. This is done using broadcasting, where the cal array is broadcasted to the same shape as A, allowing the division to happen element-wise.

In the second reassignment, cal is reshaped to have a row dimension of 1 and a column dimension of 4 using the reshape function. This reshaped cal array is then used for element-wise division with A and then multiplication by 100 to calculate the percents array.

Both of these methods yield the same percents array because the two arrays being divided (A and cal or A and cal.reshape(1, 4)) are equivalent, due to the fact that cal only has one row. Therefore, broadcasting the cal array in the first case, and reshaping it to have one row in the second case, result in the same output.

## Sweet, now lets look at a few more examples of Broadcasting in Python

In [19]:
# Broadcasting a single variable into a vector
X = np.array([1,2,3,4])
v = 100
Y = X + (X * v)
print(Y)

[101 202 303 404]


As we can see from the printed results, when X is multiplied by v, v is converted into a (1,4) vector and then multiplied against X