# Broadcasting of `numpy`-arrays

## Introduction

- *All* operations on `numpy`-arrays are by default element-wise.
- This trivially works for arrays with the same shape

In [1]:
import numpy as np

a = np.arange(10)
b = np.arange(10, 20)

c = np.arange(25).reshape((5, 5))
d = np.arange(25, 50).reshape((5,5))

print(a, b)
print(a + b)   # elementwise addition of elements

print(c, d)
print(c * d)   # elementwise multiplication

print(a + d)   # Error: shapes do not match and the
               # addition is not defined!

[0 1 2 3 4 5 6 7 8 9] [10 11 12 13 14 15 16 17 18 19]
[10 12 14 16 18 20 22 24 26 28]
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]] [[25 26 27 28 29]
 [30 31 32 33 34]
 [35 36 37 38 39]
 [40 41 42 43 44]
 [45 46 47 48 49]]
[[   0   26   54   84  116]
 [ 150  186  224  264  306]
 [ 350  396  444  494  546]
 [ 600  656  714  774  836]
 [ 900  966 1034 1104 1176]]


ValueError: operands could not be broadcast together with shapes (10,) (5,5) 

- **However,** it is also possible to do `numpy`-operations on arrays of different
shapes *if* NumPy can transform these arrays to the same shape: this conversion is called **broadcasting**.
- You already used broadcasting in operations between arrays and scalars!

In [None]:
import numpy as np

a = np.arange(5)

print(a)
print(a * 2)  # the '2' is broadcasted (stretched) to a (5,)-array

b = np.arange(25).reshape((5,5))

print(b)
print(b * 2) # the '2' is broadcasted (stretched) to a (5,5)-array
print(b + a) # the (5,)-array 'a' is broadcasted to a
             # (5,5)-array

## Practical Example: Bias correction of astronomical data

- Optical astronomical data are a two-dimensional array of pixel values
- Parts of the CCD is not illuminated during an exposure and hence provides *noisy* zero-level pixel values of an exposure. This part of a CCD is called *overscan region*
- You need to estimate an overscan value per column/row (depending on where the overscan region is; see figure below) and subtract that value from the corresponding column/row of your science data.

Effectively, we need to apply a lower-dimensional array (an overscan column/row) to a higher dimensional one (the science data). We somehow need to *stretch* the one-dimensional oversan values to a two-dimensional array so that an element-wise subtraction can be done.

This is a typical application of `numpy`-array broadcasting

![bias](figs/bias.png)

In [13]:
import numpy as np
import numpy.random as nr

# make sure that 'random numbers' are reproducible in the following:
nr.seed(1)
# create fake data with some overscan region (horizontal overscan lines)
data_hor = nr.normal(loc=100, scale=1.0, size=40).reshape(10,4)
overscan_hor = nr.normal(loc=10, scale=1.0, size=8).reshape(2,4)
data_hor[-2:,:] = overscan_hor

print(data_hor)
# perform overscan correction (horizontal case)
ov = data_hor[-2:,:].mean(axis=0)#
print(ov)
print(data_hor[:-2] - ov)
print(ov.shape, data_hor.shape)

[[ 101.62434536   99.38824359   99.47182825   98.92703138]
 [ 100.86540763   97.6984613   101.74481176   99.2387931 ]
 [ 100.3190391    99.75062962  101.46210794   97.93985929]
 [  99.6775828    99.61594565  101.13376944   98.90010873]
 [  99.82757179   99.12214158  100.04221375  100.58281521]
 [  98.89938082  101.14472371  100.90159072  100.50249434]
 [ 100.90085595   99.31627214   99.87710977   99.06423057]
 [  99.73211192  100.53035547   99.30833925   99.60324647]
 [   9.80816445    9.11237104    9.25284171   11.6924546 ]
 [  10.05080775    9.36300435   10.19091548   12.10025514]]
[  9.9294861    9.23768769   9.7218786   11.89635487]
[[ 91.69485926  90.15055589  89.74994965  87.03067651]
 [ 90.93592153  88.46077361  92.02293317  87.34243823]
 [ 90.38955299  90.51294193  91.74022934  86.04350442]
 [ 89.74809669  90.37825795  91.41189085  87.00375386]
 [ 89.89808569  89.88445389  90.32033515  88.68646034]
 [ 88.96989472  91.90703602  91.17971213  88.60613947]
 [ 90.97136985  90.078584

In [None]:
import numpy as np
import numpy.random as nr

# create fake data with some overscan region (vertical overscan column)
data_ver = nr.normal(loc=100, scale=1.0, size=40).reshape(10,4)
overscan_ver = nr.normal(loc=10, scale=1.0, size=20).reshape(10,2)
data_ver[:,:2] = overscan_ver

print(data_ver)
# perform overscan correction (vertical case)

## Formal definition of Broadcasting

- Braodcasting consists of a set of rules that permit *element-wise* operations of arrays of different shapes.
- Element-wise operations on arrays are only valid when the arrays' shapes are either equal or *compatible*.
- To determine if two shapes are compatible, `numpy` compares their dimensions, starting with the trailing ones and working its way backwards. If two dimensions are equal, or if one of them equals 1, the comparison continues. If one of the dimensions in this case is 1 and the other is larger than 1, the smaller array is stretched *naturally* to match the bigger one. Otherwise, you'll see a ValueError raised (saying something like "operands could not be broadcast together with shapes ...").
- When one of the shapes runs out of dimensions (because it has less dimensions than the other shape), `numpy` will use 1 in the comparison process until the other shape's dimensions run out as well.
- Once `numpy` determines that two shapes are compatible, the shape of the result is simply the maximum of the two shapes' sizes in each dimension.

**Examples:**
![braodcasting_1](figs/broadcasting_1.png)

**Important:**

The rules above are precise and complete!
- Note that missing dimensions in one array can only be *left-padded*!
- If you need to *right-pad* an array to make it compatible, you need to do this explicitely with a `np.newaxis` or a reshape command

![broadcase_2](figs/broadcasting_2.png)

In [None]:
import numpy as np

a = np.arange(10).reshape(2, 5)
b = np.arange(2)

# print(a + b) would not work!!
# we need to manually 'right-pad' one dimension to
# 'b' to make it broadcast-compatible with a:
b = b.reshape((2, 1))
# b = b[:, np.newaxis] # equivalent to the reshape command above
print(a, b)
print(a + b)