## Week 3 Daily Exercise 4

Create a 2 x 3 x 4 array of normally distributed values centered at 10 with a standard deviation of 3 called `sample`.

First, compute the mean `mu` across the 0th dimension (which has a length of 2). Before you write this code, what should the shape of the array of means be? What should the values approximately be?

Second, subtract `mu` from `sample` (this involves implicit broadcasting because the arrays are of different sizes). How does the resulting array relate to `sample`?

Third, repeat step 1 but use the mean `mu2` across the 2nd dimension (which has a length of 4). How should shape relate to the shape of `mu`?

Finally, repeat step 2 but use `mu2`. It shouldn't work as smoothly as it did with step 2. Why? Research a solution (or read the solution file, this one is tricky) to fix `mu2` to make the broadcast work correctly.

In [None]:
import numpy

In [None]:
numpy.random.seed(42)

In [None]:
sample = numpy.random.normal(10, scale=3, size=(2, 3, 4))
print(sample)

## This works

In [None]:
mu = sample.mean(axis=0)
print(mu)

In [None]:
centered_sample = sample - mu
print(centered_sample)

## This doesn't work

In [None]:
mu2 = sample.mean(axis=2)
print(mu2)

In [None]:
centered_sample = sample - mu2

## This is the fix

Add back the missing dimension into $mu2$, with length 1, so that the broadcast works!

In [None]:
mu2 = sample.mean(axis=2)[:, :, None]
print(mu2)
print(mu2.shape)

In [None]:
centered_sample = sample - mu2
print(centered_sample)

Using reshape() would also work. It has the downside of using hard-coded numerical values, though there are ways to get around this.

In [None]:
mu3 = sample.mean(axis=2).reshape((2,3,1))
print(mu3)
print(mu3.shape)

In [None]:
# these values should be the same as above
centered_sample = sample - mu3
print(centered_sample)