# Broadcasting

It’s possible to do operations on arrays of different sizes. In some cases NumPy can
transform these arrays automatically so that they behave like same-sized arrays. This conversion is
called **broadcasting**.

![numpy broadcasting in 2D. Copyright: Emmanuelle Gouillart, Didrik Pinte, Gaël Varoquaux, and Pauli Virtanen](numpy_broadcasting.png)

In [None]:
import numpy as np
a = np.array([[0, 10, 20, 30]])
b = np.array([[0, 1, 2]])
print("a:", a)
print("b:", b)

In [None]:
a.T

In [None]:
a.T + b

##  Exercise: Route 66

*Adapted from [Scipy Lectures](http://www.scipy-lectures.org/intro/numpy/index.html) by Emmanuelle Gouillart, Didrik Pinte, Gaël Varoquaux, and Pauli Virtanen*

Given the mileposts construct an array of distances (in miles) between cities of Route 66: Chicago, Springfield, Saint-Louis, Tulsa, Oklahoma City, Amarillo, Santa Fe, Albuquerque, Flagstaff and Los Angeles.

```
mileposts = np.array([[0, 198, 303, 736, 871, 1175, 1475, 1544, 1913, 2448][)
```
![Distances on Route 66](route66.png)

## Broadcasting rules

Broadcasting seems a bit magical, but it is actually quite natural to use it when we want to solve a problem whose output data is an array with more dimensions than input data. There a simple rule that allow to determine the validity of broadcasting and the shape of broadcasted arrays:

>  In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same or one of them must be one. 

This does indeed work for the three additions from the first figure

```
a:      4 x 3     a:      4 x 3      a:      4 x 1
b:      4 x 3     b:          3      b:          3
result: 4 x 3     result: 4 x 3      result: 4 x 3
```

Lets look at another example:

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
image = plt.imread('lena.jpg')
plt.imshow(image)

We want to add some red channel to the image:

In [None]:
scale = np.array([1., 0.6, 0.6])

In [None]:
print(image.shape)
print(scale.shape)

In [None]:
scaled = scale * image
print(scaled.shape)

In [None]:
plt.imshow(scaled.astype(np.uint8))

```
image  (3d array): 512 x 512 x 3
scale  (1d array):             3
scaled (3d array): 512 x 512 x 3
```

## Quiz

What are the dimensions of `result` array?
```
A = np.random.rand(8, 1, 6, 1)
B = np.random.rand(7, 1, 5)
result = A + B
```



```
A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
Result (?  array):              ?
```

## `np.newaxis`

You can control broadcasting by inserting singular dimensions with `np.newaxis`. For example, to convert 1-dimensional array to 2-dimensional:

In [None]:
x = np.arange(3)
x.shape

In [None]:
x[:, np.newaxis].shape

## Exercise: `np.newaxis`
Insert a single `np.newaxis` so that this code works:


```python
x = np.arange(8).reshape(4, 2)
y = np.arange(4)
x + y
```

## Exercise: Normalising data

Given the following array:

```
a = np.array([[2, 3, 1], [4, 1, 1]])
```

For each column of `a` subtract mean across rows. Next, from each row subtract its mean across columns.

## Quiz: Broadcasting rules
 
Given the arrays:
```
X = np.random.rand(10,3)
Y = np.random.rand(3)
```

which of the following will *not* produce an error. What will be the shapes of the final broadcasted arrays?
 
a) `X + Y`

b) `X[np.newaxis, :] + Y`

c) `X + Y[:, np.newaxis]`

d) `X[:, np.newaxis] + Y`
 
e) `X + Y[np.newaxis, :]`

f) `X[:, np.newaxis, :] + Y`


# Extra problems


> ## Broadcasting indices

> Predict and verify the shape of `y`:
> 
> ```python
> x = np.empty((10, 8, 6))
> 
> idx0 = np.zeros((3, 8)).astype(int)
> idx1 = np.zeros((3, 1)).astype(int)
> idx2 = np.zeros((1, 1)).astype(int)
> 
> y = x[idx0, idx1, idx2]
> ```

> ## Distances
> 
> Given an array of latitudes and longitudes of major European capitals calculate pairwise distances between them. Use the approximate formula: 
>
> $$D=6371.009\sqrt{(\Delta\phi)^2 + (\Delta\lambda)^2}\qquad \text{(in kilometers)},$$
>
> where $\Delta\phi=\phi_1-\phi_2$ and $\Delta\lambda=\lambda_1-\lambda_2$ are the differences between the latitudes and longitude of two cities in radians. (*Hint*: To convert degrees to radians multiply them by $\pi/180$).
> ```
> coords = np.array([
>                   [ 23.71666667,  37.96666667], # Athens
>                   [ 13.38333333,  52.51666667], # Berlin
>                   [ -0.1275    ,  51.50722222], # London
>                   [ -3.71666667,  40.38333333], # Madrid
>                   [  2.3508    ,  48.8567    ], # Paris
>                   [ 12.5       ,  41.9       ]  # Rome
                    ]) 
> ```
> When you are done you can compare the results with a more [precise formula](https://en.wikipedia.org/wiki/Geographical_distance#Spherical_Earth_projected_to_a_plane):
>
> $$D=6371.009\sqrt{(\Delta\phi)^2 + (\cos(\phi_m)\Delta\lambda)^2}$$
>
> where $\phi_m = (\phi_1+\phi_2) / 2$ is the mean latitude.

> ## Exercise: Creating a two-dimensional grid
> 
> What are the dimensionalities of `x`, `y` and `z` in the two cases:
>
> ```
> x, y = np.mgrid[:10, :5]
> z = x + y
> ```
> 
> and 
> 
> ```
> x, y = np.ogrid[:10, :5]
> z = x + y
> ```
> 
> What might be the advantage of using `np.ogrid` over `np.mgrid`?




## Further reading

* NumPy docs, http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html