Intro to numpy broadcasting
-----------------------------

In [1]:
import numpy as np

In [2]:
a = np.ones((3,4))

In [4]:
print(a)

[[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]


In [5]:
# the multiplication is "broadcast" over the whole array:
print(a * 3)

[[ 3.  3.  3.  3.]
 [ 3.  3.  3.  3.]
 [ 3.  3.  3.  3.]]


This is core of "vectorization" -- doing operations over a whole array at the speed of C -- key to numpy performance.

Regular python lists and comprehensions:

In [6]:
l = range(10000)

In [7]:
# using regular python list comprehensions
%timeit [i*3 for i in l]

1000 loops, best of 3: 872 µs per loop


Now the numpy way:

In [8]:
a = np.arange(10000) # create an array

In [9]:
timeit a * 3

The slowest run took 35.82 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.24 µs per loop


### in-place operations

One of the primary reasons "augemented assignemnt" was added to Python 
is support in-place operations:
    `+=`, `*=`, etc...

In [10]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [12]:
a+=3
print(a)

[ 3  4  5  6  7  8  9 10 11 12]


A was changed in place -- no new memory allocation, loop at the speed of C.

So "augmented assigment" operators are not just syntacitc sugar!

How do you broadcast to multiple dimensions?

In [13]:
x = np.linspace(0,10,4)
y = np.linspace(100,200,3)

x.shape = (1, -1)
y.shape = (-1, 1)

print(x)


[[  0.           3.33333333   6.66666667  10.        ]]


In [14]:
print(y)

[[ 100.]
 [ 150.]
 [ 200.]]


In [15]:
a = np.arange(4)
print (a.shape)
print (x.shape)
print (a * x)

(4,)
(1, 4)
[[  0.           3.33333333  13.33333333  30.        ]]


In [16]:
print(a.shape)
print(y.shape)
print(a * y)

(4,)
(3, 1)
[[   0.  100.  200.  300.]
 [   0.  150.  300.  450.]
 [   0.  200.  400.  600.]]


In [17]:
print(np.sqrt(x**2 * y**2))

[[    0.           333.33333333   666.66666667  1000.        ]
 [    0.           500.          1000.          1500.        ]
 [    0.           666.66666667  1333.33333333  2000.        ]]


In [21]:
x = np.arange(12).reshape((3,4))
y = np.arange(12).reshape((3,4)) * 10
print(x)
print(y)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[  0  10  20  30]
 [ 40  50  60  70]
 [ 80  90 100 110]]


In [22]:
x + y

array([[  0,  11,  22,  33],
       [ 44,  55,  66,  77],
       [ 88,  99, 110, 121]])

In [23]:
z = np.array([100, 200, 300, 400])

In [24]:
print(z)

[100 200 300 400]


In [25]:
x.shape

(3, 4)

In [26]:
z.shape

(4,)

In [28]:
z * x

array([[   0,  200,  600, 1200],
       [ 400, 1000, 1800, 2800],
       [ 800, 1800, 3000, 4400]])

In [29]:
z = np.array([100, 200, 300])
z.shape = (3,1)
z

array([[100],
       [200],
       [300]])

In [30]:
z * x

array([[   0,  100,  200,  300],
       [ 800, 1000, 1200, 1400],
       [2400, 2700, 3000, 3300]])