## Vectorizing code ##

It is improtant to make sure that the code is efficient. This means that operations should be vectorized where possible, instead of relying on for loops.

This is particularly important when doing the excersises, as they are automatically graded. If the code is not efficient, some of the tests that are used to evaluate your solution might exceed the time limits given by the kernel. So you will not get full points.

Here is an example.


In [None]:
def sumProductsWithLoops(x, y):
    """Return the sum of x[i] * y[j] for all pairs of indices i, j. Uses for loops.

    >>> sumProductsWithLoops(np.arange(3000), np.arange(3000))
    20236502250000

    """
    result = 0
    for i in range(len(x)):
        for j in range(len(y)):
            result += x[i] * y[j]
            #print("{} * {} = {}".format(x[i], y[i], x[i] * y[i]))
    return result

In [None]:
def sumProductsVectorized(x, y):
    """Return the sum of x[i] * y[j] for all pairs of indices i, j. Uses vectorized functions

    >>> sumProductsVectorized(np.arange(3000), np.arange(3000))
    20236502250000

    """
    result = np.sum(x) * np.sum(y)
    return result

In [None]:
import time
import numpy as np

a = np.arange(5000)
b = np.arange(5000)

t0 = time.time()
print(sumProductsWithLoops(a, b))
t1 = time.time()
print('Time for loops = {:02.2f} seconds.'.format(t1-t0))  # This is too slow.

t0 = time.time()
print(sumProductsVectorized(a, b))
t1 = time.time()
print('Time for vectorized = {:02.2f} seconds.'.format( t1-t0))  # This is fast.


In [None]:
def mySolution(x, y):
    """Return the sum of x[i] * y[j] for all pairs of indices i, j. Uses vectorized functions

    >>> sumProductsVectorized(np.arange(3000), np.arange(3000))
    20236502250000

    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
import numpy as np
import time

a = np.arange(100)
b = np.arange(100)
solution = 24502500
time_tolerance = 10  # in seconds

t0 = time.time()
np.testing.assert_equal(mySolution(a, b), solution)
t1 = time.time()
total = t1 - t0
assert total < time_tolerance, 'Time limit exceeded'

a = np.arange(5000)
b = np.arange(5000)
solution = 156187506250000

t0 = time.time()
np.testing.assert_equal(mySolution(a, b), solution)
t1 = time.time()
total = t1 - t0
assert total < time_tolerance, 'Time limit exceeded'