# Python Vectorization using numpy library

Welcome to Python and the vectorization available thanks to the numpy library. We'll try to check how much faster are operations processed using vectorization in comparison to the classic loop-based operations in this assignment.

**After this assignment you will:**
- Be conscious how vectorization accelerate computations using numpy library
- Have fundamental knowledge of using numpy library functions and numpy matrix/vector operations
- Understand the concept of "broadcasting"
- Be able to vectorize code to achieve better performance of operations

Let's get started!

In [1]:
from clearml import Task

In [None]:
task = Task.init('jhub', 'jhub with git test 2')
task.execute_remotely()

ClearML Task: created new task id=6e3cf1b3ac4c4e97aed1ead3730235b0
ClearML results page: http://20.54.188.124:80/projects/eaf394b98830476d84d93be52e06978d/experiments/6e3cf1b3ac4c4e97aed1ead3730235b0/output/log
2021-03-23 09:34:21,390 - clearml.Task - INFO - Waiting for repository detection and full package requirement analysis
2021-03-23 09:34:21,615 - clearml.Task - INFO - Finished repository detection and package analysis


In [1]:
import numpy as np   # Import numpy library and use alias "np" instead of "numpy" to make the code shorter.
import time          # Import time library to measure time of operations
print("test")
a = np.random.rand(1000000) # Generate a vector consisting of 1000000 random numbers from a uniform distribution over [0, 1)
b = np.random.rand(1000000) # Generate a vector consisting of 1000000 random numbers from a uniform distribution over [0, 1)

vtic = time.time()         # Store the time when the computation is started
dot_vec = np.dot(a,b)     # Compute the dot product of the above two vectors using the vectorized function np.dot
vtoc = time.time()         # Store the time when the computation is finished
print ("dot_vec = " + str(dot_vec))

print("Vectorized dot product computation time: " + str(1000 * (vtoc-vtic)) + "ms\n")

dot_for = 0
ltic = time.time()         # Store the time when the computation is started
for i in range(1000000):  # Compute the dot product of the above two vectors using the classic loop
    dot_for += a[i]*b[i]
ltoc = time.time()         # Store the time when the computation is finished
print ("dot_for = " + str(dot_for))

print("For-looped dot product computation time: " + str(1000 * (ltoc-ltic)) + "ms\n")

### COMPARE VECTORIZED AND LOOP-BASED IMPLEMENTATIONS ###
print ("The vectorized implementation is " + str((ltoc-ltic)/(vtoc-vtic)) + " times faster the loop-based implementation.")

dot_vec = 249646.19573187083
Vectorized dot product computation time: 2.635955810546875ms

dot_for = 249646.1957318662
For-looped dot product computation time: 745.3939914703369ms

The vectorized implementation is 282.7793958031838 times faster the loop-based implementation.


What is the difference between the vectorized and non-vectorized versions of this operation?

Now, we focus on the small difference that distinguishes the definition of a list and a vector.
Notice that lists cannot be transposed, whereas vectors can transposed.

In [2]:
import numpy as np

print("List of values:")
a = np.random.randn(6)   # generates list of samples from the normal distribution in range [0, 1)
print(a)
print(a.shape)           # the shape suggest that a is a list
print(a.T)               # the list cannot be transposed because it is not a vector or matrix!
print(np.dot(a,a.T))     # what should it mean?!

print("Vector of values:")
b = np.random.randn(6,1) # generates a vector (one-column matrix) of samples from the normal distribution
print(b)
print(b.shape)           # the shape suggest that b is a matrix (vector)
print(b.T)               # the vector can be transposed
print(np.dot(b,b.T))     # now we get a matrix as a result of multiplication of the vectors

List of values:
[-0.43236993 -0.65867936 -0.420887    1.26058941  0.07471862  0.11301377]
(6,)
[-0.43236993 -0.65867936 -0.420887    1.26058941  0.07471862  0.11301377]
2.405388764317745
Vector of values:
[[ 0.49683185]
 [ 0.6213976 ]
 [-0.15230618]
 [ 0.66420251]
 [-1.17435026]
 [ 0.13359446]]
(6, 1)
[[ 0.49683185  0.6213976  -0.15230618  0.66420251 -1.17435026  0.13359446]]
[[ 0.24684189  0.30873012 -0.07567056  0.32999696 -0.58345462  0.06637398]
 [ 0.30873012  0.38613498 -0.0946427   0.41273385 -0.72973844  0.08301527]
 [-0.07567056 -0.0946427   0.02319717 -0.10116215  0.17886081 -0.02034726]
 [ 0.32999696  0.41273385 -0.10116215  0.44116498 -0.78000639  0.08873377]
 [-0.58345462 -0.72973844  0.17886081 -0.78000639  1.37909854 -0.15688668]
 [ 0.06637398  0.08301527 -0.02034726  0.08873377 -0.15688668  0.01784748]]


In [3]:
import numpy as np

C=np.random.randn(5,1)  # generates a column-vector of samples from the normal distribution in range [0, 1)
D=np.random.randn(1,5)  # generates a row-vector of samples from the normal distribution in range [0, 1)
print("We define matrices and vectors using (m, n) where m is a number of rows, and n is a number of columns:\n")
print(C)
print("... is a column-vector\n")
print(D)
print("... is a row-vector")

We define matrices and vectors using (m, n) where m is a number of rows, and n is a number of columns:

[[-0.1535463 ]
 [-0.2957405 ]
 [-0.76221853]
 [ 0.76496457]
 [-1.13372388]]
... is a column-vector

[[-1.59202725 -0.88312305  0.43632512 -0.30072669  0.84826458]]
... is a row-vector


In [4]:
import numpy as np

a = np.random.randn(5)    # the list can be reshaped to create a vector
print(a)
print(a.shape)            # the shape function returns the shape of the structure 
a = a.reshape((5,1))
print(a)
print(a.shape)

assert(a.shape == (5, 1)) # we can check whether the shape is correct and can continue computations

[ 0.91387436 -1.44334935 -0.89935717 -0.2363411   0.15483172]
(5,)
[[ 0.91387436]
 [-1.44334935]
 [-0.89935717]
 [-0.2363411 ]
 [ 0.15483172]]
(5, 1)


In [5]:
import time

x1 = [5, 1, 0, 3, 8, 2, 5, 6, 0, 1, 2, 5, 9, 0, 7]
x2 = [2, 5, 2, 0, 3, 2, 2, 9, 1, 0, 2, 5, 4, 0, 9]

### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
ltic = time.process_time()
dot = 0
for i in range(len(x1)):
    dot+= x1[i] * x2[i]
ltoc = time.process_time()
print ("for-looped dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(ltoc - ltic)) + "ms\n")

### VECTORIZED DOT PRODUCT OF VECTORS ###
vtic = time.process_time()
dot = np.dot(x1,x2)
vtoc = time.process_time()
print ("vectorized dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(vtoc - vtic)) + "ms")

for-looped dot = 235
 ----- Computation time = 0.2900000000001235ms

vectorized dot = 235
 ----- Computation time = 0.38800000000005497ms


In [6]:
import time

x1 = np.random.rand(100000000)
x2 = np.random.rand(100000000)

### CLASSIC DOT PRODUCT OF VECTORS IMPLEMENTATION ###
ltic = time.process_time()
dot = 0
for i in range(len(x1)):
    dot+= x1[i] * x2[i]
ltoc = time.process_time()
print ("for-looped dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(ltoc - ltic)) + "ms\n")

### VECTORIZED DOT PRODUCT OF VECTORS ###
vtic = time.process_time()
dot = np.dot(x1,x2)
vtoc = time.process_time()
print ("vectorized dot = " + str(dot) + "\n ----- Computation time = " + str(1000*(vtoc - vtic)) + "ms\n")

### COMPARE VECTORIZED AND LOOP-BASED IMPLEMENTATIONS ###
print ("The vectorized implementation is " + str((ltoc-ltic)/(vtoc-vtic)) + " times faster the loop-based implementation.")

for-looped dot = 25000698.046254445
 ----- Computation time = 154460.65399999998ms

vectorized dot = 25000698.04624013
 ----- Computation time = 416.64800000000923ms

The vectorized implementation is 370.722177953564 times faster the loop-based implementation.


In [7]:
import time

x1 = [5, 1, 0, 3, 8, 2, 5, 6, 0, 1, 2, 5, 9, 0, 7]
x2 = [2, 5, 2, 0, 3, 2, 2, 9, 1, 0, 2, 5, 4, 0, 9]

### CLASSIC OUTER PRODUCT IMPLEMENTATION ###
ltic = time.process_time()
outer = np.zeros((len(x1),len(x2))) # we create a len(x1)*len(x2) matrix with only zeros
for i in range(len(x1)):
    for j in range(len(x2)):
        outer[i,j] = x1[i] * x2[j]
ltoc = time.process_time()
print ("for-looped outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(ltoc - ltic)) + "ms\n")

### VECTORIZED OUTER PRODUCT ###
vtic = time.process_time()
outer = np.outer(x1,x2)
vtoc = time.process_time()
print ("vectorized outer = " + str(outer) + "\n ----- Computation time = " + str(1000*(vtoc - vtic)) + "ms")

for-looped outer = [[10. 25. 10.  0. 15. 10. 10. 45.  5.  0. 10. 25. 20.  0. 45.]
 [ 2.  5.  2.  0.  3.  2.  2.  9.  1.  0.  2.  5.  4.  0.  9.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 6. 15.  6.  0.  9.  6.  6. 27.  3.  0.  6. 15. 12.  0. 27.]
 [16. 40. 16.  0. 24. 16. 16. 72.  8.  0. 16. 40. 32.  0. 72.]
 [ 4. 10.  4.  0.  6.  4.  4. 18.  2.  0.  4. 10.  8.  0. 18.]
 [10. 25. 10.  0. 15. 10. 10. 45.  5.  0. 10. 25. 20.  0. 45.]
 [12. 30. 12.  0. 18. 12. 12. 54.  6.  0. 12. 30. 24.  0. 54.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 2.  5.  2.  0.  3.  2.  2.  9.  1.  0.  2.  5.  4.  0.  9.]
 [ 4. 10.  4.  0.  6.  4.  4. 18.  2.  0.  4. 10.  8.  0. 18.]
 [10. 25. 10.  0. 15. 10. 10. 45.  5.  0. 10. 25. 20.  0. 45.]
 [18. 45. 18.  0. 27. 18. 18. 81.  9.  0. 18. 45. 36.  0. 81.]
 [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [14. 35. 14.  0. 21. 14. 14. 63.  7.  0. 14. 35. 28.  0. 63.]]
 ----- Computation time = 0.6159999

In [8]:
import time

x1 = np.random.rand(2000)
x2 = np.random.rand(2000)

### CLASSIC OUTER PRODUCT IMPLEMENTATION ###
ltic = time.process_time()
outer = np.zeros((len(x1),len(x2))) # we create a len(x1)*len(x2) matrix with only zeros
for i in range(len(x1)):
    for j in range(len(x2)):
        outer[i,j] = x1[i] * x2[j]
ltoc = time.process_time()
print ("Computation time of the for-looped outer product = " + str(1000*(ltoc - ltic)) + "ms\n")

### VECTORIZED OUTER PRODUCT ###
vtic = time.process_time()
outer = np.outer(x1,x2)
vtoc = time.process_time()
print ("Computation time of the vectorized outer product = " + str(1000*(vtoc - vtic)) + "ms\n")

### COMPARE VECTORIZED AND LOOP-BASED IMPLEMENTATIONS ###
print ("The vectorized implementation is " + str((ltoc-ltic)/(vtoc-vtic)) + " times faster the loop-based implementation.")

Computation time of the for-looped outer product = 6423.577999999992ms

Computation time of the vectorized outer product = 26.929999999993015ms

The vectorized implementation is 238.5287040475922 times faster the loop-based implementation.


In [9]:
import time

x1 = [5, 1, 0, 3, 8, 2, 5, 6, 0, 1, 2, 5, 9, 0, 7]
x2 = [2, 5, 2, 0, 3, 2, 2, 9, 1, 0, 2, 5, 4, 0, 9]

### CLASSIC ELEMENTWISE IMPLEMENTATION ###
ltic = time.process_time()
mul = np.zeros(len(x1))
for i in range(len(x1)):
    mul[i] = x1[i] * x2[i]
ltoc = time.process_time()
print ("for-looped elementwise multiplication = " + str(mul) + "\n -- Computation time = " + str(1000*(ltoc-ltic)) + "ms\n")

### VECTORIZED ELEMENTWISE MULTIPLICATION ###
vtic = time.process_time()
mul = np.multiply(x1,x2)
vtoc = time.process_time()
print ("vectorized elementwise multiplication = " + str(mul) + "\n -- Computation time = " + str(1000*(vtoc-vtic)) + "ms\n")

for-looped elementwise multiplication = [10.  5.  0.  0. 24.  4. 10. 54.  0.  0.  4. 25. 36.  0. 63.]
 -- Computation time = 0.0ms

vectorized elementwise multiplication = [10  5  0  0 24  4 10 54  0  0  4 25 36  0 63]
 -- Computation time = 0.0ms



In [9]:
import time

x1 = np.random.rand(10000000)
x2 = np.random.rand(10000000)

### CLASSIC ELEMENTWISE IMPLEMENTATION ###
ltic = time.process_time()
mul = np.zeros(len(x1))
for i in range(len(x1)):
    mul[i] = x1[i] * x2[i]
ltoc = time.process_time()
print ("for-looped elementwise multiplication = " + str(mul) + "\n -- Computation time = " + str(1000*(ltoc-ltic)) + "ms\n")

### VECTORIZED ELEMENTWISE MULTIPLICATION ###
vtic = time.process_time()
mul = np.multiply(x1,x2)
vtoc = time.process_time()
print ("vectorized elementwise multiplication = " + str(mul) + "\n -- Computation time = " + str(1000*(vtoc-vtic)) + "ms\n")

### COMPARE VECTORIZED AND LOOP-BASED IMPLEMENTATIONS ###
print ("The vectorized implementation is " + str((ltoc-ltic)/(vtoc-vtic)) + " times faster the loop-based implementation.")

for-looped elementwise multiplication = [0.66348001 0.16807259 0.53696384 ... 0.0412303  0.133277   0.4877795 ]
 -- Computation time = 14321.769999999986ms

vectorized elementwise multiplication = [0.66348001 0.16807259 0.53696384 ... 0.0412303  0.133277   0.4877795 ]
 -- Computation time = 60.31999999999016ms

The vectorized implementation is 237.42987400534355 times faster the loop-based implementation.


In [10]:
import time

x1 = [5, 1, 0, 3, 8, 2, 5, 6, 0, 1, 2, 5, 9, 0, 7]

### CLASSIC GENERAL DOT PRODUCT IMPLEMENTATION ###
W = np.random.rand(3,len(x1)) # Random 3*len(x1) numpy array
ltic = time.process_time()
gdot = np.zeros(W.shape[0])
for i in range(W.shape[0]):
    for j in range(len(x1)):
        gdot[i] += W[i,j] * x1[j]
ltoc = time.process_time()
print ("for-looped gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(ltoc - ltic)) + "ms\n")

### VECTORIZED GENERAL DOT PRODUCT ###
vtic = time.process_time()
gdot = np.dot(W,x1)
vtoc = time.process_time()
print ("vectorized gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(vtoc - vtic)) + "ms")

for-looped gdot = [23.41846324 37.29901079 33.79428552]
 ----- Computation time = 0.48799999999005195ms

vectorized gdot = [23.41846324 37.29901079 33.79428552]
 ----- Computation time = 1.936000000000604ms


In [12]:
import time

x1 = np.random.rand(2000000)

### CLASSIC GENERAL DOT PRODUCT IMPLEMENTATION ###
W = np.random.rand(30,len(x1)) # Random 10*len(x1) numpy array
ltic = time.process_time()
gdot = np.zeros(W.shape[0])
for i in range(W.shape[0]):
    for j in range(len(x1)):
        gdot[i] += W[i,j] * x1[j]
ltoc = time.process_time()
print ("for-looped gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(ltoc - ltic)) + "ms\n")

### VECTORIZED GENERAL DOT PRODUCT ###
vtic = time.process_time()
vgdot = np.dot(W,x1)
vtoc = time.process_time()
print ("vectorized gdot = " + str(gdot) + "\n ----- Computation time = " + str(1000*(vtoc - vtic)) + "ms\n")

### COMPARE VECTORIZED AND LOOP-BASED IMPLEMENTATIONS ###
print ("The vectorized implementation is " + str((ltoc-ltic)/(vtoc-vtic)) + " times faster the loop-based implementation.")

for-looped gdot = [500437.37503957 500588.65822974 499997.37729725 500431.96677249
 499990.57606129 500101.13619234 500440.12059663 499942.05914268
 500172.0930138  500390.68438827 500448.98426112 500097.82552791
 499998.0396671  499742.41071095 500061.37584791 500397.58645545
 500078.11552049 500198.59793205 500271.19178775 500528.01391101
 500336.90809614 500269.9702921  500297.29997051 500053.67258239
 500211.98801755 500208.28556697 500396.42538635 499823.80758932
 500409.58407897 500245.64277862]
 ----- Computation time = 27515.625ms

vectorized gdot = [500437.37503957 500588.65822974 499997.37729725 500431.96677249
 499990.57606129 500101.13619234 500440.12059663 499942.05914268
 500172.0930138  500390.68438827 500448.98426112 500097.82552791
 499998.0396671  499742.41071095 500061.37584791 500397.58645545
 500078.11552049 500198.59793205 500271.19178775 500528.01391101
 500336.90809614 500269.9702921  500297.29997051 500053.67258239
 500211.98801755 500208.28556697 500396.425386

# Python Broadcasting is shown on the addition of a matrix and a vector.

**Write a code:**
- First, the transposed vector b is broadcasted over the rows of the matrix A
- Second, the vector b is broadcasted over the rows of the transposed matrix A
- Read [broadcasting documentation](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)

In [18]:
import numpy as np

A = np.random.randn(3,6)   # Definition of the random matrix of 3 rows and 6 columns
b = np.random.randn(6,1)   # Definition of the random row-vector of 6 columns

print ("A = " + str(A) + "\n")
print ("A.T = " + str(A.T) + "\n")
print ("b = " + str(b) + "\n")
print ("b.T = " + str(b.T) + "\n")
print ("A + b.T = " + str(A + b.T) + "\n")
print ("A.T + b = " + str(A.T + b))

A = [[-0.69146698 -1.00407449 -1.17002022 -0.54720112  0.2725857   0.36188234]
 [ 0.94055063  0.40151944 -1.53609868  0.98313012 -1.14942301 -0.9764585 ]
 [-0.4398725   1.29403465 -0.57294572  0.12110486  1.34163127  1.46686881]]

A.T = [[-0.69146698  0.94055063 -0.4398725 ]
 [-1.00407449  0.40151944  1.29403465]
 [-1.17002022 -1.53609868 -0.57294572]
 [-0.54720112  0.98313012  0.12110486]
 [ 0.2725857  -1.14942301  1.34163127]
 [ 0.36188234 -0.9764585   1.46686881]]

b = [[-0.69587528]
 [ 0.72487031]
 [-0.10024606]
 [-0.74431228]
 [ 2.42925773]
 [ 0.20146459]]

b.T = [[-0.69587528  0.72487031 -0.10024606 -0.74431228  2.42925773  0.20146459]]

A + b.T = [[-1.38734226 -0.27920418 -1.27026627 -1.29151341  2.70184343  0.56334693]
 [ 0.24467535  1.12638975 -1.63634473  0.23881784  1.27983472 -0.77499391]
 [-1.13574778  2.01890496 -0.67319178 -0.62320743  3.77088899  1.6683334 ]]

A.T + b = [[-1.38734226  0.24467535 -1.13574778]
 [-0.27920418  1.12638975  2.01890496]
 [-1.27026627 -1.636344

# Reshaping arrays of a given shape

**We commonly use two numpy functions in deep learning to retrieve shape (dimension) ([np.shape](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.shape.html)) or to reshape (change dimension) ([np.reshape()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html)) arrays (matrices/vectors):**

For example, when the 2D images (reprezented by 3D arrays of shape $(length, height, depth = 3)$) are processed we usually convert ("unroll") them to 1D vectors of shape $(length*height*3, 1)$:

We implement `image2vector()` that takes an input of shape (length, height, 3) and returns a vector of shape (length\*height\*3, 1):
``` python
v = v.reshape((v.shape[0]*v.shape[1]*v.shape[2])) # v.shape[0] = length; v.shape[1] = height; v.shape[2] = rgbcomponents
```

In [15]:
def image2vector(image):
    v = image.reshape((image.shape[0]*image.shape[1]*image.shape[2]),1)
    return v

In [17]:
# Let's have a 4 by 4 by 3 array representing a simple image (num_px_x, num_px_y, 3) where 3 represents the RGB values
image = np.array(
      [[[ 11, 12, 13, 14 ],
        [ 15, 16, 17, 18 ],
        [ 21, 22, 23, 24 ],
        [ 25, 26, 27, 28 ]],

       [[ 31, 32, 33, 34 ],
        [ 35, 36, 37, 38 ],
        [ 41, 42, 43, 44 ],
        [ 45, 46, 47, 48 ]],

       [[ 51, 52, 53, 54 ],
        [ 55, 56, 57, 58 ],
        [ 61, 62, 63, 64 ],
        [ 65, 66, 67, 68 ]]])

print ("image2vector(image) = " + str(image2vector(image)))

image2vector(image) = [[11]
 [12]
 [13]
 [14]
 [15]
 [16]
 [17]
 [18]
 [21]
 [22]
 [23]
 [24]
 [25]
 [26]
 [27]
 [28]
 [31]
 [32]
 [33]
 [34]
 [35]
 [36]
 [37]
 [38]
 [41]
 [42]
 [43]
 [44]
 [45]
 [46]
 [47]
 [48]
 [51]
 [52]
 [53]
 [54]
 [55]
 [56]
 [57]
 [58]
 [61]
 [62]
 [63]
 [64]
 [65]
 [66]
 [67]
 [68]]


# Conversion of a list to a vector

**In order to use the elements of the list in vector operations, we need to convert a list to a vector:**

In [5]:
import numpy as np   # Import numpy library and use alias "np" instead of "numpy" to make the code shorter.

v1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
v2 = [11, 12, 13, 14, 15, 16, 17, 18, 19]
print(v1)
print (np.shape(v1))
print(v2)
print (np.shape(v2))

v1 = np.array(v1).reshape(len(v1),1)
v2 = np.array(v2).reshape(len(v2),1)
print(v1)
print (np.shape(v1))
print(v2)
print (np.shape(v2))
print (v1.T)
print (np.shape(v1.T))
print(v2.T)
print (np.shape(v2.T))

[1, 2, 3, 4, 5, 6, 7, 8, 9]
(9,)
[11, 12, 13, 14, 15, 16, 17, 18, 19]
(9,)
[[1]
 [2]
 [3]
 [4]
 [5]
 [6]
 [7]
 [8]
 [9]]
(9, 1)
[[11]
 [12]
 [13]
 [14]
 [15]
 [16]
 [17]
 [18]
 [19]]
(9, 1)
[[1 2 3 4 5 6 7 8 9]]
(1, 9)
[[11 12 13 14 15 16 17 18 19]]
(1, 9)
