---
layout: default
title: "Python vectorization"
categories: deeplearning
permalink: /ML27/
order: 27
comments: true
---

In [1]:
%pylab --no-import-all inline
import pandas as pd
plt.rcParams["mathtext.fontset"] = "cm"

Populating the interactive namespace from numpy and matplotlib




# Python vectorization
In the pre-deep-learning era vectorization was optional, in the deep-learning era vectorization absolutely necessary since both the size of networks and of data is vastly increased.

## Vector-vector product
In particular, in deep learning (and in machine learning in general) we need to calculate 

$$
z = w^Tx+b
$$

for 

$$
w =
\begin{bmatrix}
\vdots \\ \vdots
\end{bmatrix} \in \mathbb{R}^{n_x}
\qquad 
x = \begin{bmatrix}
\vdots \\ \vdots
\end{bmatrix} \in \mathbb{R}^{n_x}
$$

The vectorized form of this operation in python is 

In [2]:
w, x, b = np.random.rand(3, 10)

In [3]:
np.dot(w, x) + b

array([2.94551122, 3.67388637, 3.62153802, 2.95427753, 3.40170278,
       2.89747577, 3.37748581, 3.55059818, 2.90561121, 3.68952505])

where `np.dot(w, x)` $\equiv w^Tx$

## Matrix-vector product
Incidentally, the matrix-vector product $Av$, where 

$$
A = \begin{bmatrix}
\ddots &  \\
&   \\
&  \ddots \\
\end{bmatrix} \in \mathbb{R}^{m \times n} \qquad 
v=\begin{bmatrix}
\vdots \\ \vdots
\end{bmatrix} \in \mathbb{R}^n
$$

In [4]:
v = np.random.rand(10)
A = np.random.rand(3, 10)

In [5]:
np.dot(A, v)

array([2.53630323, 2.18786889, 1.99920726])

Notice that the exact same syntax performs both vecto-vector and matrix-vector multiplication, this is due to the overload implemented in the `np.dot` function. To know more about it, check out [its documentation](https://numpy.org/doc/stable/reference/generated/numpy.dot.html)

## Vectorized element-wise operations
To apply a function element by element to whole arrays you can simply use`np.ufuncs` ([numpy universal functions](https://numpy.org/doc/stable/reference/generated/numpy.ufunc.html#numpy.ufunc))

In [6]:
v = np.random.rand(10).round(2)

In [7]:
v

array([0.01, 0.38, 0.11, 0.32, 0.73, 0.64, 0.43, 0.89, 0.6 , 0.23])

In [8]:
np.exp(v).round(2)

array([1.01, 1.46, 1.12, 1.38, 2.08, 1.9 , 1.54, 2.44, 1.82, 1.26])

In [9]:
np.log(v).round(2)

array([-4.61, -0.97, -2.21, -1.14, -0.31, -0.45, -0.84, -0.12, -0.51,
       -1.47])

In [10]:
v + 1

array([1.01, 1.38, 1.11, 1.32, 1.73, 1.64, 1.43, 1.89, 1.6 , 1.23])

In [11]:
v * 2

array([0.02, 0.76, 0.22, 0.64, 1.46, 1.28, 0.86, 1.78, 1.2 , 0.46])

## Broadcasting
To a complete guide to broadcasting check out [numpy great documentation](https://numpy.org/doc/stable/user/basics.broadcasting.html#:~:text=The%20term%20broadcasting%20describes%20how,that%20they%20have%20compatible%20shapes.&text=NumPy%20operations%20are%20usually%20done,element%2Dby%2Delement%20basis.)

In [12]:
A = pd.DataFrame([[56, 0, 4.4, 6.8], [1.2, 104, 52, 8], [1.8, 135, 99, 0.9]], 
                        columns=['Apples', 'Beef', 'Eggs', 'Potatoes'], index=['Carb', 'Protein', 'Fat'])
A

Unnamed: 0,Apples,Beef,Eggs,Potatoes
Carb,56.0,0,4.4,6.8
Protein,1.2,104,52.0,8.0
Fat,1.8,135,99.0,0.9


In [13]:
A = A.values
A

array([[ 56. ,   0. ,   4.4,   6.8],
       [  1.2, 104. ,  52. ,   8. ],
       [  1.8, 135. ,  99. ,   0.9]])

In [14]:
cal = A.sum(axis=0)
cal

array([ 59. , 239. , 155.4,  15.7])

In [15]:
(A / cal.reshape(1, 4) * 100)

array([[94.91525424,  0.        ,  2.83140283, 43.31210191],
       [ 2.03389831, 43.51464435, 33.46203346, 50.95541401],
       [ 3.05084746, 56.48535565, 63.70656371,  5.73248408]])

In [16]:
A / cal * 100

array([[94.91525424,  0.        ,  2.83140283, 43.31210191],
       [ 2.03389831, 43.51464435, 33.46203346, 50.95541401],
       [ 3.05084746, 56.48535565, 63.70656371,  5.73248408]])

In general if you have a $m, n$ matrix (A) 

* if you apply an operation with an $1, n$ matrix (B), then B will be copied $m$ times and the operations applied element-wise
* if you apply an operation with an $m, 1$ matrix (C), then C will be copied $n$ times and the operations applied element-wise

## numpy Vectors
`numpy` offers great flexibility at the cost of rigorousness, sometimes wrong-looking expression give unexpectedly correct results and vice versa.
Heres a series of considerations and suggestions for dealing with `numpy`.

For example let's take a random vector of 5 elements

In [19]:
a = np.random.rand(5)
a

array([0.09985692, 0.55150143, 0.05648411, 0.56128704, 0.83764142])

Whose shape is

In [21]:
a.shape

(5,)

This is called a rank 1 vector in python and it's neither a row vector nor a column vector and its behavior is sometimes unexpected. 

For example, its transpose is equal to itself 

In [22]:
a.T

array([0.09985692, 0.55150143, 0.05648411, 0.56128704, 0.83764142])

and the inner product of `a` and `a.T` is not a matrix instead is a scalar

In [23]:
np.dot(a, a.T)

1.3340019714208795

So, instead of using rank 1 vectors you may want to use rank 2 vectors, which have a much more predictable behavior.

In [26]:
a = np.random.rand(5, 1)
a

array([[0.2285247 ],
       [0.38376194],
       [0.94287005],
       [0.40764731],
       [0.55051624]])

In [27]:
a.T

array([[0.2285247 , 0.38376194, 0.94287005, 0.40764731, 0.55051624]])

In [29]:
np.dot(a, a.T)

array([[0.05222354, 0.08769908, 0.21546909, 0.09315748, 0.12580656],
       [0.08769908, 0.14727323, 0.36183764, 0.15643952, 0.21126718],
       [0.21546909, 0.36183764, 0.88900393, 0.38435844, 0.51906527],
       [0.09315748, 0.15643952, 0.38435844, 0.16617633, 0.22441646],
       [0.12580656, 0.21126718, 0.51906527, 0.22441646, 0.30306813]])

rank 1 arrays can always be reshaped in row or columns vectors (or higher dimensional matrices)

In [31]:
a = np.random.rand(5)
a

array([0.67024056, 0.40038358, 0.24884655, 0.91545872, 0.42160758])

In [32]:
a.reshape(5, 1)

array([[0.67024056],
       [0.40038358],
       [0.24884655],
       [0.91545872],
       [0.42160758]])

In [33]:
a.reshape(1, 5)

array([[0.67024056, 0.40038358, 0.24884655, 0.91545872, 0.42160758]])