# Numpy and Vectorization

Using numpy arrays for linear algebra is very efficient both for code-writing and computationally.

In [1]:
import numpy as np
import time

Numpy includes more numeric types, vectors, matrices, and matrix functions that base python doesn't support.

Python arithmetic operators work on numpy data types and many numpy functions accept python data types.

## Vectors
- Vectors are ordered arrays of numbers. 
- The elements of a vector are all the same type.
- The number of elements in the array is referred to as *dimension* or *rank* "n"
- Index is the element of a vector. In python, indexing runs from $x_0$ to $x_n-_1$


## Arrays
Numpy's basic data structure is an indexable n-dimensional array containing elements of the same dtype.
Here, dimension refers to the number of indexes of an array

## Vector creation
Numpy data creation routines have a first parameter, the shape of the object. It can be a single value, a tuple, or other functions


In [3]:
# Produces arrays of 0's, useful for building a data container
a = np.zeros(4); print(f"np.zeros(4) :   a = {a}, a shape = {a.shape}")
a = np.zeros((4,)); print(f"np.zeros(4,) :  a = {a}, a shape = {a.shape}")
a = np.zeros((4,4)); print(f"np.zeros(4,4): a = {a}, a shape = {a.shape}")
# Generate random floats from 0 to 1
a = np.random.random_sample(4); print(f"np.random.random_sample(4): a = {a}, a shape = {a.shape}")

np.zeros(4) :   a = [0. 0. 0. 0.], a shape = (4,)
np.zeros(4,) :  a = [0. 0. 0. 0.], a shape = (4,)
np.zeros(4,4): a = [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]], a shape = (4, 4)
np.random.random_sample(4): a = [0.81279037 0.28287954 0.746396   0.33818884], a shape = (4,)


Some data creation routines don't take a tuple (they only receive scalars) 

In [12]:
a = np.arange(10.); print(f"np.arange(4.):     a = {a}, a shape = {a.shape}")
a = np.random.rand(10,2); print(f"np.random.rand(4): a = {a}, a shape = {a.shape}")

np.arange(4.):     a = [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.], a shape = (10,)
np.random.rand(4): a = [[0.19104916 0.3685134 ]
 [0.28215562 0.49664503]
 [0.23652574 0.68018303]
 [0.70030055 0.55624855]
 [0.78327524 0.75445803]
 [0.75601888 0.0471009 ]
 [0.30810343 0.02222786]
 [0.93980188 0.95539408]
 [0.80703379 0.26591955]
 [0.28693882 0.01414412]], a shape = (10, 2)


Manual specification, note that if one of the numbers has a ".", then the whole array becomes a float type.

In [13]:
a = np.array([5,4,3,2]);  print(f"np.array([5,4,3,2]):  a = {a},     a shape = {a.shape}, a data type = {a.dtype}")
a = np.array([5.,4,3,2]); print(f"np.array([5.,4,3,2]): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")

np.array([5,4,3,2]):  a = [5 4 3 2],     a shape = (4,), a data type = int32
np.array([5.,4,3,2]): a = [5. 4. 3. 2.], a shape = (4,), a data type = float64


## Operations

### Indexing
One can access and slice numpy arrays similarly to python data structures. 

for indexing, you use "array[i]"

In [19]:
# Vector indexing operations on a 1 D vector
a = np.arange(10)
print(a)

# Access an element
print(f"a[2]  = {a[2]}, a[0] = {a[0]}, Accessing an element returns a scalar")

# negative indexes count from the end
print(f"a[-1] = {a[-1]}, a[-2] = {a[-2]}")  

# If an index is out of range, it will produce an error.
try:
    c = a[10]
except Exception as e:
    print("a[10] = The error message you'll see is:")
    print(e)

[0 1 2 3 4 5 6 7 8 9]
a[2]  = 2, a[0] = 0, Accessing an element returns a scalar
a[-1] = 9, a[-2] = 8
a[10] = The error message you'll see is:
index 10 is out of bounds for axis 0 with size 10


### Slicing
Slicing creates a new array using a set of three values: (**`start:stop:step`**). 

In [22]:
#vector slicing operations
a = np.arange(10)
print(f"a         = {a}")

# Access elements 3 to 7
c = a[2:7:1];  print("a[2:7:1] = ", c)

# Access the same elements, separated by two
c = a[2:7:2];     print("a[2:7:2] = ", c)

# Access all elements index 3 (fourth element) and above
c = a[3:];        print("a[3:]    = ", c)

# Access all elements below index 3
c = a[:3];        print("a[:3]    = ", c)

# Access all elements
c = a[:];         print("a[:]     = ", c)

a         = [0 1 2 3 4 5 6 7 8 9]
a[2:7:1] =  [2 3 4 5 6]
a[2:7:2] =  [2 4 6]
a[3:]    =  [3 4 5 6 7 8 9]
a[:3]    =  [0 1 2]
a[:]     =  [0 1 2 3 4 5 6 7 8 9]


### Single vector operations

In [23]:
a = np.array([1,2,3,4])
print(f"a             : {a}")

# Negate elements of a
b = -a
print(f"b = -a      :  {b}")

# Sum of all elements
b = np.sum(a)
print(f"b = np.sum(a) : {b}")

# Mean of all elements
b = np.mean(a)
print(f"b = np.mean(a): {b}")

# Power of all elements
b = a**2
print(f"b = a**2     :  {b}")

# square root of all elements
b = np.sqrt(a)
print(f"b = np.sqrt(a): {b}")



a             : [1 2 3 4]
b = -a      :  [-1 -2 -3 -4]
b = np.sum(a) : 10
b = np.mean(a): 2.5
b = a**2     :  [ 1  4  9 16]
b = np.sqrt(a): [1.         1.41421356 1.73205081 2.        ]


### Vector - Vector operations (elements)

Numpy's arithmetic, logical and comparison operations apply to vectors. These operators work on an element-by-element basis. The vectors must be of the same size

In [26]:
a = np.array([ 1, 2, 3, 4])
b = np.array([-1,-2, 3, 4])
print(f"Binary operators work element wise: ")
print(f"a + b ={a + b}")
print(f"a <= b ={a <= b}")
print(f"a >= b ={a >= b}")
# exponential and log functions
print(f"np.exp(a) = {np.exp(a)}")
print(f"np.log(a) = {np.log(a)}")


Binary operators work element wise: 
a + b =[0 0 6 8]
a <= b =[False False  True  True]
a >= b =[ True  True  True  True]
np.exp(a) = [ 2.71828183  7.3890561  20.08553692 54.59815003]
np.log(a) = [0.         0.69314718 1.09861229 1.38629436]


### Scalar vector operations
Vectors can be scaled by scalar values (numbers). The scalar multiplies all the elements of the vector.

In [27]:
a = np.array([1,2,3,4])
b = 5 * a
print(f"b = 5 * a : {b}")

b = 5 * a : [ 5 10 15 20]


### Dot product
The dot product is a scalar obtained from the sum of the multiplication of each element [i] of two vectors.

mathematically it's the proyection of each vector over each other. Practically, it's a simplification of the data, while still mantaining information.

$$ x = \sum_{i=0}^{n-1} a_i b_i $$

In [31]:
# Implementation with a for loop
a = np.array([1, 2, 3, 4])
b = np.array([-1, 4, 3, 2])

def dot(a, b):
    x = 0
    for i in range(a.shape[0]):
        x += a[i] * b[i]
    return x

print(f"dot(a,b) = {dot(a,b)}")


dot(a,b) = 24


To use a vectorized approach, one uses np.dot(a, b)

In [32]:

c = np.dot(b, a)
print(f"NumPy 1-D np.dot(b, a) = {c}, np.dot(a, b).shape = {c.shape} ")

c = np.dot(a, b)
print(f"NumPy 1-D np.dot(a, b) = {c}, np.dot(a, b).shape = {c.shape} ") 


NumPy 1-D np.dot(b, a) = 24, np.dot(a, b).shape = () 
NumPy 1-D np.dot(a, b) = 24, np.dot(a, b).shape = () 


### Vector Vector operations

These operations are frequently used in ML. on the first course (supervised learning) we frequently use:

m = number of observations, n = number of features or columns

X is a 2 Dimensional array shape (m,n)

W is a 1 dimensional vector of shape (n)

In [33]:
X = np.array([[1],[2],[3],[4]])
w = np.array([2])
c = np.dot(X[1], w)

print(f"X[1] has shape {X[1].shape}")
print(f"w has shape {w.shape}")
print(f"c has shape {c.shape}")

X[1] has shape (1,)
w has shape (1,)
c has shape ()


## Matrices

2 D arrays. matrices are denoted with captol, bold letter such as $\mathbf{X}$. "m" are the observations and n the columns or features. The elements of a matrix can be referenced with a 2 dimensional index. Indexing will run from 0 to n-1
![image.png](attachment:image.png)

### Numpy arrays
2-D matrices are used to hold training data. Training data is m examples by n features.

One can extract an example as a vector and operate on that.

Each nested bracket is a row, and each number in these brackets is a column of that row.

In [35]:
# NumPy routines which allocate memory and fill with user specified values
a = np.array([[5,4,3], [4,3,2], [3,2,1]]);   print(f" a shape = {a.shape}, np.array: a = {a}")
a = np.array([[5],   # One can also
              [4],   # separate values
              [3]]); #into separate rows
print(f" a shape = {a.shape}, np.array: a = {a}")

 a shape = (3, 3), np.array: a = [[5 4 3]
 [4 3 2]
 [3 2 1]]
 a shape = (3, 1), np.array: a = [[5]
 [4]
 [3]]


### Indexing

matrices include a second index. specifying only 1 number on the index will return a whole vector

In [39]:
a = np.arange(6).reshape(-1,2) #Reshaping rearanges the vector to matrix form
print(f"a.shape: {a.shape}, \na= {a}")

#access an element, [row, column]
print(f"\n  a[2,0] = {a[2, 0]}, Accessing an element returns a scalar\n ")

#access a row
print(f"a[2].shape:   {a[2].shape}, a[2]   = {a[2]}, type(a[2])   = {type(a[2])}")

a.shape: (3, 2), 
a= [[0 1]
 [2 3]
 [4 5]]

  a[2,0] = 4, Accessing an element returns a scalar
 
a[2].shape:   (2,), a[2]   = [4 5], type(a[2])   = <class 'numpy.ndarray'>


### Reshape
As shown in the previous example, arrays can be reshaped. if an argument is set to "-1", the routine will compute an adequate number of rows given the size of the array and the desired columns.

### Slicing


In [40]:
#vector 2-D slicing operations
a = np.arange(20).reshape(-1, 10)
print(f"a = \n{a}")

a = 
[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]]


In [44]:

#access 5 consecutive elements at the first row (start:stop:step)
print("a[0, 2:7:1] = ", a[0, 2:7:1], ",  a[0, 2:7:1].shape =", a[0, 2:7:1].shape, "a 1-D array")


#access 5 consecutive elements (start:stop:step) in two rows
print("a[:, 2:7:1] = \n", a[:, 2:7:1], ",  a[:, 2:7:1].shape =", a[:, 2:7:1].shape, "a 2-D array")

# access all elements
print("a[:,:] = \n", a[:,:], ",  a[:,:].shape =", a[:,:].shape)


# access all elements in one row (very common usage)
print("a[1,:] = ", a[1,:], ",  a[1,:].shape =", a[1,:].shape, "a 1-D array")
# same as
print("a[1]   = ", a[1],   ",  a[1].shape   =", a[1].shape, "a 1-D array")


a[0, 2:7:1] =  [2 3 4 5 6] ,  a[0, 2:7:1].shape = (5,) a 1-D array
a[:, 2:7:1] = 
 [[ 2  3  4  5  6]
 [12 13 14 15 16]] ,  a[:, 2:7:1].shape = (2, 5) a 2-D array
a[:,:] = 
 [[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]] ,  a[:,:].shape = (2, 10)
a[1,:] =  [10 11 12 13 14 15 16 17 18 19] ,  a[1,:].shape = (10,) a 1-D array
a[1]   =  [10 11 12 13 14 15 16 17 18 19] ,  a[1].shape   = (10,) a 1-D array
