Using np.zeros to create a numpy matrix, first param can be a integer or tuple.
Use tuple to make a multi-demensional array/matrix.
NumPy routines which allocate memory and fill arrays with value.

In [None]:
import numpy as np
a = np.zeros((4,2,3))
print(a)
print(a.shape)

We can use many methods to create a sample, such as: np.random.rand(), np.random.random(), np.random.random_sample().
All of them return matrix with values in (0,1).
np.random.random_sample() is the newest, which is the most recommended method, and can have shape as input argument.

In [4]:
a = np.random.random_sample((3,3))
print(a)

[[0.90473656 0.85504913 0.62337834]
 [0.20795681 0.23784495 0.04553333]
 [0.33678925 0.29971299 0.44281961]]


NumPy routines which allocate memory and fill arrays with value but do not accept shape as input argument.

In [8]:
a = np.arange(4.);              print(f"np.arange(4.):     a = {a}, a shape = {a.shape}, a data type = {a.dtype}")
a = np.random.rand(4);          print(f"np.random.rand(4): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")

np.arange(4.):     a = [0. 1. 2. 3.], a shape = (4,), a data type = float64
np.random.rand(4): a = [0.4853188  0.0577209  0.05784702 0.98934162], a shape = (4,), a data type = float64


NumPy routines which allocate memory and fill with user specified values.

In [9]:
a = np.array([5,4,3,2]);  print(f"np.array([5,4,3,2]):  a = {a},     a shape = {a.shape}, a data type = {a.dtype}")
a = np.array([5.,4,3,2]); print(f"np.array([5.,4,3,2]): a = {a}, a shape = {a.shape}, a data type = {a.dtype}")

np.array([5,4,3,2]):  a = [5 4 3 2],     a shape = (4,), a data type = int32
np.array([5.,4,3,2]): a = [5. 4. 3. 2.], a shape = (4,), a data type = float64


Slicing creates an array of indices using a set of three values (start:stop:step).

In [15]:
a = np.random.random_sample(10)
print(a)
print(a[2:8:2])
b = np.random.random_sample((4,4))
print(b)
print(b[1, 2:4:1])

[0.01632076 0.8706762  0.54598966 0.32587853 0.09604277 0.51671067
 0.11704239 0.34234209 0.23211286 0.53398493]
[0.54598966 0.09604277 0.11704239]
[[0.94985589 0.16799599 0.37256549 0.61725078]
 [0.22003995 0.94159407 0.18567294 0.60589771]
 [0.35155902 0.98973286 0.89249208 0.55473746]
 [0.0424125  0.18477165 0.49668163 0.22501484]]
[0.18567294 0.60589771]


There are some Single Vector Operations

In [23]:
a = np.array([1,2,3,4])

negative_a = -a
print(f"negative of a: {negative_a}")

sum_a = np.sum(a)
print(f"Sum:  {sum_a}")

mean_a = np.mean(a)
print(f"mean of a: {mean_a}")

square_a = a**2
print(f"square of each value in a:  {square_a}")


negative of a: [-1 -2 -3 -4]
Sum:  10
mean of a: 2.5
square of each value in a:  [ 1  4  9 16]


Vector Element-wise Operations.
Vector must be in the same size!!!

In [35]:
a = np.array([[[1,1],[1,1]],[[2,2],[2,2]]])
b = np.array([[[3,3],[3,3]], [[4,4],[4,4]]])
sum_a_b = a + b
print(f"sum of a + b\n{sum_a_b}")

sum of a + b
[[[4 4]
  [4 4]]

 [[6 6]
  [6 6]]]


Scalar Vector operations.

In [37]:
a = np.array([[[1,1],[1,1]],[[2,2],[2,2]]])
a_scale_5 = a*5
print(f"a is scaled by 5: \n{a_scale_5}")

a is scaled by 5: 
[[[ 5  5]
  [ 5  5]]

 [[10 10]
  [10 10]]]


Vector Vector dot product.

In [44]:
a = np.arange(4)
b = 2*np.arange(4)
print(f"a: {a}")
print(f"b: {b}")

dot_a_b = np.dot(a,b)
print(f"dot of a and b is: {dot_a_b}") # 0*0 + 1*2 + 2*4 + 3*6
multi_a_b = a*b
print(f"each value of a times b is: {multi_a_b}") # [0*0, 1*2, 2*4, 3*6]
mul_a_b = np.multiply(a,b)
print(mul_a_b)

a: [0 1 2 3]
b: [0 2 4 6]
dot of a and b is: 28
each value of a times b is: [ 0  2  8 18]
[ 0  2  8 18]


Test speed

In [47]:
import time
def my_dot(a, b): 
    """
   Compute the dot product of two vectors
 
    Args:
      a (ndarray (n,)):  input vector 
      b (ndarray (n,)):  input vector with same dimension as a
    
    Returns:
      x (scalar): 
    """
    x=0
    for i in range(a.shape[0]):
        x = x + a[i] * b[i]
    return x
np.random.seed(1)
a = np.random.rand(10000000)  # very large arrays
b = np.random.rand(10000000)

tic = time.time()  # capture start time
c = np.dot(a, b)
toc = time.time()  # capture end time

print(f"np.dot(a, b) =  {c:.4f}")
print(f"Vectorized version duration: {1000*(toc-tic):.4f} ms ")

tic = time.time()  # capture start time
c = my_dot(a,b)
toc = time.time()  # capture end time

print(f"my_dot(a, b) =  {c:.4f}")
print(f"loop version duration: {1000*(toc-tic):.4f} ms ")

del(a);del(b)  #remove these big arrays from memory

np.dot(a, b) =  2501072.5817
Vectorized version duration: 5.0457 ms 
my_dot(a, b) =  2501072.5817
loop version duration: 2783.4256 ms 


Bonus Test

In [48]:
X_train = [[1,2], [3,4]]
w = [3, 8]
dot_X_w = np.dot(X_train, w)
print(f"dot of X_train and w {dot_X_w}")

dot of X_train and w [19 41]


Reshape
The previous example used reshape to shape the array.
a = np.arange(6).reshape(-1, 2)
This line of code first created a 1-D Vector of six elements. It then reshaped that vector into a 2-D array using the reshape command. This could have been written:
a = np.arange(6).reshape(3, 2)
To arrive at the same 3 row, 2 column array. The -1 argument tells the routine to compute the number of rows given the size of the array and the number of columns

In [None]:
#vector indexing operations on matrices
a = np.arange(6).reshape(-1, 2)   #reshape is a convenient way to create matrices
print(f"a.shape: {a.shape}, \na= {a}")

#access an element
print(f"\na[2,0].shape:   {a[2, 0].shape}, a[2,0] = {a[2, 0]},     type(a[2,0]) = {type(a[2, 0])} Accessing an element returns a scalar\n")

#access a row
print(f"a[2].shape:   {a[2].shape}, a[2]   = {a[2]}, type(a[2])   = {type(a[2])}")

Bonus Test

In [53]:
a = np.array([1,2,3,4,5,6,7,8])
a= a.reshape(4,-1)
print(f"a after reshape\n{a}")

a after reshape
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
