# COMP S493F Lab 3

In [None]:
%env TF_CPP_MIN_LOG_LEVEL=2

env: TF_CPP_MIN_LOG_LEVEL=2


In this lesson, you'll work on:

- basic linear algebra using NumPy, and
- more NumPy array indexing.

### Student name: *Lo Tsz Kin*


# Basic linear algebra using NumPy

Linear algebra is an important topic in the implementation of neural networks. In this section, we examine dot product and matrix product using the NumPy library.

## Dot product of vectors

Let's begin with basic vector operations. The code below creates two vectors as 1D NumPy arrays, `a` and `b`, both with shape `(3,)`.

In [None]:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([10, 20, 30])
print(a, a.shape)
print(b, b.shape)

#(3,) is a python tuple of 1 item

[1 2 3] (3,)
[10 20 30] (3,)


NumPy supports element-wise operations, such as using the `+`, `-`, `*`, and `/` operators. In the following, `a + 100` adds 100 to each element of `a`; `a + b` adds the corresponding elements in `a` and `b`, and `a * b` multiplies the corresponding elements in `a` and `b`. In all 3 cases, the results are 1D arrays with shape `(3,)`.

In [None]:
print(a + 100)
print(a + b)
print(a * b)

[101 102 103]
[11 22 33]
[10 40 90]


In [None]:
print(a+ np.array([5]))

[6 7 8]


Dot product of vectors can be obtained in several ways: the `@` operator, the `dot()` method, the `np.dot()` function, and the `np.matmul()` function. In all 4 cases below, the result is 140 (= 1&times;10 + 2&times;20 + 3&times;30).

In [None]:
print(a @ b)
print(a.dot(b))
print(np.dot(a, b))
print(np.matmul(a, b))

140
140
140
140


## Matrix product

Let's turn to matrix product. The code below creates two matrices: a 2&times;3 matrix called `c` and a 3&times;2 matrix called `d`. Recall that the matrix product operation requires that the last dimension of the first matrix (here 3 in 2&times;3 of `c`) equals the first dimension of the second matrix (here 3 in 3&times;2 of `d`).

In [None]:
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[10, 20], [30, 40], [50, 60]])
print(c, c.shape)
print(d, d.shape)

#Want to * the matrix c( dim: 2,3) * d (dim: 3 :2) Must be equle in the C2 and D1

[[1 2 3]
 [4 5 6]] (2, 3)
[[10 20]
 [30 40]
 [50 60]] (3, 2)


Matrix product is obtained using the same NumPy APIs as vector product: the `@` operator, the `dot()` method, the `np.dot()` function, and the `np.matmul()` function. In all 4 cases below, the result is a 2&times;2 matrix. For example, the first number in the resulting matrix, 220, is calculated using the first row of `c` and the first column of `d` (i.e. 220 = 1&times;10 + 2&times;30 + 3&times;50).

In [None]:
print(c @ d)
print(c.dot(d))
print(np.dot(c, d))
print(np.matmul(c, d))

[[220 280]
 [490 640]]
[[220 280]
 [490 640]]
[[220 280]
 [490 640]]
[[220 280]
 [490 640]]


The code below computes the matrix product of `c` and `d`, adds 9000 to it (i.e. adds 9000 to each element of the matrix product). The contents and shape of the result are displayed.

In [None]:
result = c @ d + 9000
print(result, result.shape)

[[9220 9280]
 [9490 9640]] (2, 2)


# Practice - Question 1 of 3

The code below creates two arrays `a1` and `a2` of random values (0.0 to 1.0), with shapes `(2, 3)` and `(5, 3)`.

In [None]:
import numpy as np
np.random.seed(42)
a1 = np.random.random((2, 3))
a2 = np.random.random((5, 3))
print(a1, a1.shape)
print(a2, a2.shape)

[[0.37454012 0.95071431 0.73199394]
 [0.59865848 0.15601864 0.15599452]] (2, 3)
[[0.05808361 0.86617615 0.60111501]
 [0.70807258 0.02058449 0.96990985]
 [0.83244264 0.21233911 0.18182497]
 [0.18340451 0.30424224 0.52475643]
 [0.43194502 0.29122914 0.61185289]] (5, 3)


## Q1a

Write code to try to compute the matrix product of `a1` and `a2` inside a `try-except` statement as:

```python
try:
    result = a1 @ a2
except Exception as e:
    print("Error:", e)
```

In [None]:
try:
    result = a1 @ a2
except Exception as e:
    print("Error:", e)

Error: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 5 is different from 3)


## Q1b

Write code to print the contents and shape of `a2.T` (i.e. the transpose of `a2`).

In [None]:

print("Contents of a2.T:")
print(a2.T)

print("Shape of a2.T:")
print(a2.T.shape)

Contents of a2.T:
[[0.05808361 0.70807258 0.83244264 0.18340451 0.43194502]
 [0.86617615 0.02058449 0.21233911 0.30424224 0.29122914]
 [0.60111501 0.96990985 0.18182497 0.52475643 0.61185289]]
Shape of a2.T:
(3, 5)


## Q1c

Write code to compute the matrix product of `a1` and `a2.T`, and keep the result in a variable called `result`. Print the contents and shape of `result`.

In [None]:

result = np.dot(a1, a2.T)

print("Matrix Product:")
print(result)

print("Shape of Result:")
print(result.shape)

Matrix Product:
[[1.28525324 0.9947397  0.64675177 0.74205833 0.88652906]
 [0.26368252 0.57840584 0.55984141 0.23912325 0.39947042]]
Shape of Result:
(2, 5)


## &#x2766;

# Practice - Question 2 of 3

The Keras program below is written for studying the operations of a sigmoid neuron. Note that you're not required to fully understand its code (in this lesson). In brief, the neuron has weights in the `weights` variable, and bias in the `bias` variable. It performs prediction from the values in the `inputs` variable and displays the predicted value.

In [None]:
# A sigmoid neuron for studying its operations
import numpy as np
from tensorflow import keras

weights = np.array([0.1, -0.2, 0.3])
bias = np.array([0.4])
inputs = np.array([5, 6, 7])

model = keras.models.Sequential([
    keras.layers.Dense(1, activation="sigmoid", input_shape=inputs.shape)
])
model.set_weights([weights.reshape((3, 1)), bias])
y_pred = model.predict(inputs.reshape((1, -1)), verbose=0)
print(y_pred)

[[0.85814893]]


## Q2a

Write a `sigmoid(x)` Python function that takes a parameter `x` to implement the sigmoid/logistic function: $$\text{sigmoid}(x) = 1 / (1 + e^{-x})$$

Use `np.exp(-x)` for $e^{-x}$.

In [None]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))



## Q2b

Write 3 lines of code to do the following:

- Compute the dot product of `weights` and `inputs`, and add `bias` to the product. Keep the result in a variable called `z`.
- Apply the `sigmoid()` function to `z`. Keep the result in a variable called `y`.
- Print the value of `y`.

In [None]:

z = np.dot(weights, inputs) + bias
y = sigmoid(z)
print(y)

[0.85814894]


## &#x2766;

# Integer array indexing

We've learned basic ways of indexing NumPy arrays using integers and slices. For instance, the example below shows an array `a` of 5 strings. `a[0]` uses an integer `0` in indexing the array and results in the first item `"a"`. `a[0:2]` uses a slice `0:2` in indexing the array and results in the first two items in a new array `["a", "b"]`.

In [None]:
import numpy as np
a = np.array(list("abcde"))  # array of 5 strings
print(a)
print(a[0])  # use an integer in indexing
print(a[0:2])  # use a slice in indexing

['a' 'b' 'c' 'd' 'e']
a
['a' 'b']


## 1D data arrays

In NumPy, *integer array indexing* uses an integer array to select arbitrary items in another array (the data array). The integer array contains indices that refer to items in the data array.

The following shows two examples of integer array indexing. Note that in `a[ix]`, `ix` is an integer array; in `a[[0, 1, 2]]`, `[0, 1, 2]` is an integer array. (Technically `ix` and `[0, 1, 2]` are lists, not arrays; but lists can be used in integer array indexing so we call them integer arrays here.) In both examples, `[0, 1, 2]` selects the items in `a` at indices 0, 1, and 2, i.e. `"a"`, `"b"`, and `"c"`.

In [None]:
ix = [0, 1, 2]
print(a[ix])  # use an integer array in indexing
print(a[[0, 1, 2]])  # use an integer array in indexing

['a' 'b' 'c']
['a' 'b' 'c']


The indices in the integer array may have repeated values, which select an item in the data array multiple times. In the integer array below, the three `0`'s select the item `"a"` at index 0 of `a` three times; the two `-1`'s select the item `"e"` in index -1 of `a` two times.

In [None]:
print(a[[0, 2, 0, 0, -1, -1]])

['a' 'c' 'a' 'a' 'e' 'e']


## 2D data arrays

When integer array indexing is applied to a multiple-dimensional data array (2D or above), the indices in the integer array refer to axis-0 of the data array. For a 2D data array, it means selecting the rows. Let's see some examples using a 2D data array with shape `(5, 10)`.

In [None]:
b = np.arange(50).reshape((5, 10))
print(b, b.shape)

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]] (5, 10)


In the code below, `b[[1, 3]]` selects the second (index `1`) row and the fourth row (index `3`) row of `b`.

In [None]:
print(b[[1, 3]])  # use an integer array in indexing

[[10 11 12 13 14 15 16 17 18 19]
 [30 31 32 33 34 35 36 37 38 39]]


Take extra care in the use of brackets, i.e. `[` and `]`, in integer array indexing. Omitting brackets would give you a totally different result, such as `b[[1, 3]]` in the above example versus `b[1, 3]` in the next example (which is the item `13` at the second row and the fourth columns).

In [None]:
print(b[1, 3])  # use two integers in indexing

13


When the integer array contains all indices of the data array exactly once (along axis-0), all items in the data array are selected once. This has the effect of reordering the items in the data array according to the index order in the integer array. For example, the following `index_order` integer array selects, in that order, the rows at indices `3`, `0`, `1`, `4`, `2` of the data array `b`.

In [None]:
index_order = [3, 0, 1, 4, 2]
print(b[index_order])

[[30 31 32 33 34 35 36 37 38 39]
 [ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [40 41 42 43 44 45 46 47 48 49]
 [20 21 22 23 24 25 26 27 28 29]]


## Sorting parallel arrays

*Parallel arrays* refer to the use two or more arrays to store a dataset such that the items at the same index of the arrays are associated to a record. They are frequently used in machine learning to store the features (prediction inputs) and labels (prediction outputs) of training/test examples.

In the example below, the `names` array contains 5 names and the `marks` array contains 5 marks. Note that names and marks are associated by the indices in the arrays: Peter gets 88 (both at index 0), Paul gets 56 (both at index 1), Mary gets 90 (both at index 2), David gets 73 (both at index 3), and Ann gets 60 (both at index 4).

In [None]:
import numpy as np
names = np.array(["Peter", "Paul", "Mary", "David", "Ann"])
marks = np.array([88, 56, 90, 73, 60])
print(names)
print(marks)

['Peter' 'Paul' 'Mary' 'David' 'Ann']
[88 56 90 73 60]


Now, let's say we want to sort the marks in the increasing order. NumPy has a `np.sort()` function for sorting an array. The code below attempts to sort both `names` and `marks` arrays using `np.sort()`. While both arrays are sorted, the association between the arrays is lost! E.g. the result indicates that Ann gets 56, but this is not correct as in the original data Ann gets 60.

In [None]:
sorted_names = np.sort(names)
sorted_marks = np.sort(marks)
print(sorted_names)
print(sorted_marks)

['Ann' 'David' 'Mary' 'Paul' 'Peter']
[56 60 73 88 90]


Sorting the arrays individually breaks the association. To maintain the association, a single integer array is used to index both arrays, as you'll do in the question below.

# Practice - Question 3 of 3

Here are the original parallel arrays again:

In [None]:
import numpy as np
names = np.array(["Peter", "Paul", "Mary", "David", "Ann"])
marks = np.array([88, 56, 90, 73, 60])

#56 goes first , index 1
#60 goes sec , index 4
#73 , index 3
#88 , index 0
#90 , index 2
print(names)
print(marks)

['Peter' 'Paul' 'Mary' 'David' 'Ann']
[88 56 90 73 60]


To maintain the association between the two arrays: (1) we need an integer array for ordering the data in both arrays; (2) the integer array should contain the indices of the `marks` array, each index once; (3) the indices in the integer array should be in the order of the increasing values in `marks`. The integer array fulfilling these is `[1, 4, 3, 0, 2]`, where `1` is the index of the smallest item 56 (i.e. `marks[1]`) in `marks`, `4` is the index of the next smallest item 60 (i.e. `marks[4]`), `3` is the index of the next smallest item 73 (i.e. `marks[3]`), followed by the index `0` of 88 (i.e. `marks[0]`) and index 2 of 90 (i.e. `marks[2]`).

This integer array can be obtained using the NumPy function `np.argsort()`. That is, calling `np.argsort(marks)` returns `[1, 4, 3, 0, 2]`.

## Q3a

Write code to do the following:

- Invoke `np.argsort(marks)` and keep the result in a variable called `order`.
- Print the contents of `order`.

In [None]:
order = np.argsort(marks)
print(order)

[1 4 3 0 2]


## Q3b

Write code to do the following:

- Use the `order` integer array to index the `names` array. Keep the result in a variable called `sorted_names`.
- Use the `order` integer array to index the `marks` array. Keep the result in a variable called `sorted_marks`.
- Print the contents of `sorted_names`.
- Print the contents of `sorted_marks`.

Note that you should see the `sorted_marks` contains the sorted marks, and the association between the names and marks are maintained, i.e. in the same way as the original data.

In [None]:

sorted_name = names[order]
sorted_marks = marks[order]

print(sorted_names)
print(sorted_marks)


['Ann' 'David' 'Mary' 'Paul' 'Peter']
[56 60 73 88 90]


## &#x2766;

# Extras

## Boolean array indexing

In NumPy, *Boolean array indexing* uses a Boolean array to decide whether or not to include each item in a data array. A data item is included once if the corresponding Boolean value in the Boolean array is `True`, or not included if it is `False`. Note that the Boolean array must have the same size as the data array (or the first dimension of the data array).

The `a` array below contains 10 integers. The `mask` array is a Boolean array obtained by `a % 2 == 0`, which gives `True` for even numbers in `a` and `False` for odd numbers in `a`. E.g. the first item `a[0]` (i.e. 10) is even so the first Boolean value `mask[0]` is `True`; `a[1]` (i.e. 11) is odd so `mask[1]` is `False`.

In [None]:
import numpy as np
a = np.arange(10, 20)
print(a)

[10 11 12 13 14 15 16 17 18 19]


In [None]:
mask = a % 2 == 0
print(mask)

[ True False  True False  True False  True False  True False]


To perform Boolean array indexing, we use the `mask` Boolean array when indexing the `a` array, i.e. `a[mask]` in the example below. The Boolean array can also be directly created and used in indexing an array, as in `a[a % 2 == 0]`. In both cases, the even numbers in `a` are obtained according to the `True` values in the Boolean arrays. The Boolean array is sometimes called a *mask*.

In [None]:
print(a[mask])
print(a[a % 2 == 0])

[10 12 14 16 18]
[10 12 14 16 18]


### E1a

You are given an array of 5 integers called `a` as follows.

In [None]:
import numpy as np
a = np.array([1, 2, 3, 4, 5])
print(a)

[1 2 3 4 5]


Write code to do the following:

- Index the `a` array using the Boolean array `[False, False, False, False, False]`, and print the result.
- Index the `a` array using the Boolean array `[True] * 5`, and print the result.
- Use a `try-except` statement and try to index the `a` array using the Boolean array `[True, True]`. Display the error for any exception that occurs.

### E1b

Boolean array indexing is also useful in manipulating parallel arrays. You're given the `names` and `marks` arrays again below.

In [None]:
import numpy as np
names = np.array(["Peter", "Paul", "Mary", "David", "Ann"])
marks = np.array([88, 56, 90, 73, 60])
print(names)
print(marks)

['Peter' 'Paul' 'Mary' 'David' 'Ann']
[88 56 90 73 60]


Write code to do the following:

- Create a Boolean array as `marks >= 70` and keep the result in a variable called `mask`. This Boolean array contains `True` for marks above or equal to 70, and `False` otherwise.
- Index the `names` array using `mask`, and keep the result in a variable called `high_names`.
- Index the `marks` array using `mask`, and keep the result in a variable called `high_marks`.
- Print the contents of `high_names`.
- Print the contents of `high_marks`.

Note that the resulting `high_names` and `high_marks` are parallel arrays containing high marks (&ge;70) and the associated names.

## Solutions to extra exercises

### E1a

In [None]:
import numpy as np
a = np.array([1, 2, 3, 4, 5])
print(a)

[1 2 3 4 5]


In [None]:
# Solution
print(a[[False, False, False, False, False]])
print(a[[True] * 5])
try:
    print(a[[True, True]])
except Exception as e:
    print("Error:", e)

[]
[1 2 3 4 5]
Error: boolean index did not match indexed array along dimension 0; dimension is 5 but corresponding boolean dimension is 2


### E1b

In [None]:
import numpy as np
names = np.array(["Peter", "Paul", "Mary", "David", "Ann"])
marks = np.array([88, 56, 90, 73, 60])
print(names)
print(marks)

['Peter' 'Paul' 'Mary' 'David' 'Ann']
[88 56 90 73 60]


In [None]:
# Solution
mask = marks >= 70
high_names = names[mask]
high_marks = marks[mask]
print(high_names)
print(high_marks)

['Peter' 'Mary' 'David']
[88 90 73]


## &#x2766;