<a href="https://colab.research.google.com/github/ChenZijiSubset/coding_class/blob/main/lecture2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Python

# Lecture 2

## Learning objectives:

At the end of this lecture, you will be able to:

* Modify elements in a `list`.
* Iterate through different combinations of lists.
* Use a `tuple` to store data elements and understand how it differs from a `list`.
* Explain the difference between locally-scoped and globally-scoped variables.
* Use an `if`-statement to execute some code blocks conditionally.
* Perform computations using [Numerical Python (*NumPy*)](http://www.numpy.org/).
* Handle multidimensional arrays.

# Changing elements in a list
Let's say we want to add 2 to all the numbers in a list `v`. To do that, we have to use an index to access and modify its elements:

In [24]:
v = [-1, 1, 10]
print(f"List before modification: {v=}")

v[1] = 4  # assign 4 to the 2nd element (index 1) in v
print(f"List after modification: {v=}")

List before modification: v=[-1, 1, 10]
List after modification: v=[-1, 4, 10]


Please note how we used `{v=}` in f-string to add `v=` in front of the list.

Now, to add 2 to all values we need a `for` loop over indices:

In [25]:
v = [-1, 1, 10]
for i in range(len(v)):
    v[i] = v[i] + 2
print(v)

[1, 3, 12]


Note that this time we iterate over the indices of the list elements

```python
for i in range(len(v)):
    ...
```

instead of iterating over the values of elements in the list

```python
for e in v:
    ...
```

## `enumerate` built-in

As we have seen previouly, we often need to use both the value of an element in a sequence and its index. Python provides a convenience built-in function `enumerate` to make this syntax clearer:

In [26]:
v = [-1, 1, 10]
for index, value in enumerate(v):
    v[index] = value + 2

print(v)

[1, 3, 12]


## Traversing multiple lists simultaneously: `zip(list1, list2, ...)`
Let us consider how we can loop over elements in both `Cdegrees` and `Fdegrees` at the same time. One approach would be to use list indices:

In [27]:
# First, we have to recreate the data from lecture 1.
Cdegrees = [deg for deg in range(-20, 41, 5)]
Fdegrees = [(9/5)*deg + 32 for deg in Cdegrees]

for i in range(len(Cdegrees)):
    print(Cdegrees[i], Fdegrees[i])

-20 -4.0
-15 5.0
-10 14.0
-5 23.0
0 32.0
5 41.0
10 50.0
15 59.0
20 68.0
25 77.0
30 86.0
35 95.0
40 104.0


An alternative construct, regarded as more ”Pythonic”, uses the `zip` built-in function:

In [28]:
for C, F in zip(Cdegrees, Fdegrees):
    print(C, F)

-20 -4.0
-15 5.0
-10 14.0
-5 23.0
0 32.0
5 41.0
10 50.0
15 59.0
20 68.0
25 77.0
30 86.0
35 95.0
40 104.0


Using `zip`, we can also traverse three or more lists simultaneously:

In [29]:
l1 = [3, 6, 1]
l2 = [1, 1, 0]
l3 = [9, 3, 2]

for e1, e2, e3 in zip(l1, l2, l3):
    print(f"{e1 = }, {e2 = }, {e3 = }")

e1 = 3, e2 = 1, e3 = 9
e1 = 6, e2 = 1, e3 = 3
e1 = 1, e2 = 0, e3 = 2


If the lists are of unequal length, then the loop stops when we reach the end of the shortest list. Experiment with this:

In [30]:
l1 = [3, 6, 1, 4, 6]  # len(l1) == 5
l2 = [1, 1, 0, 7]     # len(l2) == 4
l3 = [9, 3, 2, 0, 9]  # len(l3) == 5

for e1, e2, e3 in zip(l1, l2, l3):
    print(f"{e1 = }, {e2 = }, {e3 = }")

e1 = 3, e2 = 1, e3 = 9
e1 = 6, e2 = 1, e3 = 3
e1 = 1, e2 = 0, e3 = 2
e1 = 4, e2 = 7, e3 = 0


## Nested lists: list of lists
A `list` can contain **any** object as its element, including another `list`. To illustrate this, consider storing the conversion table as a single Python list rather than two separate lists:

In [31]:
Cdegrees = [C for C in range(-20, 41, 5)]
Fdegrees = [(9/5)*C + 32 for C in Cdegrees]
table1 = [Cdegrees, Fdegrees]  # List of two lists

print(f"{table1 = }")
print(f"{table1[0] = }")  # access the first element of list table1 - Cdegrees list
print(f"{table1[1] = }")  # access the second element of list table1 - Fdegrees list
print(f"{table1[1][3] = }")  # access 4th element in the 2nd list

table1 = [[-20, -15, -10, -5, 0, 5, 10, 15, 20, 25, 30, 35, 40], [-4.0, 5.0, 14.0, 23.0, 32.0, 41.0, 50.0, 59.0, 68.0, 77.0, 86.0, 95.0, 104.0]]
table1[0] = [-20, -15, -10, -5, 0, 5, 10, 15, 20, 25, 30, 35, 40]
table1[1] = [-4.0, 5.0, 14.0, 23.0, 32.0, 41.0, 50.0, 59.0, 68.0, 77.0, 86.0, 95.0, 104.0]
table1[1][3] = 23.0


This gives us a table with two rows. How do we create a table of columns instead:

In [32]:
table2 = []
for C, F in zip(Cdegrees, Fdegrees):
    row = [C, F]
    table2.append(row)

print(table2)

[[-20, -4.0], [-15, 5.0], [-10, 14.0], [-5, 23.0], [0, 32.0], [5, 41.0], [10, 50.0], [15, 59.0], [20, 68.0], [25, 77.0], [30, 86.0], [35, 95.0], [40, 104.0]]


We can also use list comprehension to do this more elegantly:

In [33]:
table2 = [[C, F] for C, F in zip(Cdegrees, Fdegrees)]
print(table2)

[[-20, -4.0], [-15, 5.0], [-10, 14.0], [-5, 23.0], [0, 32.0], [5, 41.0], [10, 50.0], [15, 59.0], [20, 68.0], [25, 77.0], [30, 86.0], [35, 95.0], [40, 104.0]]


And we can loop through this list as before:

In [34]:
for C, F in table2:
    print(C, F)

-20 -4.0
-15 5.0
-10 14.0
-5 23.0
0 32.0
5 41.0
10 50.0
15 59.0
20 68.0
25 77.0
30 86.0
35 95.0
40 104.0


Since elements of `table2` are length-2 lists, in each iteration, we *unpack* each of the length-2 elements to `C` and `F`.

## Tuples: lists that cannot be changed

Tuples are **constant** lists, i.e. we can use them in much the same way as lists except we cannot modify them. They are an example of an [**immutable**](http://en.wikipedia.org/wiki/Immutable_object) type.

In [35]:
t = (2, 4, 6, "temp.pdf")  # Define a tuple.
t = 2, 4, 6, "temp.pdf"    # Can skip parenthesis as it is assumed in this context.

Let us see what happens when we try to modify the tuple like we did with a list:

```python
t[1] = -1

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-593c03edf054> in <module>()
----> 1 t[1] = -1

TypeError: 'tuple' object does not support item assignment
```

```python
t.append(0)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-19-78592bf72d62> in <module>()
----> 1 t.append(0)

AttributeError: 'tuple' object has no attribute 'append'
```

```python
del t[1]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-20-0193a527a912> in <module>()
----> 1 del t[1]

TypeError: 'tuple' object doesn't support item deletion
```

However, we can use the tuple to compose a new tuple:

In [36]:
t = t + (-1.0, -2.0)
print(t)

(2, 4, 6, 'temp.pdf', -1.0, -2.0)


So, why would we use tuples when lists have more functionality?

* Tuples are constant and thus *protected against accidental changes*.
* Tuples are *faster* than lists.
* Tuples are *widely used* in Python software (so you need to know about tuples to understand other people's code!)
* Tuples (but not lists) are hashable and can be used as *keys in dictionaries* (more about dictionaries later).

**WARNING**: Tuples are actually not always immutable. If a tuple contains mutable elements, e.g. `list`, it is possible to change the tuple. Let us have a look at this example:

In [37]:
t = (1, 2, [3, 4])
t[2].append(5)  # we are appending 5 to the list which the tuple holds a reference of
print(t)

(1, 2, [3, 4, 5])


Therefore, to ensure the tuple is immutable, it is necessary for it to contain only immutable elements. The best way to check if the tuple is actually mutable is to use the `hashable` function.

In [38]:
t = (1, 2, [3, 4])  # tuple contains a list which is a mutable type
try:
    hash(t)
    print(f"Tuple {t = } is immutable.")
except TypeError:
    print(f"Tuple {t = } is mutable.")

Tuple t = (1, 2, [3, 4]) is mutable.


In [39]:
t = (1, 2, "abc", (5, 6))  # tuple contains only immutable types
try:
    hash(t)
    print(f"Tuple {t = } is immutable.")
except TypeError:
    print(f"Tuple {t = } is mutable.")

Tuple t = (1, 2, 'abc', (5, 6)) is immutable.


What does it then mean that some types are mutable and some are not? We will talk about this in lecture 4.

## The `if` construct
Let us consider we need to program the following function:
$$
f(x)=
\begin{cases}
    \sin(x),& \text{if } 0 \leq x \leq \pi\\
    0,              & \text{otherwise}
\end{cases}
$$
To do this, we need the `if` construct:

In [40]:
from math import sin, pi


def f(x):
    if 0 <= x <= pi:
        return sin(x)
    else:
        return 0


print(f"{f(-pi/2) = }")
print(f"{f(pi/2) = }")
print(f"{f(3*pi/2) = }")

f(-pi/2) = 0
f(pi/2) = 1.0
f(3*pi/2) = 0


Please note the indentations we used to define which statements belong to which condition. Sometimes, it is clearer to write this as a conditional expression:

In [41]:
def f(x):
    return sin(x) if 0 <= x <= pi else 0


print("f(-pi/2) =", f(-pi/2))
print("f(pi/2) =", f(pi/2))
print("f(3*pi/2) =", f(3*pi/2))

f(-pi/2) = 0
f(pi/2) = 1.0
f(3*pi/2) = 0


The `else` block can be skipped if there are no statements to be executed when `False`. In general, we can put together multiple conditions. Only the first condition that is `True` is executed.

```python
if condition1:
    <block of statements, executed if condition1 is True>
elif condition2:
    <block of statements, executed if condition1 is False and condition2 is True>
elif condition3:
    <block of statements, executed if conditions 1 and 2 are False and condition3 is True>
else:
    <block of statements, executed if conditions 1, 2, and 3 are False>
    
<next statement of the program>
```

## Vectors and arrays

You have known **vectors** since high school mathematics, e.g. point $(x, y)$ in the plane, point $(x, y, z)$ in space. In general, we can describe a vector $v$ as an $n$-tuple of numbers: $v=(v_0, \ldots, v_{n-1})$. One way to store vectors in Python is by using sequences, e.g. *lists* or *tuples*: $v_i$ is stored as `v[i]`.

**Arrays** are a generalisation of vectors where we can have multiple indices: $A_{ij}$, $A_{ijk}$. In Python, this is represented as a nested list, accessed as `A[i][j]`, `A[i][j][k]`.

**Example**: Matrices, a table of numbers with one index for the row and one for the column
$$
\left\lbrack\begin{array}{cccc}
0 & 12 & -1 & 5q\cr
11 & 5 & 5 & -2
\end{array}\right\rbrack
\hspace{1cm}
A =
\left\lbrack\begin{array}{ccc}
A_{0,0} & \cdots &  A_{0,n-1}\cr
\vdots & \ddots &  \vdots\cr
A_{m-1,0} & \cdots & A_{m-1,n-1}
\end{array}\right\rbrack
$$
The number of indices in an array is the *number of dimensions*. Using these terms, a vector can be described as a one-dimensional array or dimension-1 array.

In practice, we use [Numerical Python (*NumPy*)](http://www.numpy.org/) arrays instead of lists to represent mathematical arrays because it is **much** faster for large arrays.

Let us consider an example where we store $(x,y)$ points along a curve in Python lists and numpy arrays:

In [42]:
# Sample function
def f(x):
    return x**3


# Generate n points in [0, 1]
n = 5
dx = 1 / (n-1)  # x spacing

X = [i*dx for i in range(n)]  # Python list
Y = [f(x) for x in X]

# Turn these Python lists into Numerical Python (NumPy) arrays:
import numpy as np  # as a convention, we import "numpy as np"

x2 = np.array(X)
y2 = np.array(Y)

Instead of first making lists with $x$ and $y = f (x)$ data, and then turning lists into arrays, we can make NumPy arrays
directly:

In [43]:
n = 5                        # number of points
x2 = np.linspace(0, 1, n)    # generates n points between 0 and 1
y2 = np.zeros(n)
for i in range(n):
    y2[i] = f(x2[i])

List comprehensions create lists, not arrays, but we can do:

In [44]:
y2 = np.array([f(xi) for xi in x2])  # list -> array

Passing a list as an argument to some other function (like `np.array` in this case) is very common. Therefore, Python allows to omit the square brackets `[]`. This results in passing a *generator expression* which is both faster and more memory efficient.

In [45]:
y2 = np.array(f(xi) for xi in x2)

Since this is not the topic of this introduction to Python lecture series, if you would like to understand more, please refer to [PEP289](https://peps.python.org/pep-0289/).

### When and where to use NumPy arrays

* Python lists can hold any sequence of any Python objects. However, NumPy arrays can only hold objects of the same type. We refer to NumPy arrays as flat sequences, whereas we refer to lists and tuples as containers (or container sequences).
* Arrays are most efficient when the elements are basic number types (*float*, *int*, *complex*).
* In that case, arrays are stored efficiently in the computer's memory, and we can compute very efficiently with the array elements.
* We can compute mathematical operations on whole arrays without loops in Python. For example,

In [46]:
import math

x = np.linspace(0, 2, 10001)
y = np.zeros(10001)
for i in range(len(x)):
    y[i] = math.sin(x[i])

can be coded as

In [47]:
y = np.sin(x)

In the latter case, the loop over all elements is now performed in an efficient C-function. Instead of using Python `for`-loops, operations on whole arrays are called vectorisation, and they are a very **convenient**, **efficient**, and therefore an **important** programming technique to master.

Let us consider a simple vectorisation example: a loop to compute $x$ coordinates (`x2`) and $y=f(x)$ coordinates (`y2`) along a function curve:

In [48]:
x2 = np.linspace(0, 1, n)
y2 = np.zeros(n)
for i in range(n):
    y2[i] = f(x2[i])

This computation can be replaced by:

In [49]:
x2 = np.linspace(0, 1, n)
y2 = f(x2)

The advantage of this approach is:

* There is no need to allocate space for `y2` (via the NumPy *zeros* function).
* There is no need for a loop.
* It is *much faster*.

## How vectorised functions work
Consider the function

In [50]:
def f(x):
    return x**3

$f(x)$ is intended for a number $x$, i.e. a *scalar*. So, what happens when we call `f(x2)`, where `x2` is a NumPy array? **The function evaluates $x^3$ for an array $x$**. NumPy supports arithmetic operations on arrays, which correspond to the equivalent operations on each element. For example,

In [51]:
r1 = x**3                   # x[i]**3 for all i
r2 = np.cos(x)              # cos(x[i]) for all i
r3 = x**3 + x*np.cos(x)     # x[i]**3 + x[i]*cos(x[i]) for all i
r4 = x/3*np.exp(-x*0.5)     # x[i]/3*exp(-x[i]*0.5) for all i

In each of these cases, a highly optimised C-function is actually called to evaluate the expression. In this example, the `cos` function called for an `array` is imported from NumPy rather than from the `math` module which only acts on scalars.

Notes:

* Functions that can operate on arrays are called **vectorised functions**.
* Vectorisation is the process of turning a non-vectorised expression/algorithm into a vectorised expression/algorithm.
* Mathematical functions in Python automatically work for both scalar and array (vector) arguments, i.e. no vectorisation is needed by the programmer.

### Watch out for references vs. copies of arrays!
Consider this code:

In [52]:
a = x
a[-1] = 42
print(x[-1])

42.0


Notice what happened here - we changed a value in `a`, but the corresponding value in `x` was also changed! This is because `a` refers to the same array as `x`. If we want a separate copy of `x`, then we have to make an explicit copy:

In [53]:
a = x.copy()

## Generalised array indexing

We can select a slice of an array using `a[start:stop:inc]`, where the slice `start:stop:inc` implies a set of indices starting from `start`, up to `stop` in increments `inc`. Any integer list or array can be used to indicate a set of indices:

In [54]:
a = np.linspace(1, 8, 8)
print(a)

[1. 2. 3. 4. 5. 6. 7. 8.]


In [55]:
a[[1, 6, 7]] = 10  # i.e. set the elements with indicies 1, 6, and 7 in the array to 10.
print(a)

[ 1. 10.  3.  4.  5.  6. 10. 10.]


In [56]:
a[range(2, 8, 3)] = -2   # same as a[2:8:3] = -2
print(a)

[ 1. 10. -2.  4.  5. -2. 10. 10.]


Even boolean expressions can be used to select part of an array(!)

In [57]:
print(a[a < 0])  # pick out all negative elements

[-2. -2.]


In [58]:
a[a < 0] = a.max()  # if a[i]<0, set a[i]=10
print(a)

[ 1. 10. 10.  4.  5. 10. 10. 10.]


## 2D arrays
When we have a table of numbers,

$$
\left\lbrack\begin{array}{cccc}
0 & 12 & -1 & 5\cr
-1 & -1 & -1 & 0\cr
11 & 5 & 5 & -2
\end{array}\right\rbrack
$$

(i.e. a *matrix*) it is natural to use a two-dimensional array $A_{i, j}$ with one index for the rows and one for the columns:

$$
A =
\left\lbrack\begin{array}{ccc}
A_{0,0} & \cdots &  A_{0,n-1}\cr
\vdots & \ddots &  \vdots\cr
A_{m-1,0} & \cdots & A_{m-1,n-1}
\end{array}\right\rbrack
$$

Let us recreate this array using NumPy:

In [59]:
A = np.zeros((3, 4))  # we create a 2-dimensional (3 x 4) array filled with zeros

A[0, 0] = 0
A[1, 0] = -1
A[2, 0] = 11

A[0, 1] = 12
A[1, 1] = -1
A[2, 1] = 5

A[0, 2] = -1
A[1, 2] = -1
A[2, 2] = 5

# we can also use the same syntax that we used for nested lists

A[0][3] = 5
A[1][3] = 0
A[2][3] = -2

print(A)

[[ 0. 12. -1.  5.]
 [-1. -1. -1.  0.]
 [11.  5.  5. -2.]]


Next, let us create a nested list and then convert into a 2D array:

In [60]:
Cdegrees = range(0, 101, 10)
Fdegrees = [9/5*C + 32 for C in Cdegrees]
table = [[C, F] for C, F in zip(Cdegrees, Fdegrees)]  # create a nested list
print(table)

[[0, 32.0], [10, 50.0], [20, 68.0], [30, 86.0], [40, 104.0], [50, 122.0], [60, 140.0], [70, 158.0], [80, 176.0], [90, 194.0], [100, 212.0]]


In [61]:
# Convert this nested list into a NumPy array:
table2 = np.array(table)
print(table2)

[[  0.  32.]
 [ 10.  50.]
 [ 20.  68.]
 [ 30.  86.]
 [ 40. 104.]
 [ 50. 122.]
 [ 60. 140.]
 [ 70. 158.]
 [ 80. 176.]
 [ 90. 194.]
 [100. 212.]]


To see the number of elements in each dimension we ask for array's `shape`:

In [62]:
print(table2.shape)

(11, 2)


i.e. our table has 11 rows and 2 columns.

Let us write a loop over all array elements of A:

In [63]:
for i in range(table2.shape[0]):
    for j in range(table2.shape[1]):
        print(f"table2[{i}, {j}] = {table2[i, j]}")

table2[0, 0] = 0.0
table2[0, 1] = 32.0
table2[1, 0] = 10.0
table2[1, 1] = 50.0
table2[2, 0] = 20.0
table2[2, 1] = 68.0
table2[3, 0] = 30.0
table2[3, 1] = 86.0
table2[4, 0] = 40.0
table2[4, 1] = 104.0
table2[5, 0] = 50.0
table2[5, 1] = 122.0
table2[6, 0] = 60.0
table2[6, 1] = 140.0
table2[7, 0] = 70.0
table2[7, 1] = 158.0
table2[8, 0] = 80.0
table2[8, 1] = 176.0
table2[9, 0] = 90.0
table2[9, 1] = 194.0
table2[10, 0] = 100.0
table2[10, 1] = 212.0


Alternatively:

In [64]:
for index_tuple, value in np.ndenumerate(table2):
    print(f"index {index_tuple} has value {value}")

index (0, 0) has value 0.0
index (0, 1) has value 32.0
index (1, 0) has value 10.0
index (1, 1) has value 50.0
index (2, 0) has value 20.0
index (2, 1) has value 68.0
index (3, 0) has value 30.0
index (3, 1) has value 86.0
index (4, 0) has value 40.0
index (4, 1) has value 104.0
index (5, 0) has value 50.0
index (5, 1) has value 122.0
index (6, 0) has value 60.0
index (6, 1) has value 140.0
index (7, 0) has value 70.0
index (7, 1) has value 158.0
index (8, 0) has value 80.0
index (8, 1) has value 176.0
index (9, 0) has value 90.0
index (9, 1) has value 194.0
index (10, 0) has value 100.0
index (10, 1) has value 212.0


We can also extract slices from multi-dimensional arrays as before. For example, extract the second column:

In [65]:
print(table2[:, 1])  # 2nd column (index 1)

[ 32.  50.  68.  86. 104. 122. 140. 158. 176. 194. 212.]


Play with this more complicated example:

In [66]:
t = np.linspace(1, 30, 30).reshape(5, 6)
print(t)

[[ 1.  2.  3.  4.  5.  6.]
 [ 7.  8.  9. 10. 11. 12.]
 [13. 14. 15. 16. 17. 18.]
 [19. 20. 21. 22. 23. 24.]
 [25. 26. 27. 28. 29. 30.]]


In [67]:
print(t[1:-1:2, 2:])

[[ 9. 10. 11. 12.]
 [21. 22. 23. 24.]]
