### CS102/CS103

Prof. Götz Pfeiffer<br />
School of Mathematics, Statistics and Applied Mathematics<br />
NUI Galway

# Lecture 8: Lists. And Loops

We have seen some types of data that really are **collections** of
data, rather than singletons:  

* a string is an ordered collection of letters;

* a range represents a collection of integers.

And in the case of strings we have seen certain operations,
like indexing, slicing and concatenation, which make sense
for general ordered collections.  In this lecture we'll 
discuss `python`'s data type for general ordered collections:
`list`.  And we'll see how lists collaborating with definite loops
yield some powerful programming tools.

Before that, let's look back once more at Euclid's algorithm
for the computation of the greatest common divisor of two
integers.

## Euclid's Algorithm, Extended

Any pair of integers $a$ and $b$ has the following property.
If $d = \gcd(a, b)$, then by [Bezout's Lemma](https://en.wikipedia.org/wiki/B%C3%A9zout%27s_identity), there are
integers $x$ and $y$ such that
$$d = xa + yb.$$
This is an interesting property, because in the case $d = 1$,
modulo $a$ the equation becomes
$$1 = yb,$$
showing that $y$ is the **modular inverse** of $b$ (modulo $a$).


These numbers $x$ and $y$ can be determined by running
a $\gcd$ calculation backwards:  
Suppose we know numbers $x$ and $y$ such that
$$\gcd(b, a \% b) = x b + y (a \% b)$$
then
$$\gcd(a, b) = y a + (x - \lfloor a / b \rfloor y) b$$
as $\gcd(a, b) = \gcd(b, a \% b)$ and
$a - \lfloor a / b \rfloor b = a \% b$.
This process is called the
**extended Euclidean algorithm** and it can be implemented as
follows.

In [1]:
def egcd(a, b):
    "find integers  x, y  such that  gcd(a,b) = x*a + y*b"
    if b == 0:
        return (1, 0)
    x, y = egcd(b, a % b)
    x, y = y, x - (a // b) * y
#    print(x * a + y * b, "=", x, "*", a, "+", y, "*", b)
    return (x, y)

egcd(352, 123)

(-29, 83)

Check the result:

In [2]:
-29 * 352 + 83 * 123

1

We can use this algorithm to compute modular inverses:

In [3]:
def print_modular_inverse(a, m):
    "compute the modular inverse of a modulo m"
    x, y = egcd(m, a)
    d = x * m + y * a
    if d != 1:
        print(a, "has no inverse mod", m, ": d = ", d)
    else:
        print(a, "^-1 = ", y, "mod", m)

def print_all_modular_inverses(m):
    for a in range(m):
        print_modular_inverse(a, m)
        
print_all_modular_inverses(30)

0 has no inverse mod 30 : d =  30
1 ^-1 =  1 mod 30
2 has no inverse mod 30 : d =  2
3 has no inverse mod 30 : d =  3
4 has no inverse mod 30 : d =  2
5 has no inverse mod 30 : d =  5
6 has no inverse mod 30 : d =  6
7 ^-1 =  13 mod 30
8 has no inverse mod 30 : d =  2
9 has no inverse mod 30 : d =  3
10 has no inverse mod 30 : d =  10
11 ^-1 =  11 mod 30
12 has no inverse mod 30 : d =  6
13 ^-1 =  7 mod 30
14 has no inverse mod 30 : d =  2
15 has no inverse mod 30 : d =  15
16 has no inverse mod 30 : d =  2
17 ^-1 =  -7 mod 30
18 has no inverse mod 30 : d =  6
19 ^-1 =  -11 mod 30
20 has no inverse mod 30 : d =  10
21 has no inverse mod 30 : d =  3
22 has no inverse mod 30 : d =  2
23 ^-1 =  -13 mod 30
24 has no inverse mod 30 : d =  6
25 has no inverse mod 30 : d =  5
26 has no inverse mod 30 : d =  2
27 has no inverse mod 30 : d =  3
28 has no inverse mod 30 : d =  2
29 ^-1 =  -1 mod 30


## Lists

Often data come as collections, or lists, of data of 
the same or even of different type.  `python` has a `list` data
type to cater for those

In [4]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [5]:
list("hello")

['h', 'e', 'l', 'l', 'o']

In [6]:
[]

[]

### List Literals

* A **list literal** is a **comma-separated** sequence of values,
enclosed in a pair of **square brackets**.

In [7]:
items = ["Physics", "Maths", 2017, 2018, 3.1415, range(10)]
items

['Physics', 'Maths', 2017, 2018, 3.1415, range(0, 10)]

* The items in a list need not all have the same type.

### Accessing List Items

The length of a list, i.e., the number of items it contains,
is determined by the `len()` function, as for strings.

In [8]:
len(items)

6

**Indexing** and **slicing** works for list as it does for strings:

In [9]:
items[1]

'Maths'

* indices start at $0$

In [10]:
items[-3]

2018

* negative indices count from the end of the list

In [11]:
items[3:5]

[2018, 3.1415]

* a **slice** is determined by two indices.

In [12]:
items[:]  # make a copy of the list

['Physics', 'Maths', 2017, 2018, 3.1415, range(0, 10)]

* omitted slice indices default to the beginning and end of the list.

### Updating Lists

In [13]:
items[1] = 'Chemistry'
items

['Physics', 'Chemistry', 2017, 2018, 3.1415, range(0, 10)]

In [14]:
items[1:4] = 'Biology', 2019
items

['Physics', 'Biology', 2019, 3.1415, range(0, 10)]

In [15]:
items[2:3] = ['a', 'c', 'e']
items

['Physics', 'Biology', 'a', 'c', 'e', 3.1415, range(0, 10)]

In [16]:
items[2] = [1,2,3]
items

['Physics', 'Biology', [1, 2, 3], 'c', 'e', 3.1415, range(0, 10)]

In [17]:
len(items)

7

* Assigning values to non-existant list positions like
```python
items[7] = "new"
```
can lead to error messages.  The `append()` and `extend()` methods
can be used to extend a list at its end.

In [18]:
items = ["hearts"]        # a (singleton) list
items.append("diamonds")  # extended by one item
items

['hearts', 'diamonds']

In [19]:
items + ["clubs", "spades"]

['hearts', 'diamonds', 'clubs', 'spades']

In [20]:
items

['hearts', 'diamonds']

In [21]:
items.extend(["clubs", "spades"])  # extended by another sequence
items

['hearts', 'diamonds', 'clubs', 'spades']

### Deleting Items from a List

In [22]:
items.remove('diamonds')
items

['hearts', 'clubs', 'spades']

### Membership test

In [23]:
'clubs' in items

True

In [24]:
'diamonds' in items

False

In [25]:
if 'diamonds' in items:
    items.remove('diamonds')
    
items

['hearts', 'clubs', 'spades']

In [26]:
if 'clubs' in items:
    items.remove('clubs')
    
items

['hearts', 'spades']

* There are lots of other methods that apply to lists, we'll see
some as we go along.

## Lists and Loops

**Definite loops** (aka **`for` loops**) work well with lists.
Such a `for` statement has the general form
```
for <var> in <list>:
    <body>
```
It consists of a **heading** and a **body**
The heading, between the keyword `for` and the colon (`:`) introduces a **loop variable** `<var>`
and applies to a list object `<list>`.
The `<body>` is a list of statements.

To execute a `for` statement means to execute its `<body>` of
statements once for each item of the list,
using that item for one iteration as value of the variable `<var>`.

In [27]:
for item in items:
    print(item)

hearts
spades


### Accumulators

A widespread pattern makes use of an accumulator…

For example, suppose we want to know the sum $$\sum_{i=1}^{10} i$$ of the numbers
from $1$ to $10$.

One way to implement this as a `python` program is to
**initialize** an **accumulator variable** `total`, say,
with a value of $0$, and then use a loop over the numbers in
question, and add each number to the accumulator.

In [28]:
total = 0
for i in [1,2,3,4,5,6,7,8,9,10]:
    total = total + i
    
total

55

This looks useful, let's turn it into a function. We use the fact
that the list `[1,...,n]` in `python` is represented by 
`range(1, n+1)`, the range of numbers from $1$ up to but not including $n+1$.

Also, there is a shorthand for the accumulator sum
```python
total = total + i
```
which can be written as
```python
total += i
```

In [29]:
def sum_up_to(n):
    "compute the sum of 1, ..., n"
    total = 0
    for i in range(1, n+1):
        total += i
    
    return total

In [30]:
sum_up_to(10)

55

Now that we have the function, we can use it a couple of times, 
for instance to study the question: How does this sum of numbers behave as the argument `n` grows?

In [31]:
for n in range(10):
    print(n, "->", sum_up_to(n))

0 -> 0
1 -> 1
2 -> 3
3 -> 6
4 -> 10
5 -> 15
6 -> 21
7 -> 28
8 -> 36
9 -> 45


Hmm.  Is that not the same as $n \mapsto \tfrac12 n (n+1)$?  Let's see.

In [32]:
for n in range(10):
    print(n, '->', n * (n+1) // 2 )

0 -> 0
1 -> 1
2 -> 3
3 -> 6
4 -> 10
5 -> 15
6 -> 21
7 -> 28
8 -> 36
9 -> 45


The **product** 
$$n! = \prod_{i=1}^n i$$
of all numbers from $1$ to $n$ is called the **factorial** of $n$.

In [33]:
def factorial(n):
    "compute the factorial of n"
    total = 1
    for i in range(1,n+1):
        total *= i
    return total

In [34]:
factorial(5)

120

Factorials become very large very quickly.

In [35]:
factorial(49)

608281864034267560872252163321295376887552831379210240000000000

###  Prime Sieving

Another algorithm you might have seen before is
[Eratosthenes Sieve method](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes) for finding all prime numbers
up to a given limit, say $1000$.

There you start with the list of numbers from $2$ to $1000$
($1$ is not a prime).  

Then $p = 2$ is the first number in the list,
remove all multiples $ap$ with $a > 1$ from the list.

The next remaining number on the list is $p = 3$,
remove all its multiples $ap$ with $a > 1$ from the list.

Continue in this way until the end of the list is reached.
The list of surviving numbers is the list of primes up to $1000$.

A moment's reflection yields that the
smallest multiple of $p$ still in the list is $p^2$.
The list of multiples to consider is represented
in `python` by
```python
range(p*p, 1000, p)
```
a range of numbers, starting at $p^2$, up to
at most $1000$, increasing by $p$ at every step.

In [36]:
list(range(23*23,1000,23))

[529,
 552,
 575,
 598,
 621,
 644,
 667,
 690,
 713,
 736,
 759,
 782,
 805,
 828,
 851,
 874,
 897,
 920,
 943,
 966,
 989]

Here's the `python` code that provides a naive (but working) implementation of this strategy:

In [37]:
limit = 1000
primes = list(range(2, limit + 1))
for p in primes:
    for ap in range(p*p, limit + 1, p):
        if ap in primes:
            primes.remove(ap)
            
for p in primes:
    print(p, end=', ')
            

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997, 

Function version.

In [38]:
def primes_up_to(limit):
    "compute a list of primes up to limit"
    primes = list(range(2, limit + 1))
    for p in primes:
        for ap in range(p*p, limit, p):
            if ap in primes:
                primes.remove(ap)
                
    return primes

primes = primes_up_to(1000)
for p in primes:
    print(p, end=', ')


2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 241, 251, 257, 263, 269, 271, 277, 281, 283, 293, 307, 311, 313, 317, 331, 337, 347, 349, 353, 359, 367, 373, 379, 383, 389, 397, 401, 409, 419, 421, 431, 433, 439, 443, 449, 457, 461, 463, 467, 479, 487, 491, 499, 503, 509, 521, 523, 541, 547, 557, 563, 569, 571, 577, 587, 593, 599, 601, 607, 613, 617, 619, 631, 641, 643, 647, 653, 659, 661, 673, 677, 683, 691, 701, 709, 719, 727, 733, 739, 743, 751, 757, 761, 769, 773, 787, 797, 809, 811, 821, 823, 827, 829, 839, 853, 857, 859, 863, 877, 881, 883, 887, 907, 911, 919, 929, 937, 941, 947, 953, 967, 971, 977, 983, 991, 997, 1000, 

### Summary: Lists and Loops

* The `list` data type allows `python` programs to work with
ordered collections of data.

* A list literal is a comma-separated list of values
(or expressions), enclosed in square brackets.

* List items can be accessed by indexing and slicing.

* List items or slices can be updated by suitable
assignment statements.

* List can be extended, by single items or other lists.

* Items can be removed from a list.

* The `in` operator tests membership of an item in a list.

* A `for` loop iterates over the items in a list
and executes a body of statements, once for each item

* The accumulator pattern uses an accumulator variable
and a loop over a list to compute a value by suitable
updates of the accumulator variable.

* Updating a variable through arithmetical modifications
is supported by a special syntax (`+=`, ...)