# NB08: Lists

## Programming Fundamentals

## L.EIC/2022-23

#### João Correia Lopes$^{1}$, Pedro Vasconcelos$^{2}$, Nuno Macedo$^{1}$
$^{1}$FEUP/DEI & INESC TEC\
$^{2}$FCUP/DCC & LIACC

> “A list is only as strong as its weakest link.”

Donald E. Knuth

## Goals

By the end of this class, the student should be able to:

- Describe the use of lists, which are sequences of elements of different types

- Enumerate the main methods available to work with lists

## Bibliography

- Peter Wentworth, Jeffrey Elkner, Allen B. Downey, and Chris Meyers, *How to Think Like a Computer Scientist — Learning with Python 3* (Section 5.3) [[PDF](https://media.readthedocs.org/pdf/howtothink/latest/howtothink.pdf)]
[[HTML](http://openbookproject.net/thinkcs/python/english3e/)]

- Brad Miller and David Ranum, *Learning with Python: Interactive Edition*. Based on material by Jeffrey Elkner, Allen B. Downey, and Chris Meyers (Chapter 10) [[HTML](https://runestone.academy/runestone/books/published/thinkcspy/index.html)]


# 8 Data types: Lists

### A compound data type (recap)

- So far we have seen built-in types like `int`, `float`, `bool`, `str`, tuples and briefly lists.

- Strings, **lists**, and tuples are qualitatively different from the others because they are made up of smaller pieces.

- **Lists (and tuples) group any number of items, of different types, into a single compound value**.

- Types that comprise smaller pieces are called **collections** or **compound data types**.

- Depending on what we are doing, we may want to treat a compound data type as a single thing.

## 8.1 List values

### Lists

- A **list** is an ordered collection of values.

- The values that make up a list are called its **elements**, or its **items**.

- Lists, like tuples, are similar to strings (which are ordered collections of characters) except that the elements of a list can have any type and for any one list, the items can be of **different types**.

- Lists, tuples and strings --- and other collections that maintain the order of their items --- are called **sequences**.

- The main difference is that lists are **mutable**.

### List values

- There are several ways to create a new list.

```python
  numbers = [10, 20, 30, 40]

  words = ["spam", "bungee", "swallow"]

  stuffs = ["hello", 2.0, 5, [10, 20]]
```

- List elements can be of any type, including other lists.

- A list within another list is said to be **nested**.

- A list with no elements is called an **empty list**, and is denoted `[]`.

Try it here:

In [None]:
stuffs = ["hello", 2.0, 5, [10, 20]]
print(stuffs)
type(stuffs)

In [None]:
vocabulary = ["iteration", "selection", "control"]
numbers = [17, 123]
mixed_list = ["hello", 2.0, 5*2, [10, 20]]
empty = []

print(numbers)
print(mixed_list)
print(empty)

In [None]:
newlist = [ numbers, vocabulary ]
print(newlist)

## 8.2 Accessing elements

### Index operator

- The syntax for accessing the elements of a list is the index operator `[]`.

    - The syntax is the same as the syntax for accessing the characters of a string.

- The expression inside the brackets specifies the index.

- Remember that the indices start at 0.

- Negative numbers represent reverse indexing.

```python
  >>> numbers = [10, 20, 30, 40]
    
  >>> numbers[1]
  >>> numbers[-3]
```

$\Rightarrow$
<https://github.com/fp-leic/public/tree/master/lectures/08/lindex.py>

In [None]:
numbers = [17, 123, 87, 34, 66, 8398, 44]
print(numbers[2])
print(numbers[9 - 8])
print(numbers[-2])

### List traversal

It is common to use a loop variable as a list index as in the following:


In [None]:
horsemen = ["war", "famine", "pestilence", "death"]
for i in [0, 1, 2, 3]:
    print(horsemen[i])

But, sometimes, there is no need for indexing:

In [None]:
horsemen = ["war", "famine", "pestilence", "death"]
for h in horsemen:
    print(h)

## 8.3 List length

- The function `len` returns the length of a list, which is equal to the number of its elements.

- It is a good idea to use this value as the upper bound of a loop, as it accommodates changes in the length of the list.

```python
   horsemen = ["war", "famine", "pestilence", "death"]

   for i in range(len(horsemen)):
       print(horsemen[i])

   len(["car makers", 1, ["Ford", "Toyota", "BMW"], [1, 2, 3]])
```

Let's try it:

In [None]:
horsemen = ["war", "famine", "pestilence", "death"]

for i in range(len(horsemen)):
    print(horsemen[i])

Try the following and explain the results:

In [None]:
len(["car makers", 1, ["Ford", "Toyota", "BMW"], [1, 2, 3]])

In [None]:
a_list =  ["hello", 2.0, 5, [10, 20]]
print(len(a_list))
print(len(['spam!', 1, ['Brie', 'Roquefort', 'Pol le Veq'], [1, 2, 3]]))

## 8.4 List membership

- `in` and `not in` are Boolean operators that test membership in a sequence.

```python
  >>> horsemen = ["war", "famine", "pestilence", "death"]
  >>> "pestilence" in horsemen
  True
  >>> "debauchery" in horsemen
  False
  >>> "debauchery" not in horsemen
  True
```

$\Rightarrow$
<https://github.com/fp-leic/public/tree/master/lectures/08/students.py>

Think about the result, before trying the code:

In [None]:
a_list = [3, 67, "cat", [56, 57, "dog"], [ ], 3.14, False]
print(3.14 in a_list)
print("dog" in a_list)

### Nested loop revisited

Count how many students are taking "CompSci".

In [None]:
students = [
    ("John", ["CompSci", "Physics"]),
    ("Vusi", ["Maths", "CompSci", "Stats"]),
    ("Jess", ["CompSci", "Accounting", "Economics", "Management"]),
    ("Sarah", ["InfSys", "Accounting", "Economics", "CommLaw"]),
    ("Zuki", ["Sociology", "Economics", "Law", "Stats", "Music"])]

In [None]:
counter = 0
for name, subjects in students:
    if "CompSci" in subjects:
        counter += 1

print("The number of students taking CompSci is", counter)

## 8.5 List operations

### The `+` operator

- The `+` operator concatenates lists:

```python
   >>> a = [1, 2, 3]
   >>> b = [4, 5, 6]
   >>> c = a + b
   >>> c
   [1, 2, 3, 4, 5, 6]
```

In [None]:
first_list = [1, 2, 3]
second_list = [4, 5, 6]
both_lists = first_list + second_list
print(both_lists)

### The `*` operator

- Similarly, the `*` operator repeats a list a given number of times:

```python
   >>> [0] * 4
   [0, 0, 0, 0]
   >>> [1, 2, 3] * 3
   [1, 2, 3, 1, 2, 3, 1, 2, 3]
```

In [None]:
print([0] * 4)
print([1, 2, 3] * 3)

## 8.6 List slices

- The slice operations we saw previously with strings let us work with sublists:

```python
   >>> a_list = ["a", "b", "c", "d", "e", "f"]
   >>> a_list[1:3]
   ['b', 'c']
```

In [None]:
a_list = ["a", "b", "c", "d", "e", "f"]
a_list[:4]

In [None]:
a_list[3:]

In [None]:
a_list[:]

## 8.7 Lists are mutable

- Unlike strings and tuples, lists are **mutable**, which means we can change their elements.

- An assignment to an element of a list is called **item assignment**.

```python
   >>> fruit = ["banana", "apple", "quince"]
   >>> fruit[0] = "pear"
   >>> fruit[2] = "orange"
   >>> fruit
   ['pear', 'apple', 'orange']
```

$\Rightarrow$
<https://github.com/fp-leic/public/tree/master/lectures/08/lassign.py>

In [None]:
a_list = ["a", "b", "c", "d", "e", "f"]
a_list[1:3] = []           # delete
a_list

In [None]:
a_list = ["a", "d", "f"]
a_list[1:1] = ["b", "c"]  # insert
a_list

In [None]:
a_list[4:4] = ["e"]       # insert
a_list

## 8.8 List deletion

- Using slices to delete list elements can be error-prone.

- The `del` statement removes elements from a list.

```python
   >>> a = ["one", "two", "three"]
   >>> del a[1]
   >>> a
   ["one", "three"]
```

In [None]:
a_list = ["a", "b", "c", "d", "e", "f"]
del a_list[1:5]
a_list

## 8.9 Objects and references

### Objects and references with strings

- Since strings are *immutable*, Python optimizes resources by making two names that **refer** to the same string value refer to the same object.

```python
   >>> a = "banana"
   >>> b = "banana"
   >>> a is b
   True
```
![banana](https://raw.githubusercontent.com/fp-leic/public/main/notebooks/08/banana.png)

- `==` is for value equality: use it when you would like to know if two objects have the same value.

- `is` is for reference equality: use it when you would like to know if two references refer to the same object.

### Objects and references with lists

- But, with lists, that's not the case!

- This is because lists can change and become different.

```python
   >>> a = [1, 2, 3]
   >>> b = [1, 2, 3]
   >>> a == b
   True
   
   >>> a is b
   False
```

![lists](https://raw.githubusercontent.com/fp-leic/public/main/notebooks/08/lists.png)

$\Rightarrow$
<https://github.com/fp-leic/public/tree/master/lectures/08/references.py>

### Another example

-  a and b have the same value but do not refer to the same object

In [None]:
a = [81, 82, 83]
b = [81, 82, 83]

![references](https://raw.githubusercontent.com/fp-leic/public/main/notebooks/08/refs.png)

Try it here:

In [None]:
a == b

In [None]:
a is b

### Repetition and References

With a list, the repetition operator creates copies of the references.

In [None]:
origlist = [45, 76, 34, 55]
print(origlist * 3)

newlist = [origlist] * 3

print(newlist)

![repetition](https://raw.githubusercontent.com/fp-leic/public/main/notebooks/08/repetition.png)

In [None]:
origlist[1] = 99

print(newlist)

![alises2](https://raw.githubusercontent.com/fp-leic/public/main/notebooks/08/alias2.png)

## 8.10 Aliasing

- Since variables refer to objects, if we assign one variable to another, both variables refer to the same object.

- Although this behavior can be useful, it is sometimes unexpected or undesirable.

```python
   >>> a = [1, 2, 3]
   >>> b = a

   >>> a is b
   True

   >>> b[0] = 5
   >>> a
   [5, 2, 3]
```
![aliases](https://raw.githubusercontent.com/fp-leic/public/main/notebooks/08/lists2.png)

### Watch out for aliases!

- Because the same list has two different names, `a` and `b`, we say that it is **aliased**.
- Changes made with one alias affect the other:

In [None]:
a = [81, 82, 83]
b = a
print(b is a)
b[0] = 85
a

### Alias

![alias](https://raw.githubusercontent.com/fp-leic/public/main/notebooks/08/alias.png)

### The use of aliases

Why would you want to refer to one and the same variable by two different names?

- Ordinarily, you don't.

- However, some Python programming constructs automatically make use of aliases.

- Actually, function argument identifiers are actually an alias to the variable they represent outside the function, with the same consequences.


## 8.11 Cloning lists

- Oten we want to modify a list and also keep a copy of the original.

- The easiest way to **clone** a list is to use the slice operator.

```python
   >>> a = [1, 2, 3]
   >>> b = a[:]       # considered bad practice
   >>> c = a.copy()   # better in Python3
   
   >>> b
   [1, 2, 3]

   >>> b[0] = 5

   >>> a
   [1, 2, 3]
```

Let's try it:

In [None]:
a = [1, 2, 3]
b = a.copy()   # get a clone of a
print(b is a)  # is it the same object?
print(b)

Now we are free to make changes to `b` without worrying that we’ll inadvertently be changing `a`:

In [None]:
b[0] = 5
print("b =", b)
print("a =", a)

To see it all, go to [Python Tutor](http://www.pythontutor.com/visualize.html#code=a%20%3D%20%5B81,%2082,%2083%5D%0Ab%20%3D%20%5B81,%2082,%2083%5D%0A%0Aprint%28a%20%3D%3D%20b%29%0Aprint%28a%20is%20b%29%0A%0Ab%20%3D%20a%0Aprint%28a%20%3D%3D%20b%29%0Aprint%28a%20is%20b%29%0A%0Ab%5B0%5D%20%3D%205%0Aprint%28a%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false), now!

## 8.12 Using `zip()`

![zip](https://raw.githubusercontent.com/fp-leic/public/main/notebooks/08/zip.png)

### `zip()`

- Function `zip()` is available in the built-in namespace

- According to [the official documentation](https://docs.python.org/3/library/functions.html#zip), Python’s `zip()` function behaves as follows:

> Returns an iterator of tuples, where the i-th tuple contains the i-th element
> from each of the argument sequences or iterables.
>
> The iterator stops when the shortest input iterable is exhausted.
>
> With a single iterable argument, it returns an iterator of 1-tuples.
> With no arguments, it returns an empty iterator.

[Understanding the Python zip() Function](https://realpython.com/python-zip-function/#understanding-the-python-zip-function)

### Using `zip()`



In [None]:
  coordinate = ['x', 'y', 'z']
  value = [1, 2, 3, 4, 5]

  result = zip(coordinate, value)
  result_list = list(result)
  print(result_list)

### The inverse of `zip()`

- `zip()` is its own inverse! Provided you use the special `*` operator.

```python
    c, v =  zip(*result_list)
    print("c =", c)
    print("v =", v)
```

$\Rightarrow$
<https://github.com/fp-leic/public/tree/master/lectures/08/zip.py>

Try it yourself here:

In [None]:
l1 = ['a', 'b', 'c']
l2 = [1, 2, 3, 4]

z1 = zip(l1, l2)
print(z1)
print(type(z1))

l3 = list(z1)
print(l3)
print(type(l3))

In [None]:
c, n =  zip(*l3)
print("c =", c)
print("n =", n)

Another example:

In [None]:
letters = ['a', 'b', 'c']
numbers = [0, 1, 2]
for l, n in zip(letters, numbers):
    print(f'Letter: {l}')
    print(f'Number: {n}')

Visualise it in [Python Tutor](<http://www.pythontutor.com/visualize.html>).

## 8.13 List operations

### The built-in `sorted()`

  - All programmers will have to write code to sort items or data at some point.
  - Sorting can be critical to the user experience in your application.
  
  - `sorted(iterable, /, *, key=None, reverse=False)`
    - Return a *new list* containing all items from the iterable in ascending order.
    - A custom *key function* can be supplied to customize the sort order, and
    - the *reverse* flag can be set to request the result in descending order.
  
  - `sorted()`, with no additional arguments, is ordering the values in numbers in an ascending order, meaning smallest to largest.

$\Rightarrow$
 [Ordering Values With sorted()](https://realpython.com/python-sort/#ordering-values-with-sorted)

Let's try it:

In [None]:
students = [
    ("John", ["CompSci", "Physics"]),
    ("Vus", ["Maths", "CompSci", "Stats"]),
    ("Ann", ["CompSci", "Accounting", "Economics", "Management"]),
    ("Sarah", ["InfSys", "Accounting", "Economics", "CommLaw"]),
    ("Zuki", ["Sociology", "Economics", "Law", "Stats", "Music"])]

In [None]:
sorted(students)

In [None]:
sorted(students, reverse=True)

In [None]:
def reverse_tuple(t):
    return t[0][::-1]

sorted(students, key=reverse_tuple)

### More list pperations

- See the Python Standard Library for:

  - a comprehensive list of "Common Sequence Operations":
    [[PSL](https://docs.python.org/3.8/library/stdtypes.html#common-sequence-operations)];

  - a comprehensive list of operations on "Mutable Sequence Types":
    [[PSL](https://docs.python.org/3.8/library/stdtypes.html#mutable-sequence-types)].

# Further reading

### Python Lists

 Python Tutorial || Learn Python Programming -- Socratica

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('ohCDWZgNIU0')

-- João Correia Lopes & Pedro Vasconcelos