# Lists



_(c) 2022, Mark van den Brand and Lina Ochoa Venegas, Eindhoven University of Technology_

## Table of Contents

- [1. A Python List is an Array](#history)
- [2. Lists versus Strings](#lists-vs-strings)
- [3. Creating a List](#creating-list)
- [4. Accessing List Elements](#accessing-elements)
- [5. Lists are Mutable](#lists-mutable)
- [6. Traversing a List](#traversing-list)
- [7. Binary List Operations](#binary-list-op)
- [8. List Slices](#list-slices)
- [9. List Methods](#list-methods)
- [10. Map, Filter, and Reduce](#map-filter-reduce)
- [11. Lists and Functions](#lists-functions)
- [12. Deleting Elements](#deleting-elem)
- [13. Lists and Strings](#lists-strings)
- [14. Lists and Files](#lists-files)
- [15. Lists as Values](#lists-values)
- [16. List as Arguments](#lists-args)

## 1. A Python List is an Array <a class="anchor" id="history"></a>

Another way of storing data is by means of a list. The elements of a list may contain *arbitrary* values.
However, Python “lists” are not lists. They are *arrays*. 

### A bit of history

<img src="assets/Rutishauser.jpg" alt="Heinz Rutishauser" width="400"/>

<div style="text-align:center">
    <span style="font-size:0.9em; font-weight: bold;"><b>Heinz Rutishauser (30 January 1918 – 10 November 1970) was a Swiss mathematician and a pioneer of modern numerical mathematics and computer science.</b></span>
</div>

Heinz Rutishauser's programming language Superplan (1949–1951) included multi-dimensional arrays. 
Because of the importance of array structures for efficient computation, the earliest high-level programming languages, including FORTRAN (1957), COBOL (1960), and Algol 60 (1960), provided support for multi-dimensional arrays.

*Superplan* was a high-level programming language developed between 1949 and 1951 by *Heinz Rutishauser*, 
the name being a reference to "Rechenplan" (i.e. computation plan), in Konrad Zuse's terminology 
designating a single Plankalkül program.

The language was described in Rutishauser's 1951 publication "Über automatische Rechenplanfertigung bei programmgesteuerten Rechenmaschinen" (i.e. Automatically created Computation Plans for Program-Controlled Computing Machines).

Superplan introduced the keyword "for" resp. the German für with its for loop, which is an important language concept when traversing lists.

## 2. Lists versus Strings <a class="anchor" id="lists-vs-strings"></a>

We have already worked with the notion of a *lists*: a string is a sequence of single characters, 
whereas a list is a sequence of in principle *arbitrary* values, so values of any type.

The values in a list are called **elements** or **items**.

## 3. Creating a List <a class="anchor" id="creating-list"></a>

A new list can be created in different ways.
The simplest way is to put the elements between square (`[` and `]`) brackets.
Lists can also be assigned to variables.

If you want to use a list as argument or result of a function together with a type hint you have to add the following line to your code:

`from typing import List`

If you know the type of the elements, for instance all elements are integers, you can write `nums: List[int]`. If you do not know the elements or the list contains elements of different types, you can write `elems : List[any]`.

In [None]:
from typing import List

In [None]:
int_list: List = [10, 20, 30, 40]

print(int_list)
type(int_list)

This is a list of four integers.

In [None]:
str_list: List = ['data science', 'computer science', 'programming', 'statistics']

print(str_list)
type(str_list)

This is a list of four strings.
However, it is not necessary that the elements of a list are of the same type.

We can even have other lists as elements.
A list within another list is said to be a **nested** list.

In [None]:
messy_list: List = ['data science', 2, 3.14, [7, 42]]

print(messy_list)
type(messy_list)

Of course, you can assign list values to variables.

In [None]:
bikes: list = ['Gazelle', 'Trek', 'Sparta', 'Specialized']
numbers: list = [7, 42, 6 * 42]
empty_list: list = []

print(bikes, numbers, empty_list)

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Create a list of strings, where each string represents the name of a dutch artist. Assign the list to the variable <i>dutch_artists</i>.
</div>

In [None]:
# Remove this line and add your code here

## 4. Accessing List Elements <a class="anchor" id="accessing-elements"></a>

The syntax for accessing elements of a list is the same as for accessing characters of a string: you use the bracket operator.

```python
list_variable_or_value[index_expression]
```

The expression inside the brackets is the index of the list element.

In [None]:
bikes[0]

List indices work the same way as string indices:

* Any integer expression can be used as an index.
* If you try to read or write an element that does not exist, you get an `IndexError`.
* If an index has a negative value, it counts backward from the end of the list.

In [None]:
'Grape'[-5]

In [None]:
bikes[-1]

The operator `in` works also for lists.

In [None]:
'Sparta' in bikes

In [None]:
'JanJanssen' in bikes

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Check if numbers 3 and 252 are in the list stored in the variable <i>numbers</i>.
</div>

In [None]:
# Remove this line and add your code here

## 5. Lists are Mutable <a class="anchor" id="lists-mutable"></a>

In contrast to strings, list elements can be changed.
**Lists are mutable**, so we can change the value of a list element via an assignment. 
Therefore, we can use the indexed list in the left hand side of an assignment.

In [None]:
bikes[0] = 'Merida'

print(bikes)

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Change the last item of your list <i>dutch_artists</i>. Think of another artist that you have not included before.
</div>

In [None]:
# Remove this line and add your code here

## 6. Traversing a List <a class="anchor" id="traversing-list"></a>

The most common way of traversing a list is by means of using a `for` loop. 
The syntax is the same as for strings.

<div class="alert alert-info">
    <b>Iterator name</b><br>
    The name of the *iterator* can be short, because it is typically a local variable.
</div>

In [None]:
for b in bikes:
    print(b)

This way of traversing lists works fine if you only want to read the elements.
If you iterate over the elements of a list and you do not need to update them, an iterator over the elements of the list is sufficient. 

If you need to *update* the elements of the list you have to iterate using explicit *integer indices*.
A common way to do that is to combine the built-in functions `range` and `len`, as shown in the next cell.

`len` returns the length of the list.

In [None]:
len(numbers)

While `range` returns a list of indices ranging from `0` to `n-1`, where `n` is the length of the list, obtained via the function `len`.

In [None]:
range(len(numbers))

In [None]:
print(numbers)

for i in range(len(numbers)):
    numbers[i] *= 2
    
print(numbers)

This loop traverses the list and updates each individual element.
Each iteration of the loop `i` gets the index of the next element in the range. 
The assignment statement in the body uses `i` to read the old value of the element and to assign the new value.

In [None]:
for x in []:
    print('This never happens.')

Is this a useful `for` loop?

A `for` loop with an empty list never executes its body.

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Iterate the <i>dutch_artists</i> list and change its items so all names appear in capital letters.
</div>

In [None]:
# Remove this line and add your code here

## 7. Binary List Operations <a class="anchor" id="binary-list-op"></a>

A few useful list operations are `+` and `*`. 
Actually, we have seen these operations for strings as well. 

The `+` operator concatenates two lists as it happens with strings.

In [None]:
a: List = [1, 2, 3]
b: List = [4, 5, 6]
c: List = a + b
c

The `*` operator repeats a list for a given number of times.

In [None]:
a: List = [1, 2, 3] * 5
a

## 8. List Slices <a class="anchor" id="list-slices"></a>

Similar to strings it is also possible to take a slice of a list.

In [None]:
subjects: List = ['data science', 'computer science', 'programming', 'statistics']

subjects[1:3]

Recall, that if you omit the first index, the slice starts at the beginning. 

In [None]:
subjects[:4]

If you omit the second, the slice goes to the end. 

In [None]:
subjects[2:]

If you omit both, the slice is a copy of the entire list.

In [None]:
subjects[:]

Lists are mutable, so it is possible to use the slice operator in the left hand side of an assignment.
However, beware, this changes the list. If you do not want this, make a copy.

A slice operator on the left side of an assignment can update multiple elements.

In [None]:
subjects[3:4] = ['software engineering', 'machine learning']

subjects

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Slice the <i>dutch_artists</i> list. We require all items except the first one.
</div>

In [None]:
# Remove this line and add your code here

## 9. List Methods <a class="anchor" id="list-methods"></a>

Python provides methods that operate on lists. 
For example, `append` adds a new element to the end of a list.

In [None]:
subjects: List = ['data science', 'computer science', 'programming', 'statistics']

subjects.append('software engineering')
subjects

An alternative way of appending elements to a list is shown in the following cell.

In [None]:
subjects: List = ['data science', 'computer science', 'programming', 'statistics']

print(subjects[4:])

subjects[4:] = ['software engineering']
subjects

However, beware that this way of concatenating is error prone: if you use the wrong index, you will end up replacing instead of appending.

In [None]:
print(subjects[4:])

subjects[4:] = ['system engineering']
subjects

The method `extend` appends the list given argument to the list it is applied to. 
The list given as argument is not changed.

In [None]:
subjects1: List = ['data science', 'computer science', 'programming', 'statistics']
subjects2: List = ['software engineering', 'artificial intelligence']

subjects1.extend(subjects2)

print(subjects1)
print(subjects2)

The method `sort` sorts the elements of the list to which it is applied.
It uses the alphabetical order or from low to high if the elements are integers or floats.

In [None]:
subjects1.sort()
subjects1

In [None]:
numbers: List = [9, 5, 6, 2, 7, 1, 8, 4, 3]
numbers.sort()

numbers

In order to be able to sort a list, the `sort` method should be defined on element *types*.
So, sorting a list consisting of integers and strings is not possible, but a list of integers and floats can be sorted.

In [None]:
ns_list: List = [3, 5, 2, 7, 'abc', 1, 'xyz']
ns_list.sort()

ns_list

In [None]:
ns_list: List = [3, 5, 2, 7, 4.0, 1, 1.5]
ns_list.sort()

ns_list

Most list methods are void. 
They modify the list and return `None`. 

Observe what happens if you write `nums = nums.sort()`.

In [None]:
nums: List = [9, 5, 6, 2, 7, 1, 8, 4, 3]
nums = nums.sort()

print(nums)

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Use the <i>append</i> method to add a new artist to the <i>dutch_artists</i> list. Then sort all items.
</div>

In [None]:
# Remove this line and add your code here

## 10. Map, Filter, and Reduce <a class="anchor" id="map-filter-reduce"></a>

We have seen in the previous section a number of operations that can be applied to lists.
We will continue discussing a number of operations that can be applied to the individual elements of a list.

### Reduce
Suppose you want to add all numerical values of a list. 
You would probably write the following function using a `for` statement.

In [None]:
def add_all(nums: List[int]) -> int:
    """
    Adds the values of a list of integers.
    :param nums: list of integer numbers
    :returns: the sum of the list of integers.
    """
    total: int = 0
    for n in nums:
        total += n
    return total

print(add_all([9, 5, 6, 2, 7, 1, 8, 4, 3]))

As the loop runs, `total` accumulates the sum of the elements; a variable used this way is
sometimes called an **accumulator**.

Adding up the elements of a list is such a common operation that Python provides it as the
built-in function, `sum`.

In [None]:
nums: List[int] = [9, 5, 6, 2, 7, 1, 8, 4, 3]

sum(nums)

An operation like this combines a sequence of elements into a single value and is called **reduce**.

### Map

Sometimes you want to apply an operation to all elements of a list and build on the fly a new list.

Suppose you want to capitalize all strings in a list.

In [None]:
def capitalize_all(lst: List[str]) -> List[str]:
    """
    Capitalizes the first letter of all string elements.
    :param lst: list of strings
    :returns: the list of capitalized strings.
    """
    rlst: List[str] = []
    for e in lst:
        rlst.append(e.capitalize())
    return rlst

print(subjects1)
print(capitalize_all(subjects1))

`rlst` is initialized with an empty list; each time through the loop, we append the next element.
So `rlst` is another kind of accumulator.

An operation like `capitalize_all` is sometimes called a **map** because it “maps” a function
(in this case the method capitalize) into each of the elements in a sequence.

### Filter

Another common operation is to select some of the elements from a list and return a sublist.

For example, the following function takes a list of numbers and returns a list that contains
all numbers greater than 5.

In [None]:
def gtr_than_five(nums: List[int]) -> List[int]:
    """
    Filters all numbers less than or equal to 5.
    :param nums: list of integers
    :returns: list of numbers greater to 5.
    """
    result_nums: List[int] = []
    for n in nums:
        if n > 5:
            result_nums.append(n)
    return result_nums

print(nums)
print(gtr_than_five(nums))

An operation like `gtr_than_five` is called a **filter** because it selects some of the elements and
filters out the others.

Map, filter, and reduce allow a concise but powerful way of manipulating lists, certainly if combined.

Later, we will see even more powerful mechanisms to manipulate the elements of a list, but also over other similar data structures, like *sets*.

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Change all items in the <i>dutch_artists</i> list, so now they are shown in lower case. Is this a map, filter or reduce function?
</div>

In [None]:
# Remove this line and add your code here

## 11. Lists and Functions <a class="anchor" id="lists-functions"></a>

We already saw that we can use the function `sum` to add all items in a list, and the function `len` to get the size of a list.

There are other handy functions that you can use to compute values without writing your own loops.
Some of these functions are `min` and `max`.

We use the function `max` to get the maximum number of a list.

In [None]:
max(numbers)

We use the function `min` to get the minimum number of a list.

In [None]:
min(numbers)

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Can you compute the average of the list <i>numbers</i> without using any loop?
</div>

In [None]:
# Remove this line and add your code here

## 12. Deleting Elements <a class="anchor" id="deleting-elem"></a>

There are several ways of removing elements of a list, which one to choose depends on what kind of behaviour you want.

We start with the `pop` method. This operation removes an element at certain index (if given), otherwise it removes the last element of the list.
The `pop` method  returns the element that was removed.

In [None]:
topics: List = ['data science', 'computer science', 'programming', 'statistics']
elem: str = topics.pop(1)

print(topics)
print(elem)

If you are not interested in the removed element, you can use the method `del`.

In [None]:
topics: List = ['data science', 'computer science', 'programming', 'statistics']

del(topics[1])
print(topics)

If you want to remove specific element from a list you can use `remove`.
The return value of `remove` is `None`.

In [None]:
topics: List = ['data science', 'computer science', 'programming', 'statistics']

topics.remove('programming')
print(topics)

If a consecutive number of elements have to be removed from a list, it is better to use a slice.

In [None]:
topics: List = ['data science', 'computer science', 'programming', 'statistics']

del(topics[1:3])
print(topics)

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Remove the last item of the <i>dutch_artists</i> list. What operation did you choose and why?
</div>

In [None]:
# Remove this line and add your code here

## 13. Lists and Strings <a class="anchor" id="lists-strings"></a>

A string is a sequence of characters and a list is a sequence of values, but a list of characters is not the same as a string. 

To convert from a string to a list of characters, you can use `list`.

In [None]:
s: str = 'data science'

lst: list = list(s)
print(lst)

Because `list` is the name of a built-in function, you should avoid using it as a variable
name.
The `list` function breaks a string into individual characters. 

If you want to break a string (sentence) into words, you can use the `split` method.
By default, it splits the string taking the whitespace as reference.

In [None]:
claim: str = 'Data science is younger than computer science or not?'

words: List = claim.split()
print(words)

An optional argument called a **delimiter** specifies which characters to use as word boundaries.
The following example uses a hyphen `-` as a delimiter.

In [None]:
big_word: str = 'Multi-language-programming'

words: list = big_word.split('-')
print(words)

The function `join` is the inverse of the `split` operation.
It takes a list of strings (words) and concatenates the elements.

`join` is a string method and has to be invoked on the delimiter and
pass the list as a parameter.

In [None]:
words: List = ['Data', 'science', 'is', 'fun!']
sentence: str = ' '.join(words)
print(sentence)

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Can you change dots in the following message by spaces?
</div>

In [None]:
msg = 'We.want.to.modify.this.message'

# Remove this line and add your code here

## 14. Lists and Files <a class="anchor" id="lists-files"></a>

We usually read files to extract interesting information from it. 
To do so, we first need to find *interesting* lines that contain *interesting* information required by our program. 
To find this information we need to **parse** the interesting lines.

Let us look at this example. We want to detect lines that have the following form:

```python
'INFO Sending email [01-09-2020T07:44:11.144] from:bob@mail.nl to:alice@mail.nl'
```

We want to extract the sender of the email of all lines that start with the "INFO Sending email" string.

Later in the course we will see a more powerful mechanism, *regular expressions* to find patterns in strings.

In [None]:
def print_sender() -> None:
    """ 
    Looks for interesting lines and print the sender of an email.
    """
    logs = open('datasets/logs.txt') # Get lines from the logs.txt file
    
    for line in logs:
        line.rstrip() # Remove white spaces at the end of the string
        
        if line.startswith('INFO Sending email'): # Pick interesting lines
            words: List = line.split() # Split line into words
            print(words[4]) # Print the sender of the email

print_sender()

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Can you now print the receiver of the email?
</div>

In [None]:
# Remove this line and add your code here

## 15. Lists as Values <a class="anchor" id="lists-values"></a>

### Objects and Values

There is a subtle difference between *objects* and *values*. 
Even when two objects represent the same value, they may be considered to be distinct.

In [None]:
a: str = 'Data'
b: str = 'Data'

a == b

`a` and `b` refer both to the string `"Data"`, but do they actually refer to the same string?

<img src="assets/figure102.png"/>

`a` and `b` can refer to two different **objects** that have the same value, or they refer to the same **object**.

We can check this by means of the `is` operator.

In [None]:
a: str = 'Data'
b: str = 'Data'

print(a is b)
print(a == b)

In this case Python creates one string object.

Observe what happens if you change both strings in "Data science". 
Try to explain this based on the <i>split</i> method from the previous section.

In [None]:
a: str = 'Data science'
b: str = 'Data science'

print(a is b)
print(a == b)

So, `list` objects are not the same.

In [None]:
a: List = [1, 2, 3]
b: List = [1, 2, 3]

a is b

If you create two lists which are exactly the same: same number of elements and in the same order, two different objects are created.
The two lists are **equivalent**, they have the same elements, but they are not **identical**, they are not the same object.

If two objects are identical, they are also equivalent, but if they are equivalent, they are not necessarily identical.
If you want to be precise with your terminology, then you say that an object has a value.

If you evaluate `[1, 2, 3]`, you get a list object whose
value is a sequence of integers. 
If another list has the same elements, we say it has the
same value, but it is not the same object.

### Aliasing

What will be the result of the following Python fragment?

In [None]:
a: List = [1, 2, 3]
b: List = a
b is a


The assocication of a variabe with an object is called a **reference**.

In the previous example, both `a` and `b` refer to the same object:

<img src="assets/figure104.png"/>

If an object has more than one reference then the object is **aliased**.

If the aliased object is mutable, then changes made via one variable affects the other.

In [None]:
a: List = [1, 2, 3]
b: List = a
b[1] = 42

print(b)
print(a)

In this case `a` and `b` are aliases for the same object.
**Aliasing** is useful, but dangerous:
it hinders the understandability of programs, you have to keep all references in mind.

It is safer to avoid aliasing when you are working with mutable objects.

<div class="alert alert-success">
    <b>Do It Yourself!</b><br>
    Read about the <i>copy()</i> method. Can you modify the following code so only <i>b</i> is modified?
</div>

In [None]:
a = [1, 2, 3]
b = a
b[1] = 42

print(b)
print(a)

## 16. List as Arguments <a class="anchor" id="lists-args"></a>

If you pass a list as an argument to a function, you have to be aware of the aliases.
The function gets a reference to the list, which is a mutable object, so every modification
to the list has effect on the original list.

In [None]:
def delete_head(lst: List[any]) -> None:
    """ 
    Deletes the head of a list.
    :param lst: list from which the head is deleted
    """
    del lst[0]
    
topics = ['Data science', 'Computer science', 'Programming']
delete_head(topics)

print(topics)

The parameter `lst` and the variable `topics` refer to the same object, are aliases.

<div class="alert alert-info">
    <b>Side effect free</b><br>
    Such a function has a so-called **side effect**, when developing functions make sure they are side effect free in order to increase the understandability of your software.
</div>

It is important to distinguish between operations that modify lists and operations that create new lists. 
The `append` method modifies a list, whereas the `+` operator creates a
new list.

In [None]:
lst1: List = [1, 2]
lst2: List = lst1.append(3)

print(lst1)
print(lst2)

The return value of the function `append` is `None`.

In [None]:
lst3: List = lst1 + [4]

print(lst1)
print(lst3)

It is important to beware of aliases when you are writing functions that are supposed to modify lists.
The following function does not delete the head of a list.

<div class="alert alert-info">
    <b>List with arbitrary elements</b><br>
    The type hint of the <code>lst</code> argument of <code>bad_delete_head</code> is <code>List[any]</code> because this function operates on lists with arbitrary elements.
</div>

In [None]:
print(lst3)

def bad_delete_head(lst: List[any]) -> None:
    """ 
    Deletes the head of a list.
    :param lst: list from which the head is deleted
    """
    lst = lst[1:]   #WRONG!
    print(lst)
    
bad_delete_head(lst3)
print(lst3)

At the beginning of `bad_delete_head`, `lst` and `lst3` refer to the same list. 
At the end, `lst` refers to a new list, but `lst3` still refers to the original, unmodified list.

An alternative is to write a function that creates and returns a new list. For example, `tail`
returns all but the first element of a list.

In [None]:
def tail(lst: List[any]) -> List[any]:
    """ 
    Removes the head of a list
    :param lst: list from which head is deleted
    :returns: list without head
    """
    return lst[1:]
    
letters = ['a', 'b', 'c']
rest = tail(letters)

print(rest)

---
This Jupyter Notebook is based on Chapter 8 of the book Python for Everybody and Chapter 10 of the book Think Python.

---

# (End of Notebook)

&copy; 2022-2023 - **TU/e** - Eindhoven University of Technology