# LBYCPA1 Module 7 
## Collection Array: Lists and Tuples

### Objectives:
1. To understand the concept of sequences
1. To familiarize with the list and tuple collection arrays
1. To use lists and tuples for effective storage of data array
1. To utilize lists and tuples in solving computational problems
1. (Add an objective...)

### Materials and Tools:
1. Instructor's lecture notes
1. Jupyter Notebook
1. Flowchart Software (Diagrams.net, Lucidchart, SmartDraw, etc.)
1. (Add a material or tool...)

### Sequences
A sequence is a positionally ordered collection of other objects. Sequences maintain a left-to-right order among the items they contain: their items are stored and fetched by their relative position. The Python sequence data types include strings, lists, range objects returned by `range()`, and tuples. Some common operations for the sequence type object can work on both mutable and immutable sequences.

The operations in the following table are supported by most sequence types, both mutable and immutable.

| Operation | Result |
|:- |:- |
| `x in s` | `True` if an item of `s` is equal to `x`, else `False` |
| `x not in s` | `False` if an item of `s` is equal to `x`, else `True` |
| `s + t` | the concatenation of `s` and `t` |
| `s * n` or `n * s` | equivalent to adding `s` to itself `n` times |
| `s[i]` | `i`th item of `s`, origin 0 |
| `s[i:j]` | slice of `s` from `i` to `j` |
| `s[i:j:k]` | slice of `s` from `i` to `j` with step `k` |
| `len(s)` | length of `s` |
| `min(s)` | smallest item of `s` |
| `max(s)` | largest item of `s` |
| `s.index(x[, i[, j]])` | index of the first occurrence of `x` in `s` (at or after index `i` and before index `j`) |
| `s.count(x)` | total number of occurrences of `x` in `s` |

The operations in the following table are defined on mutable sequence types.

| Operation | Result |
|:- |:- |
| `s[i] = x` | item `i` of `s` is replaced by `x` |
| `s[i:j] = t` | slice of `s` from `i` to `j` is replaced by the contents of the iterable `t` |
| `del s[i:j]` | same as `s[i:j] = []` |
| `s[i:j:k] = t` | the elements of `s[i:j:k]` are replaced by those of `t` |
| `del s[i:j:k]` | removes the elements of `s[i:j:k]` from the list |
| `s.append(x)` | appends `x` to the end of the sequence (same as `s[len(s):len(s)] = [x]`) |
| `s.clear()` | removes all items from `s` (same as `del s[:]`) |
| `s.copy()` | creates a shallow copy of `s` (same as `s[:]`) |
| `s.extend(t)` or `s += t` | extends `s` with the contents of `t` (for the most part the same as `s[len(s):len(s)] = t`) |
| `s *= n` | updates `s` with its contents repeated `n` times |
| `s.insert(i, x)` | inserts `x` into `s` at the index given by `i` (same as `s[i:i] = [x]`) |
| `s.pop(i)` | retrieves the item at `i` and also removes it from `s` |
| `s.remove(x)` | remove the first item from `s` where `s[i]` is equal to `x` |
| `s.reverse()` | reverses the items of `s` in place |

### Lists
A `list` is a sequence of data. An 'array' in most other languages is a similar concept, but Python lists are more general than most arrays as they can hold a mixture of types. A list is constructed using square brackets:

Lists may be constructed in several ways:
* Using a pair of square brackets to denote the empty list: `[]`
* Using square brackets, separating items with commas: `[a]`, `[a, b, c]`
* Using a list comprehension: `[x for x in iterable]`
* Using the type constructor: `list()` or `list(iterable)`

In [None]:
lab_group0 = ["Sarah", "John", "Joe", "Emily", "Quang"]
print("Lab group members: " + str(lab_group0))
print("Size of lab group: " + str(len(lab_group0)))

print("Check the Python object type: " + str(type(lab_group0)))

The `len` functionreturns the length (number of items) of the list.

An empty list is created by

In [None]:
my_list = []
another_list = list()

A list of length 5 with repeated values can be created by

In [None]:
my_list = ["Oliver"] * 5
print(my_list)

or be concatenated to another list:

In [None]:
print(lab_group0 + my_list)

#### Indexing
Lists store data in order, so it is possible to 'index into' a list using an integer (this will be familiar to anyone who has used C), e.g.:

In [None]:
first_member = lab_group0[0]
third_member = lab_group0[2]
print(first_member, third_member)

or

In [None]:
for i in range(len(lab_group0)):
    print(lab_group0[i])

Indices start from zero, and run through to (length - 1).

Indexing can be useful for numerical computations. We use it here to compute the dot product of two vectors:

In [None]:
# Two vectors of length 4
x = [1.0, 3.5, 7.2, 8.9]
y = [-1.0, 27.1, 1.0, 6]

print("i : x[i], y[i] : x*y : dot")
# Compute dot-product
dot_product = 0.0
for i in range(len(x)):
    dot_product += round(x[i]*y[i], 2)
    print(i, ":", x[i], ",", y[i], ":", round(x[i]*y[i], 2), ":", dot_product)

print(dot_product)

When possible, it is better to iterate over a list rather than use indexing, since there are data structures that support iterating but do not support indexing.

If we have a list-of-lists,

In [None]:
lab_group0 = ["Sarah", "John", "Joe", "Emily"]
lab_group1 = ["Roger", "Rachel", "Amer", "Caroline", "Colin"]
lab_groups = [lab_group0, lab_group1]
print(lab_groups)

In [None]:
for grp, group in enumerate(lab_groups):
    for index, member in enumerate(group):
        if member == "Emily":
            print(grp, index)
            break

we can use the first index to access a list, and a second index to access the entry in that list:

In [None]:
group = lab_groups[0]
print(group, type(group))

name = lab_groups[1][4]
print(name, type(name))

#### Iteration
Looping over each item in a list (or more generally a sequence) is called *iteration*. We iterate over the members of the lab group using the syntax:

In [None]:
for i in range(len(lab_group0)):
    print(lab_group0[i], type(i))

In [None]:
for member in lab_group0:
    print(member, type(member))

Say we want to iterate over the names of the lab group members, and get the position
of each member in the list. We use `enumerate` for this: 

In [None]:
for n, member in enumerate(lab_group0):
    print(n, member)

In the above, `n` is the position in the list and `member` is the $n$th entry in the list. Sometimes we need to know the position, in which case `enumerate` is helpful. However, when possible it is preferable to use a 'plain' iteration. Note that Python counts from zero - it uses zero-based indexing.

#### Heterogeneity
Python lists are heterogeneous data structures - this means they can store mixed types, e.g.

In [None]:
mixed_list = ["Adam", 2 + 4j, 1.0, 4]
for entry in mixed_list:
    print(entry, type(entry))

Arrays in most languages are homogeneous - all types in the array must be the same.

#### Manipulating lists 
There are many functions for manipulating lists. It might be useful to sort the list:

In [None]:
print(lab_group0)
lab_group0.sort()
for member in lab_group0:
    print(member)

In the above, `sort` is known as a method of a `list`. It performs an *in-place* sort, i.e. `lab_group0` is sorted, rather than creating a new list with sorted entries (for the latter we would use `sorted(lab_group0)`, which returns a new list).

With lists we can add and remove students:

In [None]:
# Remove the second student (indexing starts from 0, so 1 is the second element)
student2 = lab_group0.pop(1)
print(lab_group0)
print(student2)

# Add new student "Josephine" at the end of the list
lab_group0.append("Josephine")
print(lab_group0)

or maybe join two groups to create one larger group:

In [None]:
lab_group1 = ["Roger", "Rachel", "Amer", "Caroline", "Colin"]

lab_group = lab_group0 + lab_group1
print(lab_group)

or create a list of group lists:

In [None]:
lab_groups = [lab_group0, lab_group1]
print(lab_groups)

print("---")

print("Print each lab group (name and members):")
for index, lab_group in enumerate(lab_groups):
    print(index, lab_group)

The official Python Tutorial page [More on Lists](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) describes the rest of the methods for Python lists.

### Tuples
The tuple data type is almost identical to the list data type, except that their elements are immutable, that is, they cannot have their values modified, appended, or removed. A tuple begins with an opening parentheses and ends with a closing parentheses, `()`.

For something that should not change after it has been created, such as a vector of length three with fixed entries, a tuple is more appropriate than a list. It is 'safer' in this case since it cannot be modified accidentally in a program. Being immutable ('read-only') also permits implementations to be optimised for speed.

To create a tuple, use round brackets. Say at a college each student is assigned a room, and the rooms are numbered. A student 'Laura' is given room 32:

In [None]:
room = ("Laura", 32)
print("Room allocation: " + str(room))
print("Length of entry: " + str(len(room)))
print(type(room))

We can iterate over tuples in the same way as with lists,

In [None]:
# Iterate over tuple values
for d in room:
    print(d)

and we can index into a tuple:

In [None]:
# Index into tuple values
print(room[1])
print(room[0])

It is also possible to assign the elements of a tuple into several variables in a single assignment statement. This process is called tuple unpacking and it works for any sequence on the right-hand side. Sequence unpacking requires that there are as many variables on the left side of the equals sign as there are elements in the sequence. For example:

In [None]:
student, room_number = room
print(f"Student: {student}, Room number: {room_number}")

We might have a student who is given permission to live outside of college. To keep track of them we still want an entry for the student, but there is no associated room number. We can create a tuple of length one:

In [None]:
room = ("Adrian",)
print("Room allocation: " + str(room))
print("Length of entry: " + str(len(room)))
print(type(room))

As part of a rooms database, we can create a list of tuples:

In [None]:
room_allocation = [("Adrian",), ("Laura", 32), ("John", 31), ("Penelope", 28), ("Fraser", 28), ("Gaurav", 19)]
print(room_allocation)

To make it easier for students to find their rooms on a printed room list, we can sort the list:

In [None]:
room_allocation.sort()
print(room_allocation)

We could improve the printed list by not printing those living outside of the college:

In [None]:
for entry in room_allocation:
    if len(entry) > 1:
        print("Name: " + str(entry[0]) + " \n  Room: " + str(entry[1]))

#### Immutability

Though tuples may seem similar to lists, they are often used in different situations and for different purposes. Tuples are immutable, and usually contain a heterogeneous sequence of elements that are accessed via unpacking or indexing. Lists are mutable, and their elements are usually homogeneous and are accessed by iterating over the list. 

If we try to change the student's name in the `room` tuple, we get an error:

In [None]:
room[0] = "Liza"

#### Operations and Methods

A tuple can also be concatenated to another tuple or can be repeated using the same operators used for strings:

In [None]:
dice_roll_1 = (1, 4, 3, 5, 1)
dice_roll_2 = (2, 4, 5, 7, 4)

all_rolls = dice_roll_1 + dice_roll_2 # concatenation
print(all_rolls)

double_rolls = all_rolls * 2 # repetition
print(double_rolls)

Tuples also have the `count` method:

In [None]:
double_rolls.count(1) # Count all the 1s in the tuple

If we need to know the index of the first occurrence of an element in a tuple, we can use the `index` method:

In [None]:
double_rolls.index(7)

## References
- Lutz, M. (2009). *Learning Python: Powerful Object-Oriented Programming*. Beijing: OReilly.
- Python Software Foundation (2022). *An Informal Introduction to Python - Python 3.10.4 documentation*. Retrieved from https://docs.python.org/3/tutorial/introduction.html#lists
- Python Software Foundation (2022). *Data Structures - Python 3.10.4 documentation*. Retrieved from https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences
- Sweigart, A. (2019). *Automate The Boring Stuff With Python, 2nd Edition*. No Starch Press, US.