# Lists, tuples, and sets
> the basics of built-in Python collections

## Lists

### Slice assignment techniques

Slice notation (as in `lista[index1:index2]`) can be used in assignments to replace the elements in a list.

This will work even if the list used on the right hand size has more or few elements, in which case the list size will be altered.

In [5]:
x = [1, 2, 3, 4]

assert x[:len(x)] == [1, 2, 3, 4]

x[len(x):] = [5, 6, 7]
assert x == [1, 2, 3, 4, 5, 6, 7 ]

x[:0] = [-1, 0]
assert x == [-1, 0, 1, 2, 3, 4, 5, 6, 7]

x[1:-1] = []
assert x == [-1, 7]

For example, if you have a list that is 10 items long, you can move the last three items from the end to the beginning keeping them in the same order using slice syntax:

In [None]:
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
x[0:0] = x[-3:]
x[-3:] = []

assert x == [8, 9, 10, 1, 2, 3, 4, 5, 6, 7]

### Appending a single element to a list with `l.append()`

The `append()` method defined on lists lets you append a single element to a list.

In [6]:
x = [1, 2, 3]
x.append("catorce")

assert x == [1, 2, 3, "catorce"]

Because `append()` appends a single individual element, if you use it to extend a list it will have unintended consequences:

In [9]:
x = [1, 2, 3, 4]
y = [5, 6, 7]

x.append(y)

assert x == [1, 2, 3, 4, [5, 6, 7]]

### Extending a list with another list with `l.extend()`

The `extend()` method lets you append one list to another:

In [10]:
x = [1, 2, 3, 4]
y = [5, 6, 7]

x.extend(y)

assert x == [1, 2, 3, 4, 5, 6, 7]

### Inserting elements within a list with `l.insert`

The `insert()` method lets you insert a new list element between two existing elements, or at the front or back of the list.

The method takes the index position where the new element should be inserted, and a second argument which should be the element itself.

In [11]:
x = [1, 2, 3]

x.insert(0, "zero")
assert x == ["zero", 1, 2, 3]

x.insert(4, "four")
assert x == ["zero", 1, 2, 3, "four"]

x.insert(2, "one and a half")
assert x == ["zero", 1, "one and a half", 2, 3, "four"]

### Removing list items or slices by index with `del`

The `del` statement lets you delete an element or a slice from a list.



In [19]:
x = ['a', 2, 'c', 7, 9, 11]
del x[1]

assert x == ['a', 'c', 7, 9, 11]

del x[:2]
assert x == [7, 9, 11]

### Removing an element from a list with `l.remove()`

The `remove()` method looks for the first instance of a given value in a list and removes that value from the list:

In [22]:
x = [1, 2, 3, 4, 3, 5]

x.remove(3)
assert x == [1, 2, 4, 3, 5]

x.remove(3)
assert x == [1, 2, 4, 5]

### Reversing a list in place with `l.reverse()`

The `reverse()` method efficiently reverses a list in place.

In [23]:
x = [1, 3, 5, 6, 7]
x.reverse()

assert x == [7, 6, 5, 3, 1]

### Sorting lists in place with `l.sort()`

The `sort()` method lets you sort a list in place:

In [31]:
x = [3, 8, 4, 0, 2, 1]
x.sort()

assert x == [0, 1, 2, 3, 4, 8]

To sort a list without changing the original list you can use the `sorted()` built-in function as seen below:

In [32]:
x = [3, 8, 4, 0, 2, 1]
y = sorted(x)

assert x == [3, 8, 4, 0, 2, 1]
assert y == [0, 1, 2, 3, 4, 8]

Alternatively, you can use the following hack to create a copy of the list before sorting it in-place:

In [33]:
x = [3, 8, 4, 0, 2, 1]
y = x[:]
y.sort()

assert x == [3, 8, 4, 0, 2, 1]
assert y == [0, 1, 2, 3, 4, 8]

Note that `sort` method requires all items in the list to be of comparable types. As a result, trying to sort a list that contains numbers and strings will raise an exception:

In [36]:
x = [1, 2, 3, "catorce"]
try:
    x.sort()
except Exception as e:
    print(f"Exception caught {type(e)}: {e}")

Exception caught <class 'TypeError'>: '<' not supported between instances of 'str' and 'int'


`sort` has an optional `reverse` parameter that causes the sort to be performed in reverse order:

In [37]:
x = [3, 8, 4, 0, 2, 1]
x.sort(reverse=True)
assert x == [8, 4, 3, 2, 1, 0]


#### Custom sorting

By default, `sort` uses built-in Python comparison functions to determine the ordering in which elements should be sorted.

However, `sort` is flexible enough to allow you using any other suitable ordering by way of providing a custom sorting function.

For example, you can sort a list of strings based only on its length

In [40]:
x = ["uno", "dos", "tres", "cuatro", "cinco"]

def num_chars(s):
    return len(s)

x.sort(key=num_chars)
assert x == ["uno", "dos", "tres", "cinco", "cuatro"]

x = ["uno", "tres", "cuatro"]
x.sort(key=num_chars, reverse=True)
assert x == ["cuatro", "tres", "uno"]

#### Exercise: lists

Suppose that you have a list in which each element is in turn a list: `[[1, 2, 3], [2, 1, 3], [4, 0, 1]]`.

Write a program that sorts this list by the second element in each list so that the result is `[[4, 0, 1], [2, 1, 3], [1, 2, 3]]`.

In [41]:
x = [[1, 2, 3], [2, 1, 3], [4, 0, 1]]

def by_second_element(l):
    return l[1]

x.sort(key=by_second_element)
assert x == [[4, 0, 1], [2, 1, 3], [1, 2, 3]]


### Checking list membership

You can use the `in` operator to test whether a value is in a list. Conversely, you can use the `not in` oprator to check that value is not:

In [1]:
3 in [1, 3, 5, 7]

True

In [1]:
2 not in [1, 3, 5, 7]

True

### List concatenation with `+`

The `+` operator is overloaded when working with lists as the concatenation operator:

In [8]:
l = [1, 2, 3] + [4, 5]
assert l == [1, 2, 3, 4, 5]

Note the difference between `+` and extend:

In [9]:
l1 = [1, 2, 3]
l1.extend([4, 5])
assert l1 == [1, 2, 3, 4, 5]

The `+` operator returns a new list resulting of concatenating the left and right list, while `extend()` is a method that transforms in place the list it is applied to.

### List initialization with `*`

When working with long lists, it is sometimes useful to use the `*` operator to correctly size the list at the start of the program:

In [10]:
zeroes = [0] * 10
assert zeroes == [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

In [11]:
options = [None] * 5
assert options == [None, None, None, None, None]

In [12]:
x = [3, 1] * 2
assert x == [3, 1, 3, 1]

The `*` operator when used with lists is called the *list multiplication* operator.

### List minimum and maximum with `min` and `max`

The functions `min` and `max` return the smallest and largest elements in a list.

In [13]:
minimum = min([3, 1, 4, 1, 5, 9])
assert minimum == 1

maximum = max([3, 1, 4, 1, 5, 9])
assert maximum == 9

Trying to find the min or max of a list with different types of objects causes a `TypeError` exception to be raised:

In [14]:
try:
    min([1, 2, 3, "catorce"])
except Exception as e:
    print(f"Exception caught {type(e)}: {e}")


Exception caught <class 'TypeError'>: '<' not supported between instances of 'str' and 'int'


### List search with `index`

The `index` method returns the position of an element in a list or an `ValueError` exception if the element is not found:

In [16]:
x = [1, 3, "five", 7, 9, 11]
assert x.index("five") == 2

try:
    x.index(5)
except Exception as e:
    print(f"Exception caught {type(e)}: {e}")


Exception caught <class 'ValueError'>: 5 is not in list


### Counting list matches with `count`

The method `count` searches through a list looking for a given value and returns the number of times that the value is found:

In [17]:
x = [1, 2, 2, 3, 5, 2, 5]
assert x.count(2) == 3

### Summary of list operations

The following table summarizes the most relevant list operations:

| List operation | Description | Example |
| :------------- | :---------- | :------ |
| [] | Creates an empty list | `x = []` |
| len | Returns the length of a list | `len(x)` |
| append | Adds a single element to the end of a list | `l.append(x)` |
| extend | Adds another list to the end of the list | `x.extend(["a", "b"])` |
| insert | Inserts a new element at a given position in the list | `l.insert(2, "y")` |
| del | Removes a list element or slice | `del(x[0])`<br>`del(x[1:3])` |
| remove | Searches for and removes a given value from a list | `x.remove("y")` |
| reverse | Reverses a list in place | `x.reverse()` |
| sort | Sorts a list in place | `x.sort()` |
| + | Returns the list that results from concatenating the two given | `z = x1 + x2` |
| * | Returns the list that results from replicating a list the number of times given | `z = ["y"] * 3` |
| min | Returns the smallest element in a list | `min(x)` |
| max | Returns the largest element in a list | `max(x)` |
| index | Returns the position of a value in a list | `i = x.index("b")` |
| count | Counts the number of times a value occurs in a list | `x.count("y")` |
| sum | Sums the items in the list | `sum(x)` |
| in | Returns whether an item is in a list | `"y" in x` |


### Exercise

1. What would be the result of `len([1, 2] * 3)`?
2. What are the two differences between using `in` and `index`?
3. Which of the following will raise an exception?
    1. `min(["a", "b", "c"])`
    2. `max([1, 2, "three"])`
    3. `[1, 2, 3].count("one")`

The result of 1 will be 6:

In [18]:
assert len([1, 2] * 3) == 6

The two differences between `in` and `index` are:
+ `in` checks whether an element exists in the list, while `index` returns where it is located.
+ `index` will raise an exception if the element is not found, while `in` will just return `False`.

The first expression won't raise an exception &mdash; it will return `"a"`:

In [19]:
assert min(["a", "b", "c"]) == "a"

The second expression will raise an exception because not all the elements of the list are of the same type:

In [25]:
try:
    max([1, 2, "three"])
except TypeError as e:
    print("Ooops: ", {e})

Ooops:  {TypeError("'>' not supported between instances of 'str' and 'int'")}


### Exercise

If you have a list `x`, write code to safely remove an item, if and only if, that value is in the list. Modify the code to remove the element only if the item occurs in the list more than once.

The `remove` method can be used to remove an element from the list, but an exception will be raised if the item is not found. We can use `in` to ensure the item is in the list:

In [35]:
def safe_remove(l: list, elem: any) -> None:
    if elem in l:
        l.remove(elem)

l = [1, 2, 3, 4, 5]
safe_remove(l, 3)
assert l == [1, 2, 4, 5]

l = [1, 2, 3]
safe_remove(l, 5)
assert l == [1, 2, 3]


## Nested lists

Lists can be nested. One application of nesting is to represent two-dimensional matrices.

In [36]:
m = [[0, 1, 2], [10, 11, 12], [20, 21, 22]]
assert m[0] == [0, 1, 2]
assert m[0][1] == 1
assert m[2][2] == 22

You might run into a 'gotcha' with nested lists by the way with which variables refer to object. Remeber the mental model that variables are just labels given to objects.

Consider the following example:

In [37]:
nested = [0]
original = [nested, 1]
original

[[0], 1]

The value of the nested list can be modified using either the `original` or the `nested` label:

In [38]:
nested[0] = "zero"
original

[['zero'], 1]

In [39]:
original[0][0] = 0
original

[[0], 1]

In [40]:
nested

[0]

Up to this point both `nested` and `original` are connected, but if we make `nested` point to another list, the connection will be broken:

In [41]:
nested = [2]
original

[[0], 1]

After doing `nested = [2]`, the `nested` label will point to a completely different list, while original will still keep referencing the former list.

# Copying a list

There are several ways to copy a list (using the techniques already seen):
+ you can take a full slice of a list `x[:]`
+ you can use the concatenation operator `x + []`
+ you can use the list multiplication operator `x * 1`

If you want to create a shallow copy of a list, `x[:]` will be the most efficient result, but it will create a *shallow copy*, that is, it won't copy the objects the list contains.

To create a deep copy of the list you can use the `deepcopy` function from the `copy` library:

In [44]:
original = [[0], 1]
shallow = original[:]
assert shallow == [[0], 1]

[['zero'], 1]

In [45]:
import copy

original = [[0], 1]
deep = copy.deepcopy(original)
assert deep == [[0], 1]

Because `original[:]` creates a shallow copy, the connection between original and shadow will be preserved:

In [46]:
original = [[0], 1]
shallow = original[:]

shallow[0][0] = "zero"
assert original == [["zero"], 1]

While in the deepcopy, new objects would've been created and therefore `deep` will be independent from `original`:

In [47]:
import copy

original = [[0], 1]
deep = copy.deepcopy(original)
deep[0][0] == "zero"

assert original == [[0], 1]
deep

[[0], 1]

### Exercise

Supppose that you have the list `x == [[1, 2, 3], [4, 5, 6], [7, 8, 9]]`. What code would you use to get a copy `y` of that list in which you could change the elements without the side effect of changing the contents of `x`?

If you don't want the original list `x` to be connected to the copied one `y` you need to use `copy.deepcopy`.

### Unpacking lists

It's quite common to unpack list elements (what in JavaScript is known as destructuring). This operation assigna the list's elements to other variables.

You are allowed to `*` to absorb any number of elements not matching the other elements (that is, to eliminate the *surplus*). Note that the elements marked with `*` will be placed in a list.

In [6]:
a, b = [1, 2]
assert a == 1
assert b == 2

[a, b] = [1, 2]
assert a == 1
assert b == 2

## Tuples

Tuples are data structures similar to lists but they are immutable. As such, they can be used for keys in dictionaries, a role that lists can't.

In [1]:
x = (1, 2, 3)
try:
    x[2] = "two"
except TypeError as e:
    print("Ooops: ", {e})

Ooops:  {TypeError("'tuple' object does not support item assignment")}


Note that a tuple being immutable does not mean that tuples cannot hold mutable objects &mdash; the objects held by the tuples may be changed if you have a *variable*/*label* referencing them.

When a tuple holds a mutable object, it isn't allowed to be used as a dictionary key.

### Syntax for empty tuples and one-element tuples

Empty tuples are specified as an empty set of parentheses `()`.

One-element tuples are specified with a trailing comma, to be able to differentiated from the corresponding mathematical expression:

In [None]:
x = () # empty tuple

y = (1,) # single element tuple

### Making a copy of a tuple

Tuples can be copied in the same way as lists:

In [2]:
x = (1, 2, 3)
y = x[:]
assert y == (1, 2, 3)

### Unpacking tuples

It's quite common to unpack tuples (what in JavaScript is known as destructuring) and assign their individual elements to other variables.

You are allowed to `*` to absorb any number of elements not matching the other elements (that is, to eliminate the *surplus*). Note that the elements marked with `*` will be placed in a list.

In [4]:
x = (1, 2, 3, 4)
a, b, *c = x

assert a == 1
assert b == 2
assert c == [3, 4]

Packing and unpacking can be performed with lists and tuples:

In [8]:
[a, b] = 1, 2 # 1, 2 is effectively a tuple
assert a == 1
assert b == 2

(c, d) = 3, 4
assert c == 3
assert d == 4

### Converting between lists and tuples

You can easily convert tuples to lists using the `list` built-in function. Similarly, the `tuple` function converts a lists to the corresponding tuple:

In [10]:
l = list((1, 2, 3, 4))
assert l == [1, 2, 3, 4]

t = tuple([1, 2, 3, 4])
assert t == (1, 2, 3, 4)

An interesting *hack*. You can use the `list` function to conveniently break a string into its characters:

In [11]:
list("Hello")

['H', 'e', 'l', 'l', 'o']

## Sets

A set in Python is an unordered collection of objects used when membership and uniqueness in the set are the main things you're interested in.

Like dictionary keys, the items in a set must be immutable and hashable (i.e., ints, floats, strings, and tuples can be members of a set, but lists, dictionaries, and sets can't).

### Set operations

In addition to `in`, `len`, and iteration with `for`, sets have several specific operations:

In [23]:
# Create a set from a list
x = set([1, 2, 3, 4, 5])
assert x == {1, 2, 3, 4, 5}

x.add(6)
assert x == {1, 2, 3, 4, 5, 6}

x.remove(5)
assert x == {1, 2, 3, 4, 6}

assert 1 in x

assert (5 in x) == False

# union
y = {1, 7, 8, 9}
assert x | y == {1, 2, 3, 4, 6, 7, 8, 9}

# intersection
assert x & y == {1}

# xor
assert x^y == {2, 3, 4, 6, 7, 8, 9}

You can create a set from a sequence (such as a list or tuple) using the `set` built-in function.

When a sequence is made into a list, their duplicates are removed:

In [26]:
x = [1, 2, 3, 4, 5, 4, 3, 2, 1]
assert set(x) == {1, 2, 3, 4, 5}

### Frozensets

The `frozenset` type can be used to create an immutable set. A `frozenset` is immutable and hashable and therefore can be used as members of other sets (or as dictionary keys):

In [30]:
x = set([1, 2, 3, 1, 3, 5])
z = frozenset(x)
assert z == {1, 2, 3, 5}

try:
    z.add(6)
except AttributeError as e:
    print("Ooops: ", {e})

Ooops:  {AttributeError("'frozenset' object has no attribute 'add'")}


### Exercise

Read a set of temperature data (the monthly high temperatures at Heathrow Airport from 1948 through 2016) from the file [lab_05.txt](data/lab_05.txt) and then find some basic statistics about the data: the highest and lowest tempeartures, the mean (average) temperature, and the median temperature (the temperature in the middle if all the temperatures are sorted).

How many unique temperatures are in the file?

First thing is inspecting the file [lab_05.txt](data/lab_05.txt).

```
8.9
7.9
14.2
15.4
18.1
19.1
...
10.2

```

It is a series of float values, one per row.

Thus, the best way to calculate some statistics will be to load the values into a list.

In [39]:
temps =[]
with open("data/lab_05.txt") as f:
    for line in f:
        temps.append(float(line.strip()))

print(len(temps))

828


To calculate the max and min temperatures we can directly use the `min` and `max` built-in functions.

For the average temperature we just have to sum all the temperatures and divide the sum by the number of elements in the list

In [34]:
min_temp = min(temps)
max_temp = max(temps)
avg_temp = sum(temps) / len(temps)

Finally, for the median, we have to sort the list and extract the element in the middle:

In [35]:
median_temp = sorted(temps)[len(temps) // 2]

As a result:

In [36]:
print(f"Min: {min_temp}")
print(f"Max: {max_temp}")
print(f"Avg: {avg_temp}")
print(f"Median: {median_temp}")

Min: 0.8
Max: 28.2
Avg: 14.84830917874396
Median: 14.7


Finally, to find how many unique temperatures are in the file we can do:

In [37]:
unique_temps = set(temps)
print(f"Unique: {len(unique_temps)}")
print(f"Total number of temps: {len(temps)}")

Unique: 217
Total number of temps: 828
