# Collections


Previously, we mainly worked on single variables. But in practice we want to be able to handle groups of elements, i.e *collections*.

Here we cover two of them:

- Lists: ordered collection of objects
- Dictionnaries: mappings from keys to objects

---

## Lists

By introducing `split()` and `join()` previously, we have started using a new type of object: the *list*. A `list` acts like some kind of container in which we can store all kinds of information. 

### accessing lists: indexes and slices

In many ways, lists are very similar to strings. We can for example access its components using indexes and we can use slice indexes to access parts of the list. Let's try this out. Think about the output before running the code:

In [None]:
numbers = ['one', 'two', 'three', 'four', 'five']
print(numbers[1])
print(numbers[-2])
print(numbers[0:3])

### append()
We can also add new items to a list. For that you use the `append()` method. Let's see how that works. Say we want to keep a list of all our good reads. We first declare an empty list using square brackets. Next, we add some books to the list:

In [None]:
# start with an empty list
good_reads = []
good_reads.append("The Hunger Games")
print(good_reads)
good_reads.append("A Clockwork Orange")
print(good_reads)

As a reminder: do you get the syntax that goes with the `append()` method? The list we wish to append the item to goes first and we join `append()` to this list using a dot (`.`). In between the round brackets that go with the method name, we place the actual string that we wish to add to the list. We call such an input value an *argument* or a *parameter* that we *pass* to a function. Next, the function may return a *return value*, i.e. an object that we can print or assign to a variable (like in `books = ", ".join(good_reads)`, where the return value is a string).
Make sure that you are familiar with this terminology because you will often come across such terms when you look for help online!

### replacing elements in a list

Now, if for some reason we don't like a particular book anymore, we can replace it with a new book as follows, using the old book's index:

In [None]:
good_reads[1] = "Pride and Prejudice"
print(good_reads)

As you see, it is no problem to reset or update an individual item in a list. This is different, however, for strings. Suppose that we have a list of files that end with some letter ('a', 'b', 'c', ...') and that we want to replace that letter by "x". If we try to change a single character in a string, this will raise an error because `strings` (and some other types) are *immutable*. That means that they cannot be changed using the index, as opposed to `lists` which *are* mutable. Observe how we will first split the string in a list of characters, replace one character and then join again the modified list of characters.

In [None]:
file = "task_a.txt"
# file[2] = "x" # This would raise a TypeError
list_chars = list(file)
print(list_chars)
list_chars[-5] = "x"
print(list_chars)
delimiter = ""
print(delimiter.join(list_chars))

## DIY

Here's another small DIY! Add two new titles to the list of `good_reads`. Then, try to change the title of the second book in our good reads collection:

In [None]:
# insert your code here

Lists are a really powerful way of dealing with your data in Python. Let's explore some other ways in which we can manipulate lists.

#### remove()

Let's assume our good reads collection has grown a lot and we would like to remove some of the books from the list. Python provides the function `remove()` that you can call on a list and which takes as argument the item we would like to remove.

In [None]:
good_reads = ["The Hunger Games", "A Clockwork Orange", 
             "Pride and Prejudice", "Water for Elephants", "Illias", "Water for Elephants", "Water for Elephants"]
print(good_reads)
good_reads.remove("Water for Elephants")
print(good_reads)
good_reads.remove("Water for Elephants")
print(good_reads)

If we try to remove a book that is not in our collection, Python raises an error to signal that something is wrong.

In [None]:
good_reads.remove("White Oleander")

Note, however, that `remove()` will only delete the *first* item in the list that is identical to the argument which you passed to the function. Execute the code in the block below and you will see that only the first instance of "Pride and Prejudice" gets deleted.

In [None]:
good_reads = ["The Hunger Games", "A Clockwork Orange", 
             "Pride and Prejudice", "Water for Elephants", "Pride and Prejudice"]
good_reads.remove("Pride and Prejudice")
print(good_reads)

Just as with strings, we can concatenate two lists using the `+` operator. Here is an example:

In [None]:
# first we specify two lists of strings:
good_reads = ["A Clockwork Orange", 
              "Pride and Prejudice", "Water for Elephants",
              "The Shadow of the Wind", "Bel Canto"]
bad_reads = ["Fifty Shades of Grey", "Twilight", "The Hunger Games"]

# then we combine them
all_reads = good_reads + bad_reads
print(all_reads)

good_reads += bad_reads
print(good_reads)

#### .sort() and sorted()

It is always nice to organise your bookshelf. We can sort our collection alphabetically with the following expressions:

In [None]:
sorted_reads = sorted(good_reads)
print(good_reads)
print(sorted_reads)

good_reads.sort()
print(good_reads)

Can you spot the difference between `sort()` and `sorted()`?
* `sorted()` is a function that can take an object (typically a list, but it works on strings too) and returns the sorted version, which you can assign to a variable. This does not change the original list!
* `.sort()` is a list method. It does not return anything; instead, it sorts the list in-place (altering it). There is no need for assigning to a new variable. In fact, if you do, this new variable will contain `None` (since `.sort()` does not return anything). Don't worry if this confuses you - it is confusing at first. If you run into bugs in the future, just remember that it may be caused by this difference in behaviour.

The functions can be further specified. For example, we might be interested in a reversed sorted list. This is done by adding specifications in the parentheses of the function. See the following code:

In [None]:
my_list = ['a', 'c', 'b']
print(sorted(my_list))
print(sorted(my_list, reverse = True))

#### nested lists

Until now, our lists only consisted of strings. However, a list can contain all kinds of data types, such as integers and even lists! Do you understand what is happening in the following example? Have a close look at the square brackets used.

In [None]:
nested_list = [[1, 2, 3, 4], [5, 6, 7, 8]]
print(nested_list[0])
print(nested_list[0][0])
print(nested_list[1][2])
print(nested_list[0][:-2])

We can put this to use to enhance our good reads collection with a score for every book we have. An entry in our collection will consist of a score within the range of `1` to `10` and the title of our book. The first element is the title, the second the score: `[title, score]`. We initialize an empty list, and add two books to it:

In [None]:
good_reads = []
good_reads.append(["Pride and Prejudice", 8])
good_reads.append(["A Clockwork Orange", 9])
print(good_reads)

## DIY

Update the `good_reads` collection with some of your own books and give them all a score and a publication year by nesting lists. Can you print out the score you gave to the first book in the list? And the publication year of the third item in your list? (Hint: you can pile up indexes using square brackets!)

In [None]:
# insert your code here

##### What we have learnt

To finish this section, here is an overview of the new concepts you have learnt. Go through them and make sure you understand them all.

- lists
- nested lists
- *mutable* versus *immutable*
- `.append()`
- `.remove()`
- `.sort()`

-------------

## Dictionaries

Our little good reads collection is starting to look quite impressive and we can perform all kinds of manipulations on it. Now, imagine that our list is large and we would like to look up the score we gave to a particular book. How are we going to find that book? For this purpose Python provides another more appropriate data structure, named `dictionary`, or `dict` for short. A `dictionary` is similar to the dictionaries you have at home. It consists of entries, or *keys*, that hold a value. Let's define one:

In [None]:
my_dict = {"book": "physical objects consisting of a number of pages bound together",
           "sword": "a cutting or thrusting weapon that has a long metal blade"}
print(my_dict)

Take a close look at the new syntax. Notice the curly brackets and the colons.

You can define an empty dictionary with `my_dict = {}` or `my_dict = dict()`.

To look up the value of a given key, we *index* the dictionary using that key (again, between square brackets):

In [None]:
description = my_dict["sword"]
print(description)

We can also add new entries, or update existing ones:

In [None]:
my_dict["pie"] = "dish baked in pastry-lined pan often with a pastry top"
print(my_dict)
my_dict["sword"] = "a pointy metal stick"
print(my_dict)

Like lists, dictionaries are mutable which means we can add and remove entries from it. Let's define an empty dictionary and add some books to it. The titles will be our keys and the scores their values. After this, we will remove a title. You do this with the 'del' command followed by the dictionary key.

In [None]:
good_reads = {}
good_reads["Pride and Prejudice"] = 8
good_reads["A Clockwork Orange"] = 9
print(good_reads["Pride and Prejudice"])
print(good_reads)
del good_reads["A Clockwork Orange"]
print(good_reads)

In a way, this is similar to what we have seen before when we altered our book `list`. There we indexed the list using a integer to access a particular book. Here we directly use the title of the book. Note that the keys in a dictionary must be unique: why would that be?

## DIY

Update the new good reads data structure with your own books. Try to print out the score you gave for one of the books which you added.

In [None]:
# put your code here

#### keys(), values(), items()

To retrieve a list of all the books we have in our collection, we can ask the dictionary to return its keys as a list:

In [None]:
keys = good_reads.keys()
print(keys) # You can see that the dictionary method .keys() does not really return a list
keys = list(keys) # But we can cast the dict_keys object as a list
print(keys)

Similarly we can ask for the values:

In [None]:
print(list(good_reads.values()))

Finally, if we want to work with both keys and values, we may need to say this explicitly (instead of using the unmarked option) with the specification `.items()`:

In [None]:
print(good_reads.items())

An important property of dictionaries that you should keep in mind, is that they have no concept of order. Unlike lists, which remember the order of the elements it contains, **dictionaries are unordered**. The reason for this is mainly technical (it makes dictionary lookup much much faster), and in practice it does not usually matter (because you will typically look up things in the dictionary instead of reading it from beginning to end). Just keep it in mind when you use the `.keys()` and `.values()` methods: the order in which they present the results is semi-random.

##### What we have learnt

To finish this section, here is an overview of the new concepts and functions you have learnt. Make sure you understand them all.

-  dictionary
-  indexing dictionaries and accessing values through their keys
-  adding items to a dictionary
-  `.keys()`
-  `.values()`