<img src="../figures/HeaDS_logo_large_withTitle.png" width="300">

<img src="../figures/tsunami_logo.PNG" width="600">

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Center-for-Health-Data-Science/PythonTsunami/blob/fall2021/Data_structures/lists.ipynb)


# Containers

*Prepared by [Katarina Nastou](https://www.cpr.ku.dk/staff/?pure=en/persons/672471)*

*Note: This notebook's contents have been adapted from Colt Steele's slides used in "[Modern Python 3 Bootcamp Course](https://www.udemy.com/course/the-modern-python3-bootcamp/)" on Udemy*

## Overview

- [Lists](#Lists)
- [Sets](#Sets)
- [Dictionaries](#Dictionaries)
- [Tuples](#Tuples)

# Containers
You were previously introduced to the following basic data types in Python: ``Boolean``, ``int``, ``float`` and ``str``. There are more fundamental data structures in Python which you will learn about in this notebook. These data types are like containers that can contain several items or elements. In particular, this notebook covers:

* ``list``
* ``dictionary``
* ``tuple``
* ``set``

# Lists

> A list is a container of ordered items that can be accessed by their index.

* To create an empty list:
    * use square breackets ``[]``
    * use the built-in fuction ``list()``
* The items in a list are separated by commas.

In [None]:
tasks = ["Install Python", "Learn Python", "Take a break"]

* To find out how many elements there are in a list, you can use the built-in function ``len``.

In [None]:
len(tasks)

## Slicing: accessing values in a ``list``

The elements in a list are ordered and they can be accessed by their index. Lists start counting at ``0``,  i.e. the first element in your list lives at the index position ``0``. 

In [None]:
friends = ["Ashley", "Matt", "Michael"]
print(friends[0]) 
print(friends[2]) 
print(friends[3]) # IndexError

**To access values from the end**, you can use a negative number to index backwards:

In [None]:
friends = ["Ashley", "Matt", "Michael"]
print(friends[-1]) 
print(friends[-3]) 
print(friends[-4]) # IndexError

**To check if a value is in a list**, you can use the ``in`` operator:

In [None]:
friends = ["Ashley", "Matt", "Michael"]
print("Ashley" in friends) 
print("Jason" in friends)

**To access several elements at once**, you can use a technique called _slicing:_

```python
    some_list[start:end:step]
```

1. **First parameter:** ``start``  
Tell Python which index to start slicing from. If you enter a negative number, it will start the slice back from the end.

In [16]:
first_list = [1, 2, 3, 4, 5, 6, 7]

# slice from index 1 (this is the second element in the list)
print(first_list[1:])

# slice from index 3
print(first_list[3:]) 

# slice from third element backwards
print(first_list[-3:])

[2, 3, 4, 5, 6, 7]
[4, 5, 6, 7]
[5, 6, 7]


2. **Second parameter:** ``end``  
Specifies the index to copy up to (excluding the last one). Negative numbers specify how many items to exclude from the end (i.e. indexing by counting backwards).

In [17]:
# slice up to (but excluding) index 2
print(first_list[:2]) 

# slice starting from index 1 up to (but excluding) index 4
print(first_list[1:4]) 

# slice up to the (but excluding) the last element
# = the first element from the end
print(first_list[:-1])

[1, 2]
[2, 3, 4]
[1, 2, 3, 4, 5, 6]


3. **Third parameter:** ``step``  
The ``step`` indicates the number to count at a time. E.g. a step of ``2`` only counts every second number in the list. We can reverse the order by using negative values for the ``step`` parameter.

In [19]:
# access entire list from start to end, but only count every other element
print(first_list[::2])

# start at index 1 and count backwards
print(first_list[1::-1])
      
# access list from the end and count backwards
print(first_list[:1:-1])

[1, 3, 5, 7]
[2, 1]
[7, 6, 5, 4, 3]


## Nested Lists
Lists can contain any kind of element, even other lists!
To access an element in a sublist of a list, you first need to access the sublist by its index and then the element in the sublist by its index.

In [21]:
nested_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
nested_list[0][1] # access 2

2

## Note on slicing
The same slicing techniques that work for lists, also work for strings. You can access individual characters in a string and define a ``start``, ``stop`` and ``step`` parameter when slicing strings: ``some_string[start:end:step]``.

In [17]:
my_string = "Programming is fun!"
print(my_string[::-1]) # print string backwards

!nuf si gnimmargorP


### Exercise 1
1. Define a list called `random_things` that is at least 4 elements long.  The data is completely up to you, but it must contain at least 1 `str` and 1 `float`. 
2. Use the ``len`` function to check if your list is indeed 4 elements long.

In [None]:
# your code goes here

### Exercise 2

1. The list below contains a few spelling errors. Correct the entries in the list **by accessing an element by its index** and assigning a new string.
    - Change "Petre" to "Peter"
    - Change "Monika" to "Monica"
    - Change "george" to "George" (capitalize it)
    
2. Access the name "Louis" from the list. Then access the last two characters of that string and print them backwards.

In [22]:
# DON'T CHANGE ANYTHING UP HERE!
people = ["Petre","Joanna","Louis","Angie","Monika","george"]
# DON'T CHANGE ANYTHING UP HERE!

# your code goes here

### Exercise 3: Quiz
- **Question 1**: Given a list `numbers = [1,2,3,4]`  - what does `numbers[::-1]`  return?

    (a) `[1,2,3,4]`  
    (b) `[1,4]`  
    (c) `[4,3,2,1]`  
    (d) `[4]`  
    
- **Question 2**: Given a list `numbers = [1,2,3,4]`  - what does `numbers[1:3]`  return?

    (a) `[1,2,3]`  
    (b) `[2,3]`   
    (c) `[2,3,4]`  
    (d) `[1,2]`  
    (e) `[3]`  
    
- **Question 3**: Given a list `numbers = [1,2,3,4]`  - what does `numbers[-2]`  return?

    (a) `[3]`  
    (b) `3`  
    (c) `[1,2,3]`  
    (d) `[2]`  
    (e) `2`  

In [None]:
# your code goes here

## List Methods

Working with lists is very common - there are quite a few things we can do!

**Adding elements to a list:**
* ``append``: add an item to the end of the list.
* ``extend``: add to the end of a list all values passed to extend.
* ``insert``: insert an item at a given position. 

In [None]:
first_list = [1, 2, 3, 4]

In [22]:
# append
first_list.append(5)
print(first_list)

[1, 2, 3, 4, 5]


In [25]:
# extend
correct_list = [1, 2, 3, 4]
correct_list.extend([5, 6, 7, 8])
print(correct_list) 

[1, 2, 3, 4, 5, 6, 7, 8]


In [None]:
# insert
first_list = [1, 2, 3, 4]
first_list.insert(2, 'Hi!') 
print(first_list) 

**Removing elements from a list:**
* ``clear``: remove all items from a list.
* ``pop``
    - Remove the item at the given position in the list, and return it.
    - If no index is specified, removes & returns last item in the list.
* ``remove``
    - Remove the first item from the list whose value is x. 
    - Throws a `ValueError` if the item is not found.
* ``del``: deletes a value from a list.

In [None]:
# clear
first_list = [1, 2, 3, 4]
first_list.clear()
print(first_list)

In [None]:
# pop
last_item = first_list.pop() 
print(last_item)
second_item = first_list.pop(1) 
print(second_item)

# the elements are then not in the list anymore
print(first_list)

In [None]:
# remove
first_list = [1, 2, 3, 4, 4, 4]
first_list.remove(2)
print(first_list)
first_list.remove(4)
print(first_list) 

In [None]:
# del 
first_list = [1, 2, 3, 4]
del first_list[3]
print(first_list)
del first_list[1]
print(first_list)

**Other list methods:**
* `index`: returns the index of the specified item in the list.
* `count`: return the number of times x appears in the list.
* `reverse`: reverse the elements of the list (in-place).
* `sort`: sort the items of the list (in-place).

In [12]:
# index
letters = ["a", "a", "b", "z", "l", "i", "k", "a", "o"]

# find the index of the first time the letter "a" appears in the list
print(letters.index("a")) 

# find the index of the first time the letter "a" appears in between indices 6 and 8 
print(letters.index("a", 6, 8))

0
7


In [13]:
# count
numbers = [1, 2, 3, 4, 3, 2, 1, 4, 10, 2]

# count how often 2 appears in the list
print(numbers.count(2))

# count how often 21 appears in the list
print(numbers.count(21))

3
0


In [14]:
# reverse
first_list = [1, 2, 3, 4]
first_list.reverse()
print(first_list) 

[4, 3, 2, 1]


In [15]:
# sort
another_list = [6, 4, 1, 2, 5]
another_list.sort()
print(another_list)

[1, 2, 4, 5, 6]


### Exercise 4

Find the instructions as comments. Work in groups for 10mins

In [None]:
# Create a list called instructors

# Add the following strings to the instructors list 
    # "Marc"
    # "Rita"
    # "Henry"

# Remove the last value in the list

# Remove the first value in the list
 
# Add the string "Done" to the beginning of the list

# Print to make sure you did this right

### Exercise 5: Quiz
1. Question: What method here does not add one or more elements to a list?
    1. extend
    2. add
    3. append  
    4. insert

2. Question: Given a list `numbers = [1,2,3]` how would you access the first element in the list?

3. Question: Given a list `numbers = [1,2,3]` what would the result of `numbers.pop(5)` be?
    
    1. `None`
    2. `IndexError`
    3. `3`

4. Question: Which of the following is not true about lists:
    1. You can loop over them using a for or a while loop 
    2. The index starts at 1
    3. They are collections of elements

# Sets

> A set is collection of unqiue, unordered, unchangeable, and unindexed elements.

* To create an empty set:
    * use curly brackets ``{}``.
    * use the built-in method ``set()``.
* **Unique elements:** Each element in a set can only appear once.
* **Unordered and unindexed elements:** In a set, you cannot know in which order the elements might appear. Since the elements can appear in any order, they do not have an index and you therefore cannot access the elements of a set by an index.
* **Unchangeable elements:** You cannot change or update the elements of a set, but you can remove or add new items.

In [57]:
# create a set
number_set = {1,2,3}
print(type(number_set))

<class 'set'>


### Set methods
You can use the following methods on sets (among others):

* ``add``: add an element to the set
* ``pop``: remove (a random) element from the set
* ``remove``: remove a specified element from the set and returns this element

In [62]:
# add a number to the set
number_set.add(4)
print(number_set)

{1, 2, 3, 4}


In [63]:
# remove the number again
number_set.remove(4)
print(number_set)

# remove random element from the set
number_set.pop()

{1, 2, 3}


1

Other set methods include:

* ``difference``: return the difference between two or more sets
* ``intersection``: returns the intersection between two or more sets 
* ``union``: returns set containing the union of two or more sets

In [65]:
# create two sets
first_set = {1,2,3}
second_set = {2,3,4,5}

In [66]:
# difference
first_set.difference(second_set)

{1}

In [67]:
# intersection
first_set.intersection(second_set)

{2, 3}

In [68]:
# union
v.union(v)

{1, 2, 3, 4, 5}

## Exercise 6

1. Examine the three examples below: Is there a difference between case a) and case b)? 

**Example 1:**
```python
first_set.difference(second_set)     # case a)
second_set.difference(first_set)     # case b)
```

**Example 2:**
```python
first_set.intersection(second_set)     # case a)
second_set.intersection(first_set)     # case b)
```

**Example 3:**
```python
first_set.union(second_set)     # case a)
second_set.union(first_set)     # case b)
```

2. Can you think of a use case in which it makes more sense to use a ``set`` than a ``list``? And vice versa: when does it make more sense to use a ``list`` instead of a ``set``?

In [None]:
# your code goes here

# Dictionaries

> A dictionary stores (key, value) pairs.

- Dictionaries are ordered by insertion order since `Python 3.5`.
- Dictionary values are accessed by keys.
- Each key in the dictionary is unique and duplicates are not allowed.

In [27]:
# define dictionary
city_population = {
    'Tokyo': 13350000, # a key-value pair
    'Los Angeles': 18550000,
    'New York City': 8400000,
    'San Francisco': 1837442,
}

# display dictionary
city_population 

In [30]:
# access the value for the key 'New York City'
print(city_population['New York City'])

8400000


In [None]:
# access the value for the key 'Copenhagen'
print(city_population['Copenhagen'])

**Change values** by specifying the key and using the `=` operator:

In [None]:
city_population['New York City'] = 73847834
city_population

**To add a new (key, value) pair**, you can choose between different ways:

In [31]:
# using the = operator
city_population['Copenhagen'] = 1000000

# using the update method
city_population.update({'Barcelona': 5000000})

city_population

{'Tokyo': 13350000,
 'Los Angeles': 18550000,
 'New York City': 8400000,
 'San Francisco': 1837442,
 'Copenhagen': 1000000,
 'Barcelona': 5000000}

**A dictionary can hold complex data types as values:**

In [41]:
food = {"fruits": ["apple", "orange"], "vegetables": ["chicken", "coliflower"]}

In [42]:
# access the value of the key "fruits"
print(food["fruits"])

# access element at index 0 in the list
print(food["fruits"][0])

['apple', 'orange']
apple


**To return all items of a dictionary:**

In [43]:
food.items()

dict_items([('fruits', ['apple', 'orange']), ('vegetables', ['chicken', 'coliflower'])])

**To remove a (key, value) pair from the dictionary:**

In [44]:
del food['vegetables']

In [45]:
food.items()

dict_items([('fruits', ['apple', 'orange'])])

# Tuples

> A tuple is an (immutable) ordered list of values. 

* To create a tuple, use round brackets ``()``.
* "Immutable" means that the elements of a tuple can only be accessed, but _not changed._
* Tuples can be used as keys in dictionaries and as elements of sets (lists cannot!).

In [50]:
# create a tuple
t = (5, 6)       

# equal formulation 
# t = 5, 6       

t

(5, 6)

Tuples are useful for instance as keys in dictionaries:

In [51]:
# use tuples as dictionary keys
grades = {("Niels", "Jensen"): 3.0, ("Morten", "Schubert"): 2.8, ("Niels", "Christiansen"): 4.2}
grades[("Niels", "Jensen")]

3.0

### Exercises 7

1. Create a dictionary with 4 elements.

2. List the keys in the dictionary.

3. List the values in the dictionary.

4. Create a dictionary where values are lists (i.e countries (keys), cities (values)) and access one of the keys and one of the values in the list.

5. Add another country and its list of cities.

6. Remove one of the countries and its elements.