<img src="https://github.com/Center-for-Health-Data-Science/PythonTsunami/blob/oct_2022_3days/figures/HeaDS_logo_large_withTitle.png?raw=1" width="300">

<img src="https://github.com/Center-for-Health-Data-Science/PythonTsunami/blob/oct_2022_3days/figures/tsunami_logo.PNG?raw=1" width="600">

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Center-for-Health-Data-Science/PythonTsunami/blob/oct_2022_3days/solutions/iterables_wSolutions.ipynb#scrollTo=C45Ny6IznwGu)


*Originally prepared by [Katarina Nastou](https://www.cpr.ku.dk/staff/?pure=en/persons/672471), updated by HeaDS*

*Note: This notebook's contents have been adapted from Colt Steele's slides used in "[Modern Python 3 Bootcamp Course](https://www.udemy.com/course/the-modern-python3-bootcamp/)" on Udemy*

# Containers
You were previously introduced to the following basic data types in Python: ``Boolean``, ``int``, ``float`` and ``str``. There are more fundamental data structures in Python which you will learn about in this notebook. 

These collections of data types are like containers that can contain several items. In particular, this notebook covers:

* ``list``
* ``set``
* ``tuple``
* ``dictionary``

# Lists

> A list is a container of ordered elements that can be accessed by their index.

* To create an empty list:
    * use square breackets ``[]``
    * use the built-in function ``list()``
* The elements in a list are separated by commas.

In [None]:
tasks = ["Install Python", "Learn Python", "Take a break"]

* To find out how many elements there are in a list, you can use the built-in function ``len``.

In [None]:
len(tasks)

## Accessing values in a ``list``

The elements in a list are ordered and can thusbe accessed by their index. Lists start counting at ``0``,  i.e. the first element in your list lives at the index position ``0``. 

### Accessing single elements

In [None]:
friends = ["Ashley", "Matt", "Michael"]
print(friends[0]) 
print(friends[2]) 
print(friends[3]) # IndexError

**To access values from the end**, you can use a negative number to index backwards:

In [None]:
friends = ["Ashley", "Matt", "Michael"]
print(friends[-1]) 
print(friends[-3]) 
print(friends[-4]) # IndexError

**To check if a value is in a list**, you can use the ``in`` operator:

In [None]:
friends = ["Ashley", "Matt", "Michael"]
print("Ashley" in friends) 
print("Jason" in friends)
print("ashley" in friends) 

### Accessing multiple elements: slicing

**To access several elements at once**, you can use a technique called _slicing:_

```python
    some_list[start:end:step]
```

1. **First parameter:** ``start``  
Tell Python which index to start slicing from. If you enter a negative number, it will start the slice back from the end.

In [None]:
first_list = [0, 1, 2, 3, 4, 5, 6]

# slice from index 1 (this is the second element in the list)
print(first_list[1:])

# slice from index 3
print(first_list[3:]) 

# slice from third element backwards
print(first_list[-3:])

[2, 3, 4, 5, 6, 7]
[4, 5, 6, 7]
[5, 6, 7]


2. **Second parameter:** ``end``  
Specifies the index to copy up to (excluding the last one). Negative numbers specify how many items to exclude from the end (i.e. indexing by counting backwards).

In [None]:
# slice up to (but excluding) index 2
print(first_list[:2]) 

# slice starting from index 1 up to (but excluding) index 4
print(first_list[1:4]) 

# slice up to the (but excluding) the last element
# = the first element from the end
print(first_list[:-1])

[1, 2]
[2, 3, 4]
[1, 2, 3, 4, 5, 6]


3. **Third parameter:** ``step``  
The ``step`` indicates the number to count at a time. E.g. a step of ``2`` only counts every second number in the list. We can reverse the order by using negative values for the ``step`` parameter.

In [None]:
# access entire list from start to end, but only count every other element
print(first_list[::2])

# start at index 1 and count backwards
print(first_list[1::-1])
      
# access list from the end and count backwards
print(first_list[:1:-1])

[1, 3, 5, 7]
[2, 1]
[7, 6, 5, 4, 3]


## Nested Lists
Lists can contain any kind of element, even other lists!
To access an element in a sublist of a list, you first need to access the sublist by its index and then the element in the sublist by its index.

In [None]:
nested_list = [[1, 2, 3],
               [4, 5, 6],
               [7, 8, 9]]
nested_list[0][1] # access 2

2

Did you know that strings behave a lot like lists, too? They are what we call ``subscriptable``. A string is not a simple element like an integer. It consists of a sequence of characters, which you can access just like list elements.

In [3]:
my_string = "Programming is fun!"
print(my_string[-4:])

fun!


## Exercise

~ 15 minutes

### 1. Defining a list

Let's start with defining a list called `random_things` that is at least 4 elements long.  The data is completely up to you, but it must contain at least 1 `str` and 1 `float`. 

Use the ``len`` function to check if your list is indeed at least 4 elements long.

In [72]:
# your code goes here

random_things = ['one', 2, 3, 4.0]
len(random_things)

4

### 2. Accessing elements

Next, we should practice accessing elements in a list.

In [73]:
my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

###
# your code goes below
###

# display the first element in my_list
print(my_list[0])

# display the last element in my_list
print(my_list[-1])

# display all but the last element in my_list
print(my_list[:-1])

# display the last 3 elements
print(my_list[-3:])

# display all even numbers using the step parameter from slicing
print(my_list[1::2])

1
10
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[8, 9, 10]
[2, 4, 6, 8, 10]


**Quiz**
- **Question 1**: Given a list `numbers = [1,2,3,4]`  - what does `numbers[::-1]`  return?

    (a) `[1,2,3,4]`  
    (b) `[1,4]`  
    (c) `[4,3,2,1]`  
    (d) `[4]`  
    
- **Question 2**: Given a list `numbers = [1,2,3,4]`  - what does `numbers[1:3]`  return?

    (a) `[1,2,3]`  
    (b) `[2,3]`   
    (c) `[2,3,4]`  
    (d) `[1,2]`  
    (e) `[3]`  
    
- **Question 3**: Given a list `numbers = [1,2,3,4]`  - what does `numbers[-2]`  return?

    (a) `[3]`  
    (b) `3`  
    (c) `[1,2,3]`  
    (d) `[2]`  
    (e) `2`  

In [74]:
# if you are unsure, just try it out here

numbers = [1,2,3,4]

# 1 : c
print(numbers[::-1])

# 2 : b
print(numbers[1:3])

# 3 : b
print(numbers[-2])

[4, 3, 2, 1]
[2, 3]
3


### 3. Slicing

Here you have a list of names, but it contains a few spelling errors. Correct the entries in the list **by accessing an element by its index** and assigning a new string.

    - Change "Petre" to "Peter"
    - Change "Monika" to "Monica"
    - Change "george" to "George" (capitalize it)

In [75]:
# DON'T CHANGE ANYTHING UP HERE!
people = ["Petre","Joanna","Louis","Angie","Monika","george"]
# DON'T CHANGE ANYTHING UP HERE!

# your code goes here
people[0] = "Peter"
people[-2] = "Monica"
people[-1] = "George"
print(people)

['Peter', 'Joanna', 'Louis', 'Angie', 'Monica', 'George']


### 4. Nested lists

This is a nested lists of strings. Print the second element from each sublist.

In [76]:
shopping_list = [['apples','bananas','oranges'],
                 ['milk', 'eggs', 'cheese'],
                 ['soap', 'toothbrush', 'tissues']]

# your code goes here
print(shopping_list[0][1])
print(shopping_list[1][1])
print(shopping_list[2][1])

bananas
eggs
toothbrush


## List Methods

Working with lists is very common - there are quite a few things we can do!

**Adding elements to a list:**
* ``append``: add an item to the end of the list.
* ``extend``: add to the end of a list all values passed to extend.
* ``insert``: insert an item at a given position. 

Note on append/extend: append is for adding single elements, extend for appending multiple elements from another iterable

In [20]:
# append
first_list = [1, 2, 3, 4]
first_list.append(5)
print(first_list)

[1, 2, 3, 4, 5]


In [21]:
# but careful with this one!
first_list = [1, 2, 3, 4]
first_list.append([5,6])
print(first_list)

[1, 2, 3, 4, [5, 6]]


In [22]:
# extend
correct_list = [1, 2, 3, 4]
correct_list.extend([5, 6, 7, 8])
print(correct_list) 

[1, 2, 3, 4, 5, 6, 7, 8]


In [None]:
# insert
first_list = [1, 2, 3, 4]
first_list.insert(2, 'Hi!') 
print(first_list) 

[1, 2, 'Hi!', 3, 4]


**Removing elements from a list:**
* ``clear``: remove all items from a list.
* ``pop``
    - Remove the item at the given position in the list, and return it.
    - If no index is specified, removes & returns last item in the list.
* ``del``: deletes a value from a list.

In [None]:
# clear
first_list = [1, 2, 3, 4]
first_list.clear()
print(first_list)

[]


In [None]:
# pop
first_list = [1, 2, 3, 4]
last_item = first_list.pop() 
print(last_item)
second_item = first_list.pop(1) 
print(second_item)

# the elements are then not in the list anymore
print(first_list)

4
2
[1, 3]


In [None]:
# del 
first_list = [1, 2, 3, 4]
del first_list[3]
print(first_list)
del first_list[1]
print(first_list)

[1, 2, 3]
[1, 3]


**Other useful list methods:**
* `count`: return the number of times x appears in the list.
* `sort`: sort the items of the list (in-place).
* `copy`: take a list and assign a copy to a new variable.

In [None]:
# count
numbers = [1, 2, 3, 4, 3, 2, 1, 4, 10, 2]

# count how often 2 appears in the list
print(numbers.count(2))

# count how often 21 appears in the list
print(numbers.count(21))

3
0


In [None]:
# sort
another_list = [6, 4, 1, 2, 5]
another_list.sort()
print(another_list)

[1, 2, 4, 5, 6]


In [36]:
# copy
unsorted_list = [6, 4, 1, 2, 5]

# just assigning a list to a new variable name does not copy it
# you now simply have to variable names pointing to the same list
sorted_list = unsorted_list
sorted_list.sort()
print(sorted_list)
print(unsorted_list)

[1, 2, 4, 5, 6]
[1, 2, 4, 5, 6]


In [37]:
# this is the way to copy a list
unsorted_list = [6, 4, 1, 2, 5]

sorted_list = unsorted_list.copy()
sorted_list.sort()
print(sorted_list)
print(unsorted_list)

[1, 2, 4, 5, 6]
[6, 4, 1, 2, 5]


### Exercise

Find the instructions as comments. You have ~10 minutes.

In [30]:
shopping_list = [['apples','bananas','oranges'],
                 ['milk', 'eggs', 'cheese'],
                 ['soap', 'toothbrush', 'tissues']]


# your code goes here


# add one item to every sublist in shopping_list

shopping_list[0].append('grapes')
print(shopping_list)

# remove the last sublist from shopping_list

del(shopping_list[-1])
print(shopping_list)

# make the shopping list a 'flat' list (non-nested) called new_list, without changing shopping_list
# Tip: take the first sublist and extend it with the second

new_list = shopping_list[0].copy()
print(new_list)
new_list.extend(shopping_list[1])
print(shopping_list)
print(new_list)

[['apples', 'bananas', 'oranges', 'grapes'], ['milk', 'eggs', 'cheese'], ['soap', 'toothbrush', 'tissues']]
[['apples', 'bananas', 'oranges', 'grapes'], ['milk', 'eggs', 'cheese']]
['apples', 'bananas', 'oranges', 'grapes']
[['apples', 'bananas', 'oranges', 'grapes'], ['milk', 'eggs', 'cheese']]
['apples', 'bananas', 'oranges', 'grapes', 'milk', 'eggs', 'cheese']


# Sets

> A set is a collection of unqiue, unordered, unchangeable, and unindexed elements.

* To create an empty set:
    * use curly brackets ``{}``.
    * use the built-in method ``set()``.
* **Unique elements:** Each element in a set can only appear once.
* **Unordered and unindexed elements:** In a set, you cannot know in which order the elements might appear. Since the elements can appear in any order, they do not have an index and you therefore cannot access the elements of a set by an index.
* **Unchangeable elements:** You cannot change or update the elements of a set, but you can remove or add new items.

In [71]:
# create a set
number_set = {1,2,3}

Why is this useful?

I personally like it to get all unique values in a list.

In [70]:
my_list = [1,4,7,1,2,3,3,7,2,7,7,1]
set(my_list)

{1, 2, 3, 4, 7}

# Tuples

> A tuple is an (immutable) ordered container of values. 

* To create a tuple, use round brackets ``()``.
* "Immutable" means that the elements of a tuple can only be accessed, but _not changed._
* Tuples can be used as keys in dictionaries and as elements of sets (lists cannot!).

In [40]:
# create a tuple
t = (5, 6)       

# equal formulation 
#t = 5, 6       

t

(5, 6)

We will see them in action now.

# Dictionaries

> A dictionary stores (key, value) pairs.

- Dictionaries are created with curly brackets ``{key: value}``
- Dictionaries are ordered by insertion order since `Python 3.5`.
- Dictionary values are accessed by keys.
- Each key in the dictionary is unique and duplicates are not allowed.

In [55]:
# define dictionary
city_population = {
    'Tokyo': 13350000, # a key-value pair
    'Los Angeles': 18550000,
    'New York City': 8400000,
    'San Francisco': 1837442,
}

# display dictionary
city_population

{'Los Angeles': 18550000,
 'New York City': 8400000,
 'San Francisco': 1837442,
 'Tokyo': 13350000}

In [None]:
# access the value for the key 'New York City'
print(city_population['New York City'])

8400000


**Change values** by specifying the key and using the `=` operator:

In [None]:
city_population['New York City'] = 73847834
city_population

{'Los Angeles': 18550000,
 'New York City': 73847834,
 'San Francisco': 1837442,
 'Tokyo': 13350000}

**To add a new (key, value) pair**, you can choose between different ways:

In [None]:
# using the = operator
city_population['Copenhagen'] = 1000000

# using the update method
city_population.update({'Barcelona': 5000000})

city_population

{'Barcelona': 5000000,
 'Copenhagen': 1000000,
 'Los Angeles': 18550000,
 'New York City': 73847834,
 'San Francisco': 1837442,
 'Tokyo': 13350000}

**A dictionary can hold complex data types as values:**

In [44]:
food = {"fruits": ["apple", "orange"], "vegetables": ["carrot", "eggplant"]}

In [45]:
# access the value of the key "fruits"
print(food["fruits"])

# access element at index 0 in the list
print(food["fruits"][0])

['apple', 'orange']
apple


**To remove a (key, value) pair from the dictionary:**

In [49]:
food

{'fruits': ['apple', 'orange'], 'vegetables': ['carrot', 'eggplant']}

In [51]:
del food['vegetables']

In [52]:
food

{'fruits': ['apple', 'orange']}

### Exercise

You have ~ 10 minutes.

1. Create a dictionary with 4 elements.

In [85]:
personal_data = {
    'name': ['Doe','John'],
    'age': 30,
    'place of birth': 'New York',
    'savings': 0
}

2. List the keys in the dictionary.

In [82]:
personal_data.keys()

dict_keys(['name', 'age', 'place of birth', 'savings'])

3. List the values in the dictionary.

In [86]:
personal_data.values()

dict_values([['Doe', 'John'], 30, 'New York', 0])

4. Create a dictionary where values are lists (i.e countries (keys), cities (values)) and access one of the keys and one of the values in the list.

In [91]:
geographic_dict = {
    'Denmark': ['Copenhagen', 'Aarhus'],
    'France': ['Paris', 'Marseille'],
    'USA': ['Washington','New York']
}

print(list(geographic_dict.keys())[0])
print(geographic_dict['USA'][0])

Denmark
Washington


5. Add another country and its list of cities.

In [92]:
geographic_dict['Japan'] = ['Tokyo', 'Osaka']

print(geographic_dict)

{'Denmark': ['Copenhagen', 'Aarhus'], 'France': ['Paris', 'Marseille'], 'USA': ['Washington', 'New York'], 'Japan': ['Tokyo', 'Osaka']}


6. Remove one of the countries and its elements.

In [93]:
del geographic_dict['USA']

print(geographic_dict)

{'Denmark': ['Copenhagen', 'Aarhus'], 'France': ['Paris', 'Marseille'], 'Japan': ['Tokyo', 'Osaka']}
